# Hardware Modeling and Design of Brain-like Information Systems

Tsutomu Miki, Takashi Morie and Takeshi Yamakawa

Department of Brain Science and Engineering, Graduate School of Life Science and Systems Engineering, Kyushu Institute of Technology, Wakamatsu-ku, Kitakyushu 808-0196, Japan {miki, morie, yamakawa}@brain.kyutech.ac.jp

Abstract-- The goal of our COE program is to develop electronic neuro-devices that lead to realize dedicated VLSI chips. One of the fields in the COE program is the "Brain-Like Integrated Circuits." The aims of the filed are to design brain-like hardware models based on the knowledge obtained in the other fields and to implement their model in dedicated VLSI device. We take three different approaches, such as real time vision processing, multi-modal sensory systems and central nervous information processors (brain-like computer), to the final goal (Brain-like VLSI). In this paper, some projects toward the goal of our field in our COE program are briefly discussed.

# I. INTRODUCTION

Our program "The World of Brain Computing Interwoven out of Animals and Robots" headed by Takeshi Yamakawa was selected in 2003 as a 21st Century Center of Excellence (COE) Program of the Ministry of Education, Culture, Sports, Science and Technology, Japan. The aim of our COE program is to develop electronic brain-devices that lead to realize dedicated VLSI chips including memory devices, sensory devices, information integration devices, motor control devices and so on, which consists of five fields: Neurophysiology and Electrochemistry, Psychology and Human Operation, Mathematics and Linguistics, Brain-Like Integrated Circuits, and Robotics.

The role of our field "Brain-Like Integrated Circuits" is to design brain-like hardware models based on the knowledge obtained in the other fields and to implement their model in dedicated VLSI device, which occupies the most important position in our COE program. Now, we take three approaches to the final goal from different aspects such as real time vision processing based on biological architecture, multi-modal sensory systems based on biological sensory mechanism which obtains information about our surroundings without depending on any environmental changes and central nervous information systems (brain-like computer) which emulate human's highly brain functions.

In the human sensory systems, vision plays the most important role. The visual information processing in the brain is mainly performed in two pathways; the ventral pathway for object shape recognition and the dorsal pathway for object movement recognition. In order to develop the brain-like visual processing systems using the current VLSI technology, our first research target is to design feature-detection VLSIs for real-time object shape recognition. If such VLSIs are realized with sufficient fast processing speed, movement detection can also be achieved. One of authors have developed various image processing LSIs such as resistive-fuse network LSIs for coarse region segmentation [1-4], nonlinear oscillator network LSIs for image region extraction [5-7], Gabor filtering LSIs for feature extraction [8-11], and convolutional neural network LSIs for image detection [12]. Section II describes design of Gabor filtering LSI, which achieves the function of the primary visual cortex (V1), using the merged analog-digital circuit architecture.

Decision making in human brain demands rich information about the outer world without depending any environment changes. upon We synthesize information obtained from five senses in real time and decide our action well. A development of sensory systems based on biological and physiological mechanism of human's sensory systems is one of the key studies for brain-like information systems. In section III, brain-like sensory system is focused. As an example of the human-like sensory systems, image pre-processing based on a degree of the perceptually importance is presented. The system employs a basis function network scheme, which has a good local adaptability and a simple parallel architecture suitable for silicon implementations. As applications of the proposed method, a facial feature extraction and headline areas extractions are presented. Furthermore some studies collaborated with other fields' researchers in our COE program is also briefly described.

The Self-Organizing Map (SOM) was proposed by Teuvo Kohonen [13] by the inspiration of the brain activity to map the outside world onto the cortex, where nearby stimuli are coded on nearby cortical areas. The SOM algorithm is a simplified model of such a mapping process, and thus has been used in wide range of technical applications, such as pattern recognition, data analysis, control tasks and so on. It is, however, very difficult to apply the SOM to applications in which real-time processing is required. Several SOM learning algorithms aiming at the speed-up and hardware implementation have been proposed [14]-[19]. In this section, a new fast learning algorithm and hardware implementation employing rough comparison WTA (Winner-Take-All) are proposed to realize SOM learning with high speed and high accuracy.

# II. A PIXEL-PARALLEL GABOR FILTER LSI BASED ON MERGED ANALOG/DIGITAL ARCHITECTURE

Gabor wavelet transformation (GWT) is a model of simple cells in the primary visual cortex. It is also known as a powerful image feature extraction method for practical image recognition. GWT extracts local spatial frequency components with a given orientation because it has a sinusoidally-waving convolution kernel localized by a Gaussian window function. Since GWT requires huge computational power, pixel-parallel hardware implementation is suitable for real-time image processing. Such massively parallel operation is one feature of brain-like information processing.

Morie *et al.* have proposed a new GWT algorithm that can achieve pixel-parallel operation with only nearest-neighbor connections [9]. This algorithm is based on the cellular neural network model using double-layer resistive-networks proposed by B. E. Shi [20], but our algorithm realizes accurate Gabor filtering while Shi's model realizes Gabor-like filtering with an exponential shape window functions.

To implement this algorithm, Morie et al. has proposed a pixel-parallel Gabor filter LSI based on our merged analog/digital LSI architecture, which has advantages of both digital and analog approaches by using pulse modulation signals [8,10]. Their LSI architecture has a 2-D pixel-circuit array corresponding to the pixel array. Because Gabor coefficients are expressed by complex numbers, the pixel circuit at pixel (m,n) treats state variables  $V_r(m,n)$  and  $V_i(m,n)$ , which correspond to the real and imaginary parts, respectively, as shown in Fig. 1 (a). The Gabor coefficients are obtained by repeating update of each state variable provided that the initial value of each V<sub>r</sub> is set at the corresponding pixel value of the input image. Each updating value is obtained by multiplying the difference between  $V_{r/i}$  of the target pixel and that of a neighboring pixel by the corresponding connection coefficient. By changing the connection coefficients, arbitrary GWT can be realized.

The real part of the pixel circuit is shown in Fig. 1 (b). The imaginary part is identical except for the pixel-data-input circuit. The value of  $V_{r/i}$  is stored as charges at capacitor  $C_o$  and is converted into a PWM signal by voltage-to-pulse converter *VPC*. The *VPC* consists of an auto-zero clocked comparator. Selector *SEL* extracts two PWM signals from the neighboring pixel circuits and the own circuit. Subtraction circuit *SUB* calculates the difference between the two PWM signals. A PWM signal whose pulse width is proportional to the absolute difference appears at node *Diff* and the sign bit of the difference appears at node *Sign*. The sign bit can be reversed by *Rev*. The PWM signal switches the current source I<sub>+</sub> or I. according to the sign bit. Thus,  $V_{r/i}$  at each pixel circuit are updated in parallel.



Fig. 1: Pixel-parallel GWT processor LSI architecture (a), and pixel circuit (b).

Morie *et al.* designed and fabricated a Gabor filter LSI using 0.35µm CMOS technology [10]. The layout result and a chip micro-photograph are shown in Fig. 2. A Gabor impulse response was obtained in the measurement of the fabricated LSI chip as shown in Fig. 3. The result was nearly identical with the corresponding ideal Gabor impulse response. A stripe-pattern detection by using the LSI chip are also verified. When the updating cycle is 1 MHz, the operation performance of this LSI is 26 GOPS.







Fig. 3: Measurement result of the fabricated LSI chip for Gabor impulse response. The input pixel value was only given at (5,5).

# III. SENSORY INFORMATION PROCESSING FOR BRAIN-LIKE INFORMATION SYSTEMS

One of the keys for realization of the Brain-like information system is how to obtain the information about the outer world effectively. In this session, sensory information processing as human-like information acquisition system is described.

# Image pre-processing based on the degree of the perceptually importance

Visual information is the most important source which contains rich information about our surroundings. These data is so huge to treat in real-time. All of that information is not necessary for us to take an appropriate action. Our real time judgment is done based on limited information which is more attractive area of image. Here, as a kind of human-like image pre-processing, Miki has proposed an effective feature extraction method [21, 22].

The proposed model is based on the simplest wavelet network [23] and has been improved to hardware oriented one [24]. The output of the proposed network is defined by a linear conjunction of weighted basis functions  $W \cdot \psi(x, y)$  described as:

$$\hat{Y}(x,y) = \sum_{a=0}^{M} \sum_{bk=0}^{2^{a+1}-1} \sum_{bl=0}^{2^{a+1}-1} W_{a,bk,bl} \psi_{a,bk,bl}(x,y), \quad (1)$$

where a is a level, bk and bl are a number of basis functions on x- and y-axis, respectively. The proposed network has a multi-level structure. Each level consists of one or more basis functions of which support size is different every a level. As a level becomes high, the support size of the basis function becomes narrow. The support size is corresponding to a spatial frequency. To approximate an image by using the network is to decompose the image into partial components according to spatial frequency. In our method, a specialized feature image can be obtained by decomposition with unique basis functions which selected with appropriate for the purpose.

The feature extraction process consists of two steps. One is decomposition and the other is reconstruction. In the decomposition process, the network decomposes the original image into the weighted basis functions. After that, in the reconstruction process, the output image can be obtained by reconstruction with the weighted basis functions which are selected based on a degree of perceptually importance. As an index of the perceptual importance, the concept of the curvature energy introduced in the three-component image model [25] is used. The curvature energy  $C_{i,j}$  is given by Eq.(2) in our method.

$$C_{i,j} = (x_{i-1,j} + x_{i,j-1} - 4x_{i,j} + x_{i+1,j} + x_{i,j+1})^{2}$$

$$i = 1, 2, \dots, N-2, \ j = 1, 2, \dots, N-2,$$
(2)

where  $x_{i,j}$  represents the intensity value of the pixel (i, j).

As an example of our applications, a human-like

facial feature extraction from gray scale image is shown in Fig. x. Sample image is  $256 \times 256$  pixels gray scale image with 256-level as shown in Fig. 4(a). The extracted image by our method is shown in Fig. 4(b). The results using proposed method show good extraction features.



Fig. 4: Result of the facial feature extraction using the proposal method, (a) Original image, (b) extracted image obtained by the proposed method

The proposed method is also useful for highlighted area extractions [22, 26]. Highlighted area is designed so as to draw human's attention. Therefore, headline areas extractions are available in the same manner. Here, extractions of Japanese and English newspapers headlines by using the proposed method are examined as shown in Fig.5 and Fig. 6, respectively. The headline area can be successfully extracted. Here, the control parameter is the threshold for the curvature energy, which corresponds to the command given from the brain (the central processor).



Fig. 5: Result of headlines extraction of Japanese newspaper, (a) original image, (b) extracted image obtained by the proposed method.



Fig. 6: Result of headlines extraction of English newspaper, (a) original image, (b) extracted image obtained by the proposed method.

The 1-D hardware oriented model has been implemented in the Xilinx FPGA device and its real time ability has been confirmed [24, 27]. Now Miki *et al.* are designing the 2-D hardware model for real-time image pre-processing.

#### Other sensory information processing projects

Miki tries to develop of an emotion extraction system from speech signals. Human's emotion involved in speech signals is very important information for our good communications. He has designed the emotion extraction system based on fuzzy theory and confirmed that the system achieves a good discrimination rate (more than 75%) for realistic speech signals [28]. The system treat with basic six emotions (natural, anger, happiness, sadness, surprise and fear) and is a kind of KANSEI information system. Miki is designing sensors with cross talk because animal taste receptors appear to take advantage of the cross talk to increase the sensitivity

Miki is also trying to design a single hardware taste bud cell (TBC) and will realize an electronic TBC network as a taste bud model. This project is pushed forward in collaboration with Yoshii and Tateno. Their final goal is to produce a bio-inspired silicon sensor [29].

## IV. FAST LEARNING ALGORITHM FOR SELF-ORGANIZING MAP HARDWARE

Yamakawa has proposed a fast learning algorithm for self-organizing maps (SOMs) which emulate an aspect of our brain function. The structure of the SOM is shown in Fig.7. The SOM consists of the input and the competitive layers that include M and N units, respectively. The *j*-th unit in the competitive layer is connected to all units in the input layer by the weight vector  $w_j = [w_{j1}, \dots, w_{ji}, \dots, w_{jM}]$ . The SOM is trained using learning data set  $\{x_l | l = 1, \dots, L\}$ . We abbreviate the subscript l except special cases. The procedure of learning of the SOM is as follows:

- 0) The weight vectors of all units in the competitive layer are initialized using random values.
- 1) The input vector  $x = [x_1, \dots, x_i, \dots, x_M]$  is randomly selected from the set of learning data and it is applied to the input layer.
- 2) The distance between the input vector x and weight vector  $w_j$  of the j -th unit in the competitive layer is calculated as Euclidean distance.
- 3) The unit which has the minimum Euclidean distance is defined as the winner unit c.
- 4) The weight vectors of the winner unit and neighboring units are updated by:

$$x_j(t+1) = w_j(t) + \alpha(t)(x - w_j(t)),$$
 (3)

where *t* represents a learning step, and  $_{W_j}(t+1)$ and  $_{W_j}(t)$  are the weight vectors after and before updating, respectively.  $\alpha(t)$  is a learning coefficient. 5) The procedures 1 to 4 are repeated until that all data are selected. In the iteration the learning rate  $\alpha(t)$  and neighborhood area  $N_c(t)$  decrease with learning steps. For example, they are defined by:

$$\alpha(t) = \alpha_i \left(\frac{\alpha_f}{\alpha_i}\right)^{\frac{1}{T}},\tag{4}$$

$$N_c = N_{c,i} \left(\frac{N_{c,f}}{N_{c,i}}\right)^{\frac{t}{T}},\tag{5}$$

where  $\alpha_i$  and  $\alpha_f$  are the initial and final values of the learning coefficient, respectively. T is a total number of learning steps.  $N_{c,i}$  and  $N_{c,f}$  are the initial and final values of the neighborhood area, respectively.



Applying the input vectors to the input layer of the SOM after the learning, an input vector pairs with a small distance between them make the corresponding winner units to be close to each other.

In learning of the SOM, procedures are divided to following three parts. (1) calculating distance, (2) defining winner unit and (3) updating weight vector. Recently the VLSI technology makes learning at all units in the competitive layer in parallel and processing time for the procedures can be easily reduced. Winner unit is obtained by a new fast WTA algorithm with high accuracy. The distribution of weight vectors in the initial stage of learning is disorderly arranged. The main purpose of this learning stage is that the weight vectors are is rearranged orderly. It means that the strict definition of the winner unit is not required. Therefore, Yamakawa proposes a fast WTA algorithm which performs rough comparison in the early stage and performs strict comparison in later stage.

Fig.8 shows the concept of proposed rough comparison WTA algorithm. In Fig.8, distance between the input vector and the weight vector is represented as 8 bits. In the input (weight) vector space (bottom-left in Fig.8), the weight vectors of the units 1, 3, 4 and 5 are allocated in the white circle. It means that the distances between the input vector and these weight vectors can not be distinguished. A number of clocks required for WTA can

be reduced by using this algorithm. Fig.9 shows the WTA circuit for the proposed WTA algorithm.



Fig. 8: Concept of rough comparison WTA algorithm.



Fig. 9: WTA circuit for new WTA algorithm. (a) WTA cell, (b) winner selector cell.

In order to investigate the effects of the proposed algorithm and hardware, the SOM hardware is designed in VHDL using the Xilinx Alliance tools and synthesized using the Leonardo Spectrum FPGA compiler. And the hardware simulation using the ModelSim SE is done. A personal computer is used as a reference. Fig.10 shows calculation time corresponding to the number of competitive units. Although calculation time of the software simulation increases in proportion to the number of units, that of the designed hardware does not change. In case of a large-scale simulation, the difference of calculation time becomes prominent. Table 1 shows the comparison of the proposed SOM hardware with other five ones, where the best performance of the learning is used. The MANTRA I is a massively parallel system based on a bidimensional systolic array of up to 1600 processing elements (PE) [14]. Its best performance is 14 MCUPS. The PARNEU general-purpose partial tree computer implemented the SOM. The system achieves the best performance of 8 MCUPS with four processing units. Various implementations of the SOM have been presented with the CNAPS neurocomputer [16]. The best performance of 446 MCUPS has been presented in the CNAPS system with 512 processing elements [16]. The architecture of NBX-VME system for the SOM is massively parallel [17]. It achieves the best performance of 1318 MCUPS when all neurons are updated during learning. The rapid prototyping system RAPTOR2000 can implement the very large scale neural networks, such as NBX-SOM [18]. It can be equipped with five Xilinx Vertex-E or Vertex-II FPGAs modules. Using Virtex-II XC2V6000-6 devices, it achieves the best performance of 12976 MCUPS [19]. As the result of comparison using MCUPS, the performance of the proposed hardware was superior to other implementations.



Fig. 10: Calculation time corresponding to the number of competing units.

Table 1: Comparison of calculation performance of various SOM hardware.

|        | MANTRA | PARNEU     | CNAPS    |
|--------|--------|------------|----------|
| MCUPS  | 14     | 8          | 446      |
|        |        | 2.02       | 5900     |
|        | NBX    | RAPTOR2000 | Proposed |
| MCUIPS | 1318   | 12976      | 22929    |

#### V. SUMMARY

The role of our field "Brain-Like Integrated Circuits" is to design brain-like hardware models based on the knowledge obtained in the other fields and to implement their model in dedicated VLSI device. As resent research results of our field, in this paper, the pixel parallel Gabor filter LSI for real time image processing, the human-like feature extraction algorithm based on a basis function network scheme and the self-organizing map hardware with fast learning mechanism were presented. These results are the first step of our project.

When results of all fields in our COE program will be synthesized organically, the brain-like information processor would be created. In order to develop the brain-like dedicated VLSI, we must treat with not only mathematical (logic-base) models but also the biology-base models, the physiology-base models and psychology-based models. At this point, our field is very important role in our COE program. We must also seep out of our own field into the neuroscience for collaboration.

#### **ACKNOWLEDGMENTS**

A part of this work was supported by fund from MEXT via Kitakyushu innovative cluster project, and also under Grant-in-Aid for Scientific Research on Priority Areas (A). The LSI chip was fabricated in the chip fabrication program of VDEC with the collaboration by Rohm Corporation and Toppan Printing Corporation.

#### REFERENCES

- T. Morie, M. Miyake, S. Nishijima, M. Nagata, and A. Iwata, Proc. Int. Conf. on Neural Information Processing (ICONIP), pp. 613-617, Taejon, Korea, Nov. 2000.
- [2] T. Morie, M. Miyake, M. Nagata, and A. Iwata, Ext. Abs. of Int. Conf. on Solid State Devices and Materials (SSDM), pp. 90-91, Tokyo, Sept. 2001.
- [3]T. Nakano, T. Morie, and A. Iwata, SICE Annual Conference 2003, pp. 1418-1423, Fukui, Aug. 2003.
- [4] T. Nakano, H. Ando, H. Ishizu, T. Morie, and A. Iwata, 7th World Multiconference on Systemics, Cybernetics and Informatics (SCI 2003), volume IV, pp. 186-191, Orlando, July 2003.
- [5] H. Ando, M. Miyake, T. Morie, M. Nagata, and A. Iwata, IEICE Trans. Fundamentals., vol. E83-A, no. 2, pp. 329-336, 2000.
- [6] H. Ando, T. Morie, M. Miyake, M. Nagata, and A. Iwata, IEICE Trans. Fundamentals., vol. E85-A, no. 2, pp. 381-388, 2002.
- [7] H. Ando, T. Morie, M. Nagata, and A. Iwata, European Solid-State Circuits Conference (ESSCIRC), pp. 703-706, 2002.
- [8] T. Morie, M. Nagata, and A. Iwata, Proc. Int. Symp. on Nonlinear Theory and its Applications (NOLTA2001), pp. 371-374, Zao, Japan, Oct. 2001.
- [9] T. Morie, J. Umezawa, and A. Iwata, Intelligent Automation and Soft Computing, vol. 10, no. 2, pp. 95-104, 2004.
- [10] T. Morie, J. Umezawa, and A. Iwata, Digest of Technical papers, pp. 212-213, Honolulu, Hawaii, June 2004.
- [11] T. Morie, T. Nakano, J. Umezawa, and A. Iwata,

World Automation Congress, #IFMIP075, Seville, Spain, June 2004.

- [12] K. Korekado, T. Morie, O. Nomura, H. Ando, T. Nakano, M. Matsugu, and A. Iwata, 7th Int. Conf. on Knowledge-Based Intelligent Information and Engineering Systems (KES'2003), volume II, pp. 169-176, Oxford, Sept. 2003.
- [13] T. Kohonen, *Biological Cybernetics*, Vol.43, pp.59-69, 1982.
- [14] Paolo Ienne, Patrick Thiran, and Nikolaos Vassilas, *IEEE Trans. on Newral Networks*, Vol.8, No.2, pp.315-330, 1997.
- [15] P. Kolinummi, P. Hämäläien, T. Hämäläien, and J. Saarinen, *Microprocessors and microsystems*, vol.24, pp.23-42, 2000.
- [16] V. Pulkki and Taneli Harju, Proc. of International Conference on Artificial Neural Networks (ICNN'96), 1996.
- [17] Rüping, M. Porrmann, U. Rückert, *Neurocomputing*, vol. 21, pp. 31–50, 1998.
- [18] M. Porrmann, H. Kalte, U. Witkowski, J. Niemann and U. Rückert, Proc. of The 5th World Multi-Conference on Systemics, Cybernetics and Informatics, SCI 2001, pp.242-247, July, 2001.
- [19] M. Porrmann, U. Witkowski, H. Kalte and U. Rückert, Proc. of the 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing (PDP 2002), pp.243-250, Jan., 2002.
- [20] B. E. Shi, IEEE Trans. Circuits & Syst. I, vol. 45, no. 2, pp. 121-132, 1998.
- [21] T. Miki and T. Sato, Intelligent Automation and Soft Computing, vol.10, pp.69-78, 2004.
- [22] T. Miki and T. Sato, ICS 1269 International Symposia on Bio-Inspired Systems (Brain IT 2004), Elsevier, in press.
- [23] T. Yamakawa, E. Uchino and T. Samatsu, Proc. the 1994 IEEE Int. Conf. Neural Networks, pp.1391-1396, 1994.
- [24] T. Miki, T. Sato and S. Kawagoe, Technical Report of IEICE VLD 2001-125, in Japanese ,pp.25-30, 2002.
- [25] Ran, X. and Farvardin, N., IEEE Trans. Image Proc 4(4), April, pp.401-415, 1995.
- [26] T. Sato and T. Miki, World Automation Congress, #IFMIP080, Seville, Spain, June, 2004.
- [27] T. Miki, S. Yano and T. Sato, 19<sup>th</sup> Fuzzy System Simposium (in Japanese), Osaka, Japan, pp.783-786, 2003.
- [28] T. Miki and J. Tashiro, 20<sup>th</sup> Fuzzy System Simposium (in Japanese), Kitakyushu, Japan, #3E2-1, 2004.
- [29] H. Hayashi and K. Yoshii, SCIS & ISIS 2004, Yokohama, Sep., in press, 2004.