Pattern Recognition 34 (2001) 2173}2179
Pattern classi"cation using multiple hierarchical overlapped self-organising maps P.N. Suganthan* School of Electrical and Electronic Engineering, Nanyang Technological University, Nanyang Avenue, Singapore 639798, Singapore Received 12 November 1999; received in revised form 20 September 2000; accepted 20 September 2000
Abstract In this paper, we describe techniques for designing high-performance pattern classi"cation systems using multiple hierarchical overlapped self-organising maps (HOSOM) (Suganthan, Proceedings of the International Joint Conference on Neural Networks, WCCI'98, Alaska, 1998). The HOSOM model has one "rst level SOM and several partially overlapping second-level SOMs. With this overlap, every training and test sample is classi"ed by multiple second-level SOMs. Hence, the "nal classi"cation decision can be made by combining these multiple classi"cation decisions to obtain a better performance. In this paper, we use multiple HOSOMs and each HOSOM is trained on a distinct input feature set extracted from the same data set. Since one HOSOM yields multiple classi"cations, these multiple HOSOMs generate a large number of classi"cation decisions. To combine the individual classi"cations, we make use of the global winner as well as a winner for every class. Our experiments yielded a high recognition rate of 99.25% on NIST19 numeral database. 2001 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved. Keywords: Pattern classi"cation; Self-organising maps; Multiple neural networks; Hierarchical self-organising maps; Numerals recognition
1. Introduction The self-organising map is one of the most versatile neural networks paradigm. It has been applied to the study of complex problems such as vector quantisation, speech recognition, combinatorial optimisation, control, pattern recognition and modelling of the structure of the visual cortex [1]. However, its performance in classi"cation applications has not been competitive when compared with other supervised networks. A novel classi"er called hierarchial overlapped self-organising maps (HOSOM) [2] based on Kohonen's algorithms was developed recently. The HOSOM has one base SOM. Second-level SOMs are grown one from each base level SOM neuron. Each second-level SOMs is trained using a subset of the whole training sample set. These subsets may have some common training samples among them,
* Tel.: #65-790-5404; fax: #65-792-0415. E-mail address:
[email protected] (P.N. Suganthan).
thereby yielding a degree of overlap in the second-level SOMs. This overlap also enables us to obtain multiple classi"cations for every data sample. So far, the HOSOM o!ered the best performance for any SOM-based classi"er for numerals classi"cation problem when tested on the NIST19 database. Further, the HOSOM can be regarded as an e$cient alternative to the k-NN algorithm. One of the primary objectives of research in the "eld of pattern classi"cation is to improve the recognition accuracy. To this end, several recognition systems have been developed. As these recognition systems make use of di!erent input features and classi"cation methodology, usually their errors are of di!erent types or independent of each other. Hence, by employing a number of independent classi"ers, the overall performance of the pattern classi"cation system can be improved. In fact, in recent years, several researchers have employed the multiple classi"er combination method [3}9]. We brie#y review some of the related work below. Cho used three di!erent feature sets [3,4] * namely, character matrix, contour direction and Kirsch
0031-3203/01/$20.00 2001 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved. PII: S 0 0 3 1 - 3 2 0 3 ( 0 0 ) 0 0 1 4 7 - 3
2174
P.N. Suganthan / Pattern Recognition 34 (2001) 2173}2179
directional edges. These features are used to train three di!erent multilayer perceptron using the backpropagation algorithm. The outputs from these networks are then combined using the fuzzy fusion method [4] and the genetic algorithms [3]. Ha et al. [5] used two di!erent feature sets to train two multilayer perceptrons using the backpropagation algorithm. The feature sets were the contour features and the combination of projection, pro"le and black-to-white transition features. The outputs of these two networks were accumulated. Rogova [9] also employed three multilayer perceptrons trained using the backpropagation algorithm and three di!erent feature sets. Rogova used the Dempster}Shafer theory of evidence to combine these three classi"ers. In this paper, we employ multiple HOSOMs to obtain high-performance pattern classi"ers. Each HOSOM is trained using a distinct input feature set. Since every HOSOM yields multiple classi"cations, these multiple HOSOMs generate a large number of classi"cation decisions which are combined to yield the "nal classi"cation. When the SOM or learning vector quantisation (LVQ) is used as a classi"er, only the label of the winner neuron is used in making the classi"cation decision. In this work, we make use of the global winner as well as a winner for every class to make the "nal classi"cation decision. Our results consistently showed an improved performance when we make use of global winner as well as winner for every class over the decision process based solely on the global winner. We obtained a high recognition rate of 99.25% on NIST19 numeral database. In the next section, we explain the HOSOM algorithm and the decision combination method. In Section 3, feature extraction procedures are described. In Section 4, experimental results are presented. The paper is concluded in Section 5.
2. The HOSOM Most of the work carried out so far on SOMs has concentrated on systems with a single self-organising layer and a "xed number of neurons. The notable exceptions are the work of Bauer et al. [10], Cho [4] and Fritzke [11], who proposed structure adaptive SOMs for two-dimensional and multi-dimensional topologies. We review some of them below. 2.1. Structure adaptive SOMs Lee et al. [12] proposed a self-development neural network for adaptive vector quantisation. The network has one self-organising layer. It has two levels of adaptations * namely, structure and parameter (synaptic vectors) levels. Kohonen's topology preserving self-organisation algorithm was used for parameter adaptation. Structure adaptation includes neuron generation, neuron annihilation, and neuron merging. If the distortion error
Fig. 1. The structure of HOSOM network.
of a neuron exceeds a prede"ned threshold, the neuron is split to generate another neuron in the vicinity. If a neuron is referenced infrequently, then the neuron is removed. If two neurons are similar, they are merged into one neuron. However, maintaining a regular connectivity has proved to be a di$cult task during the structure adaptation [12]. Wu et al. [13] investigated a supervised two self-organising layer SOM. However, their method did not employ any structure adaptation scheme. Cho [4] proposed a single self-organising layer structure adaptive SOM. The network is "rst initialised with a 4;4 map and trained using the SOM algorithm [2]. The map is labelled using training data and the nodes which do not produce unique class label is replaced with a 2;2 submap. If a node does not get activated for a long time, the node is deleted. The resulting map has a irregular connectivity. Bauer et al. [10] presented a growing self-organising map (GSOM) algorithm. The GSOM has a general hypercubical shape which is adapted during the learning. Bauer et al. [10] also showed that the GSOM produces maps which preserve neighbourhoods in a nearly optimal fashion. Fritzke [11] proposed a structure adaptive SOM algorithm which has one selforganising layer with sophisticated multi-dimensional lattice topologies. The algorithm includes cell insertion and removal based on reference frequency. Although several hierarchical and structure adaptive SOM models have been proposed [2,11}15], the novelty of our approach is to allow a degree of overlap between adjacent higher-level SOMs as conceptually shown in Fig. 1 the overlap between two second-level SOMs grown out of base-level neurons A and B. This overlapped structure possesses several bene"cial properties. Further, our experiments showed that the classi"cation performance of the HOSOM is far better than any SOM-based classi"er.
P.N. Suganthan / Pattern Recognition 34 (2001) 2173}2179
2.2. The hierarchical overlapped SOMs The overlapping is achieved by using every training sample to train N number of upper-level SOMs. These N upper-level SOMs are grown from N lower-level SOM neurons which are the N best matching nodes for that particular training sample. Here N is the degree of overlap in training samples. That is the winning neuron as well as (N!1) runners-up neurons make use of the same training sample to train the higher-level SOMs grown from them. By duplicating the training samples in the upper-level SOMs, we obtain multiple overlapped SOMs. The testing samples are duplicated to a lesser degree, for example, (N!4) times. Hence, the testing samples "t well inside the feature maps developed using the best matching and (N!1) runners-up in the training data. This duplication of samples within one HOSOM and employing multiple HOSOMs generate a large number of classi"cation decisions for every sample and subsequently allow us to employ one of several classi"cation combination schemes to obtain the "nal classi"cation. The proposed multiple HOSOMs-based classi"cation system is shown in Fig. 2. Traditionally, only the labels of winner neurons are considered to obtain the "nal classi"cation when the SOM or the LVQ is used as a classi"er. In our study, we obtain a con"dence value for every sample for its membership in every class using the following expression: DIS¹ /DIS¹ . That is, we de"ne the ratio be
U UG tween the global minimum distance (which is DIS¹ )
and minimum distance for every class (DIS¹
U UG denotes the smallest distance between the sample pattern
Fig. 2. The proposed multiple HOSOMs-based classi"cation system.
2175
and a neuron with label i) as the con"dence value for the pattern to belong to the ith class. Naturally, the class which has the global minimum distance will yield a con"dence value of 1. These con"dence values generated by multiple HOSOM classi"ers can be accumulated to obtain the "nal classi"cation decision. For example, if we have an overlap of 9 for training samples, each HOSOM generates nine sets of con"dence values. As we employ three HOSOMs, we obtain 27 sets of con"dence values for each training sample. In our study, we just accumulate them to choose the label with the highest aggregate as the "nal classi"cation. Similarly, if we have an overlap of 5 for testing samples, we obtain 15 sets of con"dence values from three HOSOMs (refer to the third row in Table 4 with overlaps (9,5)). We believe more sophisticated decision fusion techniques [4,8,9,16] such as fuzzy decision fusion, or Dempster}Shafer theory of evidence may be employed to improve the "nal decision based on these multiple classi"ers. 2.3. Training the HOSOM Having explained the structure and classi"cation combination in detail, we now brie#y discuss the training of an HOSOM. For the initial adaptation of the synaptic weight vectors of an SOM, we employ the standard Kohonen's topology preserving map algorithm [2] summarised in Table 1. Having completed the SOM learning, the neurons are labelled using a simple voting mechanism. Then the supervised LVQ 2 learning algorithm [2] is applied a number of times to "ne tune the prototype vectors. In the LVQ 2 training stage, if the winning neuron has the same label as the training sample, we update weights of only the winning neuron (i , j ). If the U U winning neuron has a di!erent label, then (a) update the weights of only the winning neuron; and (b) locate the closest neuron with the same label as the training sample and update its weights. The LVQ 2 algorithm is shown in Table 2. The training and structure adaptation of the HOSOM is given in Table 3. We apply the following structure adaptation techniques once in between the supervised LVQ 2 training iterations. 1. Removing a neuron: If a neuron is inactive for a period of time, that neuron is removed from the lattice. A neuron may be considered inactive if it is not chosen frequently as the winner over a "nite time interval. 2. Merging neurons: The merging operation is essential in particular in the "nal layer which is not to be grown further. It should be noted that merging of neurons improves the performance on cross validation and test data sets, in case the network has been over-trained or over-specialised [2,13]. We employ a simple scheme. If an end-node neuron represents a few (in our experiments less than 3) training samples, that neuron is merged with another neuron which is the closest with
2176
P.N. Suganthan / Pattern Recognition 34 (2001) 2173}2179
Table 1 The unsupervised SOM algorithm USOM1: Initialise the weights for the given size map. First layer weights are randomly initialised. Subsequent layers are initialised around the root node. Initialise the learning rate parameter, neighbourhood size and set the number of unsupervised learning iterations USOM2: Present the input feature vector x"[x , x ,2, x ,2, x ] in the training data set of the root neuron, where x is the nth L , L element in the feature vector USOM3: Determine the winner node c such that x!w "min x!w A G G USOM4: Update the weights, w 's, within the neighbourhood of node c, N (t), using the standard update rule: G A w (t#1)"w (t)#(t)[x !w (t)], where i3N (t). The neighbourhood wraps around at edges, i.e. column and row indices are in G G L G A modulo representation USOM5: Update learning rate, (t), and neighbourhood size, N (t). (t#1)"(0)1!t/K; N (t#1)"N (0)1!t/K, where K is A A A a constant and is usually set to be equal to the total number of iterations in the self-organising phase USOM6: Repeat USOM2-5 for the speci"ed number of unsupervised learning iterations
Table 2 The supervised LVQ 2 learning algorithm for SOM SSOM1: Present the input feature vector x"[x , x ,2, x ,2, x ] in the training data set L , SSOM2: Locate the winner node c such that x!w "min x!w A G G SSOM3: If the winning neuron has the same label as the training sample, update weights of the winning neuron only using the standard update rule: w (t#1)"w (t)#[x !w (t)]. If the winning neuron has a di!erent label, (a) update the weights of the winning neuron A A L A only using a small negative learning rate as follows: w (t#1)"w (t)# [x !w (t)], (b) locate the closest neuron with the same A A L A label as the training sample and update its weights using the update equation with a positive learning rate of SSOM4: Repeat SSOM1-3 for the speci"ed number of supervised learning iterations
Table 3 The HOSOM algorithm HOSOM1: Apply USOM (Table 1) HOSOM2: Label all output nodes using a voting scheme HOSOM3: Apply SSOM (Table 2) HOSOM4: Merge/remove neurons HOSOM5: Apply SSOM (Table 2) HOSOM6: Obtain recognition rates on training data HOSOM7: Grow an additional layer and repeat HOSOM1-6 until a satisfactory recognition rate is achieved OR maximum complexity level is reached
tions and black-to-white transition counts and (c) Kirsch's edge features. Prior to extracting the features, each character image was preprocessed. Preprocessing involved two steps. First, vertical and horizontal projections were obtained to detect isolated blobs. All isolated blobs which were 15% or smaller than the size of the biggest blob were removed. Then isolated 1 and 0 pixels were also removed. Finally, the character image was centered. Having preprocessed, we applied the following three feature extraction methods. 3.1. Global features
the same label. If there is no other neuron with the same label, the neuron is removed.
3. Feature extraction for handwritten digit recognition application We performed handwritten digit recognition experiments to test the proposed classi"cation method. The handwritten digits samples were extracted from the NIST database 19, from "les by}class/30/hsf}0.mis through to by}class/39/hsf}0.mis. We employed three HOSOMs trained using three di!erent feature vectors * namely, (a) normalised character matrix, (b) contour pro"les, projec-
The "rst feature set is just the character image named the global feature. To extract the normalised character matrix, a feature extraction algorithm similar to the one proposed by Wu et al. [13] is used. The algorithm is summarised as follows: (a) each centered character image is rescaled to 88;72 pixels; (b) each scaled image is divided into 8;8 blocks and pixel values in each block are summed; and (c) the result is an 11;9 image with pixel values in the range of [0,64] . This procedure yields a feature vector with 99 elements. 3.2. Projection features The second feature set consists of (a) the contour pro"les (which is de"ned to be the number of white pixels
P.N. Suganthan / Pattern Recognition 34 (2001) 2173}2179
2177
from the bounding box to "rst black pixel on the character image in a given direction and obtained in four directions, (b) projections in horizontal, vertical, left diagonal and right diagonal directions and (c) black-to-white transition counts in horizontal and vertical directions. These features were extracted from centered character image which was normalised to 40;40. These extracted features are reduced to 100 elements long feature vector * 40 pro"le, 40 projection and 20 transition features by appropriately accumulating the three individual feature sets. 3.3. Directional edge features The third feature is the Kirsch directional edges [4]. The Kirsch directional features are obtained as follows using the following equations: G(i, j)"max1,max [5S !3¹ ], I I I
Fig. 4. Sample masks used to compute horizontal and right diagonal edges.
(1)
where S "A #A #A , I I I> I>
(2)
¹ "A #A #A #A #A . I I> I> I> I> I>
(3) 4. Experimental results
Here G(i, j) is the gradient of pixel (i, j), and A , A , etc., I I> represent pixel (i, j)'s eight neighbors as shown in Fig. 3 and counted in modulo 8. We obtain horizontal, vertical, right diagonal and left diagonal Kirsch edges from centered binary images which are normalised to 20;20. These four edge images are sub-divided into 5;5 images, edge values are accumulated within each sub-division independently and concatenated to obtain 100 (4;5;5) elements long feature vectors. For example, the horizontal and right diagonal edge features are obtained using the following equations: G(i, j) "max[5S !3¹ ,5S !3¹ ], &
(4)
G(i, j) "max[5S !3¹ ,5S !3¹ ], 4
(5)
G(i, j) "max[5S !3¹ ,5S !3¹ ], *
(6)
G(i, j) "max[5S !3¹ ,5S !3¹ ]. 0
(7)
Fig. 3. Eight neighbors of pixel (i, j).
The masks used to obtain horizontal and right diagonal edge features are shown in Fig. 4.
In our experiments, we restricted the number of layers to 2. The "rst layer has a 15;15 lattice. The initial number of neurons in the second layer is determined by the number of training samples available for training using the following expression: max[5, sqrt(Number}of}training}samples/8)]. The initial neighbourhood size is set to two-third of the map size for all SOMs. The number of iterations needed to form the topological map using the unsupervised SOM learning is set to around six times the number of training samples. The learning rates during the map formation and LVQ 2 learning are set to 0.6 and 0.1, respectively. The training involves applying the unsupervised SOM and LVQ 2 algorithms once to train the "rst layer SOM and applying the same combination 15;15 times to train 225 second-level overlapped SOMs to obtain the two self-organising layer HOSOM network. The data sets have 23,000 training samples, and 23,000 testing samples. We experimentally determined that the overlap in testing samples should be about four less than the overlap in the training samples to obtain optimum performance from a HOSOM. It is also obvious from the table that overlaps of 9 and 5 for training and testing data, respectively, yield the best performance. We performed comparative experiments using two alternative classi"er combination schemes, one using global winner only and the other using one winner for each class. Our experiments consistently showed that
2178
P.N. Suganthan / Pattern Recognition 34 (2001) 2173}2179
Table 4 Recognition results on test data using the global winner and winner neuron for each class Degree of overlap (training, testing)
Feature sets 1 and 2
Feature sets 1 and 3
Feature sets 2 and 3
All 3 feature sets
(7,3) (9,5) (11,7)
99.01 99.07 99.13
99.05 99.14 99.12
98.98 99.08 99.05
99.12 99.25 99.22
a better performance was obtained when we combined the class labels and con"dence values obtained from the global winner and a winner for each class. Combining classi"cations on the basis of the winner neurons' labels alone yielded a lower recognition rate. However, it should be mentioned that both schemes yielded classi"cation rates within a tight range of 0.15%. The classi"cation accuracy on testing samples are shown in Table 4 obtained using the better classi"er combination method * namely, the global winner and a winner for each class. The best results obtained using the SOM and its variants (to our knowledge) are as follows: Wu et al. [13]: 97.3% and Cho [14]: 96.05%. Apparently, the recognition rate of our method is far superior than other SOM-based classi"ers due to the usage of multiple HOSOMs. The two-level SOM of Wu et al. does not have any structure adaptation properties nor overlap in the second level. The structure adaptive SOM of Cho has structure adaptation capabilities, but does not have any overlap in the second level. As, these two SOM-based classi"ers do not have any overlap in the second level, they yield just one class label for every data sample. To our knowledge, only the recognition rates of Ha et al. [5] are better than ours. They used almost twice the number of training samples and trained two di!erent feedforward neural networks using two di!erent feature sets and the backpropagation algorithm. They fused the decisions of these two neural networks and obtained a recognition rate of 99.34%.
5. Conclusion In this paper, we have presented a robust pattern classi"cation system based on multiple HOSOMs. We have also investigated alternative schemes for combining multiple classi"cations generated by HOSOMs and showed that by exploiting the information captured by the global winner as well as winner neuron for each class, the performance can be improved. The developed network, HOSOM, was tested on handwritten numerals recognition problem and obtained the best results ever for any SOM-based classi"er and second-best results ever reported by any method.
6. Summary The self-organising map is one of the most versatile neural networks paradigm. It has been applied to the study of complex problems such as vector quantisation, speech recognition, combinatorial optimisation, control, pattern recognition and modelling of the structure of the visual cortex [2]. However, its performance in classi"cation applications has not been competitive when compared with other supervised networks. A novel classi"er called HOSOM [1] based on Kohonen's algorithms was developed recently. The HOSOM has one base SOM. Second level SOMs are grown one from each base-level SOM neuron. Each second-level SOM is trained using a subset of the whole training sample set. These subsets may have some common training samples among them, thereby yielding a degree of overlap in the second-level SOMs. This overlap also enable us to obtain multiple classi"cations for every data sample. One of the primary objective of research in the "eld of pattern classi"cation is to improve the recognition accuracy. To this end, several recognition systems have been developed. As these recognition systems make use of di!erent input features and classi"cation methodology, usually their errors are of di!erent types or independent of each other. Hence, by employing a number of independent classi"ers, the overall performance of the pattern classi"cation system can be improved. In fact in recent years, several researchers have employed the multiple classi"er combination method [3}9]. In this paper, also we employ multiple HOSOM classi"ers (namely three) to obtain high-performance pattern classi"ers. These HOSOMs are trained using distinct input feature sets * namely, (a) the character matrix, (b) projections, contour pro"le and black-to-white transitions and (c) Kirsch directional edges. Since every HOSOM yields multiple classi"cations, the three HOSOMs generate a large number of classi"cation decisions which are combined to yield the "nal classi"cation. When the SOM or the LVQ is used as a classi"er, only the label of the winner neuron is used in making the classi"cation decision. In this work, we make use of the global winner as well as a winner for every class to make the "nal classi"cation decision. Our results consistently showed an improved performance when we made use of
P.N. Suganthan / Pattern Recognition 34 (2001) 2173}2179
global winner as well as winner for every class over the decision process based solely on the global winner. We obtained a high recognition rate of 99.25% on NIST19 numeral database. Further, the HOSOM o!ered the best performance for any SOM-based classi"er for numerals classi"cation problem on the NIST19 database. Further, the HOSOM can be regarded as an e$cient alternative to the k-NN algorithm. References [1] T. Kohonen, Self-organising maps, Springer, Berlin, 1995, The self-organising maps, Proc. IEEE 78 (9) (1990) 1464}1480. [2] P.N. Suganthan, Structure adaptive multilayer overlapped SOMs with partial supervision for handprinted digit classi"cation, Proceedings of International Joint Conference on Neural Networks, WCCI'98, Alaska, May 1998. [3] S.B. Cho, Pattern recognition with neural networks combined by genetic algorithm, Fuzzy Sets and Systems 103 (1999) 339}347. [4] S.B. Cho, Neural-network classi"ers for recognising totally unconstrained handwritten numerals, IEEE Trans. Neural Networks 8 (1) (1997) 43}53. [5] T.M. Ha, H. Bunke, O!-line, handwritten numeral recognition by perturbation method, IEEE Trans. Pattern Anal. Mach. Intell. 19 (5) (1997) 535}539. [6] T.K. Ho, J.J. Hull, S.N. Srihari, Decision combination in multiple classi"er systems, IEEE Trans. Pattern Anal. Mach. Intell. 16 (1) (1994) 66}75.
2179
[7] Y.S. Huang, C.Y. Suen, A method of combining multiple experts for the recognition of unconstrained handwritten numerals, IEEE Trans. Pattern Anal. Mach. Intell. 17 (1) (1995) 90}94. [8] J. Kittler, M. Hatef, R.P.W. Duin, J. Matas, On combining classi"ers, IEEE Trans. Pattern Anal. Mach. Intell. 20 (3) (1998) 226}239. [9] G. Rogova, Combining the results of several neural network classi"ers, Neural Networks 7 (5) (1994) 777}781. [10] H.-U. Bauer, T. Villmann, Growing a hypercubical output space in a self-organising feature map, IEEE Trans. Neural Networks 8 (2) (1997) 218}226. [11] B. Fritzke, Growing cell structures * a self-organising network for unsupervised and supervised learning, Neural Networks 7 (9) (1994) 1441}1460. [12] T.C. Lee, A.M. Peterson, Adaptive vector quantisation using a self-development neural network, IEEE J. Selected Areas Commun. 8 (8) (1990) 1458}1471. [13] J. Wu, H. Yan, A. Chalmers, Handwritten digit recognition using two-layer self-organising maps, Int. J. Neural Systems 5 (4) (1994) 357}362. [14] M. Herrmann, R. Der, G. Balzuweit, Hierarchical feature maps and non-linear component analysis, Proceedings of ICNN'96, Vol. 2, Texas, USA, 1996, pp. 1390}1394. [15] J. Lampinen, E. Oja, Clustering properties of hierarchical self-organising maps, J. Math. Imaging Vision 2 (3) (1992) 261}272. [16] T. Denoeux, A k-nearest neighbor classi"cation rule based on Dempster}Shafer theory, IEEE Trans. Systems Man Cybernet. 25 (1995) 804}813.
About the Author*P.N. SUGANTHAN received the B.A. degree, Postgraduate Certi"cate and M.A. degree in Electrical and Information Engineering from the University of Cambridge, UK in 1990, 1992 and 1994 respectively. He obtained his Ph.D. degree from the School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore. He was a predoctoral Research Assistant in the Department of Electrical Engineering, University of Sydney in 1995}1996 and a lecturer in the Department of Computer Science and Electrical Engineering, University of Queensland in 1996}1999. Since July 1999, he has been an Assistant Professor in the School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore. His research interests include neural networks, pattern recognition, computer vision, genetic algorithms, and fuzzy systems. He is a senior member of the IEEE and an associate member of the IEE.