Sensors and Actuators B 122 (2007) 493–502
Chiral behavior of TGS gas sensors: Discrimination of the enantiomers by the electronic nose K. Brudzewski a,∗ , J. Ulaczyk b , S. Osowski c,d , T. Markiewicz c a
Department of Chemistry, Warsaw University of Technology, ul. Noakowskiego 3, Warsaw, Poland b Department of Physics, Warsaw University of Technology, ul. Koszykowa 75, Warsaw, Poland c Department of Electrical Engineering, Warsaw University of Technology, Koszykowa 75, Warsaw, Poland d Department of Electronic Engineering, Military University of Technology, Kalickiego 1, Warsaw, Poland Received 16 March 2006; received in revised form 14 June 2006; accepted 15 June 2006 Available online 28 July 2006
Abstract The electronic nose measurement system composed of TGS a gas sensor array, cooperating with the support vector machine (SVM) to discriminate between enantiomeric odor pairs: (+) and (−) of ␣-pinene, carvone and limonene forms is proposed in the paper. The array of the semiconductor TGS gas sensors of an electronic nose responds with a signal pattern characteristic of each enantiomer type. The SVM classifier is responsible for the recognition of this pattern. This study demonstrates that the ability of recognition of these enantiomers can be explained as the chiral behavior of the TGS gas sensors. Chirality of these sensors can be joined with the chirally selective adsorption on the metal oxide crystal surfaces. The ability of some inorganic crystalline surfaces to adsorb selectively chiral molecules is widely spread in the nature. © 2006 Elsevier B.V. All rights reserved. Keywords: TGS gas sensors; Electronic nose; Enantiomers; Chiral recognition
1. Introduction Chiral recognition of substances, i.e. the ability to distinguish a molecular structure from its mirror image is one of the most important and widespread principles of biological activity. The property of chirality has profound effects in physics, chemistry and biology, ranging from parity violations for weak forces, to the exclusive use of one mirror-form of amino acids by all lifeforms on the earth. Crystallization is still an important means for separating chiral molecules into their two different mirror-image isomers (enantiomers) [1]. Crystalline chiral surfaces offer fascinating possibilities in the fields of technology and sensor devices. The problem is difficult, because there are no firm rules to predict whether a particular pair of chiral partners will follow the behavior of the vast majority of chiral molecules and crystallize together as racemic crystals, or as the separate enantiomers. A simpler problem occurs when crystallization is in two dimensions, such as a formation of the surface structures by the adsorbed molecules.
∗
Corresponding author. E-mail address:
[email protected] (K. Brudzewski).
0925-4005/$ – see front matter © 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.snb.2006.06.021
Moreover, the molecular species which are achiral as the isolated species in the gas-phase, may become chiral, when chemisorbed on a surface. A racemic mixture of the adsorbed chiral species may, under suitable conditions, undergo chiral separation on the surface into the ordered one or two-dimensional structures, which can be observed by using scanning tunneling microscopy [2–5]. The selective adsorption of the chiral organic molecules onto metal or chiral mineral surfaces is very popular in the nature [6–8]. On the other side many surface structures have been found to be racemic [9–12]. Here we show that TGS gas sensors are able to discriminate the pairs of enantiomers. An aroma identification system (electronic nose) has been applied in this work for the recognition of the enantiomer pairs. The system is based on an array of semiconductor tin dioxide gas sensors and neural processing algorithms [13,14]. The multi-sensor system has employed metal oxides as the sensing materials. The combination of sensor arrays and pattern recognition techniques has been used in the paper for the analysis of the response of tin oxide sensor arrays to enantiomers. The tin oxide semiconductor chemical sensors use their electrical resistance change to detect the reducing vapors. The porous tin oxide contains oxygen vacancies in the lattice. Electrons that can be thermally activated are trapped in these vacancies. The tin
494
K. Brudzewski et al. / Sensors and Actuators B 122 (2007) 493–502
oxide can be used as a gas sensor because the number of electrons in the conducting band is determined by the adsorption and desorption of gaseous species at its surface. The chemical process involved in the sensing mechanism is the chiral adsorption of the gaseous molecules on the crystalline surfaces and the reaction of reduction of these molecules with the adsorbed oxygen atoms. As a result of the adsorption and reaction, the sensor resistance decreases due to the release of the electrons trapped in the oxygen adsorbates. The applied sensors are inexpensive, easy to control and remarkably sensitive to a wide spectrum of reducing vapors. Hence they can be used for recognition of many different enantiomer materials. 2. Description of the electronic nose system and test procedure In our computerized measurement system, we have applied the array of several tin oxide-based gas sensors from Figaro Engineering Inc. These sensors have been mounted into an optimised test chamber. The chamber is placed in a mass flow controlled measurement system with laminar gas flow and controlled gas temperature. The carrier gas (synthetic air) was used for delivering an atmosphere from the ‘head-space’ of the sample chamber with the testing enantiomer sample to sensors. The carrier flow, the temperature, the volume of the enantiomer sample as well as the volume of the measuring chamber are kept constant in the whole measurements. The features used for enantiomer data analysis have been extracted from the averaged temporal series of sensor resistances R(j), one for each jth sensor of the array. In order to produce the consistent data for the pattern recognition process, some form of introductory preprocessing of the data from the sensor array is necessary. As the diagnostic feature we have used the relative variation r(j) of each sensor resistance: r(j) =
R(j) − R0 (j) R0 (j)
(1)
where R(j) is the actual resistance of the jth sensor in the array and R0 (j) represents the baseline value of resistance. As the reference we have used the baseline values of the measured resistance of the sensors in the synthetic air atmosphere. Application of the expression (1) provides automatic normalization of the signals. The sensor array used in all experiments of enantiomers recognition was composed of seven tin oxide-based gas sensors (TGS813, TGS830, TGS822, TGS825, TGS824, TGS842, TGS8221 ) from Figaro Engineering Inc., mounted into an optimised test chamber. The measurements performed for the classification of different enantiomer types were carried out under the following experimental conditions: carrier flow, 0.2 l/min; enantiomer temperature, 25 ◦ C; volume of the enantiomer sample, 100 ml; the volume of the sample chamber, 200 ml. In the experimental system we have used eight-channel analog input module Rev.D1 type ADAM-4017 as a serial communication interface with computer. The resistance sampling rate used in experiments 1
Heater voltage VH = 4.5 V.
was 36 times per minute. The measured sensor resistances R(j) have been preprocessed according to the relation (1), delivering the relative variations of each sensor resistance r(j), for j = 1, 2, . . ., 7, used as the diagnostic features. The feature vector x, on the basis of which the recognition will be done, takes the form x = [r(1), r(2), . . ., r(7)]T . 3. The techniques of recognition of smells on the basis of sensor signals The question of final recognition of smells sensed by gas sensitive sensors is a heavily researched subject in the pattern recognition science. As we have mentioned the smell is characterized here by the preprocessed signals of the sensors arranged in the form of seven-dimensional vector x. Recognition of these vectors should deliver the answer to the question of similarity of smells. Different approaches to this problem are applied in practice. They include the statistical measures of distribution of vectors, the distance measures and artificial neural network methods. This paper is primarily concerned with the supervised neural methods application [15]. In our solution the important role in recognition of the vector patterns performs the support vector machine (SVM) working in the classification mode [16,17]. In the learning stage the SVM network needs the set of learning pairs (x, d), where x is the input feature vector and d the destination (the class) associated with x. After learning stage the parameters of the network are fixed and the classifier works in the testing mode. Putting the actually measured sensor signals organized in the form of vector x to the input of the trained SVM network generates the output signal indicating the class membership. The efficient application of this classifier requires the use of the principal component analysis [18] as the preprocessing tool. Additionally we compare the proposed solution with some other methods belonging to non-supervised techniques. They include the space vector approach, K-mean and fuzzy clusterization methods [19,20]. All of them apply the distance measures to find the membership of the vector x to the proper class. Appendix A contains the basic information concerning all used tools and methods. 4. The description of experiments 4.1. Enantiomer samples used in the experiments Three different enantiomeric pairs: (+) and (−) have been tested in the experiments. They include carvone, limonene and ␣-pinene. All substances were of a nominal purity of at least 97–98% (see Table 1). The enantiomers of a given pair were presented at equal concentration in order to assess that differences in perceived intensity do not dominate over the perceived odour quality contributed to the discrimination performance of the system. Equal concentration of the enantiomer of the particular type has been achieved by controlling the flow of the synthetic air and by the stabilization of the temperature of the liquid samples. We have measured 10 samples of each enantiomer performing 36 measurements for each sample, by using
K. Brudzewski et al. / Sensors and Actuators B 122 (2007) 493–502
495
Table 1 The basic data of the substances used Substance
Producer
Purity (%)
Odour quality
(R)-(−) carvone (S)-(+) carvone (R)-(+) limonene (S)-(−) limonene (1R)-(+)-␣-Pinene (−)-␣-Pinene
Lancaster Synthesis Alfa Aesar Johnson Matthey Company Merck Alfa Aesar; Johnson Matthey Company
98 98 97 97 98 98
Spearmint Caraway Orange Turpentine Pine-like Pine-like
a seven-sensor array. So the experimental data set for each enantiomer type contained 360 vectors in a seven-dimensional space. The measurements of each enantiomer were realised in an alternate way: the first measurement of the enantiomer (+) form, then cleaning the measuring system and next the measurement of the enantiomer (−) form. The whole sequence was repeated 10 times. The cleaning was performed for about 15 min using the synthetic air, and after that the controlled signals from the sensors have been registered to obtain the values of the baseline. Each repetition of the measurement has applied to the new portion of the liquid samples of the particular enantiomer.
4.2. The mean responses of the sensor signals In our experiments we have used a seven-sensor array. This number was carefully selected to obtain the highest selectivity of the system at lowest possible noise. It is known that the individual TGS sensor suffers from the lack of selectivity. Application of a few of them is the cure for this drawback. However, a too large number of sensors results in an increase of the measurement noise. We have optimized the number of sensors by trying different configuration and choosing the best one.
Fig. 1. The mean responses of the sensor signals to limonene, ␣-pinene, carvone; the sensor notation: 1, TGS 813; 2, TGS830; 3, TGS822; 4, TGS825; 5, TGS826; 6, TGS842; 7, TGS822.
496
K. Brudzewski et al. / Sensors and Actuators B 122 (2007) 493–502
Fig. 2. The distribution of differences of the mean normalized sensor signals corresponding to (+) and (−) forms of the enantiomers: limonene, ␣-pinene, carvone.
As the results of the experiments we have got 360 vectors, containing the samples of each form of the enantiomers (720 vectors altogether). Fig. 1 illustrates the mean responses of the sensors for three tested chemicals: limonene, ␣-pinene, and carvone. Left hand side diagrams correspond to the (+) forms and the right ones to the (−) forms of the enantiomers. The horizontal axis denotes sensors (from 1 to 7) and the vertical the mean values of the normalized responses of the sensors. One can immediately see that the individual sensors in the array are not selective (the nonzero responses in all cases). Thus, analysing the individual sensor responses separately is not adequate to yield the classification accuracy. Fig. 2 presents the distribution of differences of the mean sensor signals corresponding to the form (+) and (−) of the three enantiomers: limonene, ␣-pinene and carvone. For most sensors we observe large and the same polarity differences of signals. This observation confirms their potentiality of application for recognition of both forms of enantiomers. The compact representation of the measured data set needs choosing the most descriptive vector for each form of enantiomers. This is the so-called space vector description of the problem. The concept of the most descriptive space vector is derived from the partitioning around medoid method [21]. The descriptor of each cluster is the centroid of the group or the element nearest to it, called medoid. The standard use of the vector space measures (α and Ee ) of the data corresponding to
three considered types of the enantiomers (carvone, limonene and ␣-pinene) has produced the results depicted in Table 2. The Euclidean norm has been applied in calculations. α is the angle between the mean vectors of (+) and (−) forms of the enantiomers and Ee is the enantiomeric excess, calculated for these vectors. It is evident that the standard measures defined for the state characterization of the data belonging to two classes of enantiomers do not provide the proper discrimination between two opposite classes. The α-measures are close to zero for all enantiomers, suggesting that the vectors are highly parallel. Also the Ee measure takes small values, indicating not sufficient discrimination between both classes, especially for carvone and ␣-pinene. 4.3. The PCA representation of the measured data The next investigations have been directed to the visualization of the data distribution using PCA technique. Fig. 3 presents the Table 2 The state space measures for three types of enantiomers Enantiomer
α (◦ )
Ee
Carvone Limonene ␣-Pinene
4.30 9.00 4.44
0.137 0.354 0.140
K. Brudzewski et al. / Sensors and Actuators B 122 (2007) 493–502
497
Fig. 3. The PCA three-dimensional plot of the data for (a) carvone, (b) ␣-pinene and (c) limonene enantiomers. The symbol represents (+) form and × the (−) form.
distribution of the seven-dimensional data vectors x corresponding to the carvone, ␣-pinene and limonene enantiomers, mapped on three most important principal components: PC1, PC2 and PC3. For each component the percentage of the represented information (the variance of data in the principal component direction) is also given. It is seen that three components represent more than 99% of variance of the data. At the same time three components form the three-dimensional system enabling the visual inspection of the data distribution.
We have also made the additional mapping of the data containing three classes of material: the (+) and (−) as well as the racemic forms. The aim of this experiment was to find out if the distribution of the data belonging to these three classes is separable. Fig. 4 presents the exemplary three-dimensional plot of the carvone data. Once again we see the ideally separated classes. The data corresponding to the racemic substance is placed in the middle between the opposite forms of chiral data. Analyzing the distribution of data mapped on three principal components, it is visible that the data points are gathered in the practically perfect separated clusters, corresponding to either (+) or (−) enantiomer classes or the racemic form. Each cluster is composed of 360 points grouped in 10 sub-clusters. This perfect separation of classes means that the linear discrimination classifier will be sufficient for recognition of the classes. Observe that the clusters are of irregular elongated shapes. The clusterization performed by K-means algorithm [19] will be rather inefficient since in this method equal scales are applied for each axis. Better results will be obtained by applying the fuzzy GK algorithm [18], scaling the axes individually.
5. The results of recognition of enantiomers Fig. 4. The PCA three-dimensional plot of the carvone data: the symbol represents S(+) carvone, × the R(−) carvone, and the racemic form of carvone.
The PCA results shown in the previous section clearly indicate that the data vectors grouped in two clusters are well
498
K. Brudzewski et al. / Sensors and Actuators B 122 (2007) 493–502
Table 3 The relative misclassification rate of enantiomers Recognition method
Carvone (%)
Limonene (%)
␣-Pinene (%)
Mean (%)
K-means GK algorithm SVM-classifier Linear combiner
10 2.2 0 0
10 3.3 0 0
15 11.4 0 0
11.67 5.56 0 0
separated from each other. It means that linear neural network classifiers of the simplest structure should be able to solve the recognition problem. To perform the final recognition we have applied two approaches: the supervised neural network of SVM type and the fuzzy self-organization using Gustafson–Kessel algorithm. Both differ significantly in the principle of operation, so that their results provide independent characterization of the process. The SVM of linear kernel has been applied first. To obtain the most objective results we have used the cross validation approach. The whole data set has been split into six equal exchangeable parts. The learning group was composed of five parts and the testing group composed of one set used only for testing. The SVM network was trained using learning data and then tested on the remaining testing set. The experiments have been repeated five times exchanging the contents of five learning subsets and the testing subset. The misclassification ratio in the learning mode has been calculated as the mean of all five runs. The number of testing error is simply the sum of these errors committed by the system at each cross correlation run. To get the objective assessment of the results we have performed 100 runs of the cross validation procedure at different compositions of data forming each group of data. The final results are simply the mean of all runs. The results of recognition were perfect (no misclassification cases). They are shown in Table 3. The separability of clusters corresponding to both classes in PCA analysis suggests that the ordinary linear combiner should be also able to do proper recognition of both classes. At seven inputs of this combiner the output signal is defined in the way: y(x) =
7
w i xi + b
(2)
i=1
where wi are the synaptic weights and b the bias. At positive values of y(x) we have class 1 and at negative class 2. The adjustment of weights is done by minimizing the error function defined over all learning data points. This is the quadratic optimization prob-
lem, easy to solve. Application of this simplest possible solution has also lead to the perfect results of recognition (Table 3). The other approach tested in the experiments has applied the fuzzy self-organization of GK type. The set of all vectors x was subject to the clustering into two groups starting from different random initial centre positions. One hundred trials have been done. After each trial all vectors have been categorized into one of the clusters according to the Mahalanobis distance measure (relation (A.6) of Appendix A). Then the obtained membership of the vectors to the cluster was compared with the real membership of the same vectors. Any difference of the membership was regarded as the error. The results in the form of the relative misclassification rate, calculated as the mean for all 100 runs, are presented in Table 3. For the purpose of comparison the K-mean algorithm [19] was applied for the same sets of data. The results are also gathered in Table 3. As expected they are of inferior quality with respect to GK self-organization. Comparing all results it is evident that the supervised methods (SVM and linear combiner) are the best. The next question is what the advantage of using SVM is. Our next experiments have been directed to checking the robustness of all approaches, including supervised ones. This time we have purposely distorted the measurements by the noise of the normal distribution of zero mean and different standard deviations. At the applied normalization of the sensor signals to the range of (0, 1) we have checked the efficiency of all methods at two values of standard deviations: std1 = 0.2 and std2 = 0.3. The results, in the form of the misclassification rates, are presented in Table 4. This time the unbeatable was the SVM-based solution. Thanks to the margin of separation applied at the learning stage the solution is very resistive to the noise. At very high noise (std = 0.3) distorting the normalized measured signals the mean misclassification rate was still below 3%. On the other side the linear combiner has no such immunity. At high noise added to the samples its accuracy is even below the K-means approach. At the testing stage the problem that may arrive is the classification of the samples, not belonging to the classes attending the learning stage (the open set of enantiomer families). Since
Table 4 The relative misclassification rate of enantiomers at the distorted data Recognition method
K-means GK algorithm SVM-classifier Linear combiner
Carvone (%)
␣-Pinene (%)
Limonene (%)
Mean (%)
Std1 = 0.2
Std2 = 0.3
Std1 = 0.2
Std2 = 0.3
Std1 = 0.2
Std2 = 0.3
Std1 = 0.2
Std2 = 0.3
10.5 5.7 0.1 2.4
10.8 16.1 1.5 14.9
10 10.4 0 1.7
10.2 22.7 0.4 8.9
13.9 18.1 1.0 5.3
13.9 29.3 6.67 17.5
11.46 11.45 0.37 3.13
11.63 22.7 2.86 13.76
K. Brudzewski et al. / Sensors and Actuators B 122 (2007) 493–502
the unfamiliar enantiomer does not belong to any class defined yet, all classes of the already learned classifier should reject it. In the case of the self-organization it can be done by comparing the average distance of the vectors within the same class to the centre with the actual distance of this particular vector to the same centre. If the actual distance exceeds some threshold value, typical for the distribution of data samples belonging to the particular class, the sample is rejected and labelled as unclassified. However the SVM classifier, from its principle of operation, always points to one particular, already existing class. To solve this problem we have proposed the modification, resulting into two stage classification of the already acquired data vector x: 1. In the first stage the ordinary SVM classification is performed and the preliminary winner class is determined. 2. In the second stage the distance between the already applied vector x under classification and the mean vector m of all x learning vectors belonging to the winning class is determined. The Mahalanobis distance d(x, m) is preferred: d (x, m) = (x − m) C 2
T
−1
(x − m)
(3)
499
readily because the planar confinement excludes some bulk crystal symmetry elements and enhances chiral interactions. Different approaches to the final recognition of two class enantiomers have been applied and compared in this work. They include the supervised learning, fuzzy self-organization of GK type and classical state vector methods. The results have proved the absolute superiority of SVM-based solution. The system applying SVM is very accurate and at the same time very resistive to the noise contaminating the measurements. The experiments with the data samples artificially distorted have shown only negligible loss of accuracy at application of the SVM classifier. The other approaches have shown lack of such immunity. The results of performed experiments have confirmed that TGS gas sensors arranged in the electronic nose system discriminate very well between two different enantiomer forms. Moreover they are able to recognize also the “alien” samples not belonging to any of the trained enantiomer forms. Appendix A A.1. The support vector machine (SVM) classifier
where C is the covariance matrix defined as follows: C = E[(x − m)(x − m)T ]
(4)
and E denotes the expectation operator. The vector m is determined as the mean of the learning vectors xi belonging to the Ni appropriate class, m = (1/Ni ) xi . i=1
If the distance d(x, m) exceeds some threshold value, typical for the distribution of data samples belonging to the particular class, the sample is rejected and labelled as unclassified. We have performed such experiments by training the SVM network on one kind of enantiomers and testing the learned system on the mixed data set containing the samples of this particular enantiomer forms and the representatives of the two other enantiomers. We have trained three different systems directed to the recognition of particular one of three investigated enantiomers (carvone, limonene and ␣-pinene) and using the other enantiomer types as the “pollution”. In each case the results of recognition were perfect. The applied threshold value in each case was equal one (at the normalized data). The obtained results confirm that the SVM classifier is not only the perfect tool for recognition of two particular enantiomer forms but also has high immunity to the “alien” data mixed with the proper class. 6. Conclusions An attempt has been made to discriminate between different enantiomer forms using an electronic nose and an artificial intelligence approach. It has been shown that the application of TGS gas sensors based on the crystalline thin films of tin oxides secures the specific chiral interactions (chiral adsorption of the enantiomers on crystalline surface). The enantio-separation of the chiral molecules in two dimensions is expected to occur more
At present the SVM network solution [16,17] is regarded as the most efficient classification tool of very good generalization ability. Trained on the limited number of representative examples of each class, the network is able to perform well on the wide spectrum of data differing significantly within the same class. The SVM is the three-layer structure containing N input nodes, K hidden units described by kernel function and one output node. The learning task of the SVM network in the classification mode is to determine the parameters of the separating hyperplane providing the maximization of the separation margin between two classes of the destination values di = 1 and di = −1 by keeping the number of misclassifications as small as possible. The separating hyperplane is described by the relation: K y(x) = wT (x) + b = wj ϕj (x) + b = 0, where the vector j=1
(x) = [ϕ1 (x), . . ., ϕK (x)]T is composed of activation functions of hidden units and w = [w1 , ..., wK ]T is the weight vector of the network. Mathematically the primary learning problem [16,17] is defined as the minimization of the objective function φ(w, ): 1 φ(w, ) = wT w + C ξi 2 p
(A.1)
i=1
at the following linear constraints (i = 1, 2, . . ., p): di (wT (x) + b) ≥ 1 − ξi ,
ξi ≥ 0
(A.2)
where ξ i are the slack variables. The first term in Eq. (A.1) corresponds to the maximization of the margin of separation. The constant C is the regularization parameter responsible for the minimization of the learning errors. The higher is its value, the bigger the impact of this term on the final parameters of the hyperplane.
500
K. Brudzewski et al. / Sensors and Actuators B 122 (2007) 493–502
In practice the learning procedure is simplified to the quadratic programming by introducing the Lagrange multipliers. All operations in learning and testing modes are done in SVM using so-called kernel functions K(x, xi ) = T (xi )(x), satisfying the Mercer conditions [16]. The output signal y(x) of the SVM in the retrieval mode (after learning) is determined as y(x) =
Nsv
αi di K(xi , x) + b
(A.3)
i=1
where αi are the Lagrange multipliers, di the destination values, b the bias, and Nsv is the number of support vectors (vectors xi for which αi = 0). The positive value of the output signal means that the vector x belongs to the appropriate class, while the negative one to the opposite one. The important point in designing the SVM classifier is the choice of the kernel function. The simplest linear kernel is the most preferred. However it is efficient only for the linearly separable data. In the case of non-separable data the nonlinear kernel is needed. The most often used are the Gaussian or polynomial kernels. On the stage of designing the SVM classifier, some parameters called usually hyperparameters, such as the regularization constant C and parameter σ of the Gaussian function are needed. The small value of C results in the acceptation of more not separated learning points. At higher values of C we get a lower number of classification errors of the learning data points, but more complex network structure. The optimal value of it is determined after additional series of learning experiments through the use of the validation test sets [17]. Many different values of C and σ are usually tested in the learning process and the optimal values, for which we get the smallest classification error on the validation data set is accepted. A.2. Fuzzy self-organization of data The SVM solution proposed in the previous section applies the supervised learning. To get wider look into the problem we have also tried the self-organization of data. The most important question is if the data belonging to different classes form separate clusters. It is expected that the measured signals are of fuzzy distributed nature. Good view for this distribution of data provides the methods of competitive clusterization, especially Gustafson–Kessel (GK) algorithm, taking into account special scaling in different directions [19]. Let us assume that there are p vectors x that should be divided into c clusters (in the case of (+) and (−) enantiomers the number of classes c = 2). Denote by uij the membership coefficient value of the particular vector xj to the ith cluster of the center ci . The fuzzy clustering GK algorithm searches for the membership partition matrix U and the cluster centers such that the following objective function E is minimized: E=
p c i=1 j=1
2 um ij dij
(A.4)
subject to c
uij = 1
(A.5)
i=1
for j = 1, 2, . . ., p and i = 1, 2, . . ., c. The parameter m controls the fuzziness of the clusters (usually m = 2). The function dij = d(xj , ci ) measures the Mahalanobis distance between the data vector xj and the center ci of ith cluster. The GK algorithm can be presented in the following form [19]. At the given data vectors x choose the number of clusters c, their initial positions ci , the weighting coefficient m and the termination tolerance ε. Assume the starting value of the cluster covariance matrices Fi = 1 and then iterate through the following steps: 1. Compute the distances dij (i = 1, 2, . . ., c and j = 1, 2, . . ., p) between the input vector xj and cluster centers ci using the equation: dij2 = (xj − ci )T N det(Fi )F−1 (A.6) i (xj − ci ) where N is the dimension of the input vectors x. 2. Update the membership partition matrix U entries, uij (i = 1, 2, . . ., c and j = 1, 2, . . ., p) according to the rule: 1 2/(m−1) k=1 (dij /dkj )
uij = c
(A.7)
If dij = 0 for some i = I, take uIj = 1 and uij = 0 for all i = I. Iterate until ||Ul − Ul−1 || ≤ ε for two succeeding iterations. 3. Determine the cluster prototype centers ci for all clusters, i = 1, 2, . . ., c: p m j=1 uij xj ci = p m (A.8) j=1 uij 4. Calculate the cluster covariance matrices Fi (i = 1, 2, . . ., c) according to the relation: p m T j=1 uij (xj − ci )(xj − ci ) p m (A.9) Fi = j=1 uij and return to 1. After finishing the learning process all parameters are frozen (the cluster centers and covariance matrices). Each vector x can be now tested to which the cluster belongs. This is done on the basis of its distance to the cluster centers calculated according to Eq. (A.6). At two classes we have two centers and two covariance matrices. The distance to the class center of the smaller value indicates the class membership. The covariance matrix taking part in the determination of the distance plays an important role in practice, since the shapes of the clusters are far from the circular. This matrix provides proper scaling of the particular coordinates of the vector x. On the other side the fuzzy character of the GK algorithm takes into account the fuzzy distribution of data and assures better results of clusterization in comparison to very popular K-means algorithm [20].
K. Brudzewski et al. / Sensors and Actuators B 122 (2007) 493–502
501
A.3. Space vector approach
References
The space vector approach provides very popular measures of the distance between two vectors in the enantiomer research. The most popular is the angle between vectors and the parameter Ee , called also the enantiomeric excess. The angle α between two vectors xi and xj is computed on the basis of their inner product. The cosine of this angle represents one of the similarity measures between the two vectors: xi · xj cos(α) = (A.10) ||xi ||||xj ||
[1] R.A. Seldon, in: M. Dekker (Ed.), Industrial Synthesis of Optically Active Compounds, Chiral Technologies, New York, 1993, pp. 173–204. [2] S. De Feyter, F.C. De Schryver, Two-dimensional super molecular self assembly probed by scanning tunnelling microscopy, Chem. Soc. Rev. 32 (2003) 139–150. [3] L. Ortega, M. Baddeley, C.J. Muryn, R. Raval, Extended surface chirality from super molecular assemblies of adsorbed chiral molecules, Nature 404 (2000) 376–379. [4] T. Kuhnie, R. Linderoth, B. Hammer, F. Besenbaher, Chiral recognition in dimerization of adsorbed cysteine observed by scanning tunnelling microscopy, Nature 415 (2002) 891–893. [5] Q. Chen, N.V. Richardson, Enantiomeric interactions between nucleic acid and amino acids on solids surfaces, Nat. Mater. 2 (2003) 324–328. [6] R. Raval, Review chiral expressions at metal surface, Curr. Opin. Solid State Mater. Sci. 7 (2003) 67–74. [7] S.M. Barlow, R. Reval, Complex organic molecules at metal surfaces: bonding, organisation and chirality, Surf. Sci. Rep. 50 (2003) 201–341. [8] V. Humblot, S.M. Barlow, R. Reval, Two-dimensional organisational chirality through supramolecular assembly of molecules at metal surfaces, Prog. Surf. Sci. 76 (2004) 1–19. [9] S. De Feyter, A. Gesquiere, K. Wurst, D. Amabilino, J. Veciana, F.C. De Schryver, Homo- and heterochiral supramolecular tapes from achiral, enantiopure, and racemic promesogenic formamides, Agnew. Chem. Int. Ed. Engl. 40 (2001) 3217–3220. [10] S. Romer, B. Behzadi, R. Fasel, K.H. Ernst, Homochiral conglomerates and racemic crystals in two directions: tartaric acid on Cu(1 1 0), Chem. Eur. J. 11 (2005) 4149–4154. [11] Y. Cai, S.L. Bernasek, Chiral pair monolayer adsorption of iodinesubstituted octadecanol molecules on graphite, J. Am. Chem. Soc. 125 (2003) 1655–1659. [12] Y. Wie, K. Kananappan, W.G. Flynn, M.B. Zimmt, Scanning tunnelling microscopy of prochiral antracene derivatives on graphite: chain length effects on monolayer morphology, J. Am. Chem. Soc. 126 (2004) 5318–5322. [13] K. Brudzewski, Electronic nose, Elektronizacja 1 (1997) 15–16 (in Polish). [14] K. Brudzewski, S. Osowski, T. Markiewicz, J. Ulaczyk, Classification of gasoline with supplement of bio-products by means of an electronic nose and SVM neural network, Sensors Actuators B 113 (2006) 135– 141. [15] S. Haykin, Neural Networks, a Comprehensive Foundation, Macmillan, NY, 2002, pp. 1–842. [16] V. Vapnik, Statistical Learning Theory, Wiley, New York, 1998, pp. 1–764. [17] B. Sch¨olkopf, A. Smola, Learning with Kernels, MIT Press, Cambridge MA, 2002, pp. 1–626. [18] K. Diamantaras, S. Kung, Principal Component Neural Networks, Theory and Applications, Wiley, New York, 1996, pp. 1–232. [19] D. Gustafson, W. Kessel, Fuzzy clustering with a fuzzy covariance matrix, in: Proceedings of the IEEE CDC, San Diego, 1979, pp. 761–766. [20] F. Hoppner, F. Klawonn, R. Kruse, T. Runkler, Fuzzy Cluster Analysis, Wiley, 1999, pp. 1–279. [21] A. Jain, R. Dubes, Algorithms for Clustering Data, Prentice-Hall, 1988, pp. 1–320. [22] Matlab manual, Natick, USA, 2002.
The values of similarity range from 0 (for the orthogonal vectors of α = 90◦ ) to 1 (for parallel vectors of α = 0). In the latter case two vectors are ideally fit. The second parameter Ee represents the other similarity measure defined on the basis of the norms of these vectors. It is defined in the way: Ee =
||xi || − ||xi || ||xi || + ||xi ||
(A.11)
In the enantiomer research both measures are applied for the vectors characterizing the means of both enantiomer classes (+) and (−). Any metric (Euclidean, Mahalanobis, Minkowski, etc.) can be applied for calculation of norms. Small values of α or Ee indicate large similarity of both clusters (classes). On the other side large values of these measures mean well-separated classes. A.4. Principal component analysis of data Principal component analysis (PCA) of data is a very convenient tool for visual inspection of the distribution of data in the multidimensional space. PCA is described as the linear transformation y = Wx, mapping the N-dimensional original vector x into K-dimensional output vector y, where K < N. It represents a classical statistical technique for analyzing the covariance structure of multivariate statistical observations, enhancing the most important elements of information collected in the reduced dimension vectors y. The K × N matrix W is the PCA transformation matrix composed of the Eigen vectors wi of the correlation matrix Rxx associated with K largest Eigen values, W = [w1 , w2 , . . ., wK ]T . The limited, but representative portion of the information contained in the first few principal components yi allows analyzing the measured data in an easy graphical way. Usually three most important components represent more than 99% of the original information. Mapping all measured data on these three components enables the visible graphical interpretation of the data distribution. If the data belonging to two classes under consideration form two separate clusters in the reduced dimension space, the classes can be linearly separated using any discriminating system, for example linear kernel SVM or a simple linear combiner. All above-presented learning procedures of the data processing, including the SVM networks, GK clusterization, PCA and space vector approaches, have been implemented in Matlab [22] in the form of our own developed software tools.
Biographies Kazimierz Brudzewski was born in Poland in 1943. He received the PhD in solid-state physics from Warsaw University of Technology, Warsaw in 1974, after which he joined the Staff of the Department of Chemistry. He habilitated in thin films physics in 1981. His present position is the Head of the Sensor Technique Laboratory. His work encompasses many aspects of thin solid films, including sensor technique and application of the artificial neural networks. Jan Ulaczyk was born in Poland in 1978. He received the MSc degree in Physics from Warsaw University of Technology, Warsaw, Poland in 2002. He
502
K. Brudzewski et al. / Sensors and Actuators B 122 (2007) 493–502
is currently the PhD Student at the Faculty of Physics, Warsaw University of Technology. His research includes bioinformatics, neural networks and sensor techniques. Stanislaw Osowski was born in Poland in 1948. He received the MSc, PhD, and DSc degrees from the Warsaw University of Technology, Warsaw, Poland, in 1972, 1975, and 1981, respectively, all in Electrical Engineering. Currently he is a Professor of Electrical Engineering at the Institute of the Theory of Electrical Engineering, Measurement and Information Systems, Warsaw University of Technology. His research and teaching interest are in the areas of
neural networks, optimization techniques, and computer-aided system analysis and design. Tomasz Markiewicz was born in Poland in 1977. He received the MSc and PhD in Electrical Engineering from the Warsaw University of Technology, Warsaw in 2001 and 2006, respectively. Currently he is an Associate Professor of Electrical Engineering at the Institute of the Theory of Electrical Engineering, Measurement and Information Systems, Warsaw University of Technology. His scientific interest is in neural networks and signal processing.