ultramicroscow I
ELSEVIER
.
Ultramicroscopy 62 (1996) 65-78
Unconventional immuno double labelling by energy filtered transmission electron microscopy P.J.B. Koeck a3*, R.R. Schriider b, M. Haider a, K.R. Leonard a a European Molecular Biology Laboratory, P.O. Box 102209. D-69012 Heidelberg, Germany b Max Plank Institute for Medical Research, P.O. Box 103820, D-69028 Heidelberg, Germany Received 17 January 1995; accepted 27 July 1995
Abstract A new method of immuno double labelling of biological specimens with a very high spatial resolution is presented. The advantage over conventional techniques is the possibility of using two very small labels leading to higher labelling efficiency, better penetration into the specimen and reduced steric hindrance between labels at closely spaced sites. The two labels are distinguished by their electron energy loss spectra using principal component analysis and then identified by comparison with an external standard using discriminant function analysis. The method is tested on samples of insect flight muscle labelled with 8 nm colloidal gold and silver and the statistical reliability of the classification is assessed. Extensions of the method are suggested and its potential for biological research is discussed.
1. Introduction Double Immuno Labelling is a technique used in biological electron microscopy to label different proteins specifically. On an electron micrograph specific proteins are distinguished and thus localized simultaneously in a biological sample. The standard methods use gold labels and/or silver enhanced gold labels of sizes sufficiently different to be able to distinguish them using a conventional transmission electron microscope (CTEM) [ 1I or a scanning electron microscope @EM) [2]. The obvious limit to these methods is the size of the larger label when colocalized proteins need to be resolved. In this case labelling probability could be affected by neighbouring labels due to steric hin-
* Corresponding author. Fax: +49 6221 387306; E-mail:
[email protected].
drance [3]. Other reasons for using only small labels are the bad penetration of large labels into the specimen [4] and their low binding efficiency [5]. A recently proposed method [6] that uses two labels of similar size exploits the difference in angular distribution of electrons elastically scattered by labels differing in atomic weight (for instance gold and silver labels). Due to the bad visibility of labels of low atomic weight and the necessity of a considerable difference in atomic weight to be able to distinguish the labels, this method cannot be used with very small labels and probably cannot be extended beyond two or maybe three different labels. Theoretical and experimental research has also been carried out into plasmon excitations in small metal spheres. The electron energy loss (EEL) spectra due to these surface plasmons have been shown to be characteristic of the material over a large range of diameters, although below a critical diameter in
0304-3991/96/$15.00 Copyright 0 1996 Elsevier Science B.V. All rights reserved SSDf 0304-3991(95)00088-7
66
P.J.B. Koeck et al. / Ulrramicroscop~ 62 (1996165-78
the order of 10 nm the volume plasmon energy is expected to increase whereas the surface plasmon energy is expected to decrease quickly with decreasing cluster diameter. This effect has been shown theoretically and experimentally for oxidized tin and metallic as well as oxidized gallium clusters [7]. In principle the position of the plasmon loss peak would be a good means of distinguishing different labels provided that large labels can be used or the label size is very well defined, but for gold and silver it anyway lies at too low an energy loss to be able to separate it from the primary (zero loss) peak with energy filters currently available for transmission electron microscopy [S]. In the following a new method is presented to distinguish gold and silver labels of the same or arbitrary size from their EEL spectrum at energy losses of lo-40 eV. The structure of the spectra in this region can be attributed to inter band transitions
191. Preliminary measurements on a scanning transmission electron microscope @IEM) with a parallel electron energy loss spectrometer 1101 have shown that the shape of the EEL spectra in this energy region depends sensitively on where the electron beam hits the labels [l 11. This can be explained by the difference in the electronic structure of bulk and surface atoms and makes it necessary to select the area from which the spectra are taken reproducibly and with high accuracy. Using characteristic inner shell losses of gold and silver would have required a much higher dose leading to greater radiation damage to the specimen and was therefore not considered advantageous. Multivariate Statistical Analysis in its various forms (such as Principal Component Analysis) is the method of choice for the analysis and classification of spectra in the energy region considered. Similar methods have been applied to electron spectroscopic imaging (elemental mapping) [ 12,131 and also quantitative determination of element distributions in biological samples [ 14,151.
Table 1 Electron dose used for each energy filtered image of the fust (left) and second (right) set of images Energy loss (eV)
Dose (electrons/nm2)
10 I5 20 25 30 35 40
24000/24000 18000/19000 14000/14000 14000/14000 20000/16OQO 28000/24000 42000/37000
were labelled with N 8 nm colloidal gold coupled to an antibody against myosin and N 8 nm colloidal silver coupled to an antibody against troponin. Both labels were linked to the antibodies via protein A. For details on the preparation of samples see Ref.
bl. The sections were studied at a Zeiss 912 Omega energy filtering transmission electron microscope. Energy filtered images were recorded at an energy resolution of 5 eV ranging from an energy loss of lo-40 eV in steps of 5 eV. An unfiltered bright field image was recorded at 5 pm defocus with a dose of 800 electrons/nm2 to be able to relate the position of the gold and silver labels to the structure of the muscle. The primary magnification used for all the images was 20000. The dose was chosen individually for each energy filtered image in order to exploit the full dynamic range of the photo negatives. The dose used for each energy filtered image for the two sets of images discussed is given in Table 1. In order to perform statistical analysis a 5 X 5 cm sized area of each image was digitized with a pixel size of 25 pm on an Optronics scanner giving a set of 8 images of 2000 X 2000 pixels. All further image processing and data analysis was carried out using the image processing package “Khoros” [16].
3. Results and discussion 2. Experimental procedures
3.1. Preliminary
Samples of unstained thin freeze-substituted sections of stretched Lerhacerus indicus muscle fibres
In order to extract spectra from the images at different energy losses it is necessary to align them
image processing
67
P.J.B. Koeck et al./ Ultramicroscopy 62 (1996) 65-78
to the corresponding bright field image. This involves shifting and rotating the images as well as a slight warping since the magnification of the microscope depends differently on position in different energy regions. This alignment step yields two sets of 7 energy filtered and one bright field image aligned to each other. The sizes of the two image sets are 1600 X 1600 and 1024 X 1024 pixels respectively. The bright field images, with their clearly visible labels, are used to locate the centres of the gold and silver labels. Per image set two data sets of 7-component spectra are extracted, one set of “label spectra”, each averaged over a circle of 5 pixels diameter (the labels being about 8 pixels in diameter) and one set of background spectra, each averaged over a ring of 20 pixels diameter and 1 pixel width around the centre of each label. These two data sets are then normalized in two successive steps: First each individual spectrum is divided by its integral yielding two sets of spectra with an integral of 1. This step ensures that the only possible variance in the data set comes from the shape of the spectra and not from their intensity, making the analysis insensitive to variations in label size. In the second step the average spectrum is subtracted for each of the two data sets separately, leading to two sets of spectra with each component centred around 0. 3.2. Classification of spectra The classification of the extracted spectra into gold and silver is initially only carried out with the first image set. If the spectra are pictured in a seven-dimensional co-ordinate space the two normalization steps restrict all the spectra to a six-dimensional hyper plane with the origin of the co-ordinate system at the centre of gravity of the set of spectra (see Appendix). The main spread of these data points is now due to variation in the shape of the spectra. The direction of this spread can be calculated by principal component analysis and is given as the first eigenvector of the data sets’ covariance matrix. The next eigenvectors give the directions of greatest spread of the data set orthogonal to the preceding eigenvectors. Now
4.044 -0.06
-0.04
-0.02 ComponentI
0
0.02
Fig. 1. Scatter plots of the individual spectra in the planes given by the first two eigenvectors (principal components). (a) Shows the scatter plot of the gold and silver label spectra whereas (b) shows that of the background spectra.
the entire set of spectra can be described by a new co-ordinate system consisting of the eigenvectors (or “principal components”). This co-ordinate system is adapted to the data set in the sense that the cloud of data points (spectra) can easily be visualized since the largest part of the data set’s variance is concentrated into a few dimensions and the other dimensions contain mainly noise (see Appendix). Fig. 1 shows the result of the diagonalization for the two data sets of label and background spectra. A two-dimensional projection onto the plane defined by the first 2 eigenvectors (a 2D scatter plot) is sufficient to see the dominant variations in the data sets. As can be seen from the comparison of the first two eigenvectors of the two data sets in Fig. 2 the main direction of variance is identical for the label and the background spectra. This is an effect of the aberrations of the energy filter used, which allows a finite energy dispersion
68
P. J. B. Koeck et af. / Uftramicroscopy
J s
25 30 Eneqy loss I eV
35
40
Fig. 2. The first two eigenvectors for the data set of gold and silver label spectra (Ll and L2) and that of background spectra (Bl and B2).
only (5 eV energy windows) and has additional anisochromasy. Fig. 1 also shows that the set of label spectra has more variance in the direction of the second eigenvector than does the set of background spectra, and since the second eigenvectors (L2 and B2) in Fig. 2
62 C1996) 65-78
differ considerably this is an effect that comes only from the labels (namely the difference between gold and silver labels) and is not visible in the surrounding background. Fig. 3 shows histograms of the projections of the label data sets onto the first four eigenvectors. A grouping of the data into two classes can be clearly seen in histogram B whereas histogram A shows the spread due to the imaging aberrations mentioned above and D is mainly noise as implied by the lack of structure. Histogram C also shows two peaks and could be used as a basis of separating the data into two classes. However this classification is inconsistent with that based on histogram B and must therefore be due to some additional, unknown effect. In principle histogram E? can be used to distinguish gold and silver labels. To improve signal separation however the background data should be incorporated to reduce the artefact dominant in both data sets since the slight difference between the first eigenvectors of Fig. 2 could indicate that there is
Fig. 3. Histograms of the projections of the gold and silver label data set onto the first four eigenvectors. units and dimensionless due to the preceding data normalization.
The units of the x-axis are relative
69
P.J.B. Koeck et al. / Ultramicroscopy 62 (1996) 65-78
o.“lO?
ComponentI
1
0.02
Fig. 4. Scatter plot of the label spectra projected onto the eigenvectors 2-7 of the background spectra in the plane given by the first two eigenvectors of this new data set with reduced dimensions.
useful information in the first eigenvector of the set of label spectra which could lead to a better separation between gold and silver. In a further step of analysis therefore the set of label spectra is projected onto the eigenvectors 2-7 of the background data set leading to a set of six-dimensional spectra (in the co-ordinate system defined by the eigenvectors 2-7 of the background spectra) which now shows no variance due to the mentioned artefact. Diagonalization of this new data set, which is already centred, now leads to discrimination between gold and silver in the first eigenvector. Fig. 4 shows the projection of this data set onto the plane defined by the first two eigenvectors. A separation into two classes is clearly discernible from this plot and the statistical significance can be estimated from Fig. 5 showing the eigenvalues corresponding to the six eigenvectors of the covariance matrix. The eigenvalues can be interpreted as variances in the direction of the corresponding eigenvectors and are a measure of how much the data spreads in a given direction. Notice that the last eigenvalue is 0 owing to the fact that due to the normalization applied all data points lie on a hyper plane as explained above. The eigenvector corresponding to the eigenvalue 0 is simply orthogonal to that plane. The histogram of the data set projected onto the first eigenvector is shown in Fig. 6. This histogram clearly separates into two peaks. A sum of two Gaussians fits very well into this histogram indicat-
Fig. 5. Eigenvalues
2
3
4
of the reduced dimension
5
6
data set as in Fig. 4.
ing that the data set falls into two classes of spectra both exhibiting statistical noise. The classification with the smallest error is achieved by placing the boundary approximately at the point of intersection of the two Gaussians. An identification of gold and silver labels according to
ro:o2
xl:01
b
O.bl
o.a2
0.01
0.02
BWe 307
: B
25:
-0.02
0
Fig. 6. Histogram of the projection of the reduced dimension data set as in Fig. 4 onto its first eigenvector (a) and a fit of a sum of two Gaussians into the histogram (b). In (b) the individual Gaussians and their sum are shown as solid lines and the histogram of (a) as dots.
70
P. J. B. Koeck et al. / Ultramicroscopp 62 (I9961 6% 78
this classification is shown in Fig. 7 with the gold labels marked by black rings and the silver labels by white rings. To increase image contrast of the unstained specimen Fig. 7 is the superposition of the 20 eV loss energy filtered image and the label positions from the bright field image. This was necessary since
the unstained muscle sample gives very poor contrast in bright field whereas the labels show up very badly in energy loss images. This second effect is due to a counteraction of two contrast mechanisms in energy filtered imaging. Material of low atomic weight is seen in a sort of dark field since here the inelastic
Fig. 7. Classification of the labels on a double labelled muscle section into gold (black rings) and silver (white rings) labels. The border between the two classes is given by the point of intersection of the two Gaussians in Fig. 6b, gold being on the right hand side and silver on the left.
71
P.J.B. Koeck et al. / Ultramicroscopy 62 (1996) 65-78 Table 2 Function,
I(*)=
$1 S? c, cz Al AZ
parameters
and errors of double Gaussian
A,exp(
-L=$)+A2E.p(
fit in Fig. 6b
-qq
Value
Error
0.00337 0.00299 -0.00718 0.00310 10.387 23.46
0.00057 0.00023 0.00054 0.00@23 1.0928 1.1473
scattering dominates whereas for high atomic weight areas such as the labels a high percentage of electrons is lost to elastic high angle scattering and therefore the labels have a bright field appearance. If the cross sections for elastic and inelastic scattering with a given energy loss have a certain ratio the labels cannot be distinguished from the background, which can be shown for both gold and silver labels at different energy losses. The reliability of the classification into gold and silver spectra can be estimated from the Gaussian fit in Fig. 6b. The parameters and errors of the fit are given in Table 2. It is clear from Fig. 7 that the labels classified as gold are concentrated in the bright area of thick myosin filaments on the top left half of the image whereas the labels classified as silver are concentrated in the darker area of thin filaments containing troponin. Apparently misplaced labels can be explained partly by misclassification of labels in the region of the overlap of the two Gaussians in Fig. 6b and partly by unspecific labelling which always occurs with a certain probability. 3.3. External standards
lying the labelling experiment. There are various ways to overcome this difficulty. For instance one could use one of the labels bound to two different antibodies, the one to the antigen one is interested in and the other to a well localized epitope (in muscle a protein on the Z-disk for instance). This method of an “internal standard” will make the biochemistry of labelling more complicated since for double labelling one would need three different label-antibody combinations (and for triple labelling five). Another method is to use an “external standard”, that is a second data set where one has already identified the subsets by some additional criterion. In the following this avenue will be taken with the data set analyzed above as an external standard and the spectra taken from the second set of images as the unknown data set to be classified. In a first step the complete principal component analysis is repeated on this second set of label and background spectra. As above both the label and background spectra are normalized to unity and each set is centred. The artefactual anisochromasy due to aberrations of the energy filter is extracted from the set of label spectra by restricting this set to a six-dimensional subspace orthogonal to the vector describing the artefact in the original seven-dimensional co-ordinate space. The results of principal component analysis applied to this dimensionally reduced set of label spectra are shown in Figs. 8-10. Judging from the scatterplot in the first two eigenvectors shown in Fig. 8 and from the eigenvalues shown in
0.02 ~ -1 0.01 PI
1
, 1.... :.:. . .. . .
1
o
8
.
,,;;:
_I’
r.1 i .. . .;..i_:~.~:~~~i_:, ‘.
.
. .
The principal component analysis applied above can obviously only classify a data set into subsets, but without additional knowledge it is not possible to tell which subset is associated with which label, that is, which are the gold and which are the silver spectra. In many practical cases the information one gets from this sort of analysis will therefore be insufficient to answer the biological questions under-
-0.01
-0.02 ~ -0.02
-0.01
0 Component 1
..:
. .
0.01
0.02
Fig. 8. Scatterplot of the second set of label spectra projected onto the eigenvectors 2-7 of the corresponding background spectra in the plane given by the first two eigenvectors of this new data set with reduced dimensions.
P.J.B. Koeck et al./ Ultramicroscope 62 (19%) 65-78
72
Table 3 Function, parameters
and errors of double Gaussian tit in Fig. lob
0.008 4
0.006
ii z
0.004 0.002 0
Fig. 9. Eigenvalues
2
1
4
3
5
component
6
Value
Error
0.00583 0.00474 -0.01078 0.00448 6.5196 17.359
0.00150 0.00039 0.00125 0.00045 0.72662 0.83831
of the reduced dimension data set as in Fig. 8.
Fig. 9 the separability of gold and siiver in this second data-set is comparable to that analyzed above. The function, parameters and errors of the double Gaussian fit to the histogram of the projection onto
the first eigenvector shown in Fig, 10 are given in Table 3. The first data set will now be treated as an 16 1 14 12
20 ,
I
‘A
B 8
10 8 6
-0.003
-0.002
-0.001
0
0.001
0.002
Range
--0.02
-0.01
0 Range
0.01
0.02
Fig. 1 I. Histogram of the projection of the unknown data set of normalized label spectra onto the discriminant function of the external standard data set.
0.002
j
-0.02
-0.01
0
0.01
0.02
Range Fig. 10. Histogram of the projection of the dimensionally reduced data set as in Fig. 8 onto its first eigenvector (a) and a tit of a sum of two Gaussians into the histogram (b). In (b) the individual Gaussians and their sum are shown as solid lines and the histogram of (a) as dots,
a. 1
0.001:
El .q
0:
8 6 s
-0.001 -
rg %
-0.002 -
. . . ,o.: i .
.:. . .
.. : . Z,‘. .
..., v:.:. . . . . ..,.: . . .: : .. ‘.:..’ :. .s._:,I’ p ,... i ._:
.;. .: :..* . :
-0.003 7”“1”“5’, -0.02
I.. .1., I. 1.. I,. ..,...I -0.01 0 0.01 Projectionotttodiscrimittmtfunctiot~
0.02
Fig. 12. Plot of the projection of the dimensionally reduced data set (see Figs. 8-10) onto its first eigenvector against the projection of the normalized label data set onto the discriminant function of the external standard.
P.J.B. Koeck et al. / Ultramicroscopy 62 f 1996) 65-78
external standard with which to compare this second data set in order to find out which of the two peaks in the histogram Fig. 10 corresponds to silver and which to gold spectra. The comparison is carried out using the normalized seven-dimensional label spectra
13
(which still show the anisochromasy) and not the dimensionally reduced spectra. The discriminant function of the external standard data set is calculated (see Appendix B based on the classification derived in Section 3.2 and then the data
Fig. 13. Classification of the labels on a double labelled muscle section into gold (black dots) and silver (white dots). The border between the two classes is given by the point of intersection of the two Gaussians in Fig. lob, gold being > -0.004 and silver < -0.004. The 1024 by 1024 pixel area analyzed is highlighted within the total 2000 by 2000 pixel area scanned.
74
P.J.B.
Koeck et al. / Ultramicroscope
set of unknown label spectra is projected onto this discriminant function. The direction of the discriminant function (which is of course a vector in sevendimensional co-ordinate space) is chosen such that it points from the silver to the gold spectra, so the gold spectra of the unknown data set should have higher projection values than the silver spectra. The histogram of this projection is shown in Fig. 11. To check the consistency of the class separation due to the discriminant function shown in Fig. 11 and that due to the first eigenvector of the reduced data set shown in Figs. 8-10, the two projection values are plotted against one another in Fig. 12. The linear behaviour of this plot shows that there is a strong correlation between these two projections. It is therefore possible to classify the unknown data set using Fig. 10 and subsequently identify the two classes by comparison with an external standard using discriminant function analysis. A classification of the labels into gold and silver based on the double Gaussian fit in Fig. lob, such that the border between the two classes is at the intersection of the two Gaussians, is shown in Fig. 13. The analyzed area includes only regions of thick filaments and a small overlap region, therefore gold and silver labels are evenly distributed over the whole area with gold being far more abundant. This second data set excludes the possibility that the analysis simply classifies the labels according to where they lie on the sample, as could be suspected from the results presented in Section 3.2.
62 f 1996) 65-78
be achieved in two ways: As soon as a label reaches a certain size it starts developing characteristics of a solid. Electronic behaviour typical for bulk material sets in at cluster sizes of about 300 atoms, that is diameters of about 2 nm, though only the cores of such clusters show their specific metallic behaviour whereas the properties of the surface atoms depend also on the environment (such as ligands bound to the surface) [ 171. Another way to ensure a characteristic EEL spectrum is to use uniform clusters with a defined atomic structure and ligand environment. Such clusters are available even at diameters smaller than 2 nm and some of them are already used in immuno labelling [4,17]. Finally it should be emphasized that the method is not restricted to two labels. The number of different labels that can be distinguished depends only on the number of labels with sufficiently different energy loss spectra that can be produced. Therefore it should be possible to go far beyond the limitations of conventional multiple labelling techniques.
Acknowledgements The authors would like to thank Thomas Bastian and Magdalena Akke for their software contributions to the image programming system “Khoros” and assistance in computer related matters and Willem Tichelaar for the test samples.
Appendix A. Principal component analysis 4. Conclusion and outlook
The new method of double immuno labelling presented here is not limited by the necessity of using labels of different sizes and therefore could open a whole range of new biological applications requiring very small labels. It also does not depend on labels of very different atomic weight which allows two similarly heavy labels (such as gold and platinum) to be used resulting in better visibility, especially in stained samples. Only a defined electronic structure of the labels is required, which leads to a defined electron energy loss spectrum. This can
The following presentation
follows Ref. [181. The aim of principal component analysis (PCA) is to find “the best description” of a multivariate data set (a set of spectra) by finding the principal axes of the data set. It is easiest to visualize the data set as a collection of points in a multidimensional co-ordinate space. The first principal axis is such that the sum of squares of the distances of the data points from the axis is a minimum (and the variance along the axis a maximum), the second principal axis meets the same requirement with the restriction that it has to be orthogonal to the first axis. It can be shown that the principal axes have the same direction as the eigen-
P. J. B. Koeck et al. / Ultramicroscopy 62 (1996) 65-78
vectors of the covariance matrix of the data set arranged in decreasing order of the corresponding eigenvalues. A data set consisting of N spectra Dj’j”‘,where the index i (from 1 to N) denotes the spectra and the index j (from 1 to J) denotes the components of each spectrum, can be represented as a matrix Djj with dimensions J X N, where each spectrum is a column vector. This set of spectra with J components each can be visualized as a set of vectors in a J-dimensional co-ordinate space. The spectra are assumed to be normalized such that the mean spectrum of the whole data set is the zero vector. 1 N Fi = 3 ,c Dir = 0 1=l
for j = I to .I.
(1)
This means that the centre of gravity of the whole data set lies in the origin of the j-dimensional space. The main step of PCA is the diagonalization of the covariance matrix for vanishing mean spectra D,, D,., where identical indices signify that a summation is to be performed. Eigenvalues and eigenvectors can be calculated from the equation D Dk,Jin) = $“‘u!“’ ii J
D'nr = D&!?
(4)
In this data set most of the information is in matrix elements with low index n since these result from multiplication with eigenvectors corresponding to the highest eigenvalues. The high-n portion of the matrix contains mainly noise and can be used to identify aberrant spectra and to determine the type of noise in the data. The projection of the data cloud onto the plane of the first two eigenvectors is nothing but a two-dimensional plot of the N points defined by D’,; as x-values and D’,, as y-values for i = 1 to N. The same analysis can also be carried out in an N-dimensional co-ordinate space, where N is the number of spectra in the data-set as defined above. The variance matrix then reads D,, D,k and the eigenvalue equation for m = 1 to N.
(2)
where 17,~~)is the jth component of the nth eigenvector and A’“’ is the corresponding eigenvalue. The index n runs from 1 to J. The eigenvalues are arranged in decreasing order and the eigenvectors are normalized to a length of 1, that is they fulfil the orthonormality condition U;n’U:‘n) = 6nm
furthest in the direction of the first eigenvector (if the mean spectrum is zero) and thus be most easily distinguishable. By projecting the original data set onto the J eigenvectors one obtains the data matrix in the rotated (diagonal) co-ordinate system:
D,, D,,V/“” = p(m)~(m’ -
15
(3)
The J eigenvectors define a new co-ordinate space which can be derived from the original one by a rotation in J dimensions. The variance matrix in this new co-ordinate system is diagonal, with the eigenvalues as diagonal elements (in decreasing order). Now the central idea of this application of PCA is that the eigenvector corresponding to the largest eigenvalue and defining the direction of greatest variance of the data set contains the most information about the data. If in the data set there are two distinct groups of spectra, which form two clouds of points in co-ordinate space, they should be separated
(5)
The following relations between h(“‘, UC”), /L(~) and Vcm) can be found (assuming that J is smaller than or equal to N): A(“’ = p(“)
for n 5 J,
(6)
DkiUin’ = \lh(“’ k((‘)
for n I J,
(7)
Dk,ycn) = mUin)
for n I J,
(8)
p”(‘) = 0
for 12> J.
(9)
Note that, apart from a scalar factor, V(“) is simply the projection of the original data onto UC”) and vice versa as discussed above. The original data set can be exactly or approximately reconstructed from the eigenvalues and the two sets of eigenvectors.
(10) For an approximate reconstruction the summation index n need only run from 1 to P (P < J) where P
16
P. J. 6. Koeck el al. / Ultramicroscopy
is chosen in such a way that for every m > P A(“” is negligible compared to the total variance Var, which is given by the trace of the variance matrix Var = C A’“‘. ll= I
(‘1)
Appendix B. Normalization of data In order to obtain meaningful results with PCA it is necessary to normalize the data in an appropriate way. In a first normalization step the spectra are normalized to give an integral of 1. With X denoting raw spectra and Y denoting spectra after the first normalization step this can be written:
62
f 19961
65-78
lies in the origin of the J-dimensional co-ordinate space. This second property ensures that the first eigenvector (corresponding to the largest eigenvalue) of the variance matrix gives the direction of greatest variance of the data set. Without the second normalization step centre of gravity and origin do not normally coincide and the first eigenvector will be approximately parallel to the direction defined by these two points unless their separation is small compared to the size of the data cloud. The first property (141 means that one of the eigenvalues is zero, namely that corresponding to the eigenvector CJ;”= l/ fi for j = 1 to J, which is simply orthogonal to the hyper plane mentioned above.
Appendix C. Discriminant function analysis with X,, = i
5; = 2 .I
X,,,
(12)
j= I
The physical reason for this step is that the integral of the low loss peak depends strongly on parameters such as sample thickness and beam current, whereas it is the shape of the peak that is used to distinguish gold and silver labels. In a second step the average spectrum of the whole data set is subtracted from each individual spectrum to obtain a data set that shows only the deviations of each spectrum from the average: (13) The data set normalized following properties:
in this way has the
J
D,, = 0
c j=
for i = 1 to N,
( 14)
1
that is, the integral of every spectrum is zero, or, in other words, each data vector lies on the hyper plane j$, x, = 0 in J-dimensional
space.
The following presentation follows Ref. [ 181. The aim of discriminant function analysis is to find the best description of a data set already separated into classes. The following outline will be restricted to two classes, which means that the discriminant function, a vector in data space, is defined such that the variance (of the data set projected onto this vector) within each class is a minimum and the variance between the classes a maximum. Of course in this case there is only one discriminant function. The covariance between two components j and j’ of the vectors is written:
Cov(j.j’)=; ,;
,=I
(D;;-L),)(D,,,-ir,,),
(‘6)
where the mean of component j over the entire data set is: Ej = (l/N)Cy= , D,,, which with the normalization chosen (see Appendix B is zero. If the column vectors of D are partitioned into q classes with index k = 1 to q the covariance matrix can also be written: cov(j,j’)
= t
E( X=
I
C(Dij-~,)(~~,j-~j,)).
iel,
(‘5)
(‘7)
that is, the centre of gravity of all the data vectors
This expression can be factorized into four terms using the identity ( Dji - 5,) = ( Dji - D,ck,> + (Djck,
f 0,; = 0 i= I
for j = I to J,
71
P.J.B. Koeck et al. / Ultramicroscopy 62 Cl996) 65-78
- oj>, where Djck, is the average of Dji taken over where Nk is the class k: DjckJ = (I number of spectra in class k. Two of these terms are zero and the other two give the Huyghens decomposition formula (or analysis of variance equation) which decomposes the total covariance into the sum of covariances wirhin each class and the covariance between the class-averages. /N~>Ci,,kDji,
T= W+B,
(18)
T,;, = COV(j.j’),
(19)
Since f(A) is invariant to multiplication of A by a constant factor this corresponds to maximizing AjBjjrAjf with the constraint Ajqjt Ajt = 1, leading to the matrix equation (see Ref. [18] p. 26ff):
T-‘BA = AA,
(25)
if T is non-singular, which is generally the case. The eigenvector A of this equation is the discriminant function and the eigenvalue h is sometimes called the discriminating power of A. In the case of two groups with indices 1 and 2 the matrix of between group covariances is:
(26) which can be written The centered projection of datavector given J-dimensional vector A is:
u(i) = i
D(i) onto a
Dj(z)XDj’(l)
-
as Bjjt = (N, N,/NXD,(,,
The matrix B can be considered a vector C with its transpose:
A,(Llii-Ej).
(22)
-
Dj’(z)).
as the product of
B= CC’,
(27)
j= I The variance of a(i) is: var( u) leading to:
a2(i)
= G ,t
TICPA
= AA.
(28)
I= I =;;
6 ,=
1
i
j=
[
j’=
Premultiplying C leads to:
AjAj.(D,,-i5j)(Dj,;-~j,). 8
[C’T’C]C’A (23)
By changing written:
the order of summation
this can be
giving
this equation
j=
=
i 1
j’=
AjA,cov(
h = CT T- ’ C,
and A = T-‘C, and eigenvector.
References
is a discriminant function if it maximizes Aj Bjp Ajc and minimizes A,W,,, Ap, or else maximizes (AjBj,Ajt)/(AkWkktA,). Owing to Eq. (18) this is equivalent to maximizing
A, Bjjf A,! A T A . kk’
(30)
]
Ajqic A,! = AjWjjs A,< + AjBjjr Ajv.
k
(29)
j,j’)
A
f(A)=
of
= AC%,
as the only eigenvalue var( a) = i
by the transpose
k’
(24)
[I] M. Bendayan, Protein A-Gold and Protein G-Gold Postembedding Immunoelectron Microscopy, in: Colloidal Gold: Principles, Methods and Applications, Ed. M.A. Hayat (Academic Press, San Diego, 1989) p. 33. [2] E. De Harven and D. Soligo, Backscattered Electron Imaging of the Colloidal Gold Marker on Cell Surfaces, in: Colloidal Gold: Principles, Methods and Applications, Ed. M.A. Hayat (Academic Press, San Diego, 1989) p. 229.
P.J.B. Koeck er al. / Uliramicroscopy
78
[3] M. Horisberger,
[4] [5] [6] [7] [8] [9]
Colloidal Gold for Scanning Electron Microscopy, in: Colloidal Gold: Principles, Methods and Applications, Ed. M.A. Hayat (Academic Press, San Diego, 1989) p. 217. J.F. Hainfeld and F.R. Furuya. J. Histochem. Cytochem. 40 (1992) 177. R. Newman, G.W. Butcher. B. Bullard and K.R. Leonard. J. Cell Sci. 101 (1992) 503. W. Tichelaar, C. Ferguson, J.-C. Olive. K.R. Leonard and M. Haider. J. Microscopy 175 (1994) 10. M. Acheche, C. Colliex and P. Trebbia, Scanning Electron Microscopy/l986 (SEM, AMF O’Hare. IL, 1986) p. 25. H. Raether, Excitation of Plasmons and Interband Transitions by Electrons (Springer, Berlin, 1980). J. Daniels, C. v. Festenberg, H. Raether and K. Zeppenfeld, Optical Constants of Solids by Electron Spectroscopy, in: Springer Tracts in Modem Physics 54 (Springer, Heidelberg, 1970) p. 77.
62 f 1996165-78
[ 101 M. Haider, Ultramicroscopy 28 (1989) 190. [ 111 P.J.B. Koeck, Unconventional Immuno Double Labelling by Electron Spectroscopic Imaging and Multivariate Statistical Analysis, Doctoral Thesis, Karl-Franzens-Universidt Graz. Austria, 1995. [ 121 P. Trebbia and N. Bonnet, Ultramicroscopy 34 (1990) 165. [13] P. Trebbia and C. Mary, Ultramicroscopy 34 (1990) 179. [14] E.S. Gelsema, A.L.D. Beckers, C.W.J. Sorber and W.C. de Bruijn, J. Microscopy 166 (1992) 287. [15] E.S. Gelsema. A.L.D. Beckers and W.C. de Bruijn. J. Microscopy 174 ( 19941 16 1. 1161 K. Konstantinides and J.R. Rasure. IEEE Trans. Image Process. 3 f 1994) 243. [17] G. Schmid, Chem. Rev. 92 (1992) 1709. [I81 L. Lebart, A. Morineau and K.M. Warwick, Multivariate Descriptive Statistical Analysis (Wiley, New York, 1984).