ADVANCES IN ELECTRONICS A N D ELECTRON PHYSICS. VOL . 66
Statistical Aspects of Image Handling in Low-Dose Electron Microscopy of Biological Material CORNELIS H . SLUMP* AND HEDZER A . FERWERDA Department of Applied Physics Rijksuniuersiteit Groningen Groningen. 7he Netherlands
I . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . Need for a Fundamental Statistical Analysis . . . . . . . . . . . . . . . . . . . . . B. Interaction of the Electron Beam with the Specimen . . . . . . . . . . . . . . . . .
I1.
111.
1V.
V.
C. Relation between the Object Structure and the Electron Wave Function . . . . . D . Image Formation in the CTEM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E. Remarks about the Contrast Mechanisms . . . . . . . . . . . . . . . . . . . . . . . F. The Phase Problem in Electron Microscopy . . . . . . . . . . . . . . . . . . . . . G . The Stochastic Process Characterizing the Low-Dose Image . . . . . . . . . . . . Object Wave Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . Introduction and Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Derivation of the Basic Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Solution of the Integral Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . D . Statistical Analysis of an Approximate Solution . . . . . . . . . . . . . . . . . . . Wave-Function Reconstruction of Weak Scatterers . . . . . . . . . . . . . . . . . . . A . Introductory Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Axial Illumination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Tilted Illurnination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Parameter Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . Maximum Likelihood Estimation in Electron Microscopy . . . . . . . . . . . . . B. Illustrative Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Two-Dimensional Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D . Discussion and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Statistical Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . lntroduction to Statistical Hypothesis Testing in Electron Microscopy . . . . . . B. Object Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Position Detection of Marker Atoms . . . . . . . . . . . . . . . . . . . . . . . . . . D . Statistical Significance of Image Processing . . . . . . . . . . . . . . . . . . . . . . E . Discussion and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix A: The Statistical Properties of the Fourier Transform of the Low-Dose Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix B: The Statistical Properties of an Auxiliary Variable . . . . . . . . . . . . Appendix C: The Cramer-Rao Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
202 202 203 204 205 201 209 209 213 213 215 219 224 230 230 230 242 254 254 259 213 216 211
211 281 290 295 295 291 299 305 306
* Present address: Philips Medical Systems. Eindhoven. The Netherlands. 20 1 Copyright 0 1986 by Academic Press. Inc All rights of reproduction in any form reserved.
202
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
I. INTRODUCTION A . Need for a Fundamental Statistical Analysis
It is well known that in the electron microscopy of biological specimens radiation damage constitutes a severe problem. This radiation damage manifests itself as the breaking of chemical bonds due to the bombardment of the specimen by electrons. Obviously, radiation damage can be reduced by limiting the number of electrons during exposure. The price to be paid for this economy is that the pictures exhibit a grainy appearance (“shot noise”) which introduces a probabilistic feature in the imaging process. The image is to be considered as the realization of a stochastic process. The processing of the noisy micrographs consequently acquires a stochastic nature. It is to be expected that the smaller the number of participating electrons the larger the uncertainty in the results will be. It is the aim of the present article to give a statistical characterization of the results obtained from noisy exposures. By this we mean that not only the average of a certain calculated quantity is needed but also its variance, etc. The present analysis puts intuitive notions such as signal-to-noise ratio on a firm basis. Ideally, the probability density function of the relevant quantity should be determined. As we shall see in the subsequent sections, such a goal may be too ambitious in general. In the past several attempts have been made in electron microscopy to improve the statistical significance of the obtained results. The most obvious approach is to repeat the experiments under identical circumstances. This will reduce the variance of the quantities in which we are interested. Unwin and Henderson (1975) achieve this for substances which can be crystallized. In this case the number of repetitions of the exposure of one single unit equals the number of unit cells in the crystal. Unfortunately, not all biological materials can be forced to crystallize. Periodic structures can be handled by the Fourier filtering method of Unwin and Henderson (1975) or the cross-correlating techniques of Saxton and Frank (1977). Nonperiodic objects have only been analyzed in a systematic way by techniques borrowed from pattern recognition, e.g., cluster analysis. Van Heel and Frank (1981) have applied a statistical version of such an algorithm, called correspondence analysis, to a large number of images that each contain a similar, single isolated biological macromolecule. These images could thereafter be oriented with respect to each other, aligned and averaged, resulting in a higher signal-to-noise ratio and showing many more details of the object. Even in the case of the abovementioned improvements in signal-to-noise ratio, a statistical characterization of the final quantities remains necessary.
IMAGE HANDLING IN ELECTRON MICROSCOPY
203
Before entering upon the statistical analysis proper we have to discuss briefly the interaction of the electron beam with the specimen and the image formation in the CTEM (conventional transmission electron microscope).
B. Interaction of the Electron Beam with the Specimen
A basic understanding of the scattering of electrons by the specimen under consideration is needed if one wants to deduce the object structure from one or more micrographs. Our discussion will be very sketchy. A more detailed account can be found in Heidenreich (1964), Haine (1961), Misell (1973a). We distinguish two categories of interactions: (1) inelastic interactions, and (2) elastic interactions. In inelastic interactions the incident electrons transfer energy and momentum to the object, leading to an excitation of the specimen. Such an excitation could be a breaking of a chemical bond, ionization or excitation to another energy level, plasmon interactions, etc. In elastic interactions there is no internal excitation of the object whatsoever. The transfer of momentum and energy is determined by conservation of energy and momentum. The elastic scattering is the scattering of the incident electrons from the electrons and atomic nuclei (Coulomb scattering) in the specimen. When considering monoenergetic incident electrons, the characteristic difference between elastic and inelastic scattering behavior is that the spread in scattering angles is larger for elastic scattering than for inelastic scattering. Intuitively we can use this fact to make some guesses about the resolution which might be obtained when deducing the object structure from the images. This calculation is an idealization because the scattering characteristics of the objects are recorded in some complicated way in the electron micrograph, as will be clarified in the subsequent sections. Let Ap denote the spread of the momentum transfer between the incident electron and the object. According to the Heisenberg uncertainty principle, the object can be localized with an uncertainty of position of the order Ax = %i/Ap,-h being Planck’s constant divided by 271. As Ap is largest for elastic scattering, high-resolution work (aiming at discovering structural information of the order of a few angstrom) has to utilize elastically scattered electrons. It is convenient to remove the inelastically scattered electrons by a filter lens (see e.g., Henkelman and Ottensmeyer, 1974; Egerton et al., 1975). The inelastically scattered electrons form an unwanted low-resolution background signal which can be interpreted as blur. The wave function describing the elastic scattering of an electron by an object has an important property: There exists a definite phase relationship between the incident wave function and the scattered wave function, a
204
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
relationship which does not hold for inelastically scattered electrons. For that reason we shall assume monochromatic, spatially coherent illumination. This is an idealized situation which is experimentally approximated by the field emission gun, which has a very small apparent spot size. Strictly speaking, the illumination is partially coherent. The degree of partial coherence might be incorporated in the calculations, which become increasingly cumbersome without yielding additional insight (for a discussion see Hawkes, 1980b). Therefore, in the present contribution, we shall assume monochromatic,fully spatially coherent illurnination. The monochromaticity requirement is satisfied by removing the inelastically scattered electrons by an energy filter lens. C . Relation between the Object Structure and the Electron Wave Function The discussion in this section only applies to the case of elastic scattering. In order to determine the object structure from the scattered wave function, whose determination will be discussed in the following sections, we need a model for the object. We shall give two examples. (1) The object is considered as a collection of scattering centers. The position and scattering strength of each center is described by a certain number of parameters. In principle, the intensity distribution of the image can be calculated and is compared with the measured image intensity distribution. The parameters can now be determined by a fitting procedure. Needless to say, this procedure is only viable if the number of parameters is restricted, which is the case when we have considerable prior information about the object from other sources (e.g., chemical). This approach will be discussed in further detail in Section IV. (2) The object is described by an electrostatic potential distribution V(r) which is due to all charges inside the object. According to the WKB approximation and Glauber theory (Lenz, 1971; Glauber, 1959), the phase shift cx(.,.) imparted to a plane wave incident along the z direction is proportional to the projection of V(r) on a plane perpendicular to the z axis 4 x 0 , y o ) 0~
b0, y o , z)dz
(1)
The reconstruction of the scattered wave function yields a projection of the potential distribution. It should be clear that the inverse scattering problem (i.e., the determination of the object’s structure from the scattered wave function) is more
IMAGE HANDLING IN ELECTRON MICROSCOPY
205
complicated than the present examples suggest and needs further investigation.
D. Image Formation in the CTEM The setup of the electron microscopic image formation has been schematically sketched in Fig. 1. The z axis of the coordinate system is chosen along the optical axis of the microscope. The wave function in a plane perpendicular to the z axis and situated immediately behind the object shall henceforth be denoted by the name “object wave function,” and is written as $O(XO>YO)
= exPCi4x0,Yo) - P(x0,yo)l
(2)
assuming a plane wave incident along the z axis, exp(ikz), and choosing the z coordinate of the plane, zo, equal to zero. xo and yo are the coordinates in the object plane and will be measured in units of the wavelength of the incident electrons. We shall give a physical interpretation of the quantities CI and fl occurring in Eq. (2). For a plane incident wave along the z axis, a ( x o , y o ) represents the phase shift which is imparted to the incident wave function by the object and which can be related to the projection of the electrostatic potential distribution on the object plane z = zo = 0 [see Eq. (l)]. The quantity P(x0, y o ) has a phenomenological interpretation: This quantity describes the loss of electrons due to inelastic scattering. These electrons are supposed to have been removed by a filter lens. This filtering is incorporated phenomenologically by p(xo,yo). The wave function in the image plane of the microscope, i.e., the image wave function (see Fig. l), is related to the object wave function by
(3) where K ( . ,.) is given by [see, e.g., Hawkes (1980)l K ( x 0 - x,yo - Y )=
II d5
d?exP{ -iY(5,?) - 2.niKxo - X I 5
+ (YO -Y h I J (4)
Equations (3) and (4) are the equations of the linear transfer theory of image formation, and they apply when the optical system is isoplanar, i.e., when the wave-optical aberrations are independent of the object coordinates .toand y o . The real function y ( - , .) introduced in Eq. (4) is the wave aberration function. It incorporates the effects of spherical aberration through the coefficient C, and
206
CORNELIS H. SLUMP AND HEDZER A. FERWERDA $(X.Y
1
-2 optic a x i s
z .za=o
e x i t pupil
z.zp
image plane
z .zi
FIG.1. Schematic of image formation in the transmission electron microscope
the defocus through the coefficient D The value of C,, usually in the range of 1 to 3 mm, is fixed for a given microscope. However, the defocus D is variable and is to be chosen by the experimentalist. In Eq. (5) aberrations of higher order such as coma and astigmatism are neglected. In the object plane xo and yo are measured in units of E.; in the exit pupil 5 and q are expressed in units of the (back) focal length of the optical system. In the image plane x and y are measured in MA, with M representing the lateral magnification. A wave function is not a measurable physical quantity in itself. In the image plane only the intensity, i.e., the wave function multiplied by its complex conjugate, can be observed and registered. The information about the structure of the specimen is contained in the phase function In an aberration-free microscope this information would disappear from the product $(.,.)$*(.,.) when the defocus parameter D is also set to zero, a situation which would correspond to the Gaussian reference plane. The aberration function y(. ,-)[cf. Eq. ( 5 ) ] is primarily responsible for the generation of phase contrast. In this context phase contrast is defined (as usual) as the contrast in the image caused by the phase shift imparted to the incident electron wave function by the object. This phase contrast is ~(e;).
IMAGE HANDLING IN ELECTRON MICROSCOPY
207
observable thanks to the phase shift provided by the aberration function y(.,.) which plays a role similar to the phase plate in a phase-contrast microscope.
E . Remarks about the Contrast Mechanisms We will briefly make some comments on the physical mechanisms which give rise to image contrast. We will assume that the image is due to the elastically scattered electrons. For low-resolution microscopy one gets useful information from the scattering contrast. The scattering contrast (also called diffraction contrast) is caused by the removal of electrons which have been scattered over such large angles that they are intercepted by the apertures of the microscope. This scattering contrast should be clearly distinguished from the contrast caused by the removal of the inelastically scattered electrons which have been removed by the energy filter lens. Scattering contrast is fully taken into account in our treatment by the finite size of the aperture in the exit pupil, which is incorporated in our formulas. In our treatment scattering contrast is caused by electrons that have been intercepted by the diaphragm in the exit pupil. This diaphragm should not be taken too literally: All the apertures inside the microscope are represented by one effective aperture taken to be located in the exit pupil. Scattering contrast can be observed when essentially only the unscattered electrons pass through the microscope. When the specimen undergoes crystalline scattering (in the context of crystalline specimens the name diffractionis usually preferred), contrast is observed when only the zero-order diffracted beam is transmitted. In some cases scattering contrast admits a simple physical interpretation. The object is considered to consist of a number of point scatterers, which scatter independently. Let us assume that the electrons are never scattered more than once; multiple scattering is excluded. Let the object be illuminated by a wave propagating in the z direction and let Z(x,y , z ) denote the beam intensity in an arbitrary point of space. a is the total cross section for scattering over angles larger than the angular half-width E of the aperture in the exit pupil. As a depends on the chemical composition of the scatterer, a is taken to depend on the space coordinates: a = a(x,y , z). We now consider the propagation of the beam intensity Z(x,,y,z)when proceeding in the z direction (see Fig. 2). Going from the plane perpendicular to the z axis with z coordinate z to a similar plane with z coordinate z Az,we find that the number of electrons which is prevented from reaching the image plane is given by
+
I ( x ,y , 4
-
Z ( X , Y ,z
+ A 4 = 4x9 Y , z ) l ( x ,y, z)p(x,Y, 4 Ai!
where p(x,y,z) denotes the number of scatterers per unit volume. I ( x , y , z )
208
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
/ I
I
I
/
1
zi
1
1Z+AZ
v
I
I
FIG.2.
object
/
>
2 axis
Propagation of the beam intensity through the object.
consequently satisfies the transport equation ar(x, y , z, = i3Z
- a(x,
y , z)p(x,y , z)Z(x, y , z)
which is solved by
assuming a uniform incident beam intensity lint before the beam hits the object. In particular, for a weakly scattering object (meaning a “small” value of a), we obtain by expanding the exponential function and retaining only the first two terms
L
I(x,Y , z ) = Iinc - Iinc
d x . Y ,Z’)P(X, y , z’) dz’
(8)
From this formula one sees that the image contrast is the projection of a(x, y , z)p(x, y , z) on the object plane. ~ ( xy ,, z)p(x, y , z) might be interpreted as
the “scattering power” of the object per unit length. If the scattering cross section a does not vary appreciably over the object, the image contrast is approximately proportional to the projection of the density of the scattering centers, which in its turn is approximately equal to the mass density. The image contrast due to the inelastically scattered electrons gives information on the loss characteristics of the object and is irrelevant, even a nuisance, for structure determination. The reader should be warned not to take the simple-minded discussion of scattering contrast too seriously. The state of coherence of the incident beam
IMAGE HANDLING IN ELECTRON MICROSCOPY
209
has been completely ignored, a liberty that seems to be common in the study of transport phenomena. For high-resolution electron microscopy the most important contrast mechanism is phase contrast, which arises from the interference between the unscattered and the scattered electron waves. In the next section this contrast mechanism is treated in greater depth.
F . The Phase Problem in Electron Microscopy It is clear from the foregoing that determination of the structure of the object requires the knowledge of the complex object wave function, in particular its phase, given by cx(xo,yo) [cf Eq. (1) and (2)]. We shall see later on that we can determine the complex object wave function if we know the complex wave function in the image plane, +(x, y). Unfortunately, only the intensity in the image plane is observable, which is proportional to the square of the modulus of this wave function, I+(x, y)12.So, apparently the phase of the image wave function is not recorded and must be considered as being lost. This prevents a unique determination of the object wave function and frustrates the determination of the object structure. This complication is known as the “phase problem.” In this article the phase problem will be solved by using two or more exposures of the same specimen under different settings of the microscope, e.g., by changing the defocusing between two exposures. A more detailed account can be found in the review articles by Ferwerda (1978, 1981, 1983), where more references can be found, and Saxton (1980).
G. The Stochastic Process Characterizing the Low-Dose Image Within the limits of the virtually countless number of electrons contributing to the image intensity, the observed intensity is proportional to the squared modulus of the image wave function We will call this situation the deterministic case. In this study, however, the specimens of interest are the radiation sensitive objects of biological material. As has been stated earlier, irreparable radiation damage resulting from the imaging electrons restricts the electron dose tremendously. Lowering the dose in order to prevent uncontrollable structural changes in the specimen will cause quantum noise to become manifest. The interpretation of the image will become difficult due to quantum noise. It is obvious that a compromise must be made between radiation damage of the structure and interpretability of the image contrast [see, e.g., Kellenberger (1980)l. We will not pursue this further. We minimize the structural damage by reducing the electron dose, and we will investigate in $(a,-).
210
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
subsequent sections the consequences of the reduction for the retrieval of information from noisy images in general and in particular with respect to the phase problem. In low-dose circumstances the image intensity distribution recorded on the micrograph is a realization of a stochastic process. This noise process, which is directly related to the way in which images are recorded, is investigated in this section. The stochastic nature of the recorded image has a consequence that the results of image processing also become stochastic quantities, for example the results obtained with an algorithm for phase retrieval. This section is therefore of fundamental importance in this study. From among other noise sources, such as thermal fluctuations of the current in the magnetic lenses and mechanical vibrations, we restrict our analysis to the so-called quantum noise (“shot noise”). The reason for this restriction is that quantum noise is a fundamental and major noise source which cannot be made arbitrarily small by good instrumentation and intelligent operation of the microscope. Throughout this study we will assume that low-dose images are recorded in the following idealized away. The registration of the image is performed by a detector (viz. a photographic plate) which is divided into a large number of N 2 (to be specified later) identical nonoverlapping squares. We assume that each image cell counts the exact number of electrons which arrive in the cell. Hence, a recorded image consists of an N x N array of random counts f i k , l , ( k , I ) = { 1,. . . ,N } . We use the hat symbol to denote random variables. The above assumptions are not limiting because the number of developed grains of silver in a region of a photographic emulsion is proportional to the number of incoming quanta. The step of scanning the image with a microdensitometer is omitted from the analysis presented here. The process of digitizing the photographic plate will inevitably add noise to the signal to be processed. The approach in this section is to assume that the information in the micrograph is available in digital form. This is in line with new developments in instrumentation, where a direct interface exists between microscope image and computer, and with developments in solid-state image-receptor technology. The intensity of the electron source must be low because of the restriction to the low-dose regime. It is to be expected therefore that the electron wave packets of the successively emitted electrons do not overlap. This is experimentally supported by Munch (1975), who found an average space of 15 m between two electrons in a beam current of 1.5 x lo-’’ A at 75 keV. On the average there is only one electron in the microscope at a time. Therefore, the successively emitted electrons do not interact with each other, and the emissions are statistically independent events. A further consequence is that the scattering process of the beam with the specimen and the subsequent image formation can be described by a one-electron wave function. Due to the
IMAGE HANDLING IN ELECTRON MICROSCOPY
21 1
statistical independence of the sucessive emissions the total number fi, of emitted electrons during the exposure time T is a realization of a Poisson process, as is shown for example in Davenport and Root (1958, Chap. 7). This means that fi, is a Poisson-distributed random variable with probability distribution
P{Z,
= k } = exp(-/Z,T)(3b,T)k/k!,
k
= 0,1,2,3,.. .
(9) where A, is the source intensity parameter, equal to the mean number of emissions per second. The random counts f i k , i , ( k ,1) = { 1 , . . . ,N } of which the recorded image consists, are independent, Poisson-distributed random variables, as will be shown in the following. In the low-dose regime the probability that an electron which has been emitted by the source will arrive in the Ith image cell, 1 = { 1,. . .,N ] is expressed in terms of the one-electron image wave function, cf. Eq. (3) by
s = jjI$(X.,.)l'dXdY
(10)
ar
in which a, is the area of the Ith image cell. Let fil denote the number of detected electrons in the Ith image element. The probability that k electrons will arrive in the Zth image element is given by a k-times independently repeated Bernoulli trial, with 4 being the probability of success. We arrive at
k = 0,1,2,3,.. .
(1 1)
In Eq. (11) a combination of the Poisson distribution for the total number of electrons and the binomial distribution can be recognized. Defining a new variable j = rn - k, we can write for Eq. (1 1) P ( f i l = k ) = exp(-i,T)(k!)-'Pt
1 (&T)j+k(j!)-l(l- 4)' m
j=O
= exp( -/Z,Tfi)(k!)-'(&T~)k
(12)
From Eq. (12) we conclude that f i l is Poisson distributed with parameter & T e . Consider two nonoverlapping area elements of the image plane, area i and j (see Fig. 3). We will determine the joint probability that k electrons arrive in area i and I electrons in area j. The line of approach followed here parallels the discussion by Papoulis (1965, pp. 76,77). In the low-dose regime the electrons are independently emitted. Therefore the image formation can be modeled as
212
CORNELIS H. S L U M P A N D HEDZER A. FERWERDA Y
-d
I
FIG.3. The area elements i and j in the image plane.
an m-times independently repeated experiment, where m is a Poissondistributed random variable with parameter &T. We have necessarily
(13) The following three mutually exclusive events can be distinguished as possible outcomes of an experiment: m>k+l
event i electron arrives in area element i event j__ electron arrives in area element j event i + j electron arrives somewhere in the image but not in i or j The individual probabilities of the three events are related by (14)
P,+p,+Pq=l
If m has a fixed value satisfying Eq. (13), the probability of the joint event ( k electrons in i and I electrons i n j } is Pm{fii= k , i i j
=
I}
=
{k!I![m - (k
+ l ) ] ! } - ' m ! P ~ P ~ P ~ " + k (15) '
As m is Poisson distributed with the distribution of Eq. (9), we arrive al the following expression for the probability of the joint event that k electrons will arrive in area element i and I electrons will arrive in area element j
P { i , = k,Gj = l } =
C m
m=l+k
exp(-i,,T)(m!)-'(A,T)"Pm{fii = k , f i j = I} (16)
Using Eqs. (14) and (15) and introducing a new variable m' = m
-
(k
+ l ) , we
213
IMAGE HANDLING IN ELECTRON MICROSCOPY
obtain from Eq. (1 6) P{G, = k,Cj = I } m
= e x p ( - A ~ ~ ) ( ~ ! ) - l ( ~ ~ s T ~ ) ~ ( ~ !C ) - l( m ( A’ !s) -Tl (~&)~l ) m ’ ( --l m’=O
= ex p( -
-
p,)“’
A, Tp)(k !) (A, T e ) kexp( - A,T e ) (/ !) (As ~ 5 ) (1 7)
With Eq. (1 2) we obtain
P(G, = k,Ci
=I
}
= P{Ci = k } P { n j = I}
(18)
which shows that the random variables Gi and iij are statistically independent. Notice that this property does not depend on the size or shape of the areas i andj; the only requirement is that they do not overlap. Consequently, the recorded image {fil, G2, fi3, ...,C N l ) consists of N 2 independent and hence uncorrelated Poisson-distributed random variables and has the probability of occurrence F{C1,G2,...,CN2], =
n N2
I=1
e x p ( - ~ s ~ F * ) ( ~ ~ ! ) - l ( ~ s T ~ (19) l)~’
As Cl is Poisson distributed according to Eq. (12), the mathematical expectation value of GI is given by E{ti,}
=
&T&
(20)
and the variance is given by var{Gl) = &T&
(21)
The stochastic image process is completely specified statistically by Eqs. (9), (lo), (ll), and (19). The stochastic properties of the data will play a dominant role in subsequent sections which deal with the extraction and evaluation of information from the low-dose images.
11. OBJECT WAVERECONSTRUCTION
A . Introduction and Review
The reconstruction of the object wave function is of great importance for the imaging of the structure of biological materials, especially at high resolution by means of an electron microscope. This is due to the relation between the object wave function and the electrostatic potential of the object,
214
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
as has been briefly touched upon in Section I. In Section I,D it is indicated that the interaction between the incoming electrons and the specimen is described by a shift in the phase of the electron wave function. The amount of phase shift corresponds to the projection in the propagation direction of the object’s electrostatic potential. The information about the specimen structure is not related to a directly measurable physical quantity, a situation which leads to the so-called phase problem. Only the intensity of a wave function, which is proportional to its squared modulus, can be recorded, for example, on a photographic plate. In this section we will first discuss the properties of the various methods and algorithms which are proposed in the literature for solving the phase problem when they are applied to low-dose imaging conditions. We hereby exclude periodic objects so that the noise-reducing techniques of averaging over the periodic repetition of image cells cannot be applied (Unwin and Henderson, 1975). Apart from analytical techniques such as zero flipping, which are mainly of theoretical importance, basically three algorithms exist for the retrieval of the phase of the object wave function. The elegant method proposed by Frank (1973)will in practical situations run into difficulties and is omitted here together with the Gerchberg-Saxton algorithm (Gerchberg and Saxton, 1972), which can give nonunique results even for the noise-free case. The three methods discussed in this paragraph are: (1) Newton-Kantorovich approach (Van Toorn and Ferwerda, 1976; Roger, 1981). In this approach the nonlinear equations, describing the object wave function in terms of recorded intensities, are expanded as a kind of Taylor expansion based on a first (intelligent) guess or starting from prior information about the solution to be obtained. Based on the first-order term of this expansion, the wave function is updated, which is the starting point for a new expansion, and the process is repeated. Owing to the noisy data, it is difficult to formulate a criterion which has to indicate that the calculational procedure has converged. A too stringent adaptation to the noisy data has to be prevented. A further problem with this method concerns the convergence to a correct solution when the initial starting function is not close to it. Owing to the iterative nature of the procedure, the influence of the noise on the results obtained is difficult to quantify analytically. The determination of even the expectation value and the variance of the obtained results is a hard task. However, some insight can be obtained by Monte Carlo studies. Analytically tractable is the case of an initial starting solution which is close to the true solution, when only one correction update of the wave function is necessary. However, the statistical analysis of this situation is not of great value as not much improvement is to be expected from adaptation to noisy data. What is most likely to happen is that the good initial a priori solution will be corrupted
IMAGE HANDLING IN ELECTRON MICROSCOPY
215
by noise and that the procedure ends up with an inferior result if compared with the a priori information. (2) Misell’s algorithm (Misell, 1973b).This is also an iterative procedure, based on two exposures which are defocused with respect to each other. Starting from an initial image wave function with the correct modulus according to the first exposure and with an arbitrary or even random phase, the image wave function belonging to the second exposure is calculated. The modulus of this wave function is corrected to satisfy the second exposure. Based on this corrected wave function, the wave function corresponding to the first exposure is calculated, the correct modulus is enforced, etc. When this computational scheme has converged, the object wave function is obtained by inversion of the integral equation [cf. Eq. (3)], which relates the object and the image wave function to each other. With the Misell algorithm the same problem arises with respect to the convergence and the noise influence on the result as with the Newton-Kantorovich approach. Owing to quantum noise the data of the two exposures are not consistent with each other; so that in a strict mathematical sense there exists no solution, and consequently the algorithm cannot be expected to converge. (3) The direct approach (Van Toorn and Ferwerda, 1976).This method is also based on two exposures with a different defocusing parameter. The wave function in the exit pupil is calculated by solving two coupled nonlinear Volterra integral equations of the first kind. The algorithm is not an iterative procedure and is therefore of potential interest for evaluating the solution obtained statistically. The algorithm is very sensitive to noise, as has been reported by Van Toorn et al. (1978), because of error accumulation. Of all methods the approach of solving the integral equations directly seems to be the most relevant for the analysis of the influence of the stochastic data on the reconstructed wave function. Iterative procedures are not tractable for statistical evaluation; therefore, in the next two sections the direct method is analyzed in greater detail.
B. Derivation of the Basic Equation
In this section we derive the basic integral equation which relates the object wave function to a recorded intensity distribution in the image plane. In order to keep the equations as simple as possible, we treat in this chapter one lateral dimension of the images only. For electron microscopes with square diaphragms (if there are any), the extension to two lateral dimensions is straightforward. With a circular symmetry the mathematical treatment becomes more complicated. However, this elaborate analysis will not yield
216
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
more insight than the case of one lateral dimension, which is treated in this section. The illuminating electron beam is described by a one-electron wave function. For this we assume a quasimonochromatic plane wave propagating in the direction of the optical axis of the imaging system, which we consider to be isoplanar. The object wave function is written as
(22)
$o(xo) = expCia(x0) - B(x0)l
where the phase term a(.) denotes the phase shift due to the object’s electrostatic potential and where the attenuation term p(.) describes the removal of inelastically scattered electrons from the imaging beam by the appropriate energy filter lens. The geometry of the electron microscope being considered here with one lateral dimension is presented in Fig. 4. The optical system is characterized by the relations between the wave functions in the three planes: object plane, exit pupil, and image plane. The wave function in the exit pupil $ p ( - ) is related to the object wave function $o(.) by
I-+**
$,(O = expC O(01 -
$o(xo)exp( - 27cix00 dxo
(23)
where y(.) is the aberration function in the exit pupil. The coordinate 4 in the exit pupil is measured in units off; the focal length of the imaging system. In
X
5) specimen holder
5 4Jolxo)
vacuum
XO
*d *E
F
coherent
G
i IIu mi n a t i on
Z
L
. optic a x i s
-E -d
I
object plane
z:z,=o
exit pupil
z.z
P
image p l a n e ZZZ,
FIG.4. Schematic diagram of the imaging system for the case of axial illumination
217
IMAGE HANDLING IN ELECTRON MICROSCOPY
the object plane xo is measured in units of 2, the wavelength of the accelerated electrons. In the image plane the coordinate x is measured in units of MA, with M the (lateral) magnification. The aberration function y(.) for an isoplanar system is given by
r(5) = 2 n ~ - 1 ( g , 5 4- + ~ p )
(24) where only the spherical aberration with coefficient C, and the defocusing with coefficient D have been taken into account, neglecting higher-order aberrations such as coma and astigmatism. The image wave function $(-)is related to the wave function by $&a)
As has been discussed in Section I,G the recorded low-dose image consists of a (one-dimensional) array of statistically independent, Poisson-distributed, random counts n^ = (n^-,j2, fi-N,2+ . . . , i?N,2- The Shannon number N equals 2 x 2d x 2~ because the intensity $$* in the image plane has a bandwidth of 48. The t i k (k = - * N , . . . , i N - 1) random counts represent the electrons which have arrived in the kth image cell. In Section 1,G it has been shown that Zk is a Poisson-distributed random variable with intensity parameter ikequal to its expectation value, given by
A, = E{fik) = &T(2d)-'
S.,
(261
$(x)$*(x)dx
+
where a,, denotes the kth image cell: ( 4 ~ ) -k - ( 8 ~ ) - < x < ( 4 ~ ) - l k ( 8 ~ ) Calculating the Fourier transform of the stochastic image according to
1
N/2- 1
?(()
=
k= -N/2
fikexp[2ni(4&)-'k5]
(27)
we obtain the stochastic function ?(-). For reasons of convenience ?(-) is defined here for a continuum of values of 5. In practice Eq. (27) will be calculated using a fast Fourier transform (FFT) algorithm, which results in 2(cl)for a discrete set of ( values (to. Where appropriate we will use this discrete representation. The function ?(.) is a complex stochastic function, and its statistical properties are studied in Appendix A. In Appendix A it is shown that the autocorrelation function R ( . , .) of the complex stochastic process Eq. (27) is given by
Wt,,t z ) = E { ~ ( 5 1 ) c ^ * ( 5 z ) f= E(C^(51))E(C^*(t2)) + CAkexp[2ni(4~)-'k(t, - t2)1 I from which it follows that
c^(cl)and c^((,)
are correlated
(5,
#
t2).
(28)
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
218
The basic integral equation is derived from Eq. (27) taking the expectation value of both sides. Using Eq. (26) this results in E(c^(())= i.,T(2d)-l
N/2 - 1
C
k= -N/2
exp[2ni(4~)-'k51
la,
$(x)$*(x)dx
(29)
Substituting Eq. (25) and performing the integration over x yields
X
sin [2n(Se)-'(]
n4
exp[2ni(4~)-'k(<' - (")I
Carrying out the summation over k leads to
Because d >> 1, the aperture in the object plane is numerically large compared with the wavelength of the accelerated electrons, and we have
+
(' - <")I sin[2nd(( sinCn(4~)-'(( + 5' - (")I
= 4&&5+ 5' - i"")
With Eq. (32) the integration with respect to 5'' in Eq. (31) can be evaluated leading to [note that $,(() is a band-limited function only defined in the region --i; I ( 5 c]
(33) In Eq. (33) an integral equation of the first kind is presented which is of the Volterra type. In practical situations E(c^(())is unknown, as we only have one realization of the stochastic function c^((). Therefore the integral equation is stochastically driven. Note that ?(() equals c^*( - 5) because of the symmetry properties of the Fourier transform of a real function. Equation (33) is the basis for the discussion of the direct method for the determination of the object wave function. This is the subject of the next section.
IMAGE HANDLING IN ELECTRON MICROSCOPY
219
C. Solution of the Integral Equations In this paragraph we discuss the direct method for the reconstruction of the object wave function based on two defocused exposures (Van Toorn and Ferwerda, (1976). Defining the auxiliary wave function $q(-) (34) we obtain from Eq. (33) with the two exposures i?(-) and c^2(.), which correspond to the defocus parameter D' and D2, respectively, the set of integral equations $q(()
= e~p(-2ni/,-'505~)$~(5)
x e~p[-2nii.-'iD'((~
c^(5) = 3.,T(2d)-'
+ 255')] d(',
J:,i $q((')$:(t'+
0 < 5 < 2~
(35)
()exp[ - 2 ~ i A - ' f D ~ ( 5+ ~ 25(')] d t '
In Eq. (35) the sinc function in front of the integral in Eq. (33) has been approximated by a constant
In order to reconstruct the object wave function, we have to solve t,hq(.) from Eq. (35). Inversion of an equation similar to Eq. (23) yields the required $,J.). Introducing a new integration variable q
Eq. (35) becomes
Until now the coordinate 5 has been treated as a continuous variable. It is, however, a discrete variable due to the calculation of ?(-)in practical situations by means of a fast Fourier transform algorithm (FFT).Setting the sampling
220
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
distance 9 in the exit pupil equal to (2d)- ', 5 is expressed as k = 1 , 2 ,..., 4 N - 1
<=2~-kh,
(39)
approximating the integration according to the trapezoidal rule, Eq. (38) is transformed into a set of algebraic equations. For k = 1 , t equals 2~ - h, and Eq. (38) becomes
C^'(~E
-
h)
=
i . , ~ ( 2 d ) - ' + h ( $ ~ ( -+~ h)$q*(~)exp[-2nii.-'D'(+h)(2~ - h)]
+ I)~(-E)$~*(E
-
F2(2&- h) = &T(2d)-'4h{$q(
+ I)~(-E)$~*(E
-
h)exp[-271i~-'D1(-~h)(2&- h ) ] } -E
+ h)]$q*(~)exp[-2niiLF'D2($h)(2~
-
h)exp[-2nK1D2(-+h)(2&
-
h)]
(40)
h)]}
For the moment, we treat the two unknown constants $q*(~) and tjq(-&) as if they were known. Their actual values follow from the following procedure. We set $:(E) equal to 1, making the overall phase factor a fixed value. The derivative of E(-) at the point 5 = 28 yields the product 2i.,T(2d)- 't,bq( - E)$:(E) from which $q( - E) can be evaluated. The two exposures both give rise to an estimated value for t,hq( -E), of which the mean is taken. Note that d,c^'92(2~) = 1,6k'.22ni(4~)-1k( - l)k. In the reconstruction procedure both $ q * ( ~ ) and t+hq(- E) are multiplicative coefficients. From the normalization constraint Jd4 tjq(4)$q*(l), which corresponds to the total number of electrons detected in the image, we obtain after the reconstruction a correction for the modulus of $;(E). From Eq. (40) t,bq( - 6 h) and $X(E - h) easily follow using Cramer's rule. Throughout this whole procedure i+hq(-)and $q*(.) are treated as unknown functions which are determined independently; the complex conjugation relation remains to be checked afterwards. Choosing D 2 equal to -D' and D' equal to D in the following, the determinant of Eq. (40) becomes
+
detk,,
=
-22iSin[2nl-'Dh(2~
-
h)]
(41)
If det # O the sample values t,hq( - E
+ h) and $:(E
? ' ( 2 ~- 2h) = 3 . , T ( 2 d ) - ' 3 h { $ q ( - ~
+ h)$,*(~)exp[--nij.-'~h(2r:
-
h) are obtained. For k = 2,
t = 2~ - 2h, with the trapezoidal rule Eq. (38) becoming
+ 2$hq( + h)$q*(E h) + ($q(-~)$q*(~ 2h)exp[2niA-'Dh(2~ -&
-
211)l
-
-
-
2h)l)
+
- 2/11] F2(2c - 2h) = 3L,T(2d)-'3h{$q(-~ h)t,b,*(s)exp[2xii:~'~h(2~
+ 2Ijq( + h)$q*(E h) + ($q(-~)t,bq*(~ 2h)exp[-2nK1Dh(2~ -E
-
-
-
2/91]
(42)
IMAGE HAh-DLING IN ELECTRON MICROSCOPY
22 1
The determinant of Eq. (42) is equal to det,,, = --2isin[2nK1D2h(2~- 2h)] (43) With the two samples $q( - E + h) and $,*(E - h) obtained in the previous step k = 1 and if det,,, # 0, the values $q( - E + 2h) and $X(E - 211) can be evaluated. The next step k = 3 leads to 5 = 2.5 - 3h and Eq. (38) corresponds to
+
?(2e - 3h) = A , T ( 2 d ) - 1 $ h { $ q ( - ~ 3h)$,*(&)exp[-2ni3,-'D$h(2~- 3h)l
+ 21,b~(-~+ 2h)$,*(~- h)exp[-2niLX1D$h(2~ 3h)l + 2$q(-e -th)$q*(~- 2h)exp[2nilb-'D$h(2~- 3h)l + 3h)exp[2nii.-'D$h(2~ 3h)l) (44) - 3h)l E2(2&- 3h) = A,T(2d)-'$h{$q(-e + 3h)$,*(~)exp[2niA-'D$h(2& -
$q(-~)$,*(~ --
-
+ 21j~c-c + 2h)$q*(~- h)exp[2xiA-'D+h(2~ 3h)l + 21,b~(-~+ h)$q*(~ 2h)exp[-2niAX1D+h(2~ - 3h)l + -E)$,*(E 3h)exp[-2niEL-'D$h(2~ - 3h)l) -
-
-
$q(
which has the determinant det,
2i sin[2d- D$h(2~- 3h)l (45) From Eq. (44) t+hq(- E + 3h) and $,*(E - 3h) are obtained using all the sample points determined in the steps k = 1 and 2. If the defocus parameter D is chosen not too large in relation with the size of the exit pupil E, the determinant of the set of algebraic equations is only zero for ( = 2~ and ( = 0. Repeating the numerical procedure for k = 4, 5,. . . , +A' - 1 will result in a computed function t+hq(-) in the interval ( - E , + E ) , except in the endpoints, where the determinant is equal to zero. The mathematical structure of the algorithm is easily seen from the explicitly presented steps k = 1, 2, and 3. The statistical analysis of the whole procedure is nevertheless still far from trivial due to the many complex terms involved and due to the correlation of the ?(-) samples. We therefore analyze an idealized system of equations similar to Eq. (38) but simplified, in order to obtain insight into the behavior of the above algorithm under noisy data. With the model which is described in the following, the performance of the algorithm of Van Toorn and Ferwerda (1976) is simulated in a stochastic environment. ~
=
-
1. Model Computation In order to reveal the behavior of the set of integral equations (38) with stochastic driving function, we simplify the set by
222
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
y((= )
s s
E-512
-(&
-512)
f(-31r + rl)Y(+5 + q )
x sin(2nEb-'~'tq)dg,
j'(5) =
&-
-(&
-
0<4 I 2~
(46)
fc-2+ v)g(+5 + v )
512)
0 < t 5 2~
x cos(2n2- Dz [ q ) dv,
With D = D' = - D 2 Eq. (46) is constructed to have a determinant similar to Eq. (36) in the computational steps k = 1,2,. . . ,$ N - 1 of the algorithm det
= sin(2nA-'D2<~)
(47)
The two functions f(.)and g ( - ) to be determined from Eq. (46) are real functions. The statistical properties of the stochastic driver functions PI(-) and f2(-) will be discussed further on. The first step ( k = 1) of the algorithm with = 2~ - h leads to
<
f'(2t:
-
h) = i h [ f ( - s
+ h)g(E)
f2(2c
-
h) = + h [ f ( - ~
+ h)g(&)+ f ( - & ) q ( &
-
f'( -E)g(E
h)] sin[2n3.-'Dih(2~
-
(48) h)]c0~[2n2-'D+h(2t: - h)]
The general step, i' = 2~ - kh, k = 1,2,. . ., i N fl(2E
-
h)]
-
-
-
1 gives
kh)
+ kh)q(s) f ( - E ) g ( e kh)] sin[2nA-'@kh(2~ k h ) ] + h 1 f ( - & + ( k / ) h ) g ( & /h)sin[27L1D~h(k- 21)(2~ kh)]
=ih[f(-e
-
-
-
k- 1
-
-
-
I=1
(49)
Y2(2t:- kh) = i h [ . f ' ( --6
+ kh)g(c) + f ( -e)g(E
+ h 1 f ( - ~+ ( k k~
1
I=1
-
-
kh)] cos[2nj.-' Dikh(2s - k h ) ]
I)h)g(e - Ih)cos[2nj.-' D$h(k
-
21)(2e - kh)]
For simplicity we assume the noise in the stochastic functions j(.) to be Gaussian with zero mean and variance equal to unity. Furthermore, the samples of 9(.)are taken to be statistically independent. The correlation in the driving functions of Eq. (38) is neglected in the computation. We have
Y(i1)= $ ( < I ) E ( f'.2(Ck)f1,2(t,))
+ N(O, 11, =
y^2(r,)= y2(<,)+ N ( 0 , l )
E~.c'1,"r,)}Ec4;'.2(<~)),
5k
#
i',
(50)
IMAGE HANDLING IN ELECTRON MICROSCOPY
223
) equal to 1 we obtain from Eq. (48) With d E ) and f ( - ~ set
.f( --E
+ h)
= {sin[2nL1Dh(2e - h)])-'{2h-'y^'(2& - h)cos[2nA-'Dih(2~ - h)]
+ 2h-'y^2(2~
h)sin[2nA-'D$h(2~ - h)])
(51) and a simila! expression for $(E - h). As the functions j(.) are Gaussian distributed, .f(- E h) is also Gaussian distributed -
+
+ h)) = + h) (52) var{.f(-E + h)) = var[@(E h ) ) = 4{h2sin2[2nA-1Dh(2~- h)]} In the second step ( k = 2) of the algorithm, the product of J ( - E + h) and E{f(-&
f(-E
-
- h) has to be computed in order to be able to solve the matrix equation, Eq. (49) for ?( - E + 2h) and G(E - 2h). We now analyze the stochastic properties of this product. The product z^ of two Gaussian-distributed variables 2 and j with zero mean and variances o: and a;, respectively, is distributed according to Kendall and Stuart (1963) @(E
(53) with 2 = N(0,a:), y^ = N(0, o.;) and z^ = 2 j and where KO(.)denotes the modified Bessel function of the second kind of order zero. We have P(z^) = (n%oy)-' Ko((~,a,)-'lzl)
E(5) = 0 (54)
var{z^) =o,Zo,'
In order to avoid unnecessary notational complication, we simply replace the varying sine and cosine terms by 3 s . Then the determinant of the matrix equation becomes equal to unity. The variances off( - E + h ) and $(E - h ) are equal to 4h-'. We obtain from Eq. (49)
j ( - &+ 2h) = h-'$y^'(2&
-
+ h-'J29'(2&
2h) - 2 j ( - & - 2h)
+ h)$(E
-
h) (55)
from which it follows that
E { j ( - & + 2h)) = f ( - & + 2h)
var{j( --E
(56)
+ 2h)) = 4h-' + 64K4
A similar expression with the same variance exists for @(E - 2h). Because h << 1 [h = (2d)-']- we observe a significant increase in the variance of the estimated values f(- E + 2h) and $(E - 2h). For the determination of j ( --E + 3h) and @(E - 3h) (the next step) the cross-products
I ( - &+ 2 h ) @ ( ~ h ) -
and
j ( - &+ /I)$(&
- 2h)
224
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
$re required. These products will have a larger variance than the product f ' ( - - E + h)ij(c - h ) , which leads to a larger variance of f ' ( - - ~ 311) and ij(c - 312) as compared with the variance of the estimated samples in the previous step of the reconstruction algorithm. This process continues. Ultimately the variance of the estimated samples will have increased so much that the pertinent data values do not influence the result any longer. It has to be concluded that because of the noise in j1v2(-) the algorithm is unstable. The solution thus obtained has an increasing noise variance because of the inherent error propagation. This variance is very much larger than the noise variance of the data, a situation which is characteristic for so-called ill-posed problems. The conclusion of the above model computation applies to the algorithm of Van Toorn and Ferwerda (1976) and is consistent with the reported noise sensitivity of the algorithm by Van Toorn et ul. (1978).The performance of the pertinent algorithm in low-dose imaging of biological material is severely degraded due to the inherent noise accompanying low-dose exposures. In the next paragraph an algorithm will be presented with negligible error propagation and with its stochastic behavior still analyzable. However, the results of this algorithm are only approximate in the noise-free case, which manifests itself in the fact that the algorithm is not unbiased. With noisy data the approximate character of the reconstruction method is not a drawback, as exact results are impossible in this situation.
+
D. Statistical Analysis of' an Approximate Solution In this paragraph an approximate solution is presented of the fundamental set of coupled integral equations [Eq. (38)]. The solution is approximate in the sense that certain assumptions are made which are not exactly fulfilled. In practical situations where noise is manifest, however, the deviation from the true solution is expected to be smaller than the variance of the approximate solution due to the noise in the data. The method proposed in this section does not suffer from the increasing noise variance as the algorithm in the previous paragraph. With partial integration we obtain from the basic set of Eqs. (38).
IMAGE HANDLING IN ELECTRON MICROSCOPY
:'(()
= ).,T(2d)-'(
-
-
-2nij.-'D2[)-'
$¶( - E ) $ , * ( - &
/
(
$¶(E
- t)$:(c)
x exp[ -2nii.-'D2((c
- ( 8 - 5,Z)
-
+t)]
+ t1exp[2nij--'D2((e -it)]
8
&-ti2
225
+ r)$,*(+4+ r)]exp(-2nii-'DZir)d0) -[$¶(-it ar
(57)
For a fixed value of t we have two equations with the four unknown variables <), $:( - c + t),$:(E), and $¶( -e). The latter two occur in the equations for every value of ( chosen, a situation like the algorithm of the previous section. Therefore $:(E) and -6) are treated in the same way. Together with the unknown terms an integral also appears in Eqs. (57). The contribution due to this integral will be examined in greater detail. In the algorithm of Van Toorn and Ferwerda (19761, a similar term was responsible for the error propagation. The approach here is geared to control the noise amplification by reducing the influence of the integral term with the help of an extra exposure. In order to determine the contribution of the integral to Eq. (57), we first make the new assumption that the influence of the spherical aberration (C,) is negligible. A large value of I: (high-resolution imaging) corresponds to a small sampling cell in the image and therefore for a given electron dose to a low signal-to-noise ratio. The dominance of shot noise can be mitigated by choosing a smaller value of c, which leads to a larger sampling cell with better statistics at the expense of spatial resolution. The practical reason for working at medium resolution motivates the neglect of C,, as is shown in Fig. 5. Neglecting the spherical aberration C,, we have from Eqs. (23), (24). and (34) $q(~ -
Spatial frequencies of +b0(-)up to E do contribute to the image contrast. In the case of noisy data we refrain from the reconstruction of the object wave function &(.) in all its spatial frequencies (bandwidth extrapolation). which would lead to a very high noise variance in the reconstruction result. Instead, we only reconstruct spatial frequencies of the object wave function which do contribute to the observed image contrast. This results in a low-pass filtered approximation &(-) to the original object wave function &(-)
With Eq. (59). $¶(-) in Eq. (58) can be represented by N14- 1
$¶(t)= C
k= -Nl4
$,(k/2e)exp[-2.rri(2e)-'kt]
226
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
\
0
1.0
0
,
I
\
\
\
\
I
3.0
2.0
\
L.O X 1 ~ - 3
b
5.0
-E
FIG.5 . The influence of neglecting C, in the aberration function y(.) [cf. Eq. (24). (a) Real part exp[ - b?(.)]; solid line: cos[y(.)] and dashed line: cos(2nj.-'$D('); (b) imaginary part exp[-bj(.)]; solid line: -sin[?(-)] and dashed line: -sin(-zj.-'fD('); for the microscope parameters C, = 1.6 mm, E = 0.5 x lo-', D 3 = 160 nm, A = 4 pm.
With Eq. (60) the contribution of the integral term I((; D ) to Eq. ( 5 7 )can be evaluated, where /((; D) is defined by 1
re-e2
which is equivalent to
/(t;D) = - C i + b ~ ( k / 2 ~ ) C i + b * 0 ( k ' / 2 ~ ) 2 7 1 i ( 2 ~ ) -k')exp[2ni(4~)-'(k '(k + k')(] k
k'
r-:
2
exp{ -2ni[(2~)-'(k
-
k')
-(r-:/2)
=
+ j . - ' D t ] q } dy
- - n i x i+bO(k/2&)Ci+b*0(k'/2e)(2e)-'(k - k')exp[2ni(4~)-' ( k k X
k'
sin{2n[(2~)-'(k - k ' ) - j . - ' D t ] ( t ; n[(2c)-'(k - k ' ) K ' D t ]
+
-
it))
+ k')t]
227
IMAGE HANDLING IN ELECTRON MICROSCOPY
The computational strategy in this section is to avoid error propagation by reducing the influence of already reconstructed sample points of the wave function on the reconstruction process. With an extra exposure these cumulative effects as expressed in the integral term Z((; D) can be eliminated for the greater part. Taking three exposures with defocusing parameters D', D 2 , and D3, respectively, we obtain from Eqs. (57) and (61), setting both &( - E) and $G(E) equal to unity - 2 4 4 T )- 1 2ni2- 1 D(j )5 ( ; j )(5) = &(E - 5)e ~p[ -2ni3~ -'D(j) ((~ - $()I -$:(
---E
+ ()exp[2nii.-'D(j)((~ it)] -
-
I ( ( ; D")),
j = 1,2,3 (63)
The integral term [Eq. (62)] can be written
Z((;D)
=
- 2 n i c C ~O(k/2~)$~(k'/2~)exp[271i(4~)-1(k + k')(l k k'
X
sin{2n[(2~)-'(k - k') - ~.-'D(](E + it)} n[1 + ( k - k')-'i.-'D(2~]
whereby terms with k = k' are omitted from the summation because these terms are zero. When the defocusing parameters are chosen according to
D'=D-AD D2 = D D3 = D
(65)
+ AD
with the practical values D = 80 nm and AD = 40 nm, we observe that the denominator of Eq. (64) hardly depends on the different D values, especially for larger values of k - k'. The same situation applies to the sine term in Eq. (64), which is nearly the same for the three D values. We conclude that the integral term I((;D) is in good approximation the same for the three exposures. We now can eliminate this term from Eq. (63). Subtracting the equation with the defocusing D2 from the equations corresponding to D' and D 3 , respectively, we obtain the following set of equations -2d(i.,T)-'2niR-'([D1c^'(S) = &(s -
-
$;(-E
-
D2c^2(()]
-it)] - exp[-2nij.-'D2t(~: -*()I} + ()(exp[2niX'D'((~ - $31 -exp[2niiL-'D25(~ - +()I}
()(exp[-2niZ 'D'((E
-2d(EL,T)-'2~iX'( [ D 3 C 3 ( ( )- D2E2(t)] =
-
-$:(-E
(){exp[-2ni-'D3((e
-$()I
-
exp[-2d-'D2((~
-
+()I}
+ ()(exp[2ni2-'D3C(~ - $31 - exp[2xX'D2((& -it)]) (66)
228
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
With Eq. (65) the determinant of Eq. (66) becomes
-+()I
det(D,AD) = 2isin[2n3--'2AD<(e
- 4isin[2nX1 AD<(c -
it)],
0 < ( < 2c (67) this determinant is presented in Fig. 6 as a function of
-
() = (2isin[2nIL-'2AD((~- $()I
2d(i,T) '27ciIL-'( x [ ( D - AD)?: - D?f] - 2d(i.,T)- ' 2711'3. ' t x [ ( D + AD)?: - D?;] -
X
-
4isin[2ni-'AD((~
-
$<)I}
{exp[2ni(D - AD)((&- $()/).I exp[2niDc(~- + t ) / i ] } {exp[2ni(D + AD)((&- ic)/i] - exp[2niD((c - +<)/i] (68) -
~
The object wave function Go(.) follows from Il/,(.)by means of a Fourier transform, remembering that is in fact a discrete variable 1,'2d, I = [-$N, ...,$ N - 11, we have
<
JO(k,'2e) = ( i N ) -
1
.v 4
~~
I
I = -3'4
Jq(1/2d) exp[2ni(;N)--' k I ]
(69)
From Fig. 6 it is to be expected that for values of ( near 4 = 0 and ( = 2c the reconstructed Jq(-) function will have a very large variance, as the determinant Eq. (67) is close to zero. For this reason it is advisable to sacrifice some bandwidth and omit these $q(*) values from the reconstruction of &,(.). In the
-10
'
FIG 6. The determinant of Eq. (67) of the set of Eq. (66) for the case of Eq. (65) with the microscope parameters D = 160 nm, c = 0.5 x AD = 80 nm, i. = 4 pm.
IMAGE HANDLING IN ELECTRON MICROSCOPY
229
case of Fig. 6 we would have, say, 1.0 x I (I 9.0 x We now turn to the statistical properties of the reconstructed object wave function. A complete statistical characterization of the reconstruction result of Eq. (69) together with Eq. (68) requires the determination of the probability density function of Jo(.). This is a very complicated and elaborate task. Therefore we only consider here the first two moments of the probability density function in question. With respect to the first-order moment, we easily obtain from E { C ( < ) } = c(<) that
N O ( W 4 )
=
(70)
$o(k/24
TWOremarks must be made with respect to Eq. (70).In the first place we have, according to Eq. (26), Gk being an integer, a finite number of significant decimals in E{c^((5)}.Hence, E{Jo(.)) has a finite number of significant decimals also. It is in this respect that Eq. (70) is valid. Secondly, the set of equations (66) is an approximation, as the integral term Z(t; D) of Eq. (64) is not exactly equal for the three D values of Eq. (65). This may result in a bias term in the reconstruction of Jo(-). The bias is, however, small in comparison with the variance of J0(-),which is calculated in the following. The Fourier transform [Eq. (27)] resulted in correlated random variables c^([,) and c^(t2), # t2.The inverse Fourier transform [Eq. (69)] is now expected to be a decorrelating operator, resulting in uncorrelated random variables G0(k/2.2), k = { - - ifi, N - 1). For the calculation of the variance of the reconstructed I)~(.) from Eqs. (69)and (70),we use that var {c^(
...,a
-4sin{2rrL
A D ( 2 d ) - ’ l ’ [ ~- (2d)-’I’]}).
x ( 2 ~ 2 - ’ l ’ ) ~ ( [ (-DAD)2 + D2]
’
x [ 2 - 2~0~(27&’(2d - AD)(2d)-’l’[~- $(2d)- ‘l’])]
+ [(D +
AD)^
+ 021~2
-
+ AD)(2d)-’/’(&
-
k
=
+(2d)-’/’))],
( - $ N , ...,* N
-*
2cos(2na-y2~
-
1)
(71)
where I’ is chosen so as to avoid the small values of the determinant around 5 = 0 and 2.2, e.g., I’ = { N’, ..., 4 N ‘ - I } with N ’ = 0.9N, in the case of the determinant of Fig. 6. From Eq. (71) we observe that the variance is independent of the sample value k/2e. Furthermore, the larger the electron dose i.,T applied, the smaller the variance in tj0(-), as is to be expected.
230
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
111. WAVE-FUNCTION RECONSTRUCTION OF WEAK SCATTERERS
A . Introductory Remarks
The imaging of the substructure of biological specimens by means of an electron microscope is greatly limited by the radiation sensitivity of these objects. Because of this sensitivity the number of interacting electrons incident on the specimen has to be minimized in order to reduce the radiation damage. In low-dose imaging, however, the contrast is very noisy. Because of this poor signal-to-noise ratio the evaluation in particular of nonperiodic object structures is very cumbersome. In Section I,G the stochastic process that governs low-dose image formation was analyzed. This provided the basis for the discussion in Section I1 of the reconstruction of the object wave function from two defocused images obtained with axial illumination. The discussion was presented for one lateral spatial dimension only. In this Section the same problem is discussed, but this time for the special case of weak scattering objects. A two-dimensional analysis is presented; the first part of this Section is based on axial illumination, while the second part applies tilted-beam illumination. Tilted-beam imaging has been the subject of numerous papers (see, for example, Hawkes, 1978, 1980a). This imaging mode is related to holography (see e.g., Wade, 1980). We assume that the apertures of the microscope are square. We know that the apertures of a microscope are in fact circular, but this geometry complicates the mathematical description and does not lead to deeper insight. The object wave function is estimated from two defocused low-dose images (Section II1,B) or from low-dose images obtained with different directions of the oblique illumination (Section 111,C). The stochastic properties of the recorded images are taken into account, and the probability density function of the reconstructed object wave function is given in terms of the operating parameters of the microscope. We assume coherent (quasimonochromatic) illumination. The inelastically scattered electrons are supposed to be removed from the imaging process by means of an appropriate energy filter lens. Only the elastically scattered electrons contribute to the image contrast. B. Axial Illumination
A scheme of the optical system is shown in Fig. 4, where the diaphragms are taken to be symmetrical with respect to the optical axis. The image formation is described by specifying the relations between the electron wave function in the three planes of Fig. 4: object plane, exit pupil, and image plane.
23 1
IMAGE HANDLING IN ELECTRON MICROSCOPY
Under the assumption that the optical system is isoplanar, the image wave function $(.,-) is related to the object wave function t,bO(-,-) by W Y )=
I, I, dxo
dYoWo- x, Y o - Y ) $ O ( X O , Y o )
(72)
where K ( - , . )is given by [see, for example, Hawkes (1980b)l w
o - X9YO
-
Y) =
J. J. d5
d?exP{ -i?(L?d
-
2"mO - 4
5 + (Yo
-
Y)?l)
(73) The wave-aberration function y(., -) contains the spherical aberration (with coefficient C,) and the defocus (with coefficient D)of the optical system where A denotes the wavelength of the accelerated electrons. In Eq. (74) higherorder aberrations such as coma and astigmatism are neglected. In the object plane xo and yo are measured in units of A; in the exit pupil 5 and q are expressed in the (back) focal length f of the imaging system. In the image plane x and y are measured in units of MA, where M is the (lateral) magnification of the optical system. The interaction between the illuminating electron beam and the specimen is now described in a simple model. The illuminating beam is represented by a coherent (quasimonochromatic) one-electron wave function for which we take a plane wave exp(ikz) propagating along the optical z axis with wave number k = 2n/,i. The object wave function is expressed by $o(xo, Y o ) = expCi+o, Y o ) - K X O , Yo11
(75)
Apart from the two lateral dimensions, the formalism in this section is until now identical to that in the previous section. However, as this section is restricted to (thin) weak scattering objects, differences will arise that will change the formalism completely. In the case of a weak scattering specimen, Eq. (75) is approximated by $O/o(XO?YO)
= 1
+ i~/o(XO,YO) - D(X03Yo)
(76)
Together with Eqs. (72) and (73) we obtain for the image wave function from Eq. (76)
232
CORNELIS H. S L U M P A N D HEDZER A. FERWERDA
I"
where &(-, -) denotes the wave function in the exit pupil, which is defined as
*&L ? I ) =
JI,,
dx,
4%exPC - W t x o + YY,)I
M x o 2 Y o ) - K x o , Yo11 (78)
The constant d is numerically large since it corresponds to the aperture in the object plane measured in units of the electron wavelength A. Therefore the two sinc functions in Eq. (77) can be approximated by 6 functions. This results in $(x, Y ) = 1
+
s. I dt
drl$&<, rl) exPC-
b(t,rl) + 2ni(xt + 1'rl)I
(79)
The squared modulus of the image wave function is of great importance for the image formation. Neglecting the term quadratic in $&-,-), we obtain $(x3Y)$*(x,Y ) = 1
+
drl
$p(L r l ) expl- b ( 5 , r l ) + 274x4 + y q ) ] + C.C. (80)
where C.C.denotes the complex conjugate of the preceding term. In the weakobject approximation of Eq. (76) we have neglected terms of higher order in z(-, -) and b(-,-),while in Eq. (80) we assume that the integral which is quadratic in z(-;) and p(-;) is negligible in comparison with the integrals which are linear in a ( . , - )and fl(-;). The assumption of Eq. (80) is not quite the same as the assumption in Eq. (76). Substituting Eq. (78) into (80) and using the symmetry properties of Fourier transforms of real functions, we obtain the well-known result (see, for example, Hawkes, 1980b) $(x, Y)$*(X>Y ) = 1
+2
j dx, on
jm,. jo jm dYo a(xo,Y,)
I,
x exp{2niC(x - x o ) 5 -
2
I,
dxo
drl sinIr(L 1?)1
d4
Jo I
+ (Y - ~ 0 ) r l l )
dyoB(xo,yo)
x exp{2niC(x - xo)4 + (4'
-
d5
d~cosCy(5,~)l
~011)
(81)
In the deterministic (noise-free) case the image intensity distribution is proportional to Eq. (81). This situation applies when the total number of electrons involved in the image formation is infinite. In low-dose imaging the recorded image intensity is a realization of a stochastic process. The following briefly outlines the characterization of this stochastic process.
IMAGE HANDLING IN ELECTRON MICROSCOPY
233
I . Low-Dose lmuge Recording The stochastic process that characterizes the low-dose image has been described extensively in Section I,G. We thus only recapitulate the main properties here. In low-dose imaging the emissions of electrons by the source are statistically independent events. Therefore the total number of electrons Cr emitted during the exposure time T is a random variable distributed according to the Poisson distribution
P { C T = k ) =exp(-;I,T)(AsT)k/k!,
k = 0 , 1 , 2 , ...
(82)
where the source intensity is is the mean number of electron emissions per second. The image intensity is assumed to be recorded in the following idealized way. The image plane is divided into a large number N 2 of identical nonoverlapping squares. It is assumed that each image cell exactly counts all the electrons arriving at the cell. Consequently a recorded image consists of an N x N array of independent random counts Ck,/,( k , 1) = { 1,. . . ,N }. In lowdose imaging the probability that an electron which is emitted by the source will arrive in the (k, 4th image cell is given by Pk,/
=
ss
dx d y $(x3
Y)$*(x> Y )
(83)
ak,l
where uk,[denotes the area of the ( k , I)th image cell. The recorded image is a realization of a stochastic Poisson process, characterized by
The random counts Ck,/ are Poisson-distributed random variables with the parameter = kTpk,,
(85)
The image wave function is a band-limited function of bandwidth E. In Eq. (80) only the linear terms have been taken into account. Therefore in this approximation the image intensity has bandwidth E also. Applying Whittaker-Shannon sampling to the image results in N Zimage cells with the Shannon number N equal to goo. In the next subsection the consequences of the noise in the data on the reconstruction of a ( - ; ) and fi(-,-) are treated in detail.
234
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
2. Object Wuae Reconstruction
We will determine the object wave function from two low-dose images with different defocusing parameters D. The Poisson-distributed integer data values will be transformed into new random variables $ I by subtracting the background intensity and scaling, as follows
With Eq. (86) a representation of the recorded image is obtained with the sample values s^(k/2&, 1/28) which are denoted by &I. The data rik.1 are integers. Therefore the number of significant decimals in sk,Iis governed by the value of N 2 ( A ,T)-'. Further decimals are beyond retrieval and are further discarded here. For the (mathematical) expectation value S k , ) of s^k,l,we have E { $ k , l } = ask,[ = N Z O i 2 0k.l
+ 27ci(x5 + y q ) ] + C.C. where uk,[is the area of the (k,l)th image cell, which is given by ( 2 & ) - ' k xI (2&)-'k + (4&)-1,(2&)-'1 - (4&)-1I y I (2&)-'1 + (4&)-'. (4&)-1I The Fourier transform ?((, q ) of the transformed data values is defined by
For reasons of convenience the function E(5, q ) is defined for a continuum of values t and q. In actual practice Eq. (88) will be calculated using a fast Fourier transform (FFT) algorithm, which yields 2(tp,qq) for a discrete set of 5, and ylq values. The function E(-, -) is a complex stochastic function, the properties of which are described in Appendix B. As is discussed in this appendix, the discrete representation of 2(-, .) has the advantage of consisting of uncorrelated random variables. When explicit calculations need to be performed with we shall return to this discrete representation. Appendix B shows that, to a good approximation (for not too few electrons per sample cell), the probability distribution of E(-,-) is complex Gaussian with its mean equal to the true, deterministic function and a (nearly) constant variance N4(3.,T ) - ' . This variance is a microscope parameter, because its value depends only on the illumination dose and the resolution (i.e., the area of the sample cell). With Eq. (87) we can write for the expectation value of 2(-,.), by carrying out the integrations over x and y and with evaluation of the summations over ?(.,a)
IMAGE HANDLING IN ELECTRON MICROSCOPY
x exp[-ni(2~)--l(t- 5" X
+q
-
q")]
sin[2n(2e)-'+N(t . sin[n(2~)-'(5 .-
235
(")I <")I
-
sin[2~(2~)-'+N(q - q")] sin[n(2~)-'(q - q")]
(89)
Because the aperture in the object plane is large in comparison to the wavelength of the imaging electrons, we have
(90) The integrations in Eq. (89) can be carried out with Eq. (90),leading to
E(c^(t,?))= N 2 ~ 0 2 ( $ p ( - t-v)exP[-iy(t,rlll ?
+ IC1;(5,?)exPcil)(t,?)I}
where resulting sinc functions have been set equal to unity sin [2n(46)-1t] sin pn(4&)-lq] N1 n(2E) t n(2E)- 1 q
(91) (92)
~~
The result in Eq. (91) will be used for the determination of the object wave function. Just as in the previous section the expectation value of C(-,.) is not known; instead we must use C(-,-)itself. For the case of weakly scattering objects we have obtained in Eq. (91) a direct relation between 2(-,*) and the wave function in the exit pupil. The corresponding equation in the previous section is an integral equation, which is much more difficult to handle.
236
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
In terms of the functions a(.,.) and from Eq. (91) with Eq. (78) at
fi(.,.) to be reconstructed, we arrive
~ { ? ( ( , q )=N'0;~2sin[y(<,q)] }
-N2oO22COSCY(5,v)I
dJ'o expC2ni(tx, + VY,)l P(x0 Y o ) 3
(93) For the reconstruction of a(.,.) and b(-,-)two exposures with different defocusing parameters D are made. The quantities related to the two exposures are denoted by the superscripts ( I ) and ( 2 ) , respectively. By means of elimination we obtain straightforwardly from Eq. (93) with the two exposures leading to ?'(-,.) and t 2 (.)- ,
From Eq. (74) we obtain for the determinant with D 1 sin[;'*(t, q ) - ~
' ( 5q ,) ] = sin[nX'
~
D 2 = AD
A D ( t 2 + q')]
(95)
Equations (94) are the basis for the statistical estimation of the functions a(.,-) and b(.,-),which contain the information about the specimen in question. We now return to the discrete representation of ?(-,-) and denote the sample value ?(k'/2d, / ' / 2 d ) by E k f , l , . The discrete inverse Fourier transform which corresponds to Eq. (88) is represented by
The inverse Fourier transform Eq. (96) is applied to Eq. (94) in the following analysis. For the left-hand side of the first equation of Eq. (94)
IMAGE HANDLING IN ELECTRON MICROSCOPY
237
we obtain
where the summation over k' is approximated as follows
c
N/2 - 1
k'= -N/Z
exp[2zi(2d)-'(xb
-
= exp[-zi(2d)-'(xb
N
2d
(97)
x,)k']
-
xo)]
sin[2n(2d)-'+N(xb - x o ) j sin[n(2d)-'(xb - x,,)]
sin[2m(x0 -- xb)] n(x0 - Xb)
and the summation over I' is approximated by a similar expression. In the expression on the right-hand side of Eq. (97) frequency components of a ( - ,.) above E are filtered out (low-pass filter equation). In general the functions a ( - , - ) and b(.,.) are not band limited. The images do not, however, contain information about frequency components beyond E . Although in principle bandwidth extrapolation can be applied, it is not widely used because of the severe drawback of the considerable increase of the noise variance. Therefore @(.,.) and b(-,.) are represented by band-limited functions denoted by c((-,-) and -), respectively
a(.,
X-
sin[2ne(xo ~XC[X,
-
(2~)-'k)]sin[2n~(y, - (2~)-'1)1 - (2~)-'13 27~~[yo
-(2~)~'kl
(99)
A similar expression applies for the relation between b(.,.) and p(.,.). By substituting Eq. (99) into Eq. (97), and using the orthonormality property of the sinc functions
s
+m
-m
sin{2ne[xo - (2~)-'k]} sin{2m[x0 - (2~)-'k']} dxo = d k , k f 271&[XO- (2e)-'k]7r[xo - (2&)-'k']
(100)
(which applies closely here because d >> l), we obtain the following set of equations [which are the discrete counterparts of Eq. (94)] by which i(-, .) and
238
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
B(-,-)are estimated
1
1
N/2-l
!?(k/2~,1/28) = N-2
N/2-1
exp[ -2niN-’(kk‘
k‘=-N/21‘=-N/2 ( k ’ . l ’ ) # (0.0)
+ U’)]
x (2 sinEn2-l AD(2d)-2(k’2+1’2)]}-1 x (cos[y2(k’/2d, 1’/2d)] t:f,l’ -cos[y 1 (k’/2d),1’/2d)]t;.,,.}
1
N/2- 1
E(k/2~,1/2&)=N-’
N/2- 1
k‘= -N/21’= -N/2
x
exp[-2niN-’(kk’+lZ’)]
+ lf2)]}-’
{2sin[7tX1
x {sin[yl (k’/2d, 1’/2d)]t;f,1,- sin[y2(k’/2d, 1’/2d)It:,,l,}
(101) In the equation for &(-,.) in Eq. (101) the point k‘ = 1‘ = 0 must be excluded from the summatiqn because in this case the denominator is equal to zero. In the expression for B(., .) this precaution is not necessary because in this case the numerator is also equal to zero for k‘ = 1’ = 0. With Eq. (101) we have obtained an estimate for the band-limited approximations E(-, -) and to the functions in the object wave function a(-,-)and b(.,-),respectively. We now proceed to investigate the stochastic properties of Eqs. (101). Restricting ourselves to the significant decimals of the gk.1 variables, we notice that the statistics of Eq. (91) are unbiased
p(.,.)
{ - i N , . ..,$N
E{@(k/2&, 1/28)) = E(k/2e, 1/2~),
( k ,I )
E{;(k/2~, 1/28)) = p(k/2~,1/28),
( k , 1) = { -$N,. . .,$N - l}
=
-
I}
(102)
The variances of the two statistics of Eq. (101) will be calculated separately. We first consider the object Cmplitude function B(.,.). From Eq. (101) we note that the random variable p(.,.) is a weighted sum of Gaussian random variables,-and therefore it is a Gaussian variable likewise. The expectation value of B(.,-) is the true as is expressed by Eq. (102) and the variance, which defines cri, is given by $(.,a)
1
N/2- 1
var{~(k/2&,1/2~)}=(41,T)-’
1
N/2- 1
k’=-N/2I‘= -N/2
(sin[ni-’ AD(2d)-2(k’2+1’2)]}-2
x {sin2[y’(k’/2d, 1’/2d)] +sin2[y2(k’/2d, I’/2d)]}
=(4i,T)-b;,
( k , l ) = { -3N )...,$ N
B(., A
-
l}
(103)
From Eq. (103)we see that the variance of .) does not depend on ( k , I). The value of the constant variance (4AST)-’op2 is fully determined by the microscope parameters. From the definition of y(., .) [cf. Eq. (74); we note that
239
IMAGE HANDLING I N ELECTRON MICROSCOPY
every term in the summation is finite provided that AD (cf. Eq. (95)] is chosen I”)] is zero only in (k’, 1’) = (0,O). in such a way that sin[nL-’ AD(”)-’(k’’ For (k‘,1’) = (0,O) the term in the summation in Eq. (103) takes the finite vAalue of (AD)-’(D” + 0’’ ). To summarize, the object amplitude function B(., estimated by Eq. (101), can be described as the sum of the true .) function value plus a signal-independent Gaussian stochastic process with zero mean and constant variance
+
p(-,
9)
~ ( k / 2 ~ , 1 / 2 ~ ) = ~ ( k / 2 ~ , 1 / 2 c ) + N ( O , ( 4 L , T ) -( ’ka, I~) )=,{ - t N , . . . , + N - l } (104) where N(0, a’) denotes a normal distributed random variable with mean equal to zero and a variance of a’. Next we turn to the object phase function ti(-,-). From Eq. (101)we see that $(-,.) is also a weighted sum of Gaussian random variables. Therefore I?(-, -) is a Gaussian random variable, its expectation value is the true ti(., .) value [cf. Eq. (102)], and its variance is given by var{$(k/2s, 1/24) =(41,T)-’
NI2 - 1
N/2-1
C C k’=-N/21’=-N/Z
(sinCn1-l AD(2d)-2(k’2+1’2)]}-2
WJ’) # (0.0)
+ cos2[y1(k’/2d,If/2d)]}, ...,4 N - l}
x {cos2[y2(k’/2d,1’/2d)]
(k,I)={-*N,
(105)
Unlike the situation with the object amplitude function, not every term in the summation in Eq. (105) is finite. If the term belonging to (k’, 1’) = (0,O)had not been excluded from the summation, an infinite variance would have resulted. The variance, Eq. (105), is nevertheless still dominated by the (k’,l’) combinations close to (0,O)where the denominator of Eq. (105) is very small while the numerator is of order unity. The estimated object phase function @(-,.) obtained from Eq. (101) appears to be very sensitive to noise. The variance of :(-, .) is considerably larger than the variance of the object amplitude function p(-,.), and in fact so large that it is questionable whether this quantity is of any use at all. We will now investigate the mechanism which is responsible for the extremely high variance of g(.,-).For this purpose it is necessary to check the contributions of the different Fourier coefficients of the phase function Z(-, -) to the data iik,l.From Eq. (81) we arrive at the following expression for the contribution of the phase function a(-,.) to the recorded image intensity E{Afik,l}= 1,TN-22
s. s. d5
dqsinCy(5, q)] JuodxoJuodYoa(xo~Yo)
x exp[2ni{(k(2~)-’ - xo)5
+(
4 2 ~-) y~o ) ~q ) ]
(106)
240
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
where A& denotes the contrast resulting from the phase function alone. To a good approximation we can substitute for this expression
E { A k k , f }= R,TN-22N-2
c
N/Z- 1
N/2- 1
1 sin[y(k’/2d,1’/2d)]
k’= -N/Z f‘= - N / 2
x exp[2niN-’(kk’
+ 11’)l
In order to be detectable, the right-hand side of Eq. (107) must have at least a numerical value of one as Ai?k,l has to be an integer. The term W 2 @(r/2~, s / 2 ~exp{ ) - 2niN-’(rk’ + sl’)} in Eq. (107) is of the same order of magnitude as i(-, .) which is known to be less than one. Taking i(-,.) to be approximately 0.1, a rough guess is obtained as to which Fourier coefficients do contribute to the phase contrast. In order to observe phase contrast, the following relation now applies
In order for a specific Fourier coefficent (k’,I’) to contribute to the image intensity 6, the more stringent condition must be imposed Isin[y(k’/2d, 1‘,’2d)]l 2 5 N 2 ( i . , T ) - ’
(109)
From Eq. (74) we observe that y(-, -)isa function of (2d)-2(k’2 + I”). Therefore we can find a circle around (0,O) with a radius q given by
k”
+ 1”
I q2
(1 10)
with k‘ and 1’ such that /sin[y(k‘/2d, /‘/2d)]l I 5N2(&T)-’
(111)
The recorded intensity distribution fi does not contain information about the Fourier coefficients (k’,1’) inside the circle with radius q. Therefore these coefficients cannot be retrieved. In Eq. (105) we already observed that the variance of ;(-,-) is dominated by the (k’,!’)combinations close to (0,O). If the Fourier coefficients (k’, 1’) satisfying Eqs. (1 10)and (1 11) are excluded from the computation of g(-,.) in Eq. (101)the value of the noise variance in Eq. (105) is improved considerably. Nevertheless all the phase information contained in the images is used. This in fact represents the bandpass filtration of the object phase function tl(-;). In addition to the frequency components above c, the low-frequency Fourier coefficients are filtered away also. Therefore we do not
24 1
IMAGE HANDLING IN ELECTRON MICROSCOPY
reconstruct the E(.,.) function. Instead a filtered version iq(-,.) is computed which does not contain frequency components in the circle with radius q. The bandpass-filtered object phase function Eq(-,.) is estimated by [cf. Eq. (lol)]
x {2sin[rrA-’ AD(2d)-2(k‘2
x
{COS[Y’(
+ l”)]}-’
’
k’/2d, 1’/2d)]c^:,,l, - cos [y (k‘/2d, 1’/2d)]c^;,,l,},
( k , l ) = {-+N, ...,+N
l}
-
(1 12)
The bandpass-filtered object phase function of Eq. (1 12) can be described as the sum of the expectation value of gq(.,-),which is equal to the true iiq(-,-) value, plus a signal-independent Gaussian stochastic process with zero mean and a constant variance (4%sT)-’0&.This variance follows from [cf. Eq. (105)l
x {cos2[ ~ ( ~ ) ( k ’ / 21’1241 d, = (42sT)-10-~q,
(k,l)
=
+ cos2[y(’)(k’/2d,1’/2d)]}
{ -4N... .,$N
-
l}
(113)
To summarize, the bandpass-filtered object phase function Gq(-,.)estimated by Eq. (1 12) can be expressed as
tq(k/2&,112~)= Eq(k/2&,1/28) + N(0,(4AsT)-’~:q),
( k , 1 )= {
-4 N, ...,4 N
-
(114)
1)
where the relation between the bandpass-filtered Eq(., .) and the true E(-, -) function is represented by
x
expf2niN-’[(k
-
r)k’
+ (1
-
( k , / ) = {-+N, ...,+N - 1)
s)1’1}, (1 15)
Equations (101), (103), (104), ( 1 12), (1 13), and (1 14) are the principal results of this section. For examples of the reconstruction of object wave functions from simulated low-dose images involving the bandpass filtering, the interested reader is referred to Slump and Ferwerda (1982). The next paragraph is devoted to a discussion of a reconstruction algorithm for the object wave
242
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
function which is also capable of reconstructing the lower Fourier coefficients of the object phase function.
C . Tilted Illumination In the previous paragraph the low-dose reconstruction of a weak phaseamplitude object was discussed. The reconstruction algorithm was based on two defocused images obtained with axial illumination. It became evident that the lower spatial frequencies of the object-phase structure were only weakly transmitted and therefore hardly contribute to the image contrast, especially when the applied electron dose is of a low level. Low-dose imaging leads to large noise variances in the reconstructed phase part of the object wave function. In this section we discuss the properties of a promising though more eleborate method of reconstructing the object wave function in the context of low-dose electron microscopy. This reconstruction algorithm is based on several image intensity distributions, which are obtained by illuminating the object structure consecutively from different directions. In Fig. 7 a diagram of the imaging system with tilted illumination is presented. The illuminating electron beam is again described by a one-electron wave function. We assume coherent illumination by a plane wave exp(ik-r,) with wave number k = Ikl = 2x12, propagating in a direction which makes an angle 8, with the optic axis (see Fig. 7). In the simple model for thin weak scattering objects which is considered in this chapter, the object wave function is represented by
$o(x,,Y,)
=
$O(XO>YO)
=
exP~ia(x,,.Yo) - P(xo,~,)l$b(xo,Yo)
(1 16)
The wave $b(.,.) represents the illuminating electron wave function; it is the object wave function in the absence of an object. The restriction to thin weak objects allows the approximation of the object wave function by I1
+ ia(xo,y,)
-
P(xo>Yo)lIl/b(xo,Yo)
(1 17)
Denoting the polar angles of the wave vector k by Bo and 40,we easily obtain $b(xo, y o ) = exp( - 27ri(x0sin 8, cos @ O
+ y o sin 0, sin 40))
( 1 18)
where xo and y o are measured in units of 1 and the position of the object plane zo = 0. Defining the background wave function Il/bg(*,-) as the image wave function in the absence of an object, we obtain by substituting Eqs. (1 18) and (73) into Eq. (72), and carrying out the integration over xo and y , dv exPC - iY(5, v ) + 2 n w X
+ YV)I
sin[2nd(5 + sin Bo cos 40)] sin[2nd(v + sin 8, sin 40)] (1 19) n(v + sin 8, sin &), n(4 sin 8, cos 40)
+
243
object plane
2.0
exit pupil
z=zp
image plane z=zi
FIG.7. Schematic diagram of the imaging system for the case of tilted illumination.
Because of the numerically large value of d (the aperture in the object plane expressed in units of A) the two sinc functions in Eq. (1 19)can be approximated by 6 functions. This yields $bg(x,y ) N exp{ -- iy( - sin 8, cos q50, -sin 8, sin 4,) - 2ni(x sin 6, cos 4, + y sin 8, sin 4,) ( 120) In the derivation of Eq. (120) it is assumed that both sin8,cos4, and sin 0, sin 4, are contained in the interval; this corresponds to bright-field imaging. In the image plane information about the object structure is contained in the image wave function &(., -), defined by dvl$~b(5,vl)expC-iy(5,vl)
+ 274x5 + y q ) ]
(121)
244
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
lo(, lo,
where the wave function in the exit pupil $&-,.) is given by
$&t> II) =
dY0 exp{ - 2niC(t
dx,
+ (v + ~
~
~
~
O
~
~
~
~
+ sin 0 0 cos 40)xo
O P (~X OY? Y OO) I l
~
C
~
(122) ~ ~
~
,
The squared modulus of the image wave $(-,.) is given by $(x,Y)$*(x,Y)=I$bg(x,~)+ $i(x>Y)12
(123)
Using Eq. (120) we obtain for the modulus of the image wave function
I$(x~Y)I
=
C1
+ $bg(x,~)lCIT(x?~) + $&(x,Y)$i(x,Y)+ $ i / i ( x > ~ ) $ T ( x , ~ ) I ~ ' ~ (124)
In the next subparagraph the recorded noisy image is expanded into a set of orthonormal functions. The properties of this expansion are then investigated.
1. Orthonormul Expansion Of the Low-dose Image The image wave function is a band-limited function of bandwidth E ; thus its squared modulus has bandwidth 2.5. In the next subparagraph we will show that the highest (spatial) frequency which is used in the reconstruction of the object wave function is equal to 3e. In order to improve the signal-to-noise ratio, we consider the squared modulus of the image wave function to have a bandwidth of ;E. Applying Whittaker-Shannon sampling to the image results in N 2 image cells, with N equal to ~ C J ~ The E . recorded image intensity is a realization of a stochastic Poisson process. The random counts Yik,l are Poisson-distributed random variables with intensity parameter which follows from Eq. (85) and the approximation of the integral in Eq. (83). We now expand the modulus of $(.,-) into a set of orthonormal functions. It is convenient to write the two-dimensional orthonormal functions as a direct product of two one-dimensional functions. From Eq. (124) we have the expansion
The functions 4,,, and bnare chosen to be orthonormal on the interval ( - d, + d ) . This set of functions is complete if the indices rn and n in E q . (126) continue to infinity. This is not required here because the image is sampled in squares with sides of length ( 3 e ) - ' . Within the cells the value of the functions is taken to be a constant.
~
Y
IMAGE HANDLING IN ELECTRON MICROSCOPY
245
From Eq. (125), it follows that
Our purpose is to estimate the expansion coefficients a = (. . . ,am,n,. . .) from the random variables fi = (. . . , t ? k , I ,...) using Eqs. (84) and (127). As the variables fi are integers, the accuracy attainable in the coefficients a is limited to approximately (&T)-’N 2 . The maximum likelihood method claims the best estimate of a to be those values which maximize the likelihood function L(ii, a). This function is the joint probability function of the observations. When the parameters a have their true value, L(ii,a) is the probability of obtaining the recorded count pattern 3 given in
In Appendix C the likelihood function [Eq. (128)] is used to determine the amount of information about the parameters a contained in the recorded image fi. Closely related to this Fisher-information matrix is the minimum achievable error variance of the parameters a as expressed in the Cramer-Rao bound (see, for example, Kendall and Stuart, 1967; Van der Waerden, 1969; Van Trees, 1968).The estimated values for the parameters depend on the data; thus they also are random variables. Knowledge about their probability density function, or at least of the first two moments, is of as much importance as the values themselves. We will return to this subject further on. In order to simplify the estimation of the parameters a, the auxiliary variable (k, I ) = { -*N,. . . , i N - I}, is introduced E{t?k,l}
From Eq. (129)
=
&TN-’(I
+ Sk,1}’
( 1 29)
is estimated by
Appendix B shows that the probability density function of is to a good approximation Gaussian, with mean equal to s k , l and variance equal to N2(4iST)-’.From Eqs. (127), (129), and (130), the relation between the auxiliary random variables gk,k,land the parameters a is obtained as m=O n=O
By using Eq. (131), the parameters a can be estimated either by the method of least squares or by the maximum likelihood method, because the probability density function of each Fk,, is Gaussian. The variances of do not depend
246
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
on a. Thus the estimated values for a are obtained by minimizing Q 2 , which is defined as
Minimizing Q 2 with regard to a results in the following expression for the parameters up,q
x 4 p ( k / 3 & ) 4 ( ~ / 3= 40
Using the orthonormality relations
we obtain
For the expectation value of hP,, we obtain from Eq. (131) %P41
= ZPdl
thus the statistic of Eq. (135) is unbiased. We wish to remark here again that the number of significant figures which can be retrieved from the integer data values ii is limited to an accuracy of about (1,T)-'NZ.As the probability density function of the $k,l variables is (approximately) Gaussian, as is shown in Appendix B, the probability density of GP,, is also Gaussian. For the we find that variance of ip,q Nl2- 1
NlZ-1
=(4&T)-'N2,
The covariance matrix
I/ of
( p , q ) = {O,.. . ,N
-
l}
(1 37)
the estimated parameters is given by
Vr,s:p,q = E ( C 2 r . s - E { c r , s } I C ' p , q - E{'p,q}I] = (4AT)-'N26r,p6s,q (138) because of the identical independence of the P parameters, a result that follows from the independence of the ii recordings (cf. Section 1,G). From Eqs. (1 36) and (1 37) we conclude that the estimated parameters P of the expansion in Eq. (127) are uncorrelated and Gaussian distributed. The mean equals the true value given in Eq. (1 36), and the variance is given by Eq. (137), which shows that the variance is a quantity independent of the object.
IMAGE HANDLING IN ELECTRON MICROSCOPY
247
Moreover, when comparing the covariance matrix I/ in Eq. (138) with the Cramer-Rao bound in Appendix C, we see that they are identical. We therefore conclude that the expansion parameters a are efJicientZy estimated, i.e., estimated with the lowest achievable error variance. Equation (135) is therefore an efficient statistic, and all the information that is contained in the data is converted into estimated values. In the next subparagraph we use the expansions of the recorded images to reconstruct the object wave function. 2. Reconstruction of the Object Wave Function
In this subparagraph relations are derived between the object wave function and the recorded low-dose images. In order to do so we determine the wave function in the exit pupil using the orthonormal expansion of the image data described in the previous subsection. Using Eqs. (124)and (136) we obtain
where we neglected the squared modulus of 1,9~(-,-) and approximated the square root by the first two terms of its Taylor expansion. This is an admissible approximation because we have restricted ourselves to weak objects. Until now we have not specified the orthonormal functions to be used in the expansion in Eq. (127). Because of their convenient properties under Fourier transformation, we choose the prolate spheroidal functions (Slepian and Pollak, 1961). An overview of the properties of these functions is given, for example, by Frieden (1971). The prolate spheroidal functions @:(-) are the eigenfunctions of the finite Fourier transform operator, and they are defined by the integral equations (Slepian, 1964) 1
at@<( ) =
J J y
exp(icxy)@;(x)dx,
J-l
-
1I x I1
Introducing the new integration variables x' = dx and 5 the function 4j(x') = @ ; ( x ' / d ) , transforms Eq. (140) into
=+
( 140)
~ yand , defining
. interval of 5 is given by - $ E I 5 $E. The where we chose c = 3 x ~ d The superscipt c has been omitted in Eq. (141); c corresponds with the spacebandwidth product. In its discrete representation Eq. (141) is written +anN +,,(l/%)
=
1
N / Z- 1 k = -N/Z
exp(2niklN- ')4,(k/3~)
248
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
with N = 6de = 3 0 , ~ , 1 = { -+N, -+N + l , . . ., i N - 1). Calculating the Fourier transform of the left-hand side of Eq. (139), we obtain using Eq. (143)
c c
N/Z- 1
N/2- 1
k = -N/21= -N/2
exp[2ni(kk'
+ Il')N-']
c
N-1N-1 m=Om=O
For the right-hand side of Eq. (139) we obtain, with Eqs. (120) and (121)
1
N/Z- 1
1
N/Z- 1
k = -N/Z I= -N/2
1 =-
exp[2ni(kk'
1
N/Z- 1 N / 2 - 1
2 k = -N/2
C
I = -N/2
+ l I ' ) i V 1 ] Re
exp[2ni(kk'
+ I~')N-']
where C.C. denotes the complex conjugate of the preceding term. The summations in Eq. (144) over k and 1 can be carried out and result in sinc functions which can be approximated by 6 functions, as in Eq. (1 19). These 6 functions allow us to carry out the integrations over 5 and v] in Eq. (144), which leads to
= t ( 3 ~ ) ~ e x p [ - - i y ( - s i n H , c o s ~ , ,-sinB,sin$,)
+ ig((2d)-'k'
-
sin0,cos~,,(2d)-'/'
x
$;((2d)-'k'
x
rect,(sin 8, cos 4,
-
k'
iy( -(2d)-
x
$p(-(2d)-1k'
x
rect,(
-
sinQOsin4,,)]
sinfIOcos4,,(2d)~~'I' - sinB,sin4,)
+ + ( 3 ~exp[i;l( )~ -sin -
-
-
( 2 d ) - 'k') rect,(sin 8, sin 4,
-
(2d)
1')
8, cos 4,,, -sin (I, sin 4,)
+ sin Q, cos 4,, -(2d)- I' + sin 8, sin 4,)] + s i n 8 , ~ 0 ~ 4 , , - ( 2 d ) - ~ /+' sinH,sin@,)
sin ( I , cos 4,,
~
( 2 d ) - k ' ) rect,(
-
sin c), sin 4,
~
(2d) '1') (145)
IMAGE HANDLING IN ELECTRON MICROSCOPY
249
where the rect function is defined as rect,(t) =
1, 0,
It1 I E
elsewhere
Equation (145) is the basic equation for the determination of the $J.,.) function. We assume that the first exposure [denoted by superscript (I)] has been made with the tilt angles B0 and 4o of the illuminating beam specified by
0;
=:
0;
62-1’2;
-
=
an
(147)
10-3-10-2).Usingthevalues sothatsinUhsit-14; = sinUhcos4; =$(ass in Eq. (147), we obtain from the first exposure bN2
1 C
N-IN-1 m=O n = 0
=
a^~,,.,.m0m(k1/3~)4,(1’/3E)
t ( 3 ~ ) ~ e x p [ - i ~ ( - $ e-46) ,
+ iy((2d)-’k’
x $;((2d)-’k’ - +E, (2d)-’l’ x rect,($e -
-
(2d)- ‘ 1 ’ )
i~(-(2d)-’k’ + + E ,
-(2d)-’l’
-
-
is, (2d)-’l’
--$)I
+E)rect,(*s - (2d)-’k’)
+ 3(3~)~ exp[iy( -$e, -+e) -(2d)-’l’ + 3~)]$~(-(2d)-’k’+ SE,
+ tE)rect,(-$e
-
(2d)-’k’)rect,(-+e
-
(2d)-’l’) (148)
Since the rect functions in Eq. (148) do not overlap each other everywhere, regions in the (k’, 1’) plane now emerge where $,*(.,.) or t,bP(-,-) as appropriate can be determined separately. These regions are shown in Fig. 8. From Fig. 8 we see that only when (2d)- k‘ and (2d)- I‘ both lie between -3e and +E do $&-,.)and $p*(.,-)appeartogetherinEq.(148).Thus $J.,.)and $,*(.,.)cannot be determined in this region from the first exposure only. In the regions of Fig. 8 marked with $: and $c/p, these functions can be determined separately. Also from Fig. 8 we see that the highest-frequency component of the (k’,l’) plane corresponds to %E. This is why in the previous section the WhittakerShannon sampling of the recorded image was chosen to correspond to a bandwidth of 3 ~ We . will now focus on the determination of $,(k‘/2d, 1‘/2d) in the square: - E < (k‘/2d, 1’/2d) IE. By taking another exposure with tilt angles
’
250
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
FIG.8. The regions in the (k',l') plane representing the support of the rect functions in Eq.
(148).
we obtain the relation
+
= $ ( 3 ~ ) ~ e x p [ - i y ( + ~ , + s )iy((2d)-'k'
+ f~,(2d)-'l'+ 3s)
x $:((2d)-'k' x rect,(-+e
-
(2d)-'k')rect,(-$&
+ i(3e)' exp[iy(+e,$e) x i+bp(-(2d)-'k' x rect,(+s
-
-
iy( -(2d)-
' c , - (2d)-'l'
- 2
'
(2d)- k ' ) rect,(+s
-
+ $ ~ , ( 2 d ) - l l+' $&)I -
k'
(2d)-'I') Le, -( 2 d ) - l l '
- 2
-+&)I
- +&)
(2d)- ' I ' )
(150)
Equation (150) allows us to calculate the t+bP(-,-)function in that part of the square in the ( k ' , ! ' ) plane which was inaccessible in the previous step [Eq. (148)]. In the region shown in Fig. 8, where the rect functions do overlap, we still cannot determine i+bp(-,.) and $,*(.,.). The reason is that Eqs. (148) and (150) are not independent in this region, as can be seen by taking the complex
25 1
IMAGE HANDLING IN ELECTRON MICROSCOPY
conjugate of Eq. (150) and changing k‘ into - k‘, 1’ into - 1’, and using the even symmetry of the aberration function y(-, .). The $p(., .) function is now known in the full square ( - 1 8 < k‘/2d,1’/2d < +&) from Eqs. (148) and (150) and the object wave function can be determined by inverting Eq. (122). Equation (150) contains more information, however, because the same procedure as utilized in Eq. (148) can be repeated here. Using all the available information leads to an accuracy in the determination of t,bp(-,-) which is not constant over the range of spatial frequencies involved. In order to achieve uniform accuracy we must take yet another two exposures with beam tilt angles
6;
=
6;
4;
=
-4;
= E2-1/2.
=
6:
=
8;
= E2-I”
440 -- 4; + +x -p; 1
= 27.c
With these four exposures we are able to make three complete reconstructions for each point ( k / 2 ~1/2&) , of the object wave function. Taking the mean of these reconstructed values results in a reduced variance of the noise. The reconstructed object wave functions result from inverting Eq. (122). AS highfrequency information about the object is lost due to the aperture in the exit pupil, we have to approximate ia(.,-) - B(.,.) by a band-limited (filtered) version denoted by i i ( - , - )In principle, band-width extrapolation is possible, but it increases the variance of noise considerably. Except for the “missing” quadrant of the square in the (k’,1’) plane (see Fig. 8) the $p(., .) resulting from exposure is given by
p(.,.).
+ $8, -(2d)-’I’ + + E ) = exp[-iy(-+&, - $ E ) + i ~ ( - ( 2 d ) - ’ k ’ + +E,
(3~)~$;(-(2d)-Ik’
-(2d)-I1’
$- + E ) ]
Using Eq. (122) we obtain the following relation for the object wave function
Introducing the new variables ( 2 d ) - ’ k = - ( 2 d ) - k’ + $ E and ( 2 d ) - ‘ 1 = -(2d)-I1’ + $E and taking the discrete Fourier transform of both sides of Eq. (153), we obtain by carrying out the summations over k and 1 on the
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
252
right-hand side with N' = 2 d 2 ~= o0o N c -2
"72
1
-
1
"12
-
k=-N'/ZI=-N'Z
X
1
&i(k/2d,1/2d)exp[-2ni(kr + Is)N'-']
+
2nE[(2E)- s Yo1 sin{n(2d)-'[(2~)-'s yo])
+
The integration over xo and y o in Eq. (154) can be carried out using thc orthogonality property
which applies in our case in good approximation because d is very large compared with the wavelength.
253
IMAGE HANDLING IN ELECTRON MICROSCOPY
Based on a straightforward computation we find
c
N/2- 1
N’-’
k = -N/2 I = - N
=(
+ ls)N‘-’]
k l x,2&k(,,.,)..p[-2ni(kr
N/2- I
”)
2 ~ ) - ~ [ 2E i C’(2E~ :
-
fl(z 2E ’z)]exp[ 2E in(r + s ) / 2 ]
(158)
From Eq. ( I 52) we obtain, using Eqs. (1 57) and (158) *
[iE;-( - rj2c,
-
s / ~ E-) fl( - r / 2 ~ , s / ~ E )exp[in(r ]
-&)I
= exp[-iy(-+e,
c
N’/2 - 1
x
N-lN-1
C
”‘2 - 1
1 2ni(kr + Is)”- ‘ 3
k = - N’/2 I =
x exp[iy(k/2d,1/2d) x
+ s)/21
-
N’/2
+
G ~ , ~ Z , , U ~ $ ~ [ - ( ~ ~Ed) ]-$~, [~- ( 3 ~ ) - ’ 1 + @ ] (159)
m=O n=O
In Eq. (1 59) we must exclude in the summation over k and 1the region in which $:(-,-) could not be resolved from exposure ( l ) . For each of exposures ( l ) through (4) we perform the calculation in Eq. (159),excluding the inaccessible zones and taking the-correct tilt angles. This results in four descriptions of the estimated i(-, .) and fl(., .) functions. These four reconstructions are equivalent to three reconstructions using the function $Jk’/2d, 1’/2d)in the whole square. The final estimated ;(.,.) and functions are the mean values of the four descriptions [Eq. (1 59)]. Since Eqs. (152) and (1531 are linear, the estimated function values are independent Gaussian random variables with
p(.,.)
*
~ { i E ; - ( - r / 2- ~- s, / ~ E ) - & Y / ~ E ,
-s/~E)}
-
=
iE(--r/2c, - s / ~ E )- f i ( - r / 2 ~ , - s / ~ E ) , (r,s) = {
-+ N ’ , ...,+N ’
-
l}
(160) The computation of the variances is more involved. From the first exposure we obtain the variance var{gl}
= fN2(4i,T)-’
c’
N’/2-1
1’
c
N’/2-1 N - l N - l
k=-N‘/Zf=-N’/Zm=O
x 4:[-(3~)-’k
Ianl2laml2
n=O
+ $ f ] $ ; [ - ( 3 ~ ) - ~+1+ d l
(161)
The prime on the summations over k and 1 indicates that we must exclude those values of k and 1 for which $:(-,-) could not be solved from exposure According to Slepian (1964),Ian12is approximately constant and equal to N - ’ for 0 I n I N - 1 , and zero elsewhere. Thus lc1,I2 and l c ( , I 2 can be replaced by
254
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
N - ' . The three remaining exposures can all be treated in the same way. Combining the variances of the four exppures, we find for the variance of the resulting object wave function i$(.,-) .)
B(.,
N/2-1
=&(3)N2(4&T)-'
N/2-1
k=-N/21=-N/2
$:(
-
k / 3 ~ ) 4 , 2-( 1/34
=iN2(41,,T)-'
(162)
where we used the orthonormality of &(-,-). Note that in combining the four exposures, the summations over k and 1 cover the rectangle - $N _< k, 1 I i N - 1 three times.
IV. PARAMETER ESTIMATION A . Maximum Likelihood Estimation in Electron Microscopy
The present section introduces a second line of approach to low-dose electron microscopy; i.e., certain aspects of the optimal use of a priori information are considered. In this section we focus on the estimation of unknown parameters related to the object structure, using a single exposure of the specimen. Emphasis is on the statistical significance of the obtained results, and on the merits of numerical strategies. First a brief introduction to the statistical estimation method of maximum likelihood. From among other methods maximum-likelihood estimation is chosen because of its properties in the application to electron microscopy. This does not imply that maximum likelihood is the most suitable method in every case. Whenever considered appropriate, other methods will be used. The properties of maximumlikelihood estimation are treated in many textbooks on statistics (e.g., Kendall and Stuart, 1967; Van der Waerden, 1969).This estimation method originates from R.A. Fisher, who developed the theory in several papers around 1920(see his collected works, 1950). The principle of maximum likelihood was already applied, however, by C.F. Gauss in 1795, in dealing with observational errors in astronomy. It selects for the pertinent parameters those values which assign to the observed data the largest probability of occurrence. Electron microscopy is often applied in cases in which one already has the disposal of much information about the specimen to be observed with the microscope. This information can, for example, originate from knowledge of analytical chemistry or from x-ray diffraction. Usually the recorded image of the object structure is not of interest in itself. The aim in particular of low-dose
IMAGE HANDLING IN ELECTRON MICROSCOPY
255
electron microscopy should be to extract information about the observed structure, in other words to provide a description of the internal structure of the specimen in question. This task can be facilitated considerably through the use of a priori information. In order to be of practical value, the a priori information has to be translated into a description of the pertinent structure as a functional relationship between not too many unknown parameters. The type of biological specimens we focus our attention on can be modeled by their electrostatic potential V(x,,, y o , z; 8), involving, for example, the positions of several heavy atoms, where 8 = (O,,0 2 ,. . .,Om) denotes the m distinct parameters to be estimated. The electron wave function in the object plane results from this potential distribution by taking the projection of the distribution with respect to the object plane, as is discussed in Section 1,C
where u denotes the velocity of the electrons impinging on the specimen. Hence, the image wave function is also a known function of the unknown parameters. The intensity of the image Poisson process can now be calculated (in principle) for the (k, 1) image cell = 1sT(2d)-’
ss
Il/(X,Y;8)Il/*(X,y;8)dxdy,
Uk.1
( k , 1 ) = { -*N,. . .,*N - l }
(164)
The actual recorded image 5 is a realization of this stochastic process which we have analyzed in depth in Section I,G. From Eq. (164), the intensity being a function of 8, all the information contained in the recorded image can be used for the estimation of the m unknown parameters. The N 2 observations are independent Poisson-distributed random variables, as has been shown in Section I,G. Their joint probability density function is therefore
The function L(5,0) is called the likelihood function if the 6k.l values are fixed to the values of a realization of the stochastic image process, making L(5,O)a a function of 0 alone. According to the maximum-likelihood principle mentioned above, the “best” values for the parameters 8 are those values 6 for which the likelihood function has its absolute maximum. A necessary condition for the existence of a maximum in L(5,8)is that
a
-logL(ii,O)
a4
= 0,
i
=
1 , . .., m
256
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
We hereby explicitly assume that the maximum does not occur at the boundary-of the range of permitted 8 values. The maximum-likelihood estimates 8 are a root of the set of likelihood equations obtained from (167) Because the estimated parameters depend directly on the stochastic data, they are random variables themselves. The merits of the estimated parameter values can be judged best by their probability distribution. Often the determination of such a probability distribution is a task in itself, and usually this computation cannot be performed analytically. In this case one has to be satisfied with less complete characterizations of the accuracy of the estimates 6. such as the first two moments of the distribution. In particular the determination of the second moment can also become rather complicated. The analytical complications arise from the fact that the functional relationship between the parameters representing the a priori information about the specimen is generally highly nonlinear. With respect to the first moment, the bias b(6)of the estimator 6i s defined as the deviation of the expectation value of 6 from the true value 8 b(6)= E ( 6 ) - 8 (168) whereas the second moment is related to the mean-square error matrix V ( 6 ) , of which the cr.selement is defined as The matrix V ( 5 )is $so called the error covariance matrix when the estimator 5 is unbiased, i.e., b(8) = 0. The maximum likelihood estimator has several properties which are of great importance in the application of the method to low-dose electron microscopy. First, it is a consistent estimator. This means that the estimates converge in probability to their true values when the number of observations increases. Furthermore, the estimator is unbiased asymptotically in the number of observations N 2 . A lower bound for the mean-square error of any estimator of 8 is provided by the famous Cramer-Rao inequality (if it exists), (see e.g., Van Trees, 1968, p. 72, for a discussion in a general setting; or Snyder, 1975, p. 80, in particular for Poisson-distributed data). An intuitively appealing discussion of this bound can be found in Gardner (1979). The variance of any unbiased estimator of 8 is greater than or equal to the value specified by the Cramer-Rao bound. This value appears to be equal to the inverse of the Fisher information matrix, which can be interpreted as the information contained in the data about the unknown parameters (Van der Waerden, 1969; Kendall and Stuart, 1967; Van Trees, 1968).
IMAGE HANDLING IN ELECTRON MICROSCOPY
257
The amount of information that is contained in the stochastic data ii is of crucial importance in statistical estimation theory. An estimator can do no more than transform the information present in the data and redistribute it over the unknown parameters. An important problem in the theory of statistical estimation is whether a proposed estimator uses all of the available information efficiently, i.e., when there is no information loss. In order to compare the performance of different estimators it is very useful to have a measure for the information that is contained in the data. According to Fisher (1950) this amount of information is expressed in the following matrix (if it exists)
where L(. ,.) is the likelihood function. Assuming that L ( - .), is regular enough to allow the interchange of differentiation and integration, Eq. (170) is equivalent to
The m x m matrix F ( 8 ) with entries is called the Fisher information matrix and is widely used in the theory of statistical estimation. For Poissondistributed data and likelihood function [Eq. (165)], the Fisher matrix is given by (Snyder, 1975, p. 81) N/2 - 1 .fi,j(8)
1
N/2-1
=k=-N/Zi=-N/2
2
a
~ . ~ ~ ( e ) ~ ~ k , l ( 8 ) ~ ~ ~ k , l ( 8 (172) )
If the inverse of the matrix in Eq. (172) exists, then the inverted matrix yields the Cramer--Rao bound for any unbiased estimator of 8. The Cramer-Rao bound thus obtained is to be compared with the covariance matrix of the unbiased estimator for 8. This comparison yields insight into the merits of the pertinent estimator, and whether it uses all the available information efficiently. I t appears possible to construct efficient estimators only for the restricted class of exponential probability distributions, irrespective of the number of observations (Koopman, 1936; Pitman, 1936). The maximum-likelihood estimator is asymptotically unbiased in the number of measurements and normally distributed, while the covariance matrix approaches the Cramer-Rao bound. Hence, asymptotically the maximum-likelihood estimator has the lowest attainable variance. It should be noted that these properties of the maximum-likelihood estimator are not generally valid but are only true when the likelihood function satisfies certain regularity conditions; for example, the interchange of differentiation and
258
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
integration with respect to 0 should be admissible. In our case these conditions are fulfilled since a physical object is imaged with a finite aperture E. This guarantees rather smooth intensity variations. Although the Shannon number N 2 of low-dose images will in practical situations be of the order of magnitude of several thousands, and therefore the number of observations N 2 is relatively large, we can not always rely on the nice properties of the maximum-likelihood estimator which are asymptotically in N 2 . In many cases the asymptotic limit will be reached quite slowly. The likelihood function will often have several local maxima when the number of observations N 2 is finite. The fact that there may be several local maxima poses the difficult numerical problem of obtaining the global optimum of a function of several variables. The problem will be discussed in some detail further on. The great advantage of using a priori information is, as mentioned earlier, the use of all the information that is contained in the observations for the determination of the parameters to be estimated. Another advantage lies in its directness, as no inverse problem remains to be solved. The reconstruction procedures of the object wave function described in the previous two sections resulted in only one projection of the electrostatic potential distribution. The potential distribution of the specimen remains to be reconstructed from projections obtained under different tilt angles of the specimen. Disadvantages of the a priori information approach are the numerical complications in obtaining unique results (e.g., the global extreme of the likelihood function) and the evaluation of the statistical significance of the results obtained. It is almost impossible to make generalizations about the proposed method for information evaluation in low-dose electron microscopy. This is due to the heavy dependence of the obtained results on the functional relationship between the pertinent parameters. A minor modification of this relationship may change the performance drastically. Every application is almost unique. Therefore, only some examples are presented here. In the next subsection we consider three simplified estimation problems, each one with only one single lateral dimension. However, all features of the a priori information approach advocated in this chapter can be demonstrated with these examples. With respect to the Fisher information it should be noted that Eq. (170) is not the only measure for information. Other measures of information exist, for example, the Shannon information, which is widely used in communication theory. In information theory there exists a vast amount of specialized literature about measures of information (see e.g., Aczel and Daroczy, 1975). A useful overview of the role of information measures in information theory is contained in Van der Lubbe (1981, Chap. 1). The quantity defined by R.A.
IMAGE HANDLING IN ELECTRON MICROSCOPY
259
Fisher in Eq. (1 70) or (171) has the following properties which are intuitively associated with the concept of information. (1) The amount of information is positive definite. (2) The available information increases directly with the number of measurements. Doubling the number of independent measurements doubles the amount of information. (3) Information is related to the accuracy of the measurement. (4) The amount of information is related to the knowledge to be retrieved from the measurement. Data not relevant for the parameter set 8 chosen do not contribute to the information. B. Illustrative Examples
In this subsection three examples are presented showing different ways of using a priori information about the specimen for the evaluation of low-dose images. The examples are formulated for one (lateral) dimension only: In order to concentrate on the function of the prior information the equations are kept as simple as possible. The extension to two dimensions is straightforward for rectangular apertures in the microscope.
I . A Weak Phase Object The first example deals with a weak phase object. This situation generally applies to thin biological specimens when scattering contrast (i.e., interception of electrons by apertures) may be neglected. The prior knowledge that we are dealing with a weak phase object allows us to expand the object wave function $,,(.) of Eq. (163) in the first two terms of its Taylor expansion $o(xo) = 1 + i4(x,)
(173)
where 46) is a real function representing the projection of the object's electrostatic potential. That part of the Fourier spectrum which corresponds to spatial frequencies above E is intercepted by the exit pupil and is not propagated to the image plane. Therefore the information contained in the spatial frequencies above E is lost, as these frequencies do not contribute to the data. In the sequel we will discard spatial frequencies above E by modeling the object wave function by
where the bar indicates a low-pass filtered approximation. In theory it is possible to reconstruct $(-) with a bandwidth higher than E, using
260
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
regularization techniques, see Bertero et al. (1980). In general, however, these techniques amplify the noise considerably and are therefore of limited value for low-dose microscopy. With Eq. (174) the low-pass filtered approximation to the original 4(-)function is represented by N amplitudes 4(k(2&)-'). We consider these to be unknown parameters ak, k = { --i N , ...,$N - l } that must be estimated from the recorded image ii using the maximum-likelihood method. From Eqs. (23) and (173) in combination with Eq. (174), the wave function in the exit pupil is expressed by
Hence, the image wave function is given by approximating the sinc function by a delta function
$(XI
=
1
+ iCa, k
J
r
tF
exp(-iy(t)
-
271i([k(2~)-'
-
x]}d(
(176)
-E
from which the squared modulus results f + E
where only terms that are linear in u k have been taken into account. This is consistent with the approximation in Eq. (1 73), where the quantities related to the object have been retained up to the first order. The expectation value for the number of electrons arriving in image cell 1 is E(n^,) = i, = (2d)-'i.,T /a
Ul:(2E)-'I -
(
+ (4c)-'
(4c)-' I XI I (2E)-'I
With Eq. (177) using the relation N i.= ,2,TN
dxt,b(x)$*(.u),
= 2 d ( 2 ~we )
' 1 + 4&zak k
1-y
x exp[-2ni5(2~)-'(k
~
( 178)
obtain
d(sin[y(()]
1)](71()-'
sin[2n((41:)-'])
(179)
The occurrence of sin y(-) in the integrand of Eq. (179) obstructs the analytical solution of that equation. However, by setting the defocus parameter D according to Scherzer focus conditions (Scherzer, 1949) in order to maximize the phase contrast, the shape of the sin y(-) function can be approximated by a few straight-line segments as shown in Fig. 9.
IMAGE HANDLING IN ELECTRON MICROSCOPY
26 1
0
t
g -0.5 2an
c
\\
-1.0
FIG.9. The function sin ~ ( tunder ) , Scherzer focus conditions approximated by straight-line segments, y ( ( ) = 2 7 r l ~ ~ ' ( ~ C4DC2), , ~ 4 C, = 1.6 mm, D = 80 nm, E = lo-', i. = 4 pm. ~
In order to arrive at manageable expressions, the sinc function in Eq. (179) is approximated by the constant value of (2&)-'. Instead of sin )(y we take th ree straight-line segments and the integral in Eq. (179) is set equal to
.-
sinn(k -
-
I)
n(2&)-'(k - I
I-& +&
d($(()exp[-2ni(2&)-'(k
+
where we approximated shy(() Fig. 10)
I b:
'Y
-1
+ 4((), lirl
with
4(()
-
l)]
(180)
defined by (see
48 2 812
(181)
The integral on the right-hand side of Eq. (1 80) can be elementarily evaluated by repeated integration by parts since the second derivative of $(()consists of four delta functions at k c/8 and f612. This results in
J
+&
dr$(;)exp[-2ni((2&)-l(k
-
113
-&
-
8sin[3e(k - /)/I61 sin[Sn(k - 1)/16] 3 ~ n ( k- l)(2&)-' n(k - 1)(2&)-l
~~
(182)
262
CORNELIS
N.SLUMP A N D HEDZER A. FERWERDA O(5)
-0.5E
-0.125E
0'
FIG.10. The function
-z
0.125E
0.5E
4(5)of Eq. (181).
With Eqs. (180) and (182) we obtain from Eq. (179) sin n(k - 1) - 8 sin[3n(k - 1)/16] 3~74k- 1)(2~)-' n ( 2 ~ ) - ' ( k- 2) -
sin[Sn(k - 1)/16] n(k - 1)(2&)-'
Equation (1 83) is the required relation between the Poisson-process parameter i.,and the unknown ak's to be estimated, cf. Eq. (164). The parameters a k can be evaluated by means of inversion of Eq. (183),replacing the expectation values on the left-hand side by the measured values In this example, however, we will follow the recipe of maximum-likelihood estimation. The maximumlikelihood estimates P are a solution of the likelihood equations, cf. Eq. (167)
-
8sin[3n(k - /)/16] sin[5n(k - /)/16] 3 m ( k - 1)(2&)-l n(k - I ) ( ~ E ) - '
x A,7"-'2(
r
=
sinn(r - 1) 7~(2~)-'(r -I)
{ - i N , . .. , i N
-
-
l}
11
8sin(3n(r - 1)/16) sin(Sn(r - 1)/16) 3m(r - I ) ( ~ E ) - ' n(r - l)(2&)-' (184)
Due to the presence of the parameters ak in the denominator, the set of equations is nonlinear and cannot be solved analytically. Instead the solution must be obtained numerically. It is usually very difficult to indicate the statistical significance of a result which has been obtained numerically. Of interest is the probability distribution or at least the first two moments of the estimated parameters P.
263
IMAGE HANDLING IN ELECTRON MICROSCOPY
In order to be able to judge the statistical significance of the estimated values, we proceed analytically by linearizing the set of Eq. (184). The denominator is expanded in the first two terms of the binomial series neglecting higher-order terms. This is consistent with earlier approximations, i.e., first order in the object properties. The following set of equation results
-
11
8sin[3n(k - 1)/16] sin[5n(k - 1)/16] 3 4 k - 1)(2&)-' n(k - 1)(2&)-' sin n(r - 1) n(2&)-'(r- 1)
-
8 sin[3n(r - 1)/16] sin[Sn(r - 1)/16] =O 3&n(r- 1)(2&)-' n(r - l)(2&)-'
(185)
Taking only the most significant contributions in the double summation into account, and neglecting terms which are smaller by one order of magnitude results in N/2-1 sinn(r - I ) 6, N +(fir + l ) - ' 1 (A,TN--' - f i J ( I= -N/2 n&(r- I ) -
16sin[3n(r - 1)/16] sin[5n(r - 1)/16] n(r - 1 ) 3n4r - 1 )
r = {-$N, ...,$ N - I }
(186)
The constant 1 is added to fir in the denominator because of the nonzero probability P{G, = 0 } that ii, will become zero, regardless of how large A, may be. By taking the expectation value of Eq. (186),using the fact that the fir's are independent random variables, and using E{(fi, l)-'} = &'[1 exp( - &)I, we obtain with the same type of approximations that were used in the derivation of Eq. (186)
+
E(6,)
= a,
(187)
Calculating the variance of G, in the same way using that E { ( f i , + 1)-'} A;' + AL3 + O(A;"), we obtain var (6,)
1:
12 1 (1024)- N(A,TE')-
N
N(~A,TE')-
=
(188)
The variance of the estimator in Eq. (186) as expressed in Eq. (188) can be compared to the lowest attainable variance of any estimator for a, as given by the CramCr-Rao bound. In order to determine this bound for a, the Fisher information matrix of the parameters a is calculated first [see Eq. (172)]
264
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
With the approximations used earlier in the derivation of Eqs. (186)-(188), and neglecting terms which are an order of magnitude smaller, Eq. (189) results in j;,s(a) 2 1 2 1 ( 1 6 ) ~ ' i . , T N ~ ' ~2 ~ SN, -, ,' 8 j . , T ~ ~ h , , ,
(190)
According to our approximations the estimator in Eq. (186) is unbiased [Eq. (187)l;therefore, the Cramer-Rao bound is equal to the inverse of the Fisher information matrix. As this matrix is in good approximation in diagonal form, the inversion is easy. It thus follows that the variance Eq. (188) is close to the value of the Cramer-Rao bound, an indication that the estimator in Eq. ( 1 86) is (nearly) optimal. In summary, in the first example the prior knowledge about the object structure is not very detailed; it is only a weak object [cf. Eq. (174)l. However, it became apparent that rather complicated nonlinear equations must be solved. With suitable approximations we are able to obtain analytical results of which the statistical features are established. The evaluation of the statistical significance of results that are obtained numerically is a difficult task. An example of such a situation is encountered in the next example. In that example the functional form of the image intensity distribution is supposed to be known, the pertinent function depending on several parameters.
2. linage Intensity Distribution in Analytical Form In this second example we assume the image intensity distribution to be an priori known function of several unknown parameters. This prior functional relationship, together with the observations which can be rather severely disrupted by noise, is the usual information which estimation methods take as their starting point. In the previous example the a priori knowledge concerned the object structure, and this resulted in the pertinent object wave function. The calculation of the corresponding image intensity is generally quite complicated. But in this second example this difficulty is sidestepped because the functional form of the image intensity is assumed to be LI
E(6,j
= j., =
i5TN-'( 1
~
h-1
akexp{-+s,2[/(2c)-1
- pkj2j),
(191)
1 = { - * N ,...,+N - I } with ak << 1 , k = ( 1 , . . . , m ) , such that every i.> ,0. The theoretical image contrast, i.e., the intensity distribution in the limit of an infinite number of electrons contributing to the image formation, consists of the sum of m Gaussian-shaped functions. The kth Gaussian function has an amplitude parameter u k , a position parameter pkr and the width of the bell shape is described by sL. From the recorded image, which is a realization of the stochastic Poisson process [Eq. (165)], the 3m parameters have to be
IMAGE HANDLJNG IN ELECTRON MICROSCOPY
265
estimated. A functional relationship between data and parameters, such as Eq. (191), can originate from a weakly scattering phase-amplitude object which is imaged in focus (i.e., with defocusing D set equal to zero) or from the mass-density variations of an object which is imaged under scattering contrast conditions. In order to have a problem that is manageable, the number of parameters in this example is made equal to three and the position parameters p k and the width parameters sk are set to fixed values. The three unknown parameters are the amplitudes of three Gaussian functions. The estimation problem is simplified to a linear relation between only three parameters. The remaining problem is nonetheless not trivial, as will become clear in the following analysis. The full problem will be treated in the next section, where we use twodimensional images. From Eq. (167) and using Eq. (191) we obtain the three likelihood equations
from which a solution yields candidate values for the amplitude a , , u 2 ,and u 3 . The set of equations (192) is nonlinear; hence the solution has to be obtained numerically. This can be done by a zero-locating algorithm (e.g., Powell, 1970) which attempts to find a zero in the neighborhood of a specified starting point. After a zero has been located, it remains to be checked whether it corresponds to a minimum, a maximum, or a saddle point of the corresponding loglikelihood function. In general Eq. (192) will have more than one zero. Thus, in order to obtain the global maximum of the log-likelihood function, the calculation must be repeated for various starting points. The set of equations (192) provides only a partial description of the function to be maximized:
266
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
For computational purpose it is better to maximize Eq. (193) directly than to locate the zeroes of Eq. (192) because using the function values in Eq. (193) will reveal regions of low likelihood where the location of local minima is not of interest. In this way the computational effort in the determination of zeroes which correspond to local minima is avoided. A local maximum of Eq. (193) is obtained using gradient information about the log-likelihood function by applying the algorithm proposed by Fletcher (1970), which is a variation of the algorithm of Fletcher and Powell (1963). The Fletcher algorithm departs from a starting point specified by the user, and then performs Newton-like steps in the direction of the local maximum in the neighborhood of the starting point until certain convergence criteria are satisfied. The estimated values for the parameters are defined to correspond to the global maximum of the log-likelihood function. With the algorithm indicated above there is generally no certainty that the obtained maximum is indeed the global optimum. Repeating the calculation from various starting points or from a number of randomly chosen starting points (Box, 1966), increases the likelihood that the largest maximum obtained is indeed the global optimum. The above outlined numerical strategy does not provide absolute certainty of obtaining the global optimum. We will return to this point further on where we will indicate a method that obtains the global optimum with certainty. The reliability of the numerical strategy of using the maximizing algorithm from different starting points can be improved upon by analyzing the shape of the log likelihood at the largest maximum found as a function of the parameters. A low-dose electron micrograph consists of a large number of data points. Hence it is to be expected that, for cases that are not excessively pathological, the likelihood function will tend to its asymptotic form. The asymptotic form of the likelihood function has one unique maximum, the width of which is characterized by the Cramer-Rao bound. The actual width is calculated as a function of the parameters at the maximum attained, and these values are compared. When these values are consistent with each other, the pertinent maximum is believed to be the global optimum. In Fig. 11 a realization of a simulated Poisson process with parameter j.,, 1 = { -fN,. . . ,fN- 1 ) [cf. Eq. (189)l is presented. Table I contains a summary of a series of numerical simulations with increasing electron dose 2, of the estimation of a,, u 2 , and u 3 using the procedure described above. The largest maximum found after a series 1, 6, 11, and 16 sets of random starting points (Box, 1966) is taken to be the global maximum the location of which determines the estimated value GZ, and 2,. It appeared from the simuiations that there is no difference in the maximum found after the different starting points generated. This adds to the confidence that the maximum obtained is indeed the global optimum. The shape of this maximum, which is
5
FIG.1 I. The simulated one-dimensional low-dose image (solid line) and 1, (dashed line) of Eq. (189) with the following a2 = 0.35, p 2 = 0, s2 = d / 3 , d = lo4, a3 = 0.25, p , = d / 3 , parameter setting: N = 201, a, = 0.25, p1 = - d / 3 , s, = d/5, c: = 0.5 x sj = d / 5 , I = 4 pm, I, TN-' = 8, dose F i, =, 50 e-/nmz(0.5 e-/A2).
Error analysis. confidence bounds
Estimated values
Electron dose -.-.
ir,
ir,
ir,
0.31
0.36 0.37 0.35 0.35 0.35 0.35
0.41
0.33 0.31 0.18 0.28 0.27
0.35 0.33 0.31 0.29 0.2x
I'ositivc hound -.
0.12 0.w 0.06 0.05 0.03 0.02
0.10
0.11
0.07 0.05
0.08
0.04
0.03 0.02
0.06
0.04 0.03 0.02
Ncg;~tIVC hound
-0.13 -0.w -0.07 -0.05
-0.03 -0.02
-0.11 -0.ox
-0.12 -0.09
-0.06 -0.04 -0.03 0.02
-0.06 -0.05
-0.03 -0.02
Experimental covarkince
0.13 0.09 0.06 0.05 0.03 0.02
0.11
0.07 0.06
0.04 0.03 0.02
0.12 0.08 0.06 0.05
0.03 0.02
Cramer Rao
0.13 0.m 0.07
0.05 0.03 0.02
0.1 I 0.08 0.06
0.13 0.09 0.07
0.04 0.03 0.02
0.03 0.02
0.05
269
IMAGE HANDLING IN ELECTRON MICROSCOPY
to be compared with the width predicted by the Cramer-Rao bound, is analyzed as follows. The parameters are set to the values corresponding to the largest maximum. For each parameter a consecutive determination is made of the two values for which the log-likelihood function has decreased by 0.5 in comparison to its value at the pertinent maximum. During this calculation of the positive and negative error bounds of the parameter in question, the other parameters retain their values corresponding to the maximum. When the log-likelihood function has its asymptotic form corresponding to an infinite number of observations, the above procedure yields the exact confidence intervals of the parameters (Eadie et al., 1971, p. 203). The decrease of the log-likelihood function by 0.5 corresponds to confidence intervals of one standard deviation. Hence in the asymptotic regime there is a probability of 68.3% that the parameter in question will be in the interval bounded by the two error values found. Evaluation of the Cramer-Rao bound yields a ( 3 x 3 ) matrix V which is the inverse of the Fisher information matrix
1
N/2-1
f;,j(a) =
L= -N/2
A,'(a)exp{ - ~ s ; ~ [ Q ~ E ) - ' - p i I 2 - 3 s i 2 [ I ( 2 e ) - '
-
pj12} ( 194)
Maximum-likelihood estimates are asymptotically unbiased in the number of observations. Hence taking the square root of the diagonal terms of V yields the predicted confidence intervals for the parameters in question. The Cramer-Rao bound depends via Eq. (194) on the true values of a. In practice this bound has to be evaluated using the estimated parameters a^,, a^2, and a^, . The confidence values according to the experimental covariance matrix are indicated in Table I. In the simulations of this example in which the true values of a are known, we have determined for comparison purposes the exact Cramer-Rao confidence intervals. We see that with increasing dose the error bounds approach the Cramkr-Rao confidence intervals quite closely. The symmetry in the parameters describing the Gaussian function 1 and 3 (cf. Fig. I 1) is reflected in the error bounds of the parameters a , and u 3 . In the following example the a priori information is, with respect to the object structure, unlike the present example where the intensity distribution is an u priori known function of a number of parameters. 3. A IVelikly Scattering Bell-Shaped Object Structure The third and final example of this subsection, presenting case studies about the estimation of a priori parameters in the evaluation of lowdose electron micrographs, consists of a weak Gaussian amplitude object. Like the other two examples in this section, the analysis is for simplicity purposes limited to one lateral dimension. We assume that the object wave function
270
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
can be expressed as the following function of three unknown parameters 8 = (a,p,s) i,bo(xo)= exp{ -+aexp[-s-'(x,
-
P)~]}
(195)
These three parameters must be estimated from a recording of the intensity distribution in the image plane of the microscope. Assuming furthermore that parameter a is not too large, a < 1, Eq. (195) can be approximated by (weakly scattering object) t,bo(xo) = 1 - +a exp[ - - ~ - ~ ( xP ~ )~]
(196)
The wave function in the exit pupil of the optical system becomes
where it is assumed that position pis not located near one of the endpoints -d and + d of the object plane and that hence the Fourier transform can be carried out from - x to + co.If the width s of the Gaussian function is of the order of magnitude of several sampling distances ( 2 ~ ) - in ' the image plane, then the Gaussian function in the exit pupil is so narrow that the integration interval --e to + E is equivalent to - x to + a.Due to the rapid decrease of the Gaussian function, the aberration function exp[ - iy(()] only contributes in the region t = 0. The image wave function follows
x exp(2nitx)
I - taexp[-ss2(x - P ) ~ ]
(198) We see that, under the conditions stated above, the image wave function is in good approximation equal to the object wave function, cf. Eq. (196). The intensity parameter of the recorded Poisson process ii = (6-, v , z , . . . ,6N,2 ) is given by, see, e.g., Eq. (1 64) or (1 78) 2
-
In order to obtain estimated values for the parameters 8 = ( a , p , s ) , we maximize the log-likelihood function log L(., .) which reads as
logL(6,O) =
N2-1 I = -N/2
[-/Z,TN '(1
+ 6,log(l
-
-
~ e x p ( - s - ~ [ 1 ( 2 ~ )-- 'p]'))
aexpj-s-'[1(2~)-~ - p ] ' } ) ]
(200)
27 1
IMAGE HANDLING IN ELECTRON MICROSCOPY
A necessary condition for a maximum of Eq. (200) is that the likelihood equations are zero, (d/dQi) log L(ii,fl) = 0, i = 1,2,3. This yields
C(A,TN-' - ;il;i;')2~-~[1(2~)-~ - p12 exp{ - s ~ ~ [ 1 ( 2 & ) - ~PI'} I
=
o
which form a coupled set of nonlinear equations which cannot be solved analytically. Therefore we proceed numerically in order to obtain estimated values for the parameters. As indicated in the previous example, we prefer to maximize Eq. (200) iteratively from a starting point rather than locating the zeroes of Eq. (201). Repeating the numerical procedure from various starting points will in the end reveal the global optimum. When using only a finite number of starting points, one has no absolute certainty that the highest maximum found in the optimizing process is the global maximum. Comparison of the shape of the maximum in question with its asymptotic values as expressed by the Cramer-Rao bound can identify the global maximum when the shape values are consistent. Table I1 summarizes a series of numerical simulations estimating a, p, and s, performed with increasing electron dose A,, = 2, T N - ' . The numerical procedure is identical to the one used in the previous example, which is referred to for details. We observe that for the highest value the width of the largest maximum approaches the asymptotic values predicted by the Cramer-Rao bound. This indicates that this maximum is indeed the global maximum. For the lower dose values the deviations are considerable, and this check on the results obtained does not work. As has already been remarked in connection with the previous example, the Cramer-Rao bound depends in many cases on the unknown parameters to be estimated. For very low-dose electron micrographs, one cannot be certain of being in the regime of asymptotic statistics. These two facts make the procedure of locating the global maximum by repeating the same calculations over and over while varying the starting values rather unsatisfactory. Fortunately, a numerical procedure exists which locates the global optimum with certainty. A discussion of this method in a general setting is beyond the scope of this article (see Slump and Hoenders, 1985). On behalf of the present discussion, an outline of the procedure is presented in the following. The method just referred to is based on information about the total number of stationary points of the log-likelihood function in the domain of interest. A stationary point of log I,(.,.) corresponds to a zero of the set of likelihood equations (d/dOi)logL(5,fl)= 0, i = 1,2,. . . [cf. Eq. (201)] of this
TABLE 11 SUMMARY 01' T H I NUMERICAL SIMIJLA.IIONS or THE ESTIMATION OF THE AMPLITUIIE N , THE POSITION po
A N I I THE
WIDTHs
OF A
GAUSSIAN FUNCTION"
Error analysis, confidence bounds Dose
Estimated values
'.I,
11
P
Y
2 4 6
0.369 0.346 0.750 0.322
598.6 578.8 492.1 671.9
950.2 907.9 148.2 820.4
X
Positive bo ti nd
0.147 -~
0.189 0.078
340.9 267.9 31.4 183.3
> 1550. 760.0 82.6 333.9
Experimental covariance
Negative bound
-0.205 -
-0.188 -0.079
-0.1534.6 -468.6 - 42.8 -196.2
-544.0 -312.3 66.3 -208.4 -
0.156 ~
0.185 0.077
395.4 -
37.7 179.1
Cramer-Rao
548.6 ~
70.9 243.1
0.187 0.133 0.108 0.094
339.2 239.9 195.8 169.6
491.2 347.4 283.6 245.6
' I 0". Eq. (199) from Poisson-distributed data with the following parameter setting: N = 101, a = 0.3, i: = 10 .', p = d / 3 = 833.33, d = 2500, 4 4 = 625.0. A horizontal bar indicates that the value in question could not be calculated due to severe nonparabolic behavior of the log-likelihood function for the corresponding parameter, caused by the neighboring local extrema. The size of the sampling cell is 4 A'. Hence the dose values correspond with 0.5, 1.0, 1.5. and 2.0 e/AZ.
.s
IMAGE HANDLING IN ELECTRON MICROSCOPY
273
example. The information about the total number of local extrema present in a domain is of great value as it tells us whether an iterative procedure to locate all of the zeros of the likelihood equations has missed a zero. This information is provided by evaluating an integral derived by Picard (1892) from previous work by Kronecker (1878) at the end of the nineteenth century. The integrands contain relatively simple algebraic quantities containing derivatives up to the third order of the log-likelihood function involved. The integration must be performed over the domain of interest. For an extensive discussion of this socalled Kronecker-Picard (KP) integral illustrated with examples, see Hoenders and Slump (1983). The Kronecker-Picard integral yields the exact number of zeros of a set of equations in a domain, provided that the zeros are simple; i.e., the Jacobian must not be equal to zero for these points. C. Two-Dimensional Examples The application of the maximum-likelihood method to the estimation problems of the previous section illustrates the possibilities and properties of this method in estimating LI priori parameters in the evaluation of low-dose electron micrographs. Image data are, however, essentially two dimensional. Therefore, in this section a more realistic example is presented which is based on two-dimensional data. The estimation problem presented in this section is inspired by the second example of the previous paragraph [cf. Eq. 1911. The a priori image intensity is assumed to be a function of 15 parameters A k J = x o (1
+ 1 umexp(-fs;2[k(2&)-~ 3
m= I
-
pm12 - +r;2[/(2+1-
qm]2}) (202)
= i.,TN-’ and = E(&). Besides the two-dimensional data, a with i0 difference with the estimation problem in Eq. (191) is that the amplitudes a, are not constrained to be smaller than unity. The problem of this section is the estimation of the parameters of the three Gaussian blobs from simulated lowdose images, with Poisson-distributed picture elements ( k , I ) = ( - i N , . .., $N - l), of which the corresponding intensity &,, is given by Eq. (202). The simulated images are presented in Fig. 12. The estimated values for the parameters are obtained from maximizing the log-likelihood function corresponding to Eq. (202). The numerical procedure is identical to the one used in the one-dimensional examples of the previous section, where the details are described. Table I11 summarizes the series of simulations estimating the parameters p, q, r, and s, performed with increasing electron dose I.,, with fixed values for the amplitudes a and with the image data of Fig. 12. The amplitudes
274
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
FIG.12. Simulated images used in the estimation calculations which are summarized in Table 111. (a) contains the noise-free image corresponding to Eq. (202).
SUMMARY OF
THE NUMERICAL SIMULATIONS OF THE
Estimated parameter values
Dose 10
8 16 32 48 64
TABLE 111 ESTIMATION OF THE PARAMETERS p, q. r, AND s OF THE THREE GAUSSIAN BLOBS"
PI
-2516.3 2529.4 -2537.1 -2547.2 -2558.6 -
41
3127.9 3 170.2 3185.1 3194.9 3193.6
-
f-1
3026.3 3 144.9 3173.7 3175.9 3170.7
s^1
2059.8 2084.0 2114.9 2105.8 2116.2
P2
42
F2
F2
5070.6 5 1 15.3 5098.2 5088.9 5094.6
1578.8 1594.6 1593.6 1597.0 1601.1
2607.6 2587.5 2574.1 2557.2 2568.1
2070.9 2093.1 2112.2 2115.0 2124.6
P, -1288.5 - 1296.0 -1287.1 - 1285.5 -1285.9
43
F3
-5181.4
3000.6 3 1 10.0 3156.9 3172.5 3179.0
- 5076.4
-5085.2 -5118.0 -5122.9
"Cf. Eq. (202), from Poisson-distributed data, see Fig. 12, with the following parameter setting: N = 128, a , = 4, p1 = -2560, y I = 3200, rl a2 = 6, p 2 = 5120, q2 = 1600, rz = 2 5 6 0 , ~=~2133.3, d = 6400, a , = 5, p , = 1280, q , = 5120, r 3 = 3200, s, = 3840.0. sI = 2 1 3 3 , ~= 0.5 x
F3
3660.5 3707.6 3752.3 3827.8 3822.1 =
3200,
276
CORNELIS H. SLUMP AND HEDZER A FERWERDA TABLE I V THEVALUESOF THE E X A ~CTK A M ~ R - R BAOLND O OF THF PARAMETERS p, q. r, A W s OF THE THREE GAUSSIAN BLOBS"
35.0 24.6 17.4 14.2 12.3
8 16 32 48 64 ~
~~
26.2 18.5 13.1 10.7 9.3
43.7 30.8 21.8 17.8 15.4
25.0 17.6 12.5 10.2 8.8
33.4 23.6 16.7 13.6 11.8
13.6 9.6 6.8 5.6 4.8
41.5 29.4 20.8 17.0 14.7
11.3 8.0 5.6 4.6 4.0
14.2 10.1 7.1 5.8 5.0
62.0 43.8 31.0 25.3 21.9
14.7 10.4 7.4 6.0 5.2
111.4 78.8 5.5.7 45.5 39.4
~
" Cf. Eq. (202).
a are excluded from the estimation because of reasons of computational convenience. The parameters that must be estimated are now all of the same order of magnitude. The exact Cramer-Rao bounds of the estimated parameters are presented in Table IV. An in-depth analysis of the shape of the attained maxima reveals that only for the highest dose values the width of these maxima approaches the values of the Cramer-Rao bound as presented in Table IV. This is due to the fact that Eq. (202) is a highly nonlinear function of the parameters that must be estimated. The analysis of the shape of the attained maxima was greatly facilitated through the use of the MINu1.r program (James and Roos, 1975), developed at C E R N , Geneva, for function optimization. Even with the capabilities for global optimum search offered by the M I N U I program, ~ we still have n o guarantee of attaining this maximum. D. Discussion and Conclusions
The subject of this section is the optimal use of u priori information about the structure of the imaged specimen in low-dose electron microscopy. In this section we take advantage of the prior information available by modeling the object structure in a functional relationship between a number of parameters. From this description a theoretical image intensity distribution results, i.e., the image contrast in the limit of an infinite number of electrons contributing to the image formation. Using the statistical technique of maximum-likelihood estimation, numerical values are obtained for the unknown parameters from the registered realization of the stochastic low-dose image process. The advantage of the approach of parameter estimation is that all the information available in the data is used to determine the relevant parameters about the imaged specimen one wants to know. A disadvantage of parameter estimation is the theoretical image contrast which is required as a function of
IMAGE HANDLING IN ELECTRON MICROSCOPY
277
the parameters to be estimated. This image contrast must be based on the object wave function, a calculation which is analytically very elaborate and complicated for phase contrast images. Furthermore, the determination of the object wave function as function of a number of parameters is not a simple task. Of course, the required functions can be computed numerically. However, the whole estimation procedure will become rather time consuming. More feasible is the situation at a much lower resolution scale, where scattering contrast dominates the image formation. The required image contrast as a function of parameters now can be based on the much more simple mass-density model of the specimen involved. Because of the lower resolution, the sampling cells in the image are much larger and better statistics in the data are achieved for low electron-dose values. A further complication with parameter estimation is the fact that in general the estimation problem is highly nonlinear in the parameters of interest. This nonlinearity manifests itself in the presence of local maxima in the likelihood function. The search for the global maximum of the likelihood function is a very complicated numerical problem when local extrema are present. Since the estimated parameters are based on stochastic data, the obtained values are also random variables. The statistical properties of the results are as important as the actual numerical values calculated. Unfortunately, the determination of even the first two moments is often a complicated task, due to the nonlinearity of the problem. A statistical characterization of the estimated parameters can only be established in the asymptotic regime of the maximumlikelihood estimator. Again the low-resolution imaging of specimens with scattering contrast is the most promising situation for the application of maximum-likelihood parameter estimation to low-dose electron microscopy in molecular biology.
v.
STATISTICAL HYPOTHESIS TESTING
A . Introduction to Statisticul Hypothesis Testing in Electron Microscopy
The present section is the second one which is devoted to the optimal use of a priori information. The evaluation of low-dose electron micrographs is
considered using the techniques of statistical decision theory. First we provide a short introduction to the kery useful technique of statistical hypothesis testing. This technique will be applied in consecutive subsections to three key problems in the evaluation of low-dose images (1) The detection of the presence of an object with a specified error probability for missing the object and false alarm.
278
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
(2) The detection of the positions of single heavy atoms, to be used as markers in the analysis of images of identical molecules with random orientation. The markers allow the images to be aligned and averaged which leads to a higher signal-to-noise ratio (Frank, 1980; Van Heel and Frank, 1981). (3) How to measure the statistical significance of, e.g., applying image processing to the low-dose image, in order to judge to what extent artefacts are introduced by the computer processing of the image. The visual interpretation of low-dose electron micrographs of moderately stained or unstained biological material is almost impossible due to the low and noisy contrast. Therefore computer processing of these images is indispensable. However, image processing applied to electron micrographs by means of a digital computer has to be performed with great care in order to prevent artefacts and false judgements about the structure to be observed. For overcoming these complications, which can be severe, especially for low-dose images, statistical decision theory offers a tool for an independent and objective check afterwards by quantifying the statistical significance of the obtained results. This can be done by statistical hypothesis testing whenever one has prior information about the structure being observed. In many cases occurring in practice, the information in an electron micrograph is partially redundant. This redundancy of the image, which is equivalent to a priori information, offers the opportunity to reduce the influence of noise. One way in which this a priori information can be used optimally is to apply the method of maximum likelihood to the estimation of unknown parameters. This technique which has been studied in depth in the previous section is especially suited when detailed a priori information about the parametrization of the specimen and the resulting image intensity distribution is available. Another approach of using the available a priori information in an optimal way is the construction of one or more hypotheses about the image distribution. Next, the statistical significance of the hypothesis under consideration is tested against the recorded image intensity which results in case of consistency in acceptance of the hypothesis and otherwise it is rejected. The rest of this section contains an outline of this technique of hypothesis testing (for a more general discussion see e.g., Van der Waerden, 1969; Kendall and Stuart, 1967; Lehmann, 1959). Throughout this chapter a recorded low-dose image is represented by an N x N array of statistically independent Poisson-distributed random counts i?k,lwhich correspond to the number of electrons that have arrived in the ( k , I ) image cell, ( k , I ) = { - $ N , . . . , i N - 1 ) with N Z roughly equal to the number of degrees of freedom of the image. The probability distribution of an individual iik,lhas been discussed in Section I,G [cf. Eq. (12)]. The following example is a simple application of hypothesis testing to a recorded image.
IMAGE HANDLING IN ELECTRON MICROSCOPY
279
Suppose that we have to decide between two possibilities: in the image ii or in a smaller region of interest, either specimen A or specimen B is imaged. The specimens A and B can be, e.g., two different biological molecules. Specimen A is characterized by the intensity parameter /z& of the Poisson process of the image recording and let specimen B correspond to the intensity parameter &. We assume here that the intensity parameters A t l and A& are completely specified; i.e., they do not depend on unknown parameters that have to be determined from the image data. In this case we have two simple hypotheses, the null hypothesis H,: Specimen A is imaged and the alternative hypothesis HI: specimen B is imaged. The null hypothesis is the hypothesis which is tested, here chosen to correspond to specimen A . Composite hypotheses also exist, in the case there is not one simple alternative hypothesis but instead a number of alternatives usually involving a free parameter. In the next subsection an example of such a composite hypothesis will be encountered. From the recorded image ii we now have to test hypothesis H , against its alternative HI and to decide whether specimen A or B was imaged. In order to do so, a so-called test statistic T is needed, which is a function of the experimental data to be specified further. Let W be the sample space of the test statistic, i.e., the space containing all possible sets of values of T.The space W is now divided into a critical region w and a region of acceptance W - w. If T falls within the critical region, hypothesis H , is rejected; otherwise it is accepted. The critical region w is chosen in such a way that a preselected level of significance c1 of the test is achieved. This level of significance sl is defined as the probability that T is in w while H , is true (see Fig. 13a) c(
=
P { T €WJH,)
=
i‘
p(TIH,)dT
(203)
In other words, c1 is the probability that H , is rejected although the hypothesis is true. Having chosen a value of a, the value of c follows from Eq. (203) such that if T 2 c: H , is rejected and thus HI is accepted; if T I c: H , is accepted, HI is rejected. Whether a test is useful or not depends on its ability to discriminate against the alternative hypothesis H I .This is measured by the power of the test, which is defined as the probability 1 - p that T is in w while HI is true. This makes p the probability that H , is accepted although HI is true (see Fig. 13b)
p = P { T €w - WIH,}
=
SI
p(TIH,)dT
(204)
m
The performance of a specific test is measured by the two types of error that may occur. The first is type I: H I is chosen while H , is true (“false alarm”). The probability of a type-I error is c(. The second error is called type 11: H , is chosen while HI is true (“miss”).The probability that a type-I1 error will be made is p.
280
CORNELIS H . SLUMP A N D HEDZER A. FERWERDA
Fic;. 13. The level of significance 2 and the power I against the simple alternative hypothesis H , .
- [j
of testing the null hqpothesis ff(,
In hypothesis testing one has to choose the significance level 2, i.e., the probability of a type-I error one is willing to accept and the test statistic T which is to be chosen such that for a given value of 2, /lis minimal. In this section three test statistics will be compared: the likelihood ratio, the chisquare test, and Student's t test. These test statistics are introduced in the following. The likelihood of observing the recorded realization fi of the stochastic image process is given by [cf. Eq. (19)]
I!d(fi,k) = n n e x p ( ~ j " k . J ) ( ~ k ,; .Jfk.1 l k!' ) ~ ' k
(205)
I
The likelihood ratio q is the test statistic which is defined as the ratio of the probabilities of obtaining the recorded count pattern for the hypotheses H , and H ,
q(ii) = L(ii, Ho)/L(fi,H , )
= exp
CC[.~,, '
-
+ fik,i(Io!gi&
-
~og;.;,,)~)
(206) Having calculated the likelihood ratio q according to Eq. (206), its value is to be compared with threshold value q,. If q 2 qo hypothesis H , is accepted, otherwise H , is chosen. The test procedure is now completely specified; what remains to be solved is how the threshold value qo should be chosen in order to correspond to the desired CI level. A further question is what the resulting
IMAGE HANDLING IN ELECTRON MICROSCOPY
28 1
power of the test will be. In general these matters depend on the hypothesis at hand, i.e., the differences between At.,and j&. The likelihood ratio is a powerful test statistic for the decision between the two simple hypotheses H , and H I . Another test statistic which is especially suited for measuring the discrepancy between observed iik,ldata values is the chi-square statistic Tx2with N 2 - 1 degrees of freedom T,,(N2
1) = x x j . ; l ' ( k k , L- i,J2
~
k
I
(207)
The larger the values of Tx2,the larger is the discrepancy between the observed and expected data values. The expected data values are computed on the basis of a hypothesis H , . This H , is rejected if the obtained q2value exceeds the critical value at the desired significance level, e.g., x:,95 or xi,99,which are the critical values to be obtained from tables at the 5% and 1 "/, significance level, respectively (see, for example, Van der Waerden, 1969, Chap. 14, Table 6). In that case the chi-square test concludes that the observations differ significantly from the expected values at the chosen level of significance. Otherwise H , is accepted or at least not rejected. In the next subsection also a third test statistic will be encountered which has a more limited scope of application, namely Student's t test. This is the appropriate test statistic if one wants to test whether an observed mean value of a set of N 2 independent normal-distributed random variables ( X I ,X , , . . ,X N 2 )is consistent with the expected value p . The test statistic t , which is defined as
x
,
r
= .f '(X-
p)N
(208)
where X = N-' C i X , is the sample mean and s2 = ( N 2 - 1 ) - ' c i ( X i- X)' denotes the sample variance. has a Student's t distribution with N 2 - 1 degrees of freedom. Also for this test statistic the critical values, e.g., to,,, and to,99 can be obtained from tables (for example Van der Waerden, 1969, Chap. 14, Table 7). If the test statistic exceeds the critical value, the hypothesis H , is rejected. In the next subsections statistical hypothesis testing is applied to problems in electron microscopy. B. Object Detection
A critical problem in the evaluation of low-dose electron micrographs is the detection of the presence of an object in the noisy images. Once an object has been detected in a certain region of interest, various techniques can be applied to extract information about this object. However, first the question has to be answered whether there is an object present or that the pertinent image intensity variation is .just a random fluctuation. Otherwise, faulty
282
CORNELIS H. S L U M P A N D HEDZER A. FERWERDA
conclusions will arise from applying image processing to an image which consists of random noise only. The detection of objects in question in noise-limited micrographs has been treated by Saxton and Frank (1977) using a matched filter approach based on cross-correlation and by Van Heel (1982)applying a variance image operator to the low-dose image. The visual perceptibility of objects under low-intensity levels has been treated in the pioneering work of Rose (1948a,b) in the early days of television systems. The results of Rose’s analysis are fundamental to low-dose electron microscopy and will be outlined briefly. In order to detect an image-resolution cell of one picture element (pixel) having a contrast C, where C is defined as the relative difference with respect to the background intensity A,, C = A&,/2, in an image of N x N pixels, a total number of electrons nT is needed in the image information. According to Rose (1948a,b) this number is given by
nT = N2k2Cp2 (209) It is assumed that this total number of electrons is uniformly distributed over the image and further the detection quantum efficiency (DQE) is taken to be unity, so that every impinging electron is recorded. The factor k in Eq. (209) is introduced in order to avoid false alarms and should be between 4 and 5 when the image has about lo5 pixels. The following example adopted from Rose ( 1973) illustrates Eq. (209) and clarifies the role of the factor k. Suppose we want to detect a single picture element with a constrast value C of at an unknown position in an image consisting of 100 x 100 pixels. According to Eq. (209) a total number of (at least) 108k2imaging electrons is needed in order to make this pixel visible. A pixel of the background receives in the mean a number of electrons 2, equal to 104k2 and the pixel to be detected expects to receive 1 = 9900k2electrons. The recorded numbers of electrons, ii, and ii, respectively, are Poisson-distributed random variables as is discussed in Section 1,G. Due to the relatively large numbers involved, the Poisson distribution is well approximated by the Gaussian probability density function P { ii,
= m } = exp( - i.,)(m!)-
’2;
= p(m)= ( 2 n a ~ ) p ” 2 e x p [ - - ~ ~ 2(m
(210)
where 0; equals j-,. Figure 14 presents the two probability density functions, the shaded areas correspond to the type-I error a, and type-I1 error p, which depend on the decision threshold c. The distance between the two peaks is look2.This is equal to k times the standard deviation, which is nearly the same for both density functions. If the decision threshold c would be situated halfway in between 2 and L o , the error probabilities a and fl would be equal.
IMAGE HANDLING IN ELECTRON MICROSCOPY
283
FIG.14. The two probability density functions p ( m , io) and p(m, A),together with the shaded areas a and p, representing, respectively, the probability of a type-I error (“false alarm”) and a type-I1 error (“miss”).
Usually this is a desirable situation, but not in this case because there are lo4 - 1 background pixels. Although Rose cannot use explicitly the threshold c because his detection criterion is the visual perceptibility, it is argued in Rose (1973) that the distance from c to I., should be at least 4 standard deviations in order to bring the total CI risk down to 0.3. With a distance from A to c of one standard deviation, the risk becomes 0.1 58, and the value for k is found to be 5. When the image contains less pixels, the value for k can be lowered somewhat. Note that the dose values in this example (I., = 25 x lo4 electrons/pixel) are far away from low-dose imaging conditions, which underlines the inherent difficulty of the evaluation of low-dose electron micrographs at high resolution. Considering the detection of image detail with a larger spot size than one pixel, Eq. (209) can be rewritten as
nT = A k 2 ( d C ) - ’
(21 1 )
where d is the diameter of the spot to be detected and A is the area of the image. According to Eq. (21 1 ) the diameter of a test spot which is just visible varies inversely with the contrast c‘ for a fixed value of the electron dose. This relation is illustrated in Fig. 15, where a test pattern adopted from Rose (1948b) is presented, which consists of a two-dimensional array of discs in a uniform background. The diameter d of the discs decreases in steps of a factor of 2 while moving to the right along a row and the contrast C of the discs decreases in steps of a factor of 2 while moving downwards along a column. These images show that the boundary between the visible and invisible discs lies roughly along a diagonal where the product dC is constant. With Fig. 15 the discussion of Rose’s detection criterion is determined, and we turn to hypothesis testing for the detection of objects. For objects
284
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
d FIG.15. Test pattern [Rose (3948b)], illustrating Eq. (21 1). (a) Original test pattern. (b)-(d) Test patterns for electron-dose values of 8, 16, and 32 electrons per pixel.
consisting of one pixel, not so much can be improved upon Eq. (209); however, for larger objects a significant improvement is possible, as has also been reported by Saxton and Frank (1977) and by Van Heel (1982). This can be understood from the fact that the Rose criterion is the visibility of the test spot, which is based on the integrated contrast over image elements with the size of the test spot. In the case of extended objects detection methods based on the statistics of the individual pixels use more information from the recorded image and therefore in principle are more appropriate for the treatment of lower dose values.
285
IMAGE HANDLING I N ELECTRON MICROSCOPY
Applying statistical hypothesis testing to the detection of the presence of an object, one has to decide between two possibilities: either there is no object, the null hypothesis ( H o : j . k , l= constant = 1,,7iV2) or there is an object present, the alternative hypothesis ( H , : j u k , larbitrary, but not all equal). If ,?,7?v-’ is known, then H , is a simple hypothesis because it is completely specified; H , , however, is a composite hypothesis (the object is not specified). The likelihood-ratio test statistic as it is discussed in the previous section does not apply to this situation because the probability of the alternative hypothesis cannot be specified. This complication is overcome by using instead the generalized or maximum-likelihood ratio as test statistic, which is defined as the ratio of the maximum-likelihood values of the two hypotheses. Of both hypotheses the likelihood is maximized by variation over the pertinent hypotheses; cf. the maximum-likelihood estimation. When H , is true = ii = N L(ii,KO) is maximal estimating Ak,[by CkEl Ck,[. If H , is true, we have = Gk,[;in the case of one observation the best estimated value is the observation itself. The generalized likelihood ratio q is
q ( 6 ) = L(fi,J?,)/L(h,
fi,) = exp
N2iilog ii
-
k 1
Ck.[log Ck,,
from which it follows that 0 5 q I 1. It can be shown (e.g. Kendall and Stuart, 1967, p. 233) that when H , is true, - 2 log q is distributed for N2 -+ CT, as z 2 ( N 2- 1). For image data we may well expect to be in the asymptotic regime so that the probability distribution of -2 log q equals the chi-square distribution with N Z - 1 degrees of freedom. When one has chosen the level of significance ci, usually of the order of 1 the threshold value c for the decision acceptance or rejection of H , can be obtained from tables of the ;c2 distribution (see Fig. 16).The decision threshold
x,
-r
FIG. 16. The chi-square distribution with N *
1 degrees of freedom of r = threshold value c is chosen such that the shaded area equals 1 - a. -
--
21og q. The
286
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
c with which r
=
-
2 log q is to be compared, is chosen such that P x z { r ,N 2 - l} dr
However, for larger values of N 2 - 1 the chi-square distribution is approximated very well by the Gaussian distribution. Defining the value z as follows ~ ( q=) (2NZ- 2)- '"(1 - N 2 - 210gq)
(214)
we have obtained a variable which is distributed standard normal N(0,l). The a levels with the corresponding threshold values c are presented below.
r level
Threshold value c
0.1
1.282
0.05
1.645
0.01
2.326
If the value z exceeds a threshold value c, the H , hypothesis is rejected at the corresponding significance level a. We will follow here the conventional terminology that results significant at the c( level of 1% are highly significant, results significant at the 5% level are probably significant, and results significant at levels larger than 5% are not significant. Because H , is the set of all possible alternatives to H , , nothing can be said about the probability of a type -11 error /I. Applying the chi-square test for the detection of the presence of an object, the test statistic is the following [cf. Eq. (207)] T , , ( N ~-
where 6 = N 2 Z kZ I G k , l . Because of the asymptotic properties of the x 2 distribution, which are already discussed in connection with Eqs. (213) and (214), the value z is defined as z(T) = (2N2 - 2)-' ' [ T , 2 ( N Z- 1) - N Z +
11
(216)
is distributed standard normal N(0,l). Therefore, the threshold values c of the above table also apply to the chi-square test. With the third test statistic, is., Student's t test of Eq. (208) the consistency of the observed mean value is measured against the expected value p, which corresponds to i.,TN -'. The value of p is to be determined from previous exposures under the same conditions, e.g., by dose measurements using a Faraday cup. The t test requires that the random variables to be tested are
IMAGE HANDLING IN ELECTRON MICROSCOPY
287
(approximately) normally distributed. We therefore first apply the square-root transformation to the Poisson-distributed image data j k , [ = (kk.1
+
$)1’2
(217)
The obtained j k , [ values are in good approximation normally distributed N ( p L y , b )where , p y = ( p + ;)l/’. The square-root transformation in Eq. (217) is discussed in Appendix B. The test statistic t t = s - ’ ( y - py)N
(218)
where J = N - 2 C k C I F k k ,and l s2 = ( N 2 - 1 ) - ’ X k X L ( & [ - #, has the Student’s t distribution with N 2 - 1 degrees of freedom. For large values of N 2 - 1, t is distributed standard normal N(0,l). It is to be expected that when an object is present the number of electrons arriving in the image will be reduced, e.g., by scattering contrast or because of the energy filter lens which removes inelastically scattered electrons. Therefore, only the significance levels less than the measured mean corresponding to the lower tail of N(0,I) have to be tested. The threshold values c of the above table apply here, however with opposite sign. The three test statistics which are compared in this section are first applied to the detection of the presence of objects in simulated images of four model objects. The simulated images are arrays of 64 x 64 pixels. The objects are represented by their object wave function and consist of two amplitude objects and two phase objects: amplitude: $,(x,,y,)
phase:$&,,
yo)
=
exp(--0.05),
inside a circle with a diameter of 8 and 16 pixels, respectively;
=
1,
otherwise.
= exp(i/4), =
inside a circle with a diameter of 8 and 16 pixels, respectively; otherwise.
1.
The image wave functions are calculated according to Eqs. (3)-(5) for the following setting of the microscope parameters: D = 180 nm, C, = 1.6 mm, A = 4 pm, and E = 5 x From this parameter setting it follows that a resolution cell in the image which corresponds in this simulation to a picture element, is equal to (4 x 4) A’. The contrast calculated from the image wave function $(-,-)is the parameter I,,,, of the image Poisson process. By means of random number generation a realization of the low-dose image is obtained. The results of the three test statistics, generalized likelihood ratio, chi-square, and Student’s t test, respectively, given in Eqs. (214), (216), and (218) are summarized in Table V. $(a,.)
TABLE V SUMMARY ok THE DETECTION OF THE PRESENCE OF THE MODELOBJECTS I N THE SIMULATED IMAGESW I T H THE THKWTESTSTATISTICS FOR INCREASINGELECTRON DOSE' Amplitude object
Phase object -.
i.,TN-'(e-/pixel)
i n ( e ./nm'), (e-/A')
Likelihood ratio
Chi-square
Student's
Likelihood ratio
Ho Ho Hn H0,l H", I
HO H0 H0 H", I
H0,l H, Hl Hl Hl
Ho11 Ho, L H0,l HOIl HI
Ho HO Ho
Hwi
HO Ho Ho Ho
If0 Ho Hl Hl Hi
Hl Hl Hl Hl Hi
H,, 1 HOIl HI HI HI
H" HO Ho, 1 Hl
Ho Ho Ho Ho
11,
11"
Chi-square
Student's
Object diameter 8 pixels 12 16 32 48 64
75 (0.75) 100(1)
200 (2) 300 (3) 400 (4)
H0
h0,1
lfn
Object diameter 16 pixels 12 16 32 48 64
15 (0.75) 100 (1)
200 (2) 300 (3) 400 (4)
Ho! I
If,, 1 HI HI Hi
" I I , means that the hypothesis !I,, is not rejected, the test statistic is not significant. Hi indicates that the test statistic is highly signilicant. If,, is rejected, while I I , I indicates that the test statistic is probably significant.
IMAGE HANDLING IN ELECTRON MICROSCOPY
289
From the simulated experiments summarized in Table V we observe that the Student’s t test, which has a very sharp response on amplitude contrast, is not sensitive to phase contrast at all. This is not surprising, as the t test statistic measures the deviation between the total number of expected electrons in the image and the acquired number of detected electrons in the image. In the case of amplitude contrast, the number of electrons arriving at the image detector plane will be reduced if compared with the case that there is no object present. In the case of phase contrast this difference is negligible in comparison with the statistical fluctuation in the total number of electrons that is involved in the image formation. For the detection of phase contrast we observe from Table V that the likelihood-ratio test is more sensitive than the chi-square test. In a small experiment in which the presence of phase contrast in low-dose images is tested, we used the likelihood ratio as test statistic. The three lowdose electron micrographs which are the input data for this experiment are a courtesy of Dr. E. J. Boekema and Dr. W. Keegstra of the Biochemisch Laboratorium, Rijksuniversiteit Groningen. The imaged specimens presented in Fig. 17 are an image of a carbon support foil, a carbon foil with a small amount of uranyl acetate, which is used as staining material, and an image of a NADH: Q oxidoreductase crystal (slightly negatively stained with uranyl acetate) from bovine heart mitochondria (see Boekema et al., 1982). The crystal structure of the last image is visualized in Fig. 17d, which is obtained by Fourier peak filtration (Unwin and Henderson, 1975). The images are small sections of 128 x 128 pixels of low-dose CTEM electron micrographs, obtained with an electron dose in between 5 and 7 e-/A2. The micrographs have been scanned by a microdensitometer with a sampling grid of 25 pm. Since the magnification is 46 600, the size of a pixel corresponds to 5.3 A. The low-dose images are recorded on a photographic plate, which is not the ideal device for electron detection. Moreover, due to the scanning by the microdensitometer, we cannot expect the recorded image to have the Poisson statistics of Section I,G. The test statistics developed in this section are based on the Poisson statistics of a recorded image as derived for ideal electrondetection conditions in Section I,G. The following crude approach has been chosen to correct the statistics of the image to the Poisson regime. From the carbon foil image the mean and variance are calculated. Hence the image is scaled in such a way that an equal mean and variance value are obtained which numerically corresponds to a dose of 6 eC/A’. Exactly the same scaling is applied to the two other images. The scaled images serve as input for the estimation experiment. The likelihood-ratio test statistic detects phase contrast, and thus the presence of an object at the 5% level in the image of the carbon foil with uranyl acetate. In the image of the NADH dehydrogenase crystal, phase contrast is detected at the 1o/;l level. Object detection by means of
290
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
CARBON FOIL
CARBON FOIL- UAC
CRYSTAL
CRYSTAL FILTERED
FIG.17. Low-dose CTEM images used for an object detection experiment. The imaged specimens are consecutively (a) carbon foil, (b) carbon foil with a small amount of uranyl acetate, and (c) NADH: Q oxidoreductase crystal. (d) is obtained from (c) by means of Fourier peak filtration.
hypothesis testing exceeds the capability of the human eye, which is very useful in the noisy images of low-dose electron microscopy.
C. Position Detection of Marker Atoms Another key problem in the evaluation of low-dose images is the detection of the position of single heavy atoms. These atoms are used as markers in the analysis of images of identical macromolecules with a random orientation
IMAGE HANDLING IN ELECTRON MICROSCOPY
29 1
with respect to each other. If the marker positions are known, the images can be aligned and integrated, which leads to a higher signal-to-noise ratio (Frank, 1980; Van Heel and Frank, 1981). The imaging of single atoms in electron microscopy has been studied by many authors (see, e.g., Proceedings of the 47th Nobel Symposium, Direct Imaging of Atoms in Crystals and Molecules, Chernicu Scriptu 14 (1978-1979). Calculations of the image contrast due to a single atom have been reported (e.g., by Scherzer, 1949; Niehrs, 1969, 1970; Reimer and Gilde, 1973; Iijima, 1977). Experimental observations showing evidence of imaged single atoms are among others, in Dorignac and Jouffrey (1980), Kirkland and Siege1 (l98l), and Isaacson et a / . (1 974). For the construction of hypotheses about the location of the marker, the theoretical image contrast of one isolated heavy atom is required. However, the calculation of the theoretical image intensity distribution as a function of the lateral position of the atom in the object plane is a complicated task. Our purpose here is to discuss the detection capability of marker atoms under lowdose illumination. In order to bring out the essentials, we will simplify this calculation considerably by neglecting inelastic scattering phenomena completely. Furthermore, we represent the electrostatic potential of the pertinent atom at the position ( x a ,y,, z,) in the specimen which is situated just before the object plane z = zo by the Wentzel potential (Lenz, 1954) V(r,) =
where r, = [(xO- x,)’ atom R is given by
+ (yo
-
Ze exp( - r,R ’) 471c0r,
~
~
y,)’
R
+ (zo
~
= uHZ-’
z,)~]”’. The “radius” of the (220)
with aH the hydrogenic Bohr radius corresponding to 0.529 8, and Z is the charge of the atomic nucleus. According to Eq. (I), plane-wave illumination exp(ikz) incident parallel to the optical axis results in the object wave function (zo = 0)
The integral in Eq. (221) can be evaluated analytically if we extend the upper bound of the integration interval to infinity. This is allowed because of the rapid exponential decay of the potential in Eq. (219). With a change of variables we obtain (using Gradshteyn and Ryzhik, 1980, f.3.387.6, p. 322)
292
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
where KO(.)is the modified Bessel function of the second kind and of zero order. The constant a is given by
In the weak-object approximation* we obtain the object wave function
1 - 2iaKo(R-'C(xo - x,I2 + ( Y O - Y,) 1 ' ) (224) If the integration is extended over the whole object plane, we obtain from Eqs. ( 3 ) and (4) introducing the new variables x b = xo - x, and yb = y o - y, for the wave function in the exit pupil 2
$o(xo,~o;x,,~,)
$,(C3 '11
=
exp[
- pi((,
I')
~
d.uo dy, [ I
2ni(C.u, -
112
+ tly,)]
2ia K,(R-'(.x:
+Y:)':~)]
- 1
(225) exp[ -2ni((-ub + qyb)] The structure of the integrand suggests the use of polar coordinates. s;,= r b c o s 4 b , y b = r;,sin4b,c = r,cos4,,and 17 = r,sin4, $,,(r,. 41,) = exp[ - r;,(r,,) - 2nir,(.ua cos cj,, + J', sin 4,,)] x
x
Using
I:
r1;
riIdrh =
((:
r ~ ) d r ~ ~ [ c ~ n d $- 2h i[alK o ( R - ' r b ) ] (226)
x exp[- z n i r ; r p c o s ( ~ ~-; ,
dd)h K,(R 'r;)exp[-2nirbr,cos($b
drbrbK,(R-'r;)J,(2nr,rb)
=
[R-'
-
4,)]
+ (2~w,)']-'
(227)
we obtain the following expression for the image wave function, where we introduce the new coordinates .x' = .Y - .xa, y' = J' y,, x' = r'cos q$', y ' = r'sin $', and with the coordinate center (.x,,y;,) in the image plane ~
(228) In the derivation of Eq. (228) we assume a circular exit pupil with radius I:. * In the immediate neighborhood of ( . Y " , J ~ )the weak-object approximation is not valid. However, this corresponds to scattering electrons passing the atomic nucleus at a very small distance. These scattering events occur with a very low probability.
IMAGE HANDLING IN ELECTRON MICROSCOPY
293
The image intensity distribution is proportional to the squared modulus of the image wave function. In the weak-object approximation we obtain with the first-order terms only $(r’, d)’)$*(r’, 4’) = 1 - 1 6 d a
J O
dr, r, sin[y(r,)]Jo(2?cr,r’)
It is statistically more convenient to detect in the image regions with a surplus of electrons (peak detection) than it is to detect regions with less electrons, especially with a low-intensity background. Therefore it is important in Eq. (229) to set the defocus parameter D[cf. Eq. ( 5 ) ] to a value which maximizes the contrast under the condition of contrast reversal. At the position of the imaged atom in the image plane, more electrons arrive than in the background. The stochastic low-dose image is characterized by the parameter of the Poisson process I,,,,
x $*(k(2&)-’
+ x,, 1(2&)-’ + y,)
(230)
In a simulation experiment low-dose images are obtained for different dose values by means of random-number generation of Poisson-distributed variables. The images contain 128 x 128 pixels, and the presence of 3 uranium atoms is simulated. The separation between the atoms is taken such that the respective image contrast patterns do not overlap. The detection procedure is as follows. The position parameters x, and ya are also restricted to multiples of (2~)-’,corresponding to the pixel dimensions in the image. The hypothesis is now tested for every pixel being the center of the calculated atomic contrast pattern, Eq. (230). For each pixel (r,s) the likelihood L(6, A(r/2~,s/2~))[cf.Eq. (205)] is calculated. The obtained likelihood values which are beyond a threshold value c correspond to detected marker positions. The choice of the threshold value c determines the error probabilities CI and p, as has been discussed at length in the previous two subsections to which we refer. The simulated images are generated with the following values for the microscope parameters: N = 128, C , = 1.6mm, D = 144 nm (contrast reversal), 2 = 4 pm, and E = lo-’. In the simulations which are summarized in Table VI the presence of an object is also tested by means of the three test statistics discussed in the previous subsection. In this simulation we have simply taken the three highest likelihood values to correspond with the three atomic positions. In practical situations this extra a priori knowledge is not always available.
TABLE VI SUMMARY OF THE DETECTION OF THE PRESENCE OF A N OBJECT WITH THREE TESTSTATISTICS' Object detection 3 Uranium atoms ~ , T N'(<,-/pixel)
8 12 16 20
52
i o ( e /nm2),( e -
/AZ)
200 (2) 300 (3) 400 (4) 500 ( 5 ) 1300(13)
Likelihood ratio
Chi-square
Student's
HO,,
HO
&,I
Ho
Ho,, Ho,, HI
Ho
HO Ho Ho H"
HO HI
Position detection (Number of atoms at correct position)
Ho
For an explanation of the symbols H o ,Ho,,, and H, see the legend of Table V. The position detection states the number of marker locations which have been detected correctly.
IMAGE HANDLING IN ELECTRON MICROSCOPY
295
D. Statistical Signijicance of Image Processing
The technique of statistical hypothesis testing is also applicable to the measurement of the statistical significance of the effect of image-enhancement algorithms on the low-dose electron micrographs. When applying imageprocessing techniques, great care should be exercised in order to avoid the generation of artefacts and hence false conclusions about the imaged specimen. As an example we will measure the statistical significance of a popular noise-reducing (“filter”) algorithm for periodic objects, i.e., the Fourier peak filtration technique of Unwin and Henderson (1975). In Fig. 17c the low-dose micrograph of a periodic object is presented and Fig. 17d is obtained by Fourier peak filtration from the image in Fig. 17c. (For more details about these images see the end of Section V,B.) We now wish to determine quantitatively whether the filtered image is consistent with the data. Therefore two hypotheses H , and H , are constructed. The null hypothesis H , is the outcome of the computer processing of the filtered image (cf. Fig. 17d).The H , hypothesis is the set of all possible alternatives to H , . In the latter case A k , l is estimated by Xk,l = iik,l.Let p = { Pk,l}, (k,I ) E { 1,. . .,N } denote the processed image. We now obtain the likelihood ratio
For the images in Figs. 17c and d we conclude by calculating the test statistic - 2 log q the rejection of the null hypothesis at the 5% level of significance; i.e., H , is probably significant. For a noise-reducing algorithm this result is rather poor. This procedure is also applied to a filtered image, which is calculated from the image in Fig. 17c, however with a small shift in the positions of the filter peaks. With this image the H , hypothesis is rejected at the 10% level of signijicance, which means that the HI hypothesis is highly significant. E . Discussion and Conclusions
This section deals with the handling in an optimal way of prior information available about the imaged specimen in low-dose electron microscopy. The prior information is quantified and transformed into a set of different hypotheses. Hence the statistical significance of each alternative hypothesis is measured against the recorded data. In the application of statistical hypothesis testing to low-dose electron microscopy, it is difficult to make general statements about performance and attainable accuracy, likewise
296
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
for the case of parameter estimation. One of the problems analyzed in this section is the basic problem of the detection of an object in a region of interest in the low-dose micrograph. Once an object has been detected, techniques from the field of image processing can be applied for further information extraction. In a simulation experiment Student’s t test appears to be very effective for the detection of amplitude contrast. This could be expected, as the t statistic calculates the deviation from the mean number of electrons per pixel. Amplitude objects “absorb” electrons, leading to less electrons in the image, a situation which can be detected by the t statistic. For phase objects the mean contrast over the image does not change. No electrons are captured by diaphragms at high resolutions (no scattering contrast), and therefore the t statistic shows no significance at all. Also from the simulation experiment we see that the likelihood-ratio test statistic is more powerful for phase-contrast detection than the commonly used x 2 statistic. The detection of the positions of heavy-atom markers (Section V,C) turned out to be quite successful even for modest electron-dose values. The detection method will benefit from a priori knowledge about the total number of marker positions to be detected. Then simply the highest likelihood values can be taken to correspond with the locations to be determined, likewise in the simulation experiment. An earlier attempt failed to obtain this information from the inelastically scattered electrons which are missing from the image (energy filter lens) together with knowledge of the total cross section of the atom for inelastic scattering. This failure is due to the fact that for increasing 2 the cross section for inelastic scattering decreases. This implies that the inelastic signal from a U atom gets submerged in the inelastic signal from its surrounding, which consists mainly of organic material. The technique of statistical hypothesis testing can also be applied to measure the statistical significance of image processing on the low-dose electron micrograph. The example discussed in Section V,D which concerned the application of the Fourier peak filtration method for periodic objects of Unwin and Henderson (1975) is only included for illustrative purposes. It is not to be considered as an analysis of the statistical relevance of the Unwin and Henderson technique. For periodic objects the cross-correlating techniques of Saxton and Frank (1977) are believed to be more powerful as these are able to accomodate lattice deformations. Another problem which can be handled with hypothesis testing is the problem of measuring the radiation damage. Again for periodic objects a measure of the radiation damage exists in the broadening of the diffraction spots for increasing electron dose. For aperiodic specimens a series of images can be made of which the differences are quantified by means of statistical hypothesis testing. When the images of the series start to differ significantly this is due to the radiation damage.
297
IMAGE HANDLING IN ELECTRON MICROSCOPY OF THE APPENDIX A: THESTATISTICAL PROPERTIES FOURIER TRANSFORM OF THE LOW-DOSE IMAGE
In this appendix we derive the stochastic properties of the complex stochastic process t(t)defined in Eq. (27)
E(5) =
c
N/2 - 1 k = -N/2
ikexp[271i(4&)-1k(]
(232)
which are required in Section 11. The measurements (counted electrons) i i k are . autocorrelation Poisson-distributed random variables with intensity i k The function R ( - ,.) of the complex stochastic process defined in Eq. (232) is given by
N5,, 5 2 ) = E { ~ ( W * ( t 2 ) )
Using the statistical independence of the 6,"s(cf. Section I,G),Eq. (233)can be rewritten as
kfk'
+ C E { i i z )e ~ p [ 2 ~ i ( 4 ~ ) - ' k ( 5 ,l z ) ] -
k
C E { i i k } E ( j i kexp[2ni(4e)-'(kt1 ,} - k't2)] + C [ E { ~ : }E2{fik}]exp[27ci(4~)-'k(t, - t2)1 = E { ~ ^ ( S , ) ) E { ~ ^ * ( S+, )Ci.,exp[2ni(4~)-lk(t~ } t2)1 (234) =
k.k'
-
k
-
k
where the property of Poisson-distributed variables has been used that
E{i?i} = i.; + iLk
(235)
In practice ?(-) will be calculated by means of a fast Fourier transform (FFT) algorithm. This results in the discrete values = (2d)-ll,
I
=
{ --;A',...,$ N
-
1)
298
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
for the continuous variable 5. The autocorrelation in Eq. (234) becomes
where denotes c'(l/2d). We now determine the probability distribution of the complex random variable tI.It is convenient to examine the characteristic function @-(.) of tl, because completely determines the probability distribution o?tI and is more easily obtained. The characteristic function of the discrete random variable & is defined @",(o) = E(exp(iw6,)f =
= exp{d,[exp(io)
C exp(iol)exp(- A k ) a : / i ! m
l=O
-
11)
(237)
fik exp(2niN-'
(238)
From Eq. (232) we have that (€+,(o) = E exp io H
k
Straightforward computation using the fact that the Gk variables are statistically independent (cf. Section 1,G)yields @tI(o) = exp
CA,(exp[io e x p ( 2 n i ~ ' k l ) l- 1 ) )
( k
(239)
Inverse Fourier transformation of Eq. (239) results in the probability distribution of t l .The moments m, of C; which are defined as E(c^y}= m,
(240)
are related to the derivatives of the characteristic function @?,(-):
are most easily The first two moments of the distribution function of obtained from the logarithm of Eq. (241) [see, for example, Papoulis (1964), p. 1581 d
IMAGE HANDLING IN ELECTRON MICROSCOPY
299
We obtain for the expectation value E{?,} E{?,}
= x,?keXp(2niN-'k1) k
(243)
which is consistent with applying the expectation operator to the right-hand side of Eq. (232). To obtain the variance we proceed by defining the real functions GI and 6, to be the real and imaginary parts of El
c; = GI + ib, A
(244)
According to Eq. (242), we have
E{G,} = a,,
E { & } = b,
(245)
Straightforward computation of the variances e l and ez,results in 015,
=
c k
/?k
c0s2(27cN-' k l )
From the definition of the variance of a complex random variable, we obtain for the variance of ?, =
E{l?I
-
E{c[)1'} = 1 2 , k
(247)
From Eq. (247) we conclude that the variance of ?, is equal to a constant value independent of 1.
APPENDIX B: THESTATISTICAL PROPERTIES OF AN AUXILIARY VARIABLE This appendix is devoted to the stochastic properties of an auxiliary random variable together with its Fourier transform, which is used in Section 111. The variable &is defined in Section I11 in two places in two different ways. According to Eq. (86) & , is given by ?k,[
= N'(&T)p1(6k,,
7
&TN-')
(248)
while Eq. (130) reads as gk,k,l
2 -112
= (IsTN- )
-112
[nk,,
- (&TN-2)'1']
(249)
Apparently the two definitions in Eqs. (248)and (249) are quite different. They originate from
+
E{6k,l) = &TNp2(1
Sk,,)
(250)
300
CORNELIS H . S L U M P A N D HEDZER A. FERWERDA
and
respectively. Since sk,l << 1, it will appear that the two definitions give rise to same probability distribution. The only difference in the two definitions is a factor of four between the variances. A . Part
I
In the first part of Section 111, we are interested in the statistical properties of the Fourier transform of gk.1 according to Eq. (248). The Fourier transform of the variable gk,/,( k ,1) = { - i N , . . . , i N - 1) is a complex stochastic process ?((, q), defined in Eq. (88)
We abbreviate the factor 1,TN-' by 1,. The measurements (counted electrons) iik,l are Poisson-distributed random variables with parameter j.k,l = Lo( 1 + s k , / ) .The autocorrelation function R(., -) of the complex stochastic process [Eq. (252)]
( k ,I ) # (k', I ' )
(254)
301
IMAGE HANDLING IN ELECTRON MICROSCOPY
In the second term on the right-hand side of Eq. (254), the variance of Fk,l appears. For this variance we obtain, using the property of Poissondistributed variables tik,l,that E{tiz,,} = A,& + &, c2(L?k,k,l)
=
E(SIk2.1). - EZ{S^k,l}= J?{Aiz(6k,l
+ A;]
- A,)'} - s&
= 1,2[E{ti,2,,} - 2E{tik,l)& = i.0
(1
+
- sE,I
(255)
Sk,J)
Since S k , ! << 1, we have to a good approximation that c2(gk,k,l) N
i.0'
(256)
With Eq. (256), we obtain for the autocorrelation function of Eq. (254) ~ ( 5 1 ~ t Z . ~ l ~=V E{c^(tl, Z) Vl)I~{c^*(52?~Z)}
+ j.,'CCexP(2ni(2&)-'Ck(51 k
f
-
52)
+ 4rll v1Z)l) -
(257)
The summations over k and I in Eq. (257) can be evaluated (by a geometric series) as
Choosing sampling points (,,, = (2d)-'m and qr = (2d)-'r, such that t1- t2 and q 1 - q z are multiples of (2d)-', we find for the autocorrelation function Nt,, m t q r , V S ) that
+
R ( t m , t n , q r > y / s ) = E(c^((5,,~r)}E{c^*(5,,~s)}A i 1 N 2 d m , n d r , s
(259)
For this choice of discrete values C(t,,,,q,) and c^((5,,qs), (m,n)# (r,s) are uncorrelated. This situation will occur when we apply Whittaker-Shannon sampling in the exit pupil. This will always be the case in practice, where E(-, -) is calculated by means of a fast Fourier transform algorithm. In the following analysis, t(.,-) i s treated for a continuum of 5 and q values, but the discrete representation is kept in mind when necessary. We examine the characteristic function Q5(w) of C((,q), since Q5(w) completely determines the probability distribution of i?(t,q). From Eq. (252) we have that
{
(1
+
@?(w)= E exp iwCCs^,,,exp[2ni(2~)-~(k5 Iq)])] k
I
(260)
302
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
Substituting Eq. (248) leads to exp[2ni(2~)-'(k5
+ Ir)]
+
$)])I
-liO~~'C~~k,,eXp[271i(2&)-1(k~ k
I
(261)
Straightforward computation, making use of the characteristic function of the random variable Ck,/ [Eq.(237)], results in
+ xxA,,,(exp(io&' k 1
+
)
exp[2ni(2~)-'(k< l q ) ] } - 1)
(262)
Fourier transformation of Eq. (262) yields the probability distribution of ?(-,-). We restrict our attention, however, to the first two moments of this distribution, which can be calculated most easily from the logarithm of the characteristic function [cf. Eq. (242)l. We obtain for the expectation value of ?(a,
E(C((5,q)) = CCsk,,expC2ni(2E)-1(k5+ 11111 k
f
(263)
a relation which has been used in the derivation of Eq. (89). We see that the expectation value of C(-, .) is equal its true value, which proves that Eq. (248)is an unbiased statistic. For the variance of ?(-,-) we obtain c2(C^(5,q)) =
Cc(l+ k
I
sk,l)
(264)
As the sample values sk,f << 1, we approximate Eq. (264) as o2(CI((r,q)) 'v & ' N 2
(265) SO the variance of ?(-,-) is to a good approximation equal to a microscope parameter N4(IbsT)-'. Instead of calculating higher-order moments, we examine Eq. (262). In electron microscopy practice, Lo >> 1, even for modest resolution requirements, since &, denotes the average number of electrons available for imaging per sampling cell. When Lo >> 1, the term exp{io&' exp [2xi (2~)-'( k t
+ Iq)]}
of Eq. (262) can be approximated by the first terms of its Taylor expansion
+ lq)]) exp(iw&' exp[2~i(2~)-'(k< 1+io&' e x p [ 2 n i ( 2 ~ ) - ' ( k 5 + 1 q ) ] - ~ w ~exp[4~i(2&)-'(kt E.~~ +1q)]
(266)
IMAGE HANDLING IN ELECTRON MICROSCOPY
303
With Eq. (266) we obtain for @?(o) to a good approximation exp [2ni(2~)-'(k[
+ lr])]
We define the real functions a^((, q ) and 6((, r ] ) to be the real and imaginary part of Z(5, r ] )
Z(5, r ] )
+ i&5, r ] )
= a^([, YI)
(268)
From Eq. (267) the characteristic functions of the real part a^(.,.) and the imaginary part ,.) are easily derived. These functions have the structure of the characteristic function of a Gaussian random function with mean a(.,.) and b(.,.) with variances given by
&.
From this we conclude that, to a good approximation (if we restrict our attention to the two lowest-order moments, mean and variance), ?(-,.) is a Gaussian random function with parameters given by Eqs. (263) and (265). B. Part 2
In the second part of this appendix we examine the probability density function of the auxiliary variable .?k,k,ldefined in Eq. (130). In order to simplify the notation we will drop the subscripts (k, 1) and abbreviate also here the mean number of arriving electrons per image cell AsTN - 2 by 2,. From Eq. (251) we have that E{G}
and s is estimated by
= &(l
s^ = & 1 / 2 ( @ 1 / 2
+ sy
-
(270)
1-y)
Since n^ is distributed according to the Poisson distribution we have P{$}
= exp(-A)A'/$!
(272)
with
A = &(1
+
s)2
(273)
304
CORNELIS H. S L U M P A N D HEDZER A. FERWERDA
Applying the transformation Eq. (271) which has Jacobian &,fi)-”’ using Eq. (272) in the probability density function of s^ p(?) = 2A0(1
+ s^)exp[-/20(l + S ) ~ ] [ A ~ ( +I s)2]1~(1+1)2/[A,(l + :)’I!
results (274)
With Stirling’s approximation to the factorial n!,
n!
N
(2nr1)’/~n“exp(-n)
(275)
we obtain for Eq. (274)
(276) Expanding the logarithm in Eq. (276) into a power series, we obtain to third order in s and s^ p(s^) N (2n)-1’2(4ivo)’’Z exp{ -2i0(?
-
s)’
-
2i,[+(i3
-
s3) + ss^(s^ - s)]} (277)
Defining a’ as follows,
-
o’ = (42,)-1
(278)
Eq. (277) can be written p(?)
(2na’)-’’’ exp[ -+o-’(S^ - s)’] exp{ -40-~[+(s^~- s3) + ss^(sI - s)]} (279)
The second exponential in Eq. (279) is close to unity because s << 1 (and consequently i<< 1) because we restricted the imaging to weak objects; so we have to a good approximation p(?)
N
( 2 7 c ~ r ~ )exp[-fo-’(? -”~
-
s)’]
(280)
From Eq. (280) we observe that the probability density function of s^ is Gaussian with mean equal to the true value s and with variance (4jbO)-’.This result is in agreement with the results obtained by Anscombe (1948), used in Slump and Ferwerda (1982). Anscombe applied the square-root transformation = (n
+ c)’”
(28 1)
with c an arbitrary positive constant and where 6 is a Poisson-distributed random variable with intensity parameter A. Anscombe observed that 9 is distributed Gaussian, asymptotically in R; in particular he obtained the
IMAGE HANDLING IN ELECTRON MICROSCOPY
305
following results
E { j ) 2 (2 + c)'I2 - (8E.'12)-'
+ (128A312)-'(24c 7) (282) var(9) e $[l + (8;.)-'(3 - 8c) + (3222)-'(32c2 52c + 17)] Substituting for the arbitrary constant c the value 4 results in a nearly constant -
-
variance, already for moderate values of 2.
APPENDIX C : THECRAMER-RAO BOUND In this appendix we derive the Cramer-Rao bound for the variance of the parameters to be estimated: a = (ao,o,.. . ,am,".. . , a N - l , N - l )in Eq. (127). This Cramtr-Rao bound is the minimum value for the variance which is achievable by whatever estimation method from the recorded image 6. For an intuitively appealing interpretation of the Cramer-Rao inequality see Gardner (1979). We first calculate the amount of information that is contained in the data ii about the parameters a. According to Fisher (1950) this amount of information is defined by the following matrix
i"
a
L(ii, a)-log L(ii, a) daP4 where L(6,a) is the likelihood function of Eq. (1 28). Under the assumption that L(6,a) is regular enough to allow the interchanging of differentation and integration, Eq. (283) is equivalent with F(a)r,s,p,q = E -log 8ar.s
Substitution of Eq. (128) into Eq. (284) results with Eq. (127) in the following Fisher information matrix
= 42, TN-26r,,6s,,
(285)
The Cramer-Rao bound for the variance of an unbiased statistic is equal to the inverse of the Fisher information matrix [Eq. (283)] (see for example Kendall and Stuart, 1967; Van der Waerden, 1969; and Van Trees, 1968). From Eq. (285) we observe that the parameters a all have the same CramerRao bound and that they are not interconnected. That means that information about one specific parameter ak,ldoes not give information about any other parameter
306
CORNELIS H. S L U M P A N D HEDZER A. FERWERDA
ACKNOWLEDGMENTS The authors gratefully acknowledge the support of the Netherlands Organization for the Advancement of Pure Research (Z.W.O.). We thank Dr. B. J. Hoenders, Dr. E. J. Boekema, and Dr. W. Keegstra for their help at the time when this research was carried out. The processing of the images in this article has been performed at the Information Theory Group of the Department of Electrical Engineering at Delft University of Technology, for which help Dr. J. J. Gerbrands is gratefully acknowledged. We thank Mrs. E. Boswijk for her patience with typing the many changes in the manuscript, which was originally submitted as a thesis to the Rijksuniversiteit Groningen by one of us (C.H.S.). We thank Mr. 0. T. Schutter for skillfully drawing the figures.
REFERENCES Aczel, J., and Daroczy, Z. (1975). “On Measures of Information and Their Characterization.” Academic Press, New York. Anscombe, F. J. (1948). Biometrika 35,246-254. Bertero, M., De Mol, C., and Viano, C. (1980). I n “Inverse Scattering Problems in Optics” (H P. Baltes, ed.), Springer-Verlag, Berlin and New York. Boekema, E. J., Van Breemen, J. F. L., Keegstra, W., Van Bruggen, E. F. J., and Albracht, S. P. J. (1982). Biochim. Biophys. Acta 679, 7-1 1. Box, M. J. (1966). Comput. J . 9, 67-77. Davenport, W. B., and Root, W. L. (1958). “Random Signals and Noise.” McGraw-Hill, New York. Davidoglou, A. (1901). C. R. Hebd. 133,7844786,860-863. Dorignac, B., and Jouffrey, B. (1980). Proc. Eur. Congr. EIectron Microsc. 7th 1, 112. Eadie, W. T., Drijard, D., James, F. E., Roos, M., and Sadoulet, B. (1971). “Statistical Methods in Experimental Physics.” North-Holland Publ., Amsterdam. Egerton, R. F. (1982). Int. Congr. Electron Microsc., IOth, Hamburg 1, 151-158. Egerton, R. F., Philip, J. G., Turner, P. S., and Whelan, H. J. (1975). J . Phys. E . 8, 1033-1037. Ferwerda, H. A. (1978). I n “Inverse Source Problems in Optics” (H. P. Baltes, ed.), Chap. 2. Springer-Verlag, Berlin and New York. Ferwerda, H. A. (1981). Optics in Four Dimensions (ICO-Ensenada, 1980) (M. A. Marchado and L. M. Narducci, eds.). A I P Conf. Proc. (65), 402-41 1. Ferwerda, H. A. (1983). “Technical Digest of the Topical Meeting on Signal Recovery and Synthesis with Incomplete Information and Partial Constraints, Incline Village, Nevada” Paper ThAI. Optical Society of America. Fisher, R. A. (1950). “Contributions to Mathematical Statistics.” Wiley, New York. Fletcher, R. (1970). Comput. J . 13, 317-322. Fletcher, R., and Powell, M. J. D. (1963). Comput. J . 6, 163-168. Frank, J. (1973). Optik 38, 582-584. Frank, J. (1980). I n “Computer Processing of Electron Microscope Images” (P. W. Hawkes, ed.). Springer-Verlag, Berlin and New York. Frieden, B. R. (1971). “Progress in Optics,” Vol. 9, pp. 311-407 (E. Wolf, ed). NorthHolland, Amsterdam. Gerchberg, R. W., and Saxton, W. 0.(1972) Optik 35,237-246.
IMAGE HANDLING IN ELECTRON MICROSCOPY
307
Glauber, R. J. (1959). In “Lectures in Theoretical Physics, Summer Institute, University of Colorado, Boulder” (W. E. Britten and L. G. Dunham, eds.), Vol. 1, pp. 315-414. Wiley (Interscience),New York. Gradshteyn, 1. S., and Ryzhik, I. M. (1980). “Table of Integrals, Series, and Products.” Academic Press, New York. Haine, M. E. (1961). “The Electron Microscope.” Spon, London. Hawkes, P. W. (1978). Comput. Graph Image Process 8,406-446. Hawkes, P. W. (1980a). Optik 55, 207 -212. Hawkes, P. W. (1980b). In “Computer Processing of Electron Microscope Images” (P. W. Hawkes, ed.), Chap. 1, Springer-Verlag, Berlin and New York. Heidenreich, R. D. (1964). “Fundamentals of Transmission Electron Microscopy.” Wiley, New York. Henkelman, R. W., and Ottensmeyer, F. P. (1974).J . Microsc. (Oxford) 102,79-94. Hoenders, B. J., and Slump, C. H. (1983).Computing 30, 137-147. Iijima, S. (1977).Optik 38, 193. Isaacson, M., Langmore, J. P., and Rose, H. (1974).Optik 41,92. James, F., and Roos, M. (1975). Comput. Phys. Commun. 10,343-367. Kellenberger, E. (1980).Eur. Conyr. Electron Microsc. 7th, The Hague 2,632-637. Kendall, M. G., and Stuart, A. (1963).“The Advanced Theory of Statistics,” Vol. 1. Charles Griffin, London. Kendall, M. G.. and Stuart, A. (1967).“The Advanced Theory of Statistics,” Vol. 2. Charles Griffin, London. Kirkland, E. J.. and Siegel, B. M. (1981). Ultramicroscopy 6, 169. Koopman, B. 0.(1936). Trans. Am. Math. SOC.39,399. Kronecker, L. (1878).“Monatsberichte der Koniglich Preuszischen Akademie der Wissenschaften zu Berlin,” pp. 145-1 52 (Reprinted 1897in “L. Kronecker Werke,” Vol. 2, pp. 71-82. Teubner, Leipzig). Lehmann, E. L. (1959). “Testing Statistical Hypotheses.” Wiley, New York. Lenz, F. A. (1954). Z . Naturforsch. 9a. 185-204. Lenz, F. A. (1971).In “Electron Microscopy in Material Science” (U. Valdre, ed.). Academic Press, New York. Misell, D. L. (1973a).Adu. Electron. Electron Phys. 32,63. Misell, D. L. (1973b).J . Phys. D 6,2200-2216,2217-2225. Munch, J. (1975).Optik 43,79-99. Niehrs, H. (1969). Optik 30,273. Niehrs, H . (1970). Optik 31, 51. Papoulis, A. (1965). “Probability, Random Variables and Stochastic Processes.” McGraw-Hill, New York. Picard, E. (1892). Math. Pure Appl. (4me Ser.) 8, 5-24. Pitman, E. J. G . (1Y36). Proc. Cambridge Philos. Soc. 32, 567. Powell, M. J. D. (1970). In “Numerical Methods for Non-linear Algebraic Equations” (P. Rabinowitz, ed.). Gordon & Breach, New York. Reimer, L., and Gilde, H. (1973). In “Image Processing and Computer-aided Design in Electron Optics” (P. W. Hawkes, ed.), pp. 138-167. Academic Press, New York. Roger, A. (1981). IEEE Trans. Antennas Propagat. AP-29,232-238. Rose, A. (1973).“Vision: Human and Electronic.” Plenum, New York. Rose, A. (l948a). In “Advances in Electronics”(L. Marton, ed.), Vol. 1. Academic Press, New York. Rose, A. (1948b).J . Opt. Soc. Am. 38, 196-208. Saxton, W. 0.(1980). In “Computer Processing of Electron Microscope Images” (P. W. Hawkes, ed.), Chap. 2. Springer-Verlag, Berlin and New York.
308
CORNELIS H. S L U M P AND HEDZER A. FERWERDA
Saxton, W. O., and Frank, J. (1977). Ultramicroscopy 2,219-227. Scherzer, 0 .(1949). J . Appl. Phys. 20, 20-29. Slepian, D. (1964). Bell Syst. Technol. J . 43, 3009-3057. Slepian, D., and Pollack, H. 0.(1961). Bell Syst. Tech. J . 40,43-63. Slump. C. H., and Ferwerda, H. A. (1982). Optik 62,93-104, 143-168. Slump, C. H., and Hoenders, B. J. (1985). I E E E Trans. IT-31,490-497, Snyder, D. L. (1975). “Random Point Processes.” Wiley, New York. Unwin, P. N. T., and Henderson, R. (1975). J . Mol. Biol. 94,425-440. Van der Lubbe, J. C. A. (1981). A generalized probabilistic theory of the measurement of certainty and information. Thesis, Delft University of Technology. Van der Waerden, B. L. (1969).“Mathematical Statistics.” Springer-Verlag, Berlin and New York. Van Heel, M. G. (1982). Ultramicroscopy 8,331-342. Van Heel, M. G., and Frank, J. (1981). Ultramicroscopy 6, 181-194. Van Toorn, P., and Ferwerda, H. A. (1976). Opt. Acta 23,457-468,469-481. Van Toorn, P., Huiser, A. M. J., and Ferwerda, H. A. (1978). Optik 51, 309-326. Van Trees, H. L. (1968). “Detection, Estimation and Modulation Theory,” Part I . Wiley, New York. Wade, R . H. (1980). In “Computer Processing of Electron Microscope Images” (P. W. Hawkes, ed.), p. 233. Springer-Verlag, Berlin and New York.