Estimation of unknown structure parameters from high-resolution (S)TEM images: What are the limits?

Estimation of unknown structure parameters from high-resolution (S)TEM images: What are the limits?

Ultramicroscopy 134 (2013) 34–43 Contents lists available at ScienceDirect Ultramicroscopy journal homepage: www.elsevier.com/locate/ultramic Estim...

824KB Sizes 1 Downloads 35 Views

Ultramicroscopy 134 (2013) 34–43

Contents lists available at ScienceDirect

Ultramicroscopy journal homepage: www.elsevier.com/locate/ultramic

Estimation of unknown structure parameters from high-resolution (S)TEM images: What are the limits? A.J. den Dekker a,n, J. Gonnissen b, A. De Backer b, J. Sijbers c, S. Van Aert b a

Delft Center for Systems and Control, Delft University of Technology, Mekelweg 2, 2628 CD Delft, The Netherlands Electron Microscopy for Materials Science, University of Antwerp, Groenenborgerlaan 171, 2020 Antwerp, Belgium c Vision Lab, University of Antwerp, Universiteitsplein 1, N.1, 2610 Wilrijk, Belgium b

art ic l e i nf o

a b s t r a c t

Available online 1 June 2013

Statistical parameter estimation theory is proposed as a quantitative method to measure unknown structure parameters from electron microscopy images. Images are then purely considered as data planes from which structure parameters have to be determined as accurately and precisely as possible using a parametric statistical model of the observations. For this purpose, an efficient algorithm is proposed for the estimation of atomic column positions and intensities from high angle annular dark field (HAADF) scanning transmission electron microscopy (STEM) images. Furthermore, the so-called Cramér–Rao lower bound (CRLB) is reviewed to determine the limits to the precision with which continuous parameters such as atomic column positions and intensities can be estimated. Since this lower bound can only be derived for continuous parameters, alternative measures using the principles of detection theory are introduced for problems concerning the estimation of discrete parameters such as atomic numbers. An experimental case study is presented to show the practical use of these measures for the optimization of the experiment design if the purpose is to decide between the presence of specific atom types using STEM images. & 2013 Elsevier B.V. All rights reserved.

Keywords: High resolution transmission electron microscopy (HRTEM) Electron microscope design and characterization Data processing/image processing

1. Introduction Ever since the construction of the first electron microscope, a lot of effort has been made to improve its resolution. By now, aberration correctors and image reconstruction methods have pushed the point resolution of transmission electron microscopy (TEM) to about 0.5 Angstrom, which allows one to visually resolve individual atomic columns in projection [1–4]. Approaching the point at which the resolution is fundamentally limited by the intrinsic “width” of the atoms, the focus in TEM research has moved gradually from obtaining a better point resolution to improving the precision with which structural (and chemical) information can be extracted from TEM data [5,6]. Note that there is a clear difference between resolution and precision. Whereas resolution, as defined by the classical resolution criteria such as the Rayleigh resolution criterion [7,8], expresses the ability to visually distinguish neighboring components, precision corresponds with the variance with which (structure) parameters can be measured from data. Van Dyck was among the first to emphasize the importance of precise structure determination for materials science and technology [9]. This importance can be

n

Corresponding author. Tel.: +31 15 2781823; fax: +31 15 2786679. E-mail address: [email protected] (A.J. den Dekker).

0304-3991/$ - see front matter & 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.ultramic.2013.05.017

motivated as follows [5]. A complete understanding of the relation between the properties of nanomaterials and their structure, combined with recent progress in building nanomaterials atom by atom, will enable materials science to evolve into materials design. Quantum mechanical ab initio calculations allow one to predict relations between structure and physical properties of nanomaterials, but validation and further improvement of these calculations are only possible by interaction with experiments. This requires experimental characterization methods yielding local structure information with sufficiently high precision. In fact, atom positions have to be measured with a precision of the order of 1 pm, since a displacement of the atoms over such a length may have a considerable effect on the material's properties [10,11]. For example, strain induced by the lattice mismatch between a substrate and the superconducting layer grown on top may double its critical temperature [12]. Van Dyck was also among the first to emphasize that TEM is the most appropriate technique to provide the required precision, since from all possible imaging particles, electrons interact most strongly with matter [6]. Van Dyck initiated the first studies on the feasibility of precise structure determination through TEM, which were based on simplified models and simulations [9,13,14]. These studies showed that TEM has indeed the potential to provide structure characterization with a precision that is orders of magnitude better than the resolution of the microscope, but this requires a quantitative

A.J. den Dekker et al. / Ultramicroscopy 134 (2013) 34–43

model-based interpretation of the images. A merely visual interpretation is inadequate. The methodology for the quantitative approach required is provided by statistical parameter estimation theory. Meanwhile, the feasibility of this approach has been demonstrated not only in simulations, but also in practice [15–18]. The starting point of the quantitative approach mentioned above is the notion that we are not so much interested in the electron microscopy images as such, but rather in the (structural and chemical) information of the sample under study. Image quality and resolution are therefore of subordinate importance. Images are to be considered as data planes, from which sample structure parameters, such as atom positions, particle sizes and fiber diameters, have to be estimated as precisely as possible. For this, we need a parametric model of the images that includes all ingredients to perform a computer simulation of the images (such as electron–sample interaction, microscope transfer and image detection). If first principles based models cannot be derived, or are too complex for their intended use, simplified models may be used. The model is parametric in the unknown sample structure parameters, which are estimated by fitting the model to the experimental images using a criterion of goodness of fit, such as least squares or maximum likelihood [19]. Structure determination thus becomes a parameter estimation problem. Using this statistical parameter estimation based methodology, not only structural information, but also chemical information can be extracted from electron microscopy data. Indeed, it has been shown to be possible to quantify electron energy loss spectra (EELS) [20] and to relatively quantify the chemical composition of atomic columns from HAADF STEM images [21] using the same methodology. The quantification of HAADF STEM images in terms of chemical composition could efficiently be solved using a simple parametric incoherent imaging model to describe the image intensities. The parameters of this model are then estimated using least squares estimation and interpreted in terms of total intensities scattered by the atomic columns. Recent work has shown that such scattered intensities can be further explored in order to count the number of atoms that are present in an atomic column [22]. In order to quantify larger fields of view of experimental HAADF STEM images, some practical problems concerning the implementation of the least squares estimator need to be solved. In this paper, an efficient algorithm that suits this purpose will be introduced. Ultimately, the precision with which the parameters can be estimated is limited by noise. Indeed, due to noise, the pixel values that constitute the experimental images will fluctuate randomly from experiment to experiment. These pixel values, which we will from now on refer to as observations, can therefore be modeled as random variables, characterized by a joint probability density function (PDF) (in the case of continuous observations) or a joint probability function (PF) (in the case of discrete observations, such as Poisson counting results). The parametric image model introduced above describes the expectations (i.e., mean values) of these observations. Use of the concept of Fisher information [23] allows one to derive an expression for the highest attainable precision with which the structure parameters of the sample under study can be estimated in an unbiased way. This expression defines a lower bound on the parameter variance. This bound, which is known as the Cramér–Rao lower bound (CRLB), can be derived from the P(D)F described above. The CRLB relies on weak regularity conditions on the P(D)F of the observations [23]. One of these conditions is that the P(D)F should be continuously differentiable with respect to the parameters. The CRLB is generally a function of the sample parameters, the microscope parameters, and the electron dose. It provides quantitative insight into what precision might be achieved from the available image(s). It also provides insight into the sensitivity of the precision to the parameter values. An important application for which the CRLB can

35

be used is statistical experiment design. Experiment design is the selection of free variables in an experiment to improve the precision of the estimated parameters. By calculating the CRLB as a function of the microscope settings, these experimental settings can be optimized so as to attain the highest precision [5,13,14,24–30]. The approach for experiment design also provides the possibility to decide if new instrumental developments result in significantly higher attainable precisions. In this sense, it provides the framework to improve the balance between precision on the one hand and cost, complexity and size of the instrument on the other hand. So far, studies on the precision of atomic scale measurements from HRTEM images considered the estimation of the position [5,6,9,13–15,19,25,26,31,32] and thickness [33] of atoms (or atomic columns in projection). In the present paper, we will consider the problem of identifying the chemical composition. Inspired by the pioneering work of Van Dyck et al. on the feasibility of precise estimation of atom positions [9,13,14], we will follow a similar approach and start in a simple way by studying the problem of estimating the atomic number (Z) from a STEM image of a single atom. Note, that chemical theory restricts the unknown atomic number to be a positive integer. Therefore, the atomic number Z is a so-called restricted (or, discrete) parameter. As a consequence, the P(D)F of the observations is not continuously differentiable with respect to Z and hence the CRLB is not defined. In this paper, we will therefore propose an alternative solution using the principles of detection theory [34]. In the present problem of identifying the chemical composition, a priori knowledge concerning possible solutions for the atomic numbers is usually available. In such cases, the question reduces to distinguishing between a finite plausible set of values for the atomic numbers Z given the experimental STEM observations. Detection theory provides the tools to decide between 2 or more hypotheses – where each hypothesis corresponds to the assumption of a specific Z value – and to compute the experimental settings minimizing the probability to assign an incorrect hypothesis. For a binary hypothesis problem, we will derive an expression describing the probability of error for the problem of estimating the atomic number Z from a STEM image of a single atom using a simplified model and assuming Poisson noise statistics. In this way, we derive an expression that gives insight into the performance to make a correct decision and the sensitivity of this detection performance to the experimental settings. Moreover an alternative performance measure is proposed, circumventing the need for excessive image simulations. In particular, we will analyze the detection performance to the inner detector radius. The organization of this paper is as follows. Section 2 discusses the derivation of a parametric statistical model of the observations. Two different cases in the field of STEM imaging will be considered. Next, the maximum likelihood estimator will be reviewed in Section 3. An efficient algorithm for the estimation of atomic column positions and intensities from HAADF STEM images will be introduced. In Section 4, the concept of Fisher information will be discussed and how this can be used to optimize the experiment design in terms of the attainable precision with which unknown continuous structure parameters can be estimated. For the estimation of discrete parameters, alternative performance measures for the evaluation of the design will be proposed. These measures will be tested in an experimental case study in Section 5. In Section 6, conclusions are drawn.

2. Parametric statistical model of the image In this section, we will consider two distinct experiments that consist of identifying (i.e., estimating) unknown structure parameters from STEM observations (pixel values). In the first

36

A.J. den Dekker et al. / Ultramicroscopy 134 (2013) 34–43

experiment, we will consider the problem of estimating the atomic column positions and total intensities scattered by atomic columns from HAADF STEM images. In the second experiment, we will consider the problem of estimating the atomic number Z of an atom from a set of observations obtained by imaging this singleatom object using STEM. In contrast to the first experiment, the second experiment involves the estimation of discrete rather than continuous parameters. In what follows, we will consider 2D images of K  L pixels. The usual way to describe the fluctuating behavior of images in the presence of noise is by modeling the noisy pixel values fwkl ; k ¼ 1; …; K; l ¼ 1; …; Lg, where the index kl of wkl corresponds with the pixel at position ðxk ; yl ÞT , as stochastic variables [23]. By definition, a set of observations, w ¼ ðw11 ; …; wKL ÞT , is defined by its joint P(D)F pw ðωÞ, where the independent variables ω correspond with w. The joint P(D)F defines the expectations, i.e. the mean value of each observation and the fluctuations about these mean values. The expectation values E½wkl ≡λkl are described by parametric models f kl ðΘÞ, which will be introduced in Sections 2.1 and 2.2 for the two examples considered in this paper. The availability of such a model makes it possible to parameterize the P(D)F of the observations, which is of vital importance for quantitative structure determination as will be shown in the remainder of this paper. Obviously, it is important to test the validity of the expectation model before attaching confidence to the structure determination results obtained using the model. If the model is inadequate, it must be modified and the analysis continued until a satisfactory result is obtained. A review of statistical model assessment methods can be found in [19]. 2.1. Empirical HAADF STEM model parametric in the atomic column positions and column intensities In this example, we will describe the expectations of the HAADF STEM observations using an empirical incoherent imaging model. This model has been introduced in [21] as a tool for quantitative composition analysis using statistical parameter estimation theory and has been shown to be a promising technique for atom counting [17,22]. Assuming a HAADF detector, an incoherent STEM image will be formed, of which the expected value f kl ðΘÞ can be written as a convolution between an object function Oðrkl ; ΘÞ and the probe intensity profile jPðrÞj2 [35,36] f kl ðΘÞ ¼ f ðrkl ; ΘÞ ¼ Oðrkl ; ΘÞnjPðrkl Þj2 ;

ð1Þ

where the pixels (k, l) correspond to the STEM probe at position rkl ¼ ðxk ; yl ÞT . The object function depends on a set of unknown structure parameters Θ and the probe profile depends on the probe parameters including the acceleration voltage, the objective aperture semi-angle, defocus, spherical aberration constant, and higher aberrations coefficients. The object function is assumed to be sharply peaked at the atomic column positions and is modeled as a superposition of Gaussian peaks ! N ðxk −βxn Þ2 þ ðyl −βyn Þ2 Oðrkl ; ΘÞ ¼ ζ þ ∑ ηn exp − ð2Þ 2ρ2 n¼1 with ζ a constant background, ρ the width of the Gaussian peak, ηn the column intensity of the nth Gaussian peak, βxn and βyn the x- and y-coordinate of the nth atomic column, respectively, and N the total number of atomic columns. An incoherent imaging model has been proposed in which the scattered intensities of each atomic column are treated separately. As discussed in [35,37,38], this implies the assumption of transversal or intercolumn incoherence. Although it has for a long time been expected that transversal incoherence required the detection of thermally diffuse scattered electrons at high angles [39], later analysis showed

that the detector geometry could destroy many of the coherent interference effects [40]. The annular detector preferentially selects 1s states, which are tightly bound to the atomic columns [38]. As such an image becomes a direct representation of these 1s states. In this sense, the Gaussian functions in the object function, which are localized at the atomic column positions, could be related to the column specific 1s states. It has thus been assumed that the probe channels along the atomic columns when propagating through a sample oriented in zone-axis. So-called cross-talk is the most important effect that could have implications on this channeling behavior [40–42]. It refers to the effect that when a small probe is incident over an atomic column, channeling initially causes the intensity to be focussed tightly on that column but as the thickness increases some intensity can appear on neighboring columns as the wave function apparently tunnels between them [40]. Furthermore, in the proposed imaging model, the dependence of the intensity as a function of thickness is accounted for by the column specific parameters ηn . These parameters therefore also describe the effect of thermal vibrations, which have been shown to be very effective in breaking the coherence along a column and removing strong dynamical oscillations. This leads to so-called intracolumn incoherence [38,43]. The unknown parameters of the imaging model are given by the parameter vector Θ ¼ ðβx1 ; …; βxN ; βy1 ; …; βyN ; ρ; η1 ; …; ηN ; ζÞT :

ð3Þ

Furthermore, if we consider independent normally distributed observations wkl with equal variance s2 , the joint PDF is given by 8 ! 9 < 1 ω −λ 2 = K L 1 kl kl pw ðω; ΘÞ ¼ ∏ ∏ pffiffiffiffiffiffi exp − : ð4Þ : 2 ; s 2π s k¼1l¼1 Appealing to the central limit theorem, the assumption of normally distributed observations is often justified in practical cases where also disturbances other than pure counting statistics contribute [19]. Since the expectations λkl are described by the functional model f kl ðΘÞ, substitution of Eq. (1) in Eq. (4) shows how the PDF depends on the unknown parameters to be measured. 2.2. Single-atom STEM model parametric in the atomic number As a second example, we consider single-atom imaging in STEM. Single atoms can be considered as weakly scattering samples which are often approximated as being phase objects. Therefore, we assume the phase object approximation to be valid in the derivation of a parametric PF for the resulting observations. The wave function in the exit plane of the object is then given by [44] Ψ ðr; ZÞ ¼ expðisV ðrÞÞPðr−rkl Þ

ð5Þ

with V(r) the total projected atomic potential, s the interaction parameter, and Pðr−rkl Þ the wave function of the focused probe incident upon the specimen at the position rkl . The atomic potential can be described using the parameterization of Kirkland [44], where for every atomic number Z, a set of 12 parameters determines the potential. The wave function is then diffracted into the diffraction plane, where the detector is positioned. Mathematically, this is described by a Fourier transform of Eq. (5). The detector integrates this wave function, Ir-g Ψ ðr; ZÞ, to form the STEM intensity at the pixel (k, l) corresponding to the STEM probe at position rkl Z Iðrkl ; ZÞ ¼ jIr-g Ψ ðr; ZÞj2 DðgÞ dg: ð6Þ The detector function D(g) is assumed to be equal to one in the detected field and to zero elsewhere. Based on the intensity distribution, described by Eq. (6), we can now derive an expression

A.J. den Dekker et al. / Ultramicroscopy 134 (2013) 34–43

for the expected number of detected electrons Iðr ; ZÞ f kl ðZÞ ¼ N e kl ; ID ¼ 1

ð7Þ

with Ne the number of incident electrons at each position of the probe and I D ¼ 1 the constant intensity obtained from Eq. (6) when the detector function is uniform. When assuming that the STEM observations are independent electron counting results, which can be modeled as a Poisson distribution, the joint PF is given by λωklkl expð−λkl Þ: ω k ¼ 1 l ¼ 1 kl ! K

L

pw ðω; ZÞ ¼ ∏ ∏

ð8Þ

Similar to the example discussed in Section 2.1, this PF depends on the unknown parameter Z when replacing the expectations λkl in Eq. (8) by the functional model, given by Eq. (7).

3. The maximum likelihood estimator The observations are considered as a data plane from which the unknown parameters have to be estimated in a statistical way. Different estimators can be used to estimate the same unknown parameters of the proposed parametric models. Each estimator will have a different precision, however, the variance of unbiased estimators will never be lower than the CRLB, which is a theoretical lower bound on the variance and will be described in Section 4.1. The maximum likelihood (ML) estimator achieves this theoretical lower bound asymptotically, i.e. for an increasing number of observations, and is therefore of practical importance. From the joint P(D)F of the observations, discussed in Section 2, the ML estimator may be derived. The maximum likelihood estimates t^ ML of the parameters Θ are given by the values of t that maximize the likelihood function pw ðw; tÞ with t independent variables replacing the true parameters and the observations replacing the stochastic variables in the joint P(D)F t^ ML ¼ arg max pw ðw; tÞ ¼ arg max ln pw ðw; tÞ: t

t

ð9Þ

The joint P(D)F pw ðω; t^ ML Þ with the ML estimates inserted generates the observations with higher probability than the joint probability distribution with another set parameters t. 3.1. Maximum likelihood estimation of atomic column positions and column intensities The ML estimator equals the least squares estimator for independent normally distributed observations. Therefore, for the parametric statistical model described in Section 2.1, it is given by θ^ LS ¼ arg min∑∑ðwkl −f kl ðtÞÞ2 t

ð10Þ

k l

where fkl is given by Eq. (1). Direct implementation of the least squares estimator as described by Eq. (10) in which all parameters are estimated at the same time is computationally very intensive and feasible only for images containing a limited number of projected atomic columns in the HAADF STEM image. Therefore, a more efficient algorithm is proposed for obtaining estimates of the unknown structure parameters of the parametric model given by Eq. (1). The use of this new algorithm enables one to analyze larger fields of view. Two assumptions are made for the algorithm. First, it is assumed that an estimate for the background parameter ζ in the empirical model can be obtained from an area in the HAADF STEM image where the intensity is not related to the scattered intensities of the projected atomic columns of the structure of interest. The estimated value for the background parameter is then given by the mean value of this area. Secondly,

37

it is assumed that proper starting values are available for the parameters of the empirical model. Good starting values are necessary in order to avoid ending up in a local minimum during the model estimation procedure. Initial values for the width ρ of the Gaussian are easy to provide and starting coordinates for the positions βx and βy of projected atomic columns can be given using e.g. a peak finding routine. The parameters βx , βy and ρ are the non-linear parameters of the model, and will be denoted as α. An initial guess for the linear parameters, i.e. the height η, is not needed, since the linear parameters in the model can be replaced by their linear least squares estimates given the values of the nonlinear parameters of the model [45]. These linear least squares estimates are the best companion values for the linear parameters and the computation involves only a simple linear regression. In addition, this approach has a second benefit apart from the fact that no starting guesses are needed for the linear parameters. The number of parameters that needs to be estimated in an iterative procedure is reduced, since the best companion values for the linear parameters are used in each iteration increasing the convergence of the model estimation. Schematically, the proposed algorithm for model estimation is outlined in Fig. 1. The basic idea of this algorithm is the segmentation of the image into smaller sections containing individual atomic columns. Next, the parameters of these columns are estimated column-by-column without ignoring overlap between neighboring atom columns. In the first step (B.), the estimated background is subtracted from the input image (A.). When initial guesses are available for the non-linear parameters α1 (C.), the iterative part of the fitting procedure can be executed (shown in gray in Fig. 1). Based on the values of the non-linear parameters αj , starting values for the linear parameters ηj are calculated by means of linear regression (D.), where j corresponds to the number of the iteration. Having starting values for the linear and non-linear parameters of the jth iteration available (E.), the actual estimation of the parameters can be done. Therefore, regions of interest containing single atomic columns are selected from the HAADF STEM image (F.). Next, the parameters of the neighboring projected atomic columns are selected (G.) in order to take overlap between neighboring Gaussian shapes into account. The values of the nonlinear and linear parameters are given by the values that resulted from, respectively, step C. and step D. in the first iteration and by the values that resulted from, respectively, step N. and step D. in further iterations. The contributions of the tails of these neighboring Gaussians are calculated in the region of interest of the single atomic column (H.). Then, these contributions are subtracted from the image of the single atomic column (I.). Ideally, the thus obtained image only contains scattered intensities resulting from this single projected atomic column and can be used for a non-linear least square estimation of the parameters of this single Gaussian (J.). Steps (F.) till (J.) are repeated for each projected atomic column in the HAADF STEM image in order to obtain a ^ j for the jth iteration (K.). complete set of estimated parameters Θ Once the parameters are estimated, convergence is evaluated (L.). If convergence is not yet obtained (M.), the non-linear starting values for the (j+1)th iteration are calculated (N.). The updates for the non-linear parameters are computed based on the starting and estimated values for the non-linear parameters of the jth iteration. The updates are modified by a scaling factor λ in order to avoid divergence on the parameters to be estimated. Having a new set of non-linear parameters available, the iterative part of the fitting algorithm can be executed again starting from step (D.). The iterative approach allows a better modeling of the overlap between neighboring Gaussians. If convergence is attained (O.), the model can be calculated based on the estimated parameters (P.). This algorithm allows one to efficiently evaluate larger nanostructures in a quantitative way.

38

A.J. den Dekker et al. / Ultramicroscopy 134 (2013) 34–43

Fig. 1. Schematic of the proposed model estimation algorithm.

3.2. Maximum likelihood estimation of atomic numbers As a second example, we consider the identification (i.e. estimation) of the atomic number Z of a single atom from a STEM image. As shown in Section 2.2, the joint PF for such an experiment is given by Eq. (8) where the expectations are described by Eq. (7). In this example, the unknown parameter Θ is given by the atomic number Z. Following Eq. (9), the ML estimator is then given by Z^ ML ¼ arg max ln pw ðw; ZÞ ¼ arg max∑∑½wkl ln f kl ðZÞ−f kl ðZÞ: Z

Z

ð11Þ

k l

As compared to the example discussed in Section 3.1, the ML estimator Z^ ML can be computed straightforwardly given the discrete nature of the atomic number and the fact that only one parameter is estimated in this example. For that reason, it does not require advanced computing algorithms.

4. Experiment design In general, the purpose of experiment design in electron microscopy is to set up experiments in such a way that the data obtained can be analyzed to yield most information concerning the structure or chemistry of the sample under study. When applying the apparatus of statistical parameter estimation theory, this information is provided in terms of estimates for the unknown structure parameters using, for example, the ML estimator discussed in Section 3. Ultimately, the precision with which the parameters can be estimated is limited by noise. In the framework of statistical parameter estimation theory, the goal of experiment design is therefore to answer the question: which microscope settings are expected to yield the highest precision with which structure parameters, such as, atomic column positions, particle size, and thickness, can be estimated? Previous work has shown that the CRLB is a very efficient way to answer this question [5,6,9,13–15,19,25,26,30–33]. This lower bound provides a theoretical lower bound on the variance of unbiased estimators of these parameters and can be computed from the parameterized P(D)F. By calculating the CRLB, the experimenter is able to compute the design so as to attain maximum precision. So far, studies on the precision of atomic scale measurements from (S)TEM images considered the estimation of the position of atoms or atomic

columns in projection [5,6,9,13–15,19,25,26,31,32], the atomic column thickness [33], and nanoparticle's sizes [30]. In these papers, it has been shown how optimizing the design of quantitative electron microscopy experiments may substantially enhance the precision of the structure parameter estimators. The common aspect in these studies is the continuous differentiability of the P (D)F with respect to the parameters. For so-called restricted or discrete parameters, the P(D)F is no longer continuously differentiable and hence the CRLB is not defined. We will therefore propose an alternative solution using the principles of detection theory [34]. 4.1. Attainable precision: the Cramér–Rao lower bound Different estimators of the same parameters generally have a different precision. The question then arises what precision may be achieved ultimately from a particular set of observations. For the class of unbiased estimators (bias equals zero), this answer is given in the form of a lower bound on their variance, the CRLB [46–48]. Let pw ðω; ΘÞ be the joint P(D)F of a set of observations w ¼ ðw11 ; …; wKL ÞT . An example of this function is given in Section 2.1. The dependence of pw ðω; ΘÞ on the R  1 parameter vector Θ can now be used to define the so-called Fisher information matrix  2  ∂ ln pw ðw; ΘÞ ; ð12Þ F ¼ −E T ∂Θ ∂Θ which is an R  R matrix. The expression between square brackets represents the Hessian matrix of the logarithm of the joint P(D)F of which the (r, s)th element is defined by ∂2 ln pw ðω; ΘÞ=∂Θr ∂Θs . The Fisher information expresses the ‘inability to know’ a measured quantity [48]. Indeed, use of the concept of Fisher information allows one to determine the highest precision, that is, the lowest variance, with which a parameter can be estimated ^ is any unbiased estimator of Θ, that is, unbiasedly. Suppose that Θ ^ ¼ Θ. Then it can be shown that under general conditions the E½Θ ^ of Θ ^ satisfies covariance matrix covðΘÞ −1 ^ covðΘÞ≥F ;

ð13Þ

−1 ^ so that covðΘÞ−F is positive semi-definite. A property of a positive semi-definite matrix is that its diagonal elements cannot be ^ that is, negative. This means that the diagonal elements of covðΘÞ, ^ 1 ; …; Θ ^ R are larger than or equal to the the actual variances of Θ

A.J. den Dekker et al. / Ultramicroscopy 134 (2013) 34–43

corresponding diagonal elements of F −1 ^ r Þ≥½F varðΘ

−1

rr ;

ð14Þ −1

where r ¼ 1; …; R and ½F rr is the rth diagonal element of the inverse of the Fisher information matrix. In this sense, F −1 repre^ sents a lower bound for the variances of all unbiased estimators Θ. ^ The matrix F −1 is the CRLB on the variance of Θ. As was mentioned above, the attainable precision will depend on the microscope settings. As a consequence, it will also depend on the microscope's resolution. This relation is clearly illustrated by Van Dyck et al., who derived simplified expressions for the precision with which the distance between two neighboring atoms (or atomic columns in projection) can be estimated from a noisy HRTEM image [5,14,26]. It was shown that the highest attainable precision, in terms of the standard deviation s, with which the distance d can be measured, is given by the following rule of thumb [5,14,26]: pffiffiffi 2ρ2 s≈ pffiffiffiffiffiffi if d ≤ 2ρ d Ne pffiffiffi pffiffiffi 2ρ s≈ pffiffiffiffiffiffi if d≥ 2ρ Ne

ð15Þ

where ρ denotes the width of the image of the atom and Ne the total number of detected electrons available to image this atom. The width ρ in fact represents the resolution in the sense of Lord Rayleigh [7]. It depends on both object and microscope parameters. In fact, the standard deviation of the distance measurepffiffiffiffiffiffi ments is inversely pffiffiffi proportional to N e in any case. For distances smaller than 2ρ, the precision increases pffiffiffi proportionally with the distance. For distances larger than 2ρ, the precision becomes independent of the distance. In that case, it can be shown that the variance is directly proportional to the variance of the estimated position of an isolated atom. These expressions clearly show that the attainable precision depends on both the resolution and signal-to-noise ratio (i.e., electron dose). The principle of precisionbased optimization of the experiment design is in fact illustrated by Eq. (15). For example, it follows from these equations that if a higher resolution can only be attained at the expense of a decrease of the number of detected electrons, both effects have to be balanced under the existing physical constraints so as to produce the highest precision [25]. Note that the CRLB is not related to a particular estimation method and that the existence of a lower bound on the parameter variance does not imply that an estimator can be found that reaches this lower bound. However, it is known that there exists an estimator that achieves the CLRB at least asymptotically, that is, for an increasing number of observations. This estimator is the ML estimator, which has been discussed in Section 3. It is known to be asymptotically normally distributed with a mean equal to the true value of the parameter and a covariance matrix equal to the CRLB. In electron microscopy the number of observations is usually sufficiently large for the asymptotic properties of the ML estimator to apply. For this and other reasons, the use of the ML estimator in quantitative electron microscopy is highly recommended. Moreover, approximate confidence regions and intervals for ML parameter estimates can be obtained based on the asymptotic statistical properties of the ML estimator. In this approach, the CRLB is approximated by substituting the ML parameter estimates for the true values of the parameters in expression (12) [19]. Although the CRLB is only defined for continuously differentiable probability functions, a generalization of the CRLB can be derived that will also apply to the case of a restricted parameter, such as the atomic number Z. This bound is known as the Chapman– Robbins lower bound or Hammersley–Chapman–Robbins lower bound (HCRLB) [49,50]. Like the CRLB, the HCRLB is a lower bound

39

on the variance of unbiased estimators, which can be derived from the P(D)F of the observations. It is known to be both tighter and applicable to a wider range of problems than the CRLB. The limits and possibilities of using the HCRLB as an alternative solution to optimize the design when estimating restricted parameters will be discussed in a forthcoming paper. 4.2. Limits to deciding between different hypotheses: detection theory When considering the problem of identifying the atomic number Z from a STEM image, a priori knowledge about the atom types that may be present in the sample and their concentration ratios is usually available. In such cases, the question reduces to distinguishing between a finite plausible set of values for the atomic numbers Z given the experimental STEM observations. In this paper, we will restrict ourselves to the problem of deciding between 2 hypotheses—where each hypothesis corresponds to the assumption of a specific Z value. Note, however that detection theory also provides the tools to generalize this problem where one wishes to choose between 2 or more hypotheses [34]. For the problem at hand, the hypotheses are summarized as follows: H0 : Z ¼ Z 0 H1 : Z ¼ Z 1

ð16Þ

where H0 is referred to as the null hypothesis and H1 as the alternative hypothesis. With these hypotheses, prior probabilities PðH0 Þ and PðH1 Þ are assumed known. In this way, we express a prior belief in the likelihood of the hypotheses. Throughout, we will assume that prior knowledge assures that only Z0 and Z1 are possible atomic numbers for the atom in the sample so that one of both hypotheses is always correct. If the presence of an atom of atom type Z0 or Z1 is equally likely, then it is reasonable to assign equal probabilities of 1=2. In this so-called Bayesian approach, where we assign prior probabilities, we can define a probability of error Pe as P e ¼ Prfdecide H0 ; H1 trueg þ Prfdecide H1 ; H0 trueg ¼ PðH0 jH1 ÞPðH1 Þ þ PðH1 jH0 ÞPðH0 Þ

ð17Þ

where PðHi jHj Þ is the conditional probability that indicates the probability of deciding Hi when Hj is true. Using criterion (17), the two possible errors are weighted appropriately to yield an overall error measure. Decision rules are now defined such that the probability of error is minimized. For this purpose, it is shown in [34] that we should then decide H1 if pw ðw; Z 1 Þ PðH0 Þ 4 ¼γ pw ðw; Z 0 Þ PðH1 Þ

ð18Þ

with pw ðw; Z i Þ the conditional joint PF pw ðω; Z i Þ evaluated at the available observations w. For the problem at hand, the conditional joint PF is given by Eq. (8). As is commonly the case, the prior probabilities are assumed to be equal such that γ corresponds to 1. Then, we decide H1 if LRðwÞ ¼

pw ðw; Z 1 Þ 4 1: pw ðw; Z 0 Þ

ð19Þ

The function LR(w) is called the likelihood ratio since it indicates for each set of observations w the likelihood of H1 versus the likelihood of H0 . This test is therefore also known as the likelihood ratio test. Similarly, decision rule (19) corresponds to deciding H1 if ln LRðwÞ ¼ ln pw ðw; Z 1 Þ−ln pw ðw; Z 0 Þ 4 0:

ð20Þ

Otherwise H0 is decided. Following Section 3, this corresponds to choosing the hypothesis for which the log-likelihood function is maximal. The left-hand side of (20) is termed the log-likelihood

40

A.J. den Dekker et al. / Ultramicroscopy 134 (2013) 34–43

ratio. Using Eq. (11), it follows that: ln LRðwÞ ¼ ∑∑½wkl ðln f kl ðZ 1 Þ−ln f kl ðZ 0 ÞÞ þ ðf kl ðZ 0 Þ−f kl ðZ 1 ÞÞ:

ð21Þ

k l

Given decision rule (20), the probability of error Pe, given by Eq. (17), can be reformulated as follows:   P e ¼ Pðln LRðwÞ o 0H1 Þ12 þ Pðln LRðwÞ 4 0H0 Þ12: ð22Þ In Section 5, it will be shown how repetitive simulations can be used in order to compute this probability of error and how to compute the experimental settings minimizing the probability to assign an incorrect hypothesis. A tightly connected performance measure that will be investigated as a possible alternative to optimize the design is based on the so-called Kullback–Leibler divergence [51,52]. This measure quantifies the difference between two probability distributions. The Kullback–Leibler divergence from pH1 ¼ pw ðω; Z 1 Þ to pH0 ¼ pw ðω; Z 0 Þ is defined as   p ðw; Z 1 Þ ¼ EpH ½ln LRðwÞ: ð23Þ DðpH1 ; pH0 Þ≡EpH ln w 1 1 pw ðw; Z 0 Þ It corresponds to the expected or mean log-likelihood ratio assuming H1 to be true. Similarly the Kullback–Leibler divergence from pH0 to pH1 equals     p ðw; Z 0 Þ p ðw; Z 1 Þ ¼ −EpH ln w DðpH0 ; pH1 Þ ¼ EpH ln w 0 0 pw ðw; Z 1 Þ pw ðw; Z 0 Þ ¼ −EpH ½ln LRðwÞ: ð24Þ 0

From Eqs. (23) and (24), it follows that: DðpH1 ; pH0 Þ þ DðpH0 ; pH1 Þ ¼ EpH ½ln LRðwÞ−EpH ½ln LRðwÞ: 1

0

ð25Þ

The sum of Kullback–Leibler divergences thus corresponds to the difference of the mean log-likelihood ratio under H1 and the corresponding value when assuming H0 to be true. In this paper, we will investigate if this measure can be used as an alternative of the above mentioned probability of error to compute the optimal experiment design. Indeed, based on decision rule (20), which can be used to choose between H1 and H0 , it follows that the probability to assign the wrong hypothesis will decrease when the distributions of the log-likelihood ratio under these hypotheses are better separated. A measure of this separation is given by Eq. (25). For that reason, it is likely to assume that the probability to assign a wrong hypothesis will decrease when the sum of Kullback–Leibler divergences increases. An explicit expression for Eq. (25) can be derived from Eq. (21) and using the fact that the expectations of the observations are described by means of the expectation model given by Eq. (7), that is, EpH ½wkl  ¼ f kl ðZ 0 Þ and 0 EpH ½wkl  ¼ f kl ðZ 1 Þ 1

DðpH1 ; pH0 Þ þ DðpH0 ; pH1 Þ ¼ ∑∑½ðf kl ðZ 1 Þ−f kl ðZ 0 ÞÞðln f kl ðZ 1 Þ−ln f kl ðZ 0 ÞÞ: k l

ð26Þ In Section 5, this expression will be calculated explicitly. It will be investigated if an increase of the sum of Kullback–Leibler divergences leads to a decrease of the probability of error. In this way, it will be possible to decide if this measure can be used as an alternative performance measure to optimize the experiment design.

5. Simulation experiments In this section, we will present the results of an explorative simulation study that has been performed in order to investigate the two criteria introduced in Section 4.2 as possible alternatives to evaluate and optimize the inner detector radius of a quantitative STEM experiment. The importance of enhancing the imaging

power to detect light atoms, such as lithium and hydrogen, and to visualize mono-atomic-layer membranes, such as graphene, has re-attracted interest in the optimization of the STEM detector for such applications. Findlay et al. have shown in [53] that so-called annular bright field (ABF) STEM, whereby an annular detector is used with detector collection range lying within the cone of illumination, performs well to detect light elements. Another strength of ABF STEM is that both light and heavy atomic columns are visible simultaneously in contrast to HAADF STEM which tends to render columns of light elements invisible when in proximity to heavier elements. Hovden and Muller [54] have shown, by means of image simulations in combination with an analysis of the detection efficiency, that the low angle annular dark field (LAADF) detector can provide a significant increase in signal-to-noise ratio for well-resolved and atomically thin specimens. Here, we will investigate the possibilities of using the probability of error and the Kullback–Leibler divergence as alternative quantitative criteria to optimize STEM detector settings in terms of identifying the atomic number of a single atom. The simulation experiment that will be described here is based on an earlier problem considered in [21] where the question was to decide between the presence of Ti and Mn atomic columns at an interface. In that case, the difference in atomic number Z is only 3. Because of this small difference in Z in combination with the presence of heavy atomic columns surrounding the Ti and Mn columns, this question could not be solved visually. Instead the use of statistical parameter estimation theory has been proposed. Here we will reconsider this problem in terms of optimizing STEM detector settings for the simplified problem of deciding between the presence of a single Ti or Mn atom. The hypotheses of interest can therefore be formulated as H0 : Z ¼ Z 0 ¼ 22 and H1 : Z ¼ Z 1 ¼ 25. The goal is now to minimize the probability of error as a function of the inner radius of an annular detector, assuming a constant infinitely large outer detector radius. This means that in Eq. (6) all contributions outside the so-formed detector hole are summed up. The inner detector radius will be varied over a broad range, covering the ABF to the HAADF regime. If we assume that the probabilities of the presence of a Ti atom or Mn atom are equal, an expression for the probability of error is given by Eq. (22). Furthermore, the sum of Kullback–Leibler divergences, for which an expression is given by Eq. (26), will be computed for the given hypotheses and maximized as a function of the inner detector radius. As discussed in Section 4.2, the detector radius minimizing the probability of error is expected to correspond to the radius maximizing the sum of Kullback–Leibler divergences. The probability of error can only be computed using repetitive image simulations under both hypotheses. Therefore, the parametric statistical model described in Section 2.2 is assumed. Given the simulation parameters listed in Table 1, the expectation models described by Eq. (7) have been computed for a single Ti and Mn atom. Note that, to ensure accuracy, the projected potential, and thus the wave function in Eq. (5), were calculated using a three times smaller sampling distance than the probe sampling distance. Next, Poisson distributed observations w have been generated following the joint PF given by Eq. (8). In this way, 10 000 images of size 60  60 pixels have been simulated under H0 and another 10 000 under H1 . For every simulation experiment, the log-likelihood ratio ln LRðwÞ, given by Eq. (21), is calculated. In this way, the results shown in Figs. 25 are obtained for four different inner detector radii (0.55 Å−1, 1.00 Å−1, 1.25 Å−1 and 2.50 Å−1). The histograms shown in light gray and dark gray result from simulations assuming the presence of a Ti and Mn atom, respectively. Based on the computed log-likelihood ratios, hypothesis H1 or H0 is decided following decision rule (20) for each simulation experiment. From the fraction of wrongly

A.J. den Dekker et al. / Ultramicroscopy 134 (2013) 34–43

700

Table 1 Microscope parameter values used in the simulation study. Electron dose Acceleration voltage Electron wavelength Defocus Spherical aberration Spherical aberration of fifth order Semi-convergence angle Interaction parameter Probe sampling distance Total number of scanned pixels

Ne V λ ε Cs C5 α s dx KL

41

Ti Mn 100 300 kV 0.0197 Å −83.01 Å 0.035 mm 0 mm 1.1056 Å−1 5.3  10−4 0.1 Å 60  60

600

500

400

300

200 700

100

Ti Mn

600

0 8 500

6

4

2

0

2

4

6

8

10

Fig. 4. Histograms of the log likelihood ratio for inner detector radius 1.25 Å Ti and Mn.

−1

for

400 1200 Ti Mn

300 1000

200

800

100

0 0.8

0.6

0.4

0.2

0

0.2

0.4

0.6

Fig. 2. Histograms of the log likelihood ratio for inner detector radius 0.55 Å−1 for Ti and Mn.

800

600

400

200

Ti Mn

700

0 4

600

3

2

1

0

1

2

3

4

5

Fig. 5. Histograms of the log likelihood ratio for inner detector radius 2.5 Å−1 for Ti and Mn.

500 400

3

0.5 Probability of error

2.5

0.45

100 0 1.5

1

0.5

0

0.5

1

1.5

Fig. 3. Histograms of the log likelihood ratio for inner detector radius 1.0 Å and Mn.

2 −1

for Ti

Probability of error

200 2

0.4

1.5

0.35

1

0.3

0.5

0.25

assigned atomic numbers, the probability of error, given by Eq. (22), can then be estimated. By repeating this procedure for a broad range of inner detector radii, it can be investigated which settings minimize the probability of error. The results are given by the circles shown in Fig. 6. From this figure, it can be seen that the probability of error is smallest for an inner detector radius of

0.2

0

1

2

3

4

5

6

7

8

sum of KullbackLeibler divergences

Sum of KL divergences

300

0

1

Inner detector radius (Å ) Fig. 6. The probability of error and sum of Kullback–Leibler divergences as a function of the inner detector angle. All other parameters are fixed (see Table 1).

42

A.J. den Dekker et al. / Ultramicroscopy 134 (2013) 34–43

1.25 Å−1. At this detector radius, the histograms of the loglikelihood ratios are mostly separated as can be observed from the comparison of the different results presented in Figs. 2–5. As shown in Section 4.2, a criterion measuring the distance between the distributions of the log-likelihood ratio is given by the sum of Kullback–Leibler divergences, for which an expression is given by Eq. (26). This sum has been computed as a function of the inner detector radius resulting into the dotted curve shown in Fig. 6. Note that the computation of the Kullback–Leibler divergence does not require repetitive image simulations and is therefore easier to calculate. As expected, the inner detector radius maximizing the sum of Kullback–Leibler divergences corresponds to the radius minimizing the probability of error. A maximum divergence is indeed found for an inner detector radius also corresponding to 1.25 Å−1. Furthermore, as follows from Fig. 6, an increase of the sum of Kullback–Leibler divergences in general corresponds to a decrease of the probability of error. These results give a strong indication of the possibility of using the sum of Kullback–Leibler divergences as a useful alternative performance measure for the optimization of the experiment design. It is interesting to note that based on this preliminary simulation study, the optimal inner detector radius slightly exceeds the circular objective aperture radius of 1.1 Å−1. This corresponds to the low angle annular dark field (LAADF) regime also suggested in [54]. The detector radius resulting from the principles of detection theory, seems to correspond to the settings for which a trade-off between signal-to-noise ratio and contrast is found. Indeed, at small detector radii (ABF regime), the signal-to-noise ratio is high, whereas at large detector radii (HAADF regime), the contrast is large. Note that similar conclusions have been found in [31] where the design has been optimized in terms of the precision with which atomic column positions can be determined.

settings corresponds to a minimum in the probability of error. From the optimization of the experiment design, it turns out that for the experimental case at hand, the optimal inner detector radius lies in the LAADF regime thus corresponding to a trade-off between signalto-noise ratio and contrast. The concept that has been introduced here will be extended in a forthcoming paper toward the optimization of all experimental settings simultaneously, including settings of different detector types and probe settings. Furthermore, we will investigate the effect of the electron dose on the identification of atomic numbers and will consider the problem of detecting light element atoms in the presence of heavy atoms. In addition, we will expand the problem from binary to multi-level hypotheses. Finally, since the experimental guidelines are determined by the parametric statistical model, we will extend the model including effects such as thermal diffuse scattering. We will also expand the model for the case of a crystal instead of isolated atoms where diffracted beams overlap with the central diffraction disc.

6. Conclusions

[1] K. Urban, Studying atomic structures by aberration-corrected transmission electron microscopy, Science 321 (2008) 506–510. [2] C.L. Jia, S.B. Mi, K. Urban, I. Vrejoiu, M. Alexe, D. Hesse, Atomic-scale study of electric dipoles near charged and uncharged domain walls in ferroelectric films, Nature Materials 7 (2008) 57–61. [3] C.L. Jia, S.B. Mi, M. Faley, U. Poppe, J. Schubert, K. Urban, Oxygen octahedron reconstruction in the SrTiO3/LaAlO3 heterointerfaces investigated using aberration-corrected ultrahigh-resolution transmission electron microscopy, Physical Review B 79 (2009) 081405. [4] R. Erni, M.D. Rossell, C. Kisielowski, U. Dahmen, Atomic resolution imaging with a sub-50 pm electron probe, Physical Review Letters 102 (2009) 096101. [5] S. Van Aert, A.J. den Dekker, A. van den Bos, D. Van Dyck, High-resolution electron microscopy: from imaging toward measuring, IEEE Transactions on Instrumentation and Measurement 51 (4) (2002) 611–615. [6] D. Van Dyck, S. Van Aert, A.J. den Dekker, A. van den Bos, Is atomic resolution transmission electron microscopy able to resolve and refine amorphous structures? Ultramicroscopy 98 (1) (2003) 27–42. [7] Lord Rayleigh, Wave theory of light, in: Scientific Papers by John William Strutt, Baron Rayleigh, vol. 3, Cambridge University Press, Cambridge, 1902, pp. 47–189. [8] A.J. den Dekker, A. Van den Bos, Resolution: a survey, Journal of the Optical Society of America A 14 (3) (1997) 547–557. [9] D. Van Dyck, E. Bettens, J. Sijbers, M. Op de Beeck, A. Van den Bos, A.J. den Dekker, From high resolution image to atomic structure: how far are we? Scanning Microscopy 11 (1997) 467–478. [10] D.A. Muller, Why changes in bond lengths and cohesion lead to core-level shifts in metals, and consequences for the spatial difference method, Ultramicroscopy 78 (1999) 163–174. [11] C. Kisielowski, E. Principe, B. Freitag, D. Hubert, Benefits of microscopy with super resolution, Physica B 308–310 (2001) 1090–1096. [12] J.P. Locquet, J. Perret, J. Fompeyrine, E. Machler, J.W. Seo, G. Van Tendeloo, Doubling the critical temperature of La1.9Sr0.1CuO4, Nature 394 (1998). [13] A.J. den Dekker, J. Sijbers, D. Van Dyck, How to optimize the design of a quantitative HREM experiment so as to attain the highest precision? Journal of Microscopy 194 (1999) 95–104. [14] E. Bettens, D. Van Dyck, A.J. den Dekker, J. Sijbers, A. van den Bos, Model-based two-object resolution from observations having counting statistics, Ultramicroscopy 77 (1999) 37–48. [15] S. Van Aert, A.J. den Dekker, A. van den Bos, D. Van Dyck, J.H. Chen, Maximum likelihood estimation of structure parameters from high resolution electron microscopy images, part II: a practical example, Ultramicroscopy 104 (2) (2005) 107–125.

In this paper, the benefits of statistical parameter estimation theory in the field of electron microscopy as a method to quantify atomic column positions, number of atoms, nanoparticle radius, chemical compositions, and other structure parameters, have been reviewed. Ultimately these parameters need to be estimated as accurately and precisely as possible from the available observations. To reach this goal, the ML estimator has been described. When assuming independent normally distributed observations, this estimator equals the least squares estimator. An efficient algorithm has been proposed for least squares estimation of structure parameters from HAADF STEM observations. This algorithm allows one to analyze larger nanostructures in a quantitative way. Furthermore, it has been illustrated how to derive the CRLB from the joint P(D)F. This expression can be used for the optimization of the experiment design so as to attain the highest precision. However, for discrete parameters, such as the atomic number Z, this lower bound is no longer applicable. Therefore, alternative solutions using the principles of detection theory have been proposed. This theory provides the necessary tools on how to decide between 2 or more hypotheses, where each hypothesis corresponds to the assumption of a specific Zvalue. For a binary hypothesis problem, it has been shown how repetitive simulations can be used to compute the probability of error to assign an incorrect hypothesis. Given an experimental case, in which the question was to decide between the presence of a Mn or Ti atom, it has been shown how to minimize this error as a function of the inner detector radius of an annular detector using image simulations. Furthermore, the Kullback–Leibler divergence has been proposed as an alternative measure for experiment design which does not require the use of repetitive simulations. A maximum in the Kullback–Leibler divergence as a function of the experimental

Acknowledgments The authors acknowledge financial support from the Research Foundation Flanders (FWO, Belgium) through project fundings (G.0393.11, G.0064.10 and G.0374.13) and a Ph.D. research grant to A.D.B. The research leading to these results has received funding from the European Union Seventh Framework Programme [FP7/2007–2013] under Grant agreement no. 312483 (ESTEEM2). The authors would like to thank professor Dirk Van Dyck for sharing his knowledge and insights and for inspiring them for so many years.

References

A.J. den Dekker et al. / Ultramicroscopy 134 (2013) 34–43

[16] S. Bals, S. Van Aert, G. Van Tendeloo, D. Ávila Brande, Statistical estimation of atomic positions from exit wave reconstruction with a precision in the picometer range, Physical Review Letters 96 (2006) 096106. [17] S. Van Aert, K.J. Batenburg, M.D. Rossell, R. Erni, G. Van Tendeloo, Three dimensional atomic imaging of crystalline nanoparticles, Nature 470 (2011) 374–377. [18] S. Van Aert, S. Turner, R. Delville, D. Schryvers, G. Van Tendeloo, E.K.H. Salje, Direct observation of ferrielectricity at ferroelastic domain boundaries in CaTiO3 by electron microscopy, Advanced Materials 24 (2012) 523–527. [19] A.J. den Dekker, S. Van Aert, D. Van Dyck, A. van den Bos, Maximum likelihood estimation of structure parameters from high resolution electron microscopy image. Part I: a theoretical framework, Ultramicroscopy 104 (2) (2005) 83–106. [20] J. Verbeeck, S. Van Aert, Model based quantification of EELS spectra, Ultramicroscopy 101 (2004) 207–224. [21] S. Van Aert, J. Verbeeck, R. Erni, S. Bals, M. Luysberg, D. Van Dyck, G. Van Tendeloo, Quantitative atomic resolution mapping using high-angle annular dark field scanning transmission electron microscopy, Ultramicroscopy 109 (2009) 1236–1244. [22] S. Van Aert, A. De Backer, G.T. Martines, B. Goris, S. Bals, G. Van Tendeloo, A. Rosenauer, Procedure to count atoms with trustworthy single-atom sensitivity, Physical Review B 87 (2013) 064107. [23] A. van den Bos, A.J. den Dekker, Resolution Reconsidered—Conventional Approaches and an Alternative, Advances in Imaging and Electron Physics, vol. 117, Academic Press, San Diego, USA241–360. [24] S. Van Aert, D. Van Dyck, Do smaller probes in a STEM result in more precise measurement of the distances between atom columns? Philosophical Magazine B 81 (2001) 1833–1846. [25] A.J. den Dekker, S. Van Aert, D. Van Dyck, A. van den Bos, P. Geuens, Does a monochromator improve the precision in quantitative HRTEM? Ultramicroscopy 89 (2001) 275–290. [26] S. Van Aert, A.J. den Dekker, D. Van Dyck, A. van den Bos, High-resolution electron microscopy and electron tomography: resolution versus precision, Journal of Structural Biology 138 (2002) 21–33. [27] S. Van Aert, Statistical Experimental Design for Quantitative Atomic Resolution Transmission Electron Microscopy, Ph.D. Thesis, Delft University of Technology, 2003. [28] S. Van Aert, A.J. den Dekker, D. Van Dyck, How to optimize the experimental design of quantitative atomic resolution TEM experiments? Micron 35 (2004) 425–429. [29] S. Van Aert, A.J. den Dekker, A. van den Bos, D. Van Dyck, Statistical Experimental Design for Quantitative Atomic Resolution Transmission Electron Microscopy, Advances in Imaging and Electron Physics, vol. 130, Academic Press, San Diego, USA, 2004, pp. 1-164. [30] W. Van den Broek, S. Van Aert, P. Goos, D. Van Dyck, Throughput maximization of particle radius measurements through balancing size versus current of the electron probe, Ultramicroscopy 111 (2011) 940–947. [31] S. Van Aert, A.J. den Dekker, D. Van Dyck, A. van den Bos, Optimal experimental design of STEM measurement of atom column positions, Ultramicroscopy 90 (4) (2002) 273–289.

43

[32] S. Van Aert, D. Van Dyck, A.J. den Dekker, Resolution of coherent and incoherent imaging systems reconsidered—classical criteria and a statistical alternative, Optics Express 14 (2006) 3830–3839. [33] A. Wang, S. Van Aert, P. Goos, D. Van Dyck, Precision of three-dimensional atomic scale measurements for HRTEM images: what are the limits? Ultramicroscopy 114 (2012) 20–30. [34] S.M. Kay, Fundamentals of Statistical Signal Processing, Detection Theory, vol. II, Prentice-Hall Inc., New Jersey, 2009. [35] R.F. Loane, P. Xu, J. Silcox, Ultramicroscopy 40 (1992) 121. [36] P.D. Nellist, Scanning transmission electron microscopy, in: P.W. Hawkes, J.C. H. Spence (Eds.), Science of Microscopy, vol. 1, Springer, New York, 2007, pp. 65–132. [37] S.J. Pennycook, D.E. Jesson, High-resolution z-contrast imaging of crystals, Ultramicroscopy 37 (1991) 14–38. [38] S.J. Pennycook, B. Rafferty, P.D. Nellist, Z-contrast imaging in an aberrationcorrected scanning transmission electron microscope, Microscopy and Microanalysis 6 (2000) 343–352. [39] M.M.J. Treacy, A. Howie, C.J. Wilson, Z contrast imaging of platinum and palladium catalysts, Philosophical Magazine A 38 (1978) 569–585. [40] P.D. Nellist, S.J. Pennycook, Incoherent imaging using dynamically scattered coherent electrons, Ultramicroscopy 78 (1999) 111–124. [41] J. Fertig, H. Rose, Resolution and contrast of crystalline objects in high-resolution scanning transmission electron microscopy, Optik 59 (1981) 407–429. [42] L.J. Allen, S.D. Findlay, M.P. Oxley, C.J. Rossouw, Lattice-resolution contrast from a focused coherent electron probe. Part I, Ultramicroscopy 96 (2003) 47–63. [43] P. Hartel, H. Rose, C. Dinges, Conditions and reasons for incoherent imaging in STEM, Ultramicroscopy 63 (1996) 93–114. [44] E.J. Kirkland, Advanced Computing in Electron Microscopy, Plenum Press New York, 1998. [45] W.H. Lawton, E.A. Sylvestre, Elimination of linear parameters in nonlinear regression, Technometrics 13 (3) (1971) 461–467. [46] R.I. Jennrich, An Introduction to Computational Statistics—Regression Analysis, Prentice Hall, Englewood Cliffs, NJ, 1995. [47] A. Stuart, K. Ord, Kendall's Advanced Theory of Statistics, Arnold, London, 1994. [48] B.R. Frieden, Physics from Fisher Information—A Unification, Cambridge University Press, Cambridge, United Kingdom, 1998. [49] J.M. Hammersley, Estimating restricted parameters, Journal of the Royal Statistical Society. Series B (Methodological) 12 (2) (1950) 192–240. [50] D.G. Chapman, H. Robbins, Minimum variance estimation without regularity assumptions, Annals of Mathematical Statistics 22 (4) (1951) 581–586. [51] S. Kullback, R.A. Leibler, On information and sufficiency, Annals of Mathematical Statistics 22 (1951) 79–86. [52] S. Kullback, Information Theory and Statistics, John Wiley and Sons, 1959. [53] S.D. Findlay, N. Shibata, H. Sawada, E. Okunishi, Y. Kondo, T. Yamamoto, Y. Ikuhara, Robust atomic resolution imaging of light elements using scanning transmission electron microscopy, Applied Physics Letters 95 (2009) 191913. [54] R. Hovden, D.A. Muller, Efficient elastic imaging of single atoms on ultrathin supports in a scanning transmission electron microscope, Ultramicroscopy 123 (2012) 59–65.