ADVANCES IN IMAGING AND ELECTRON PHYSICS, VOL. 151
Lie Algebraic Methods in Charged Particle Optics ˇ TOMÁŠ RADLICKA Institute of Scientific Instruments AS CR, Královopolská 147, 612 64 Brno, Czech Republic
I. Introduction . . . . . . . . . . . . . . . . II. Trajectory Equations . . . . . . . . . . . . . . A. Newton Formulation . . . . . . . . . . . . . B. Lagrangian Approach . . . . . . . . . . . . C. Hamiltonian Approach . . . . . . . . . . . . D. Hamilton–Jacobi Approach . . . . . . . . . . . III. The Field Computation . . . . . . . . . . . . . A. The Basic Equations . . . . . . . . . . . . . B. The Field in the Vicinity of the Optic Axis . . . . . . IV. Trajectory Equations: Solution Methods . . . . . . . . A. Paraxial Approximation . . . . . . . . . . . . 1. Paraxial Approximation from the Trajectory Equation . . 2. The Hamiltonian Formulation of the Paraxial Approximation B. The Paraxial Transformation: General Dispersion Case . . . C. Polynomial Form of Solution . . . . . . . . . . D. Numerical Methods . . . . . . . . . . . . . V. The Analytic Perturbation Method . . . . . . . . . . A. The Differential Algebraic Method . . . . . . . . . B. The Trajectory Method . . . . . . . . . . . . C. The Eikonal Method . . . . . . . . . . . . . 1. The First-Order Perturbation . . . . . . . . . . 2. The Second-Order Perturbation . . . . . . . . . D. The Lie Algebraic Method . . . . . . . . . . . E. Dispersion and Chromatic Aberration . . . . . . . . 1. The Trajectory Method . . . . . . . . . . . 2. The Lie Algebraic Method . . . . . . . . . . 3. The Differential Algebraic Method . . . . . . . . F. Example: Round Magnetic Lens . . . . . . . . . 1. The Eikonal Method . . . . . . . . . . . . 2. The Lie Algebraic Method . . . . . . . . . . 3. The Trajectory Method . . . . . . . . . . . 4. Comparison of Methods . . . . . . . . . . . VI. The Symplectic Classification of Geometric Aberrations . . . A. Aberrations and Lie Transformations . . . . . . . . B. Aberrations and Paraxial Approximation . . . . . . . C. Lie Algebra h2 . . . . . . . . . . . . . . D. Representation of h2 on the Space of Complex Polynomials . E. Example: Decomposition of Polynomial Space up to Fourth Order 1. Polynomials of the Third Order . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
242 245 245 248 249 250 251 251 252 257 257 258 263 265 267 272 273 273 276 280 282 284 288 295 295 296 297 297 298 302 304 309 310 311 312 315 316 322 322
241 ISSN 1076-5670 DOI: 10.1016/S1076-5670(07)00404-1
Copyright 2008, Elsevier Inc. All rights reserved.
242
ˇ RADLI CKA
2. Polynomials of the Fourth Order . . . . . . . . . . . . . . F. Representation of h2 on the Real Space of Polynomials . . . . . . . . G. Example: Real Polynomials up to the Fourth Order . . . . . . . . . 1. The Third-Order Polynomials . . . . . . . . . . . . . . 2. Polynomials of the Fourth Order . . . . . . . . . . . . . . . . . . . . . . H. Decomposition of Polynomials with Respect to M1 I. The Third-Order Axially Symmetric Aberrations . . . . . . . . . J. Reflection Symmetry . . . . . . . . . . . . . . . . . . K. Changing Parameterization . . . . . . . . . . . . . . . . VII. Axial Symmetric Aberrations of the Fifth Order . . . . . . . . . . . A. Axial Symmetric Polynomial of the Sixth Order . . . . . . . . . . B. Algebra of the Axial Symmetric Polynomials up to the Sixth Order . . . . C. Interaction Hamiltonian of a Round Magnetic Lens . . . . . . . . . D. Calculation of g4 and g6 . . . . . . . . . . . . . . . . . E. Spherical Aberration of the Fifth Order . . . . . . . . . . . . F. Distortion: Seidel weight −5/2 . . . . . . . . . . . . . . . G. Coma: Seidel Weight 3/2 . . . . . . . . . . . . . . . . H. Peanut Aberration: Seidel Weight 1/2 . . . . . . . . . . . . . I. Elliptical Coma: Seidel Weight −1/2 . . . . . . . . . . . . . J. Astigmatism and Field Curvature: Seidel Weight −3/2 . . . . . . . . Appendix A. The Hamiltonian Transformation . . . . . . . . . . . Appendix B. The Form of the Interaction Hamiltonian for the Round Magnetic Lens References . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . .
323 324 326 326 327 328 330 332 334 338 339 340 342 344 347 349 351 353 354 355 355 357 360
I. I NTRODUCTION Calculations of electron optical systems are important for the design of electron microscopes, lithography, or other devices. The optical properties are calculated in the approximation of geometrical optics; quantum effects are in most cases negligible. The standard computation methods come from either the trajectory equation, which is derived from the equation of motion for a particle in an electromagnetic field, or from the eikonal equation. Unfortunately, neither equation is analytically solvable for other than trivial problems. The paraxial approximation provides the first insight into the properties of an optical system. This linear approximation of the trajectory equation is also used in light optics. The description of the paraxial properties is essential for the basic system design. Unfortunately, the omission of higher-order terms in the trajectory equation is much more problematic than in light optics. In light optics the maximal resolution is given by the wavelength of light and it varies between 400 and 800 nm. Objects of these dimensions cannot be observed with this technology. On the other hand, the wavelength of electrons is on the order of magnitude of pm depending on energy. Hence, in electron optics the resolution is created mainly by the influence of the nonlinear terms in the trajectory equation.
LIE ALGEBRAIC METHODS
243
A consequence of the nonlinear terms in the trajectory equation is that a point is not imaged on a point by the optical system. This property of optical systems is already known from light optics, where it was described by Seidel in 1856 (Seidel aberrations). However, the systems in electron optics contain more complicated elements; therefore, the nonlinear properties of these systems are also more complicated: elements with lower than axial symmetry often are used in electron optical systems, and because of the magnetic field the electron rays are not perpendicular to the wave surfaces. As a consequence, aberrations with no analogy in light optics exist in the field of electron optics. For these reasons it was necessary to develop new perturbation methods that are suitable for the description of nonlinear properties of electron optical systems. These methods are represented by the trajectory method, the eikonal method, and the Lie algebraic method. All methods are analytical perturbation methods, each of which is from one point of view advantageous and from the other disadvantageous. The trajectory method is based on an iterative solution of trajectory equations, the eikonal method is derived from the perturbation method for the eikonal, and the Lie algebraic method is based on the canonical perturbation method. The trajectory method and the eikonal method have been used from the early days of electron optics. Both methods were introduced into electron optics in the 1930s—the trajectory method by Scherzer (1933) and the eikonal method by Glaser (1935). The eikonal method is used for optical calculation of more complicated systems; the method was pedagogically explained by Hawkes and Kasper (1989) in the textbook of electron optics; the practical use of the method was shown by Rose (2004) in the calculation of aberration correctors. An alternative method was developed by Dragt et al. at the University of Maryland in 1980s. The basic concepts of the Lie algebraic method were published in a series of articles (Dragt, 1987; Dragt and Forest, 1986a, 1986b; Dragt et al., 1988). The method was developed as a tool for description of nonlinear behavior in accelerators where the global stability of the system is the most important property. This problem is connected with the theory of dynamic systems in which the Hamiltonian formalism is a very common and powerful tool. The stability of the phase space according to the transfer map is investigated and the normal form method is the standard tool used for the description of the stability of the system (Bazzani, 1988; Bazzani et al., 1994). Even if the method is used primarily in the accelerator community, its good properties in high-order aberration calculations led to its application in electron optics. First, the axially symmetric magnetic field was calculated (Dragt, 1987; Dragt and Forest, 1986a); later still came the deflection systems (Hu and Tang, 1998, 1999). The goal of all these works was to find new
244
ˇ RADLI CKA
results, mainly for high-order aberrations. There is no work that compares the Lie algebra method with the standard methods in electron optics. This is the main aim of our work here. First, we introduce each method and then compare the procedures for solving a simple round lens. The second chapter contains basic approaches in description of the properties of optical systems, particularly different forms of the trajectory equation and its derivations. In the third chapter the basic properties of fields that are used in charged particle optics are summarized. Particular forms of series expansions of fields in the vicinity of the optic axis are also presented. The next chapter deals with the method of solution of trajectory equations. The majority of the chapter describes the paraxial approximation for monoenergetic and general dispersion case. The form of the solution is described both in Lagrangian and Hamiltonian variables. A procedure for a general case is outlined at the end of the chapter. The fifth chapter provides an overview of the perturbation methods. After a short introduction of the differential algebraic method, the trajectory eikonal and Lie algebraic methods are presented in more details. We show that the Lie algebraic method is based on the canonical perturbation theory, and we demonstrate a significant relationship between the factorization theorem and the canonical perturbation theory. We also extend the Lie algebraic method to the parameterization by the position in the object and aperture plane, which often is used in electron microscopy. All mentioned perturbation methods are applied to a simple but important example—a round magnetic lens. The sixth chapter describes the structure of aberration polynomials according to representation of Sp(2, R). The way in which the classification is connected to the form of the paraxial approximation affects the relationship among the aberration coefficients. Because of the complexity of general system classification, we concentrate on the classification of stigmatic systems. The classification of the aberration polynomials is described as a representation of the Lie group adjoint to the algebra of the quadratic polynomials that is determined by the quadratic part of Hamiltonian. This representation is explicitly presented. The last sections cover the description of the symmetry of the aberration polynomials with respect to the reflection and changing of the parameterization. The last chapter deals with description and calculation of the fifth-order geometric aberrations of an axial symmetric magnetic lens. We present the analytical form of all coefficients. The results are applied to a simple magnetic lens, and the coefficient value for the spherical aberration is compared with results calculated by ray tracing.
LIE ALGEBRAIC METHODS
245
II. T RAJECTORY E QUATIONS The motion of particles in an electromagnetic field is completely determined by the equations of motion. Unfortunately, their mathematical form does not describe the optical properties in an appropriate manner. The use of the time parameterization when only the trajectories of particles are relevant leads to clumsy formulations without direct optical meaning. For this reason, the time parameterization of trajectories is replaced by other parameterization (e.g., by the optic axis position). The equations of motion that use time derivatives are then replaced by the trajectory equation, which contains derivatives with respect to the new independent variable. The solutions of such equations are just particle trajectories—not the motion of particles; however, if the rays are known, the velocity of a particle in a given point can be easily calculated. The coordinate axis z of the used coordinate system coincides with the optic axis, and coordinates x and y form the transversal space. The optical devices are constructed so that the particle with given energy that starts to move on the optic axis with 0 slope remains on the axis. Such a particle is known as a design particle and its energy as design energy. The design trajectory term is used for the trajectory of the design particle. Unless stated otherwise, we restrict ourselves to systems with straight optic axis. The treatment of the trajectory equations differs according to which formulation of mechanics is used. We will review some of them. A. Newton Formulation Newton’s second law, relativistically modified for an electron in an electromagnetic field (Jackson, 1998), d (γ r˙ ) = −eE − e˙r × B, (1) dt completely describes the system evolution. The standard notation for time derivative is used, γ = (1 − v 2 /c2 )−1/2 , r = (x, y, z), the electron charge −e = −1.602 · 10−19 C, and mass of the electron m = 9.109534 · 10−31 kg. Now we switch to the parameterization of the trajectory by the axis coordinate z. Only time independent forces are considered. We begin with the time-parameterized trajectory ζ (t) = (x(t), y(t), z(t)). Let us suppose that it can be reparameterized by the axis coordinate z as ζ (z) = (x(z), y(z), z). The time derivative of the trajectory is then expressed as dζ dz(t) dζ dζ = = vz , dt dz dt dz m
246
ˇ RADLI CKA
or more generally, the time derivative along a trajectory takes the form d d = vz . dt dz The z-component of velocity reads
(2)
v vz = , 1 + q 2
(3)
with notation q = (x, y)T for the vector of transversal deviations of the d trajectory from the optic axis and q = dz q. The velocity v is in direct relationship with the kinetic energy of the particle. It can be expressed using scalar potential Φ. If we suppose that all particles have the same energy, the additive constant in Φ can be chosen so that the value eΦ coincides with the kinetic energy of the particle, T = eΦ.
(4)
Because the total energy can be written in the form E = γ mc2 = T + mc2 ,
(5)
using Eq. (4) the relativistic factor γ may be written eΦ . mc2 The kinetic momentum and energy are connected by the relation γ =1+
E2 − g 2 = m2 c 2 ; c2 hence, using Eqs. (7), (5), and (4), we can write g= where η =
e ∗1 Φ 2, η
(6)
(7)
(8)
√ e/2m and the acceleration potential e Φ Φ∗ = Φ 1 + 2mc2
(9)
were introduced. Now we can express the velocity from the kinetic momentum 1
eΦ ∗ 2 g = v= mγ ηγ m
(10)
LIE ALGEBRAIC METHODS
247
and using Eq. (3), the z component of velocity can be found: 1
eΦ ∗ 2 . vz = mηγ (1 + q 2 ) From Eq. (2), we find 1
d eΦ ∗ 2 d = . dt mηγ 1 + q 2 dz
(11)
Substituting into Eq. (1) yields (after some trivial calculations) the trajectory equations 1 1 2 d Φ∗ 1 + q 2 2 1 By − y Bz . (12) q = γ ∇Φ + η −Bx + x Bz dz 2 Φ∗ 1 + q 2 Note that this form of trajectory equations can be used only if the trajectory can be parameterized by the axis coordinate; otherwise, as in case of an electric mirror, some other parameterization must be adopted (Hawkes and Kasper, 1989). A new quantity δ with the meaning of the particle energy deviation from the design particle must be introduced for a beam of particles with different energy. As the system is time independent, δ remains constant along each ray. For the scalar potential to be determined uniquely, eΦ must coincide with the kinetic energy of the particle that has design energy. The relationship between the relativistic factor γ and the scalar potential Φ is then modified to γ =1+
δ δ eΦ + = γ0 + 2 2 mc mc mc2
(13)
and Eq. (8) similarly 1 2 δ2 e γ0 δ ∗ Φ + + g= . η e 2mec2
(14)
The subscript 0 denotes the value for particles with the same energy as the design particle, that is, eΦ . mc2 The form of the trajectory equation (12) remains unchanged; only the acceleration potential must be replaced by γ0 = 1 +
δ2 γ0 δ + , e 2mec2 and the relativistic factor γ by Eq. (13). Φ∗ +
(15)
ˇ RADLI CKA
248
B. Lagrangian Approach The particle trajectories in time parameterization are found as extremals of the functional t2 S=
L r(t), r˙ (t) dt,
(16)
t1
called action. In the case of an electromagnetic field the function L, the Lagrangian, reads (Jackson, 1998)
2 v L = mc2 1 − 1 − 2 − e(vA − Φ), (17) c where A denotes the vector potential. Because we assume that the Lagrangian does not explicitly depend on time, instead of direct reparameterization, the Maupertius principle (Goldstein, 1980) can be used. The trajectories are then found as the extremals of the functional t2 S=
p˙r dt =
t1
p dq,
(18)
ζ
where the canonical momentum is defined as ∂L(r, v) = g − eA. (19) ∂v Substituting Eq. (19) into Eq. (18) and using z-parameterization yields p=
zi S=
1 1 Φ ∗ 2 1 + q 2 2 − η Ax x + Ay y + Az dz.
(20)
zo
The physical interpretation of the integrand is an index of refraction, which is effectively appropriate for an anisotropic nonhomogeneous medium. Let us denote it 1 1 M(q, q , z) = Φ ∗ 2 1 + q 2 2 − η(Ax x + Ay y + Az ). (21) The trajectory equations found as equations for the extremals of Eq. (20) ∂M d ∂M =0 − dz ∂q ∂q
(22)
LIE ALGEBRAIC METHODS
249
take the form 1 1 2 1 Φ∗ 1 + q 2 2 d By − y Bz , q = γ ∇Φ + η −Bx + x Bz dz 2 Φ∗ 1 + q 2 which are equivalent to Eq. (12). Even though the final trajectory equation is identical to the trajectory equation in Newtonian formulation, the advantage of this approach is that it shows the analogy between light optics represented by Fermat’s principle and electron optics. Moreover, the Lagrangian perturbation methods can be used in the aberration computation. The extension to the case in which the particles have different energy is completely analogous to the description of such systems in Newton formulation. The action to minimalize takes the form 1 zi 2 1 δ2 γ0 δ ∗ 2 2 S= Φ + + − η(Ax x + Ay y + Az ) dz, 1+q e 2mec2 zo (23) and its extremals are solutions of the equations equivalent to the trajectory equation extended to cases when the particles have different energy. Note that setting δ = 0 reduces Eq. (23) to Eq. (20). C. Hamiltonian Approach The relativistic electron moving through an electromagnetic field is described by the Hamiltonian (Jackson, 1998) H = m2 c4 + c2 (p + eA)2 − eΦ − mc2 . (24) The value of the Hamiltonian along a phase-space trajectory is equal to δ—the deviation of the particle energy from the design energy. The phase-space trajectories in time parameterization can be found as extremals of the action t2 S= (25) p˙r − H (r, p, t) dt = p dr − H (r, p) dt. t1
ζ
With notation pt = −H the action in z-parameterization of the trajectory reads zi S= (26) pq + pt t − K(q, p, pt , z) dz, zo
ˇ RADLI CKA
250
where henceforth p and A denote transversal part of momentum p = (px , py )T and vector potential A = (Ax , Ay )T , respectively. The function K, the solution of
H(px , py , −K, x, y, z) = −pt , has the meaning of the Hamiltonian in z parameterization and takes the form e2 p2 (27) K = − 2 Φ ∗ + 2t − 2mpt γ0 − (p + eA)2 + eAz . η c The time and pt play role of canonical variables in z-parameterization. It is convenient to choose phase-space variables so that they all vanish on the design trajectory. Evidently this is true for all of them except t, which must be replaced by new variable τ = c(t − z/v0 ). It causes a change of the canonical conjugate variable pt to pτ = pt /c. This change is represented by the canonical transformation described by the generating function pτ z . (28) F2 = qp˜ + cpτ t − β0 The canonical variables q and p do not change with the transformation, and for convenience we drop the use of tilde ( ˜ ). The new Hamiltonian denoted H reads e2 pτ . (29) H = − 2 Φ ∗ + pτ2 − 2mcpτ γ0 − (p + eA)2 + eAz − β0 η The trajectory equations can be found as equations for extremals of Eq. (26), which take the form of standard Hamilton equations ∂H , ∂p ∂H , τ = ∂pt
q =
∂H , ∂q ∂H = 0. pτ = − ∂t p = −
(30a) (30b)
The case when the energy of particles does not differ and forces do not depend on time is described by the Hamiltonian with pτ = 0 substituted into Eq. (29). D. Hamilton–Jacobi Approach The Hamilton–Jacobi approach is based on the fact that motion can be described as a canonical transformation that transforms the initial state into the state with given z. Finding such a transformation is equivalent to solving the trajectory equations. In transformed coordinates the evolution is
LIE ALGEBRAIC METHODS
251
an identity, the consequence of which is vanishing of the Hamiltonian. This leads to the Hamilton–Jacobi equation (Goldstein, 1980) ˜ z) ∂F2 (q, p, ∂F2 ,z + = 0, (31) H q, ∂q ∂z the solution of which is the generating function of the desired canonical ˜ p, ˜ z) of the transformation and the trajectories are solutions q = q(q, algebraic equation q˜ =
∂F2 , ∂ p˜
p=
∂F2 , ∂q
(32)
where q˜ and p˜ play the role of the initial conditions. Generally, solving the partial differential equation (31) is more complicated than solving the trajectory equation; however the formalism of canonical transformations is very suitable for the perturbation calculus. The idea is not to compensate the entire Hamiltonian at once [as in Eq. (31)], but to do so step by step, for only those parts that contribute to a given perturbation order. With slight modification, we use this approach in the Lie algebraic method.
III. T HE F IELD C OMPUTATION Except for some trivial assumptions, we have not described the field in which the electrons are moving; therefore, we first summarize basic properties and forms of the electromagnetic field. Because the applied fields are static in the majority of practical optical devices, we can restrict our attention here to the stationary field. A. The Basic Equations The electromagnetic field is described by the Maxwell equations, which in the case of stationary fields reduce to (Jackson, 1998) ∇ × E = 0,
∇ × H = j,
(33a)
∇D = ρ,
∇B = 0.
(33b)
These equations are completed by the material equations D = E,
B = μH (H = νB).
(34)
In ferromagnetic materials the reluctance ν is a function of B = |B|. The space charge density and current density are regarded as functions of position.
ˇ RADLI CKA
252
The source-free Maxwell equations allow us to introduce electromagnetic potentials E = −∇Φ,
B = ∇ × A.
The scalar potential can then be found as the solution of −∇ (r)∇Φ = ρ,
(35)
(36)
which reduces to ∇ 2Φ = 0
(37)
in domains free of space charge. Similarly, the equation for the vector potential can be found as ∇ × ν |∇ × A| ∇ × A = j, (38) which reduces for constant ν and gauge ∇A = 0 to ∇ 2 A = μj.
(39)
Yet another simplification is possible in vacuum current-free domains, where ∇ × H = 0. In such cases, it is always permissible to write B(r) = −∇W (r),
(40)
where W (r) is the scalar magnetic potential. Since ∇B = 0, the scalar magnetic potential satisfies the Laplace equation ∇ 2 W = 0.
(41)
The simplification achieved lies in the fact that only one scalar differential equation is to be solved instead of three coupled equations. B. The Field in the Vicinity of the Optic Axis The region where the field will be computed is a vacuum source-free domain; hence, Eqs. (37) and (41) can be used for the field calculation. The procedure is similar for the electric and magnetic fields; therefore, we show it for the electric case, whereas the magnetic field is only briefly summarized. Using the standard separation of variables the solution of the Poisson equation in cylindrical coordinates can be found in the form Φ=
∞ m=0
Φm (r, z, ϕ) =
∞ m=0
Φm,s (r, z) sin(mϕ) + Φm,c (r, z) cos(mϕ), (42)
253
LIE ALGEBRAIC METHODS
where Φ0 is an axial symmetric field, Φ1 is a dipole field, Φ2 is a quadrupole field, and so on. For each Φm,c and Φm,s can be found (Venturini and Dragt, 1999) ∞ Φm,α (r, z) =
(α = c, s),
dk eikz Im (kr)am,α (k)
(43)
−∞
with Im is the modified Bessel function Im (x) = −i−m Jm (ix) =
∞
x 2n+m
n=0
22n+m n!(n + m)!
(44)
and am,α represents some function of k. Hence, Φm,α =
∞ n=0
r 2n+m 2n+m 2 n!m!
∞ dk eikz k 2n+m am,α (k).
(45)
−∞
When we introduce the functions 1 cm,α (z) = m 2
∞ dk eikz k m am,α (k),
(46)
dk eikz k 2n+m am,α (k),
(47)
−∞
the derivatives of which read (2n) (z) cm,α
(−1)n = 2m
∞ −∞
Eq. (45) can be written in the form Φm,α
∞ (2n) (−1)n cm,α (z) 2n+m = r . n!(n + m)!22n
(48)
n=0
We will next show the form of the standard multipole fields, starting from the axially symmetric field. In this case, the sum in Eq. (42) consists of only one member with m = 0, that is, Φ = Φ0 =
(2n) ∞ (−1)n c0,c (z) n=0
n!2 22n
r 2n .
(49)
(2n)
The meaning of the coefficient c0,c can be easily found φ(z) = Φ(0, z, φ) = c0,c ;
(50)
ˇ RADLI CKA
254 hence, we can write Φ = Φ0,c =
∞ (−1)n φ (2n) (z) n=0
n!2 22n
r 2n .
(51)
The form of the dipole field is more complicated; in this case m = 1 and Eq. (42) reduces to Φ = Φ1,c cos(ϕ) + Φ1,s sin(ϕ) ∞ (2n) (−1)n (2n) r 2n+1 c1,c cos(ϕ) + c1,s sin(ϕ) = 2n n!(n + 1)!2 n=0
=
∞ n=0
(2n) (−1)n (2n) r 2n c1,c x + c1,s y . 2n n!(n + 1)!2
The meaning of the coefficients can be found similarly ∂Φ1 F1 = Ex (z) = − = −c1,c , ∂x x=y=0,z ∂Φ1 = −c1,s . F2 = Ey (z) = − ∂y
(52)
(53)
x=y=0,z
For quadrupole fields, m = 2 and Φ(r, z, ϕ) = Φ2 (r, z, ϕ) ∞ (2n) (−1)n (2n) = r 2n+2 c2,c cos(2ϕ) + c2,s sin(2ϕ) . 2n n!(n + 2)!2
(54)
n=0
Because
and
r 2 sin(2ϕ) = 2r 2 sin ϕ cos ϕ = 2xy
(55)
r 2 cos(2ϕ) = r 2 cos2 ϕ − sin2 ϕ = x 2 − y 2 ,
(56)
Eq. (54) reads Φ2 (r, z, ϕ) =
∞ (−1)n (x 2 + y 2 )n n=0
Because
n!(n + 2)!22n
∂ 2 Φ2 p2 = = 2c2,c , ∂x 2 x=y=0,z
(2n) (2n) c2,c x 2 − y 2 + 2c2,s xy . (57)
∂ 2 Φ2 q2 = = 2c2,s , (58) ∂x∂y x=y=0,z
255
LIE ALGEBRAIC METHODS
the quadrupole field reads Φ2 (r, z, ϕ) =
∞ (−1)n (x 2 + y 2 )n 1 n=0
n!(n + 2)!22n
p2(2n) x 2 2
−y
2
+ q2(2n) xy
.
(59)
A similar approach can be used for higher-orders multipoles. The magnetic scalar potential also satisfies the Poisson equation; hence, the form of the expansion will be similar to the expansion for the electrostatic potential, that is, W (r, z, ϕ) =
∞
Wn (r, z, ϕ),
(60)
n=0
where Wm (r, z, ϕ) ∞ (2n) (−1)n (2n) = r 2n+m dm,c (z) cos(mϕ) + dm,s (z) sin(mϕ) . (61) 2n n!(n + m)!2 n=0
The meaning of the coefficients dm,α can be simply found d0,s = 0, d0,c = − B(0, 0, z) dz, d1,c = −Bx (0, 0, z) =: −B1 , 1 ∂ 2 W d2,c = =: P2 , 2 ∂x 2 (0,0,z)
(62a)
d1,s = −By (0, 0, z) =: −B2 , (62b) 1 ∂ 2 W d1,s = =: Q2 . (62c) 2 ∂x∂y (0,0,z)
A knowledge of the vector potential is necessary for Lagrangian and Hamiltonian formulation. We will derive the form for only the axial symmetric case. The vector potential in the general case in appropriate gauge can be found in Hawkes and Kasper (1989): 1 2 y 2 Ax = − B − x + y B 2 8 1 4 xy 1 3 1 x − y 4 B2 − B1 + x y + xy 3 B1 + x 2 − y 2 B2 − 4 48 2 24 1 3 1 3 − (63a) x − 3xy 2 Q2 − y − 3x 2 y P2 , 12 12 1 x B − x 2 + y 2 B Ay = 2 8 1 4 xy 1 3 1 2 B2 − + x − y 2 B1 − x − y 4 B1 + x y + xy 3 B2 4 48 2 24
ˇ RADLI CKA
256
1 3 1 3 x − 3xy 2 P2 + y − 3x 2 y Q2 , (63b) 12 12 1 Az = −xB2 + yB1 + x 2 + y 2 (xB2 − yB1 ) 8 1 4 1 3 1 2 + x − y 2 Q2 − xyP2 − x − y 4 Q2 + x y + xy 3 P2 2 24 12 1 3 1 3 2 2 − x − 3xy Q3 − y − 3x y P3 . (63c) 6 6 Now we derive the axially symmetric vector potential up to higher orders, because it will be needed later. The axial symmetry of the system implies that all components of the vector potential except Aϕ vanish; that is, A = Aϕ eϕ ; hence, using Eq. (35) −
Br = −
∂Aϕ ∂z
(64)
and using Eqs. (40) and (61) ∞ n(−1)n 2n−1 (2n−1) r d0,c (z) Aϕ = − Br dz = ∂r W0 dz = (n!)2 22n−1 n=1
=
∞
(−1)n
n=0
(n + 1)!n!22n+1
r 2n+1 B (2n) (z) =
1 rΠ (z, r), 2
(65)
where Π (z, r) =
∞ (−1)n B (2n) 2n r . (n + 1)!n!22n
(66)
n=0
In Cartesian coordinates the vector potential reads 1 Ax = − sin ϕAϕ = − yΠ z, q2 , 2 1 Ay = cos ϕAϕ = xΠ z, q2 , 2 Az = 0.
(67a) (67b) (67c)
The last yet not the least trivial issue relates to determining the axial field distribution. It can be obtained from numerical solution of Maxwell equations. Three common methods exist—the boundary elements method (BEM) (Harrington, 1967); the finite difference method (FDM) (Rouse and Munro, 1989); and the finite elements method (FEM) (Khursheed, 1999; Lencová, 1995). Each method computes the field values in nodal points; the approximate field value in any point can then be computed using some
LIE ALGEBRAIC METHODS
257
of the spline methods (Barth et al., 1990; Press et al., 1986). However, the computation of potential derivatives is more sophisticated; in fact, the accuracy of the methods used is not good. This creates other sources of errors in analytical calculation of aberrations. Several methods of calculating the high-order derivatives of the axial potential exist. Munro et al. (MEBS, 2007; Wang et al., 2004) use the expansion into the basis of Hermite functions. Matsuya, Saito and Nakagawa (1995) used the expansion into another basis. Berz (1999) introduced the method based on Gaussian wavelet transformation, which Liu (2006) used for the calculation of an electrostatic lens. The methods mentioned do not use any property that the field fulfills. Conversely, Venturini and Dragt (1999) proposed for magnetic fields a way of using the fact that the field is a solution of the Poisson equation. A similar method was introduced by Manikonda and Berz (2006) and was implemented in the computer code COSY INFINITY (Michigan State University, Department of Physics and Astronomy, East Lansing, MI, USA) (COSY INFINITY, 2007).
IV. T RAJECTORY E QUATIONS : S OLUTION M ETHODS We have shown that the forms of the trajectory equations are similar independent of the manner in which they are derived or of the coordinates used. From the mathematical point of view, they are the set of two nonlinear ordinary differential equations of the second order with variable coefficients in the case of Lagrangian coordinates, which is equivalent to the set of four ordinary differential equations of the first order in the Hamiltonian case. In general, no analytic form of the solution exists; hence, perturbation or numerical methods must be used. Although the numerical solution relying mostly on the Runge–Kutta algorithm is the easiest and most exact, the methods that allow the solution to be found in the form of a polynomial in initial conditions are very suitable. We describe the features of these two classes of methods in this chapter. A. Paraxial Approximation The particles moving close to the optic axis are well described by the paraxial approximation. This approach is based on the fact that the deviation of the particle trajectories from the axis and their slopes are so small that their second and higher powers can be neglected. It is the standard assumption of Gaussian optics. First, we will presuppose that the energy of particles does not differ. Similar to the case of the trajectory equation, two approaches can be used here: the direct linearization of the trajectory equation or the Hamiltonian approach. We review both of them.
ˇ RADLI CKA
258
1. Paraxial Approximation from the Trajectory Equation The paraxial trajectory equations emerge by linearization of general trajectory Eq. (12), d ∗ 1 F1 (F1 x + F2 y) γ Ex + η(By − y Bz ) − , φ 2x = − 1 3 dz 2φ ∗ 2 4φ ∗ 2
(68a)
γ Ey d ∗ 1 F2 (F1 x + F2 y) + η(x Bz − Bx ) − , φ 2y = − 1 3 ∗ dz 2φ 2 4φ ∗ 2
(68b)
where 1 φ x + F1 − p2 x − q2 y, 2 1 Ey = φ y + F2 + p2 y − q2 x, 2 Ez = −φ
Ex =
and 1 Bx = − B x + B1 − Q2 y − P2 x, 2 1 By = − B y + B2 − Q2 x + P2 y, 2 Bz = B. eφ Henceforward γ relates to the design particle, γ = 1 + mc 2 . Having the orientation of dipoles and quadrupoles selected such that x–z plane is the symmetry plane of the electric dipole and quadrupole fields and the antisymmetry plane of the magnetic dipole and quadrupole fields, F2 , q2 , B1 , and P2 vanish. This reduces the paraxial approximation to F12 η 1 γ φ γφ γp2 ηQ2 x + B − + + y + y B x + ∗ x + 1 1 2φ 4φ ∗ 2φ ∗ 4φ ∗2 φ∗ 2 φ∗ 2 2 γ F1 ηB2 =− ∗ + 1, (69a) 2φ φ∗ 2 η 1 γ φ γφ γp2 ηQ2 B y + ∗ y + y − + − x + x B = 0. 1 1 2φ 4φ ∗ 2φ ∗ φ∗ 2 φ∗ 2 2 (69b) The linear differential equations of the second-order (69a) and (69b) are not separated and homogeneous. The term that causes the mixing of coordinates generates the rotation of particles around the z-axis; it can be eliminated by
LIE ALGEBRAIC METHODS
transition to rotating coordinates X x = Rˆ , Y y where Rˆ =
cos Θ − sin Θ
sin Θ cos Θ
259
(70) (71)
represents the rotation through an angle η Θ(z) = 2
z
B(z) 1
zi
φ ∗ 2 (z)
dz.
In such coordinates the paraxial trajectory equations read F12 γ φ γ φ η2 B 2 X + + X + ∗ X + 2φ 4φ ∗ 4φ ∗2 8φ ∗2 2 F1 γp2 ηQ2 cos(2Θ)X − sin(2Θ)Y + − ∗ + 1 ∗2 2φ 8φ φ∗ 2 γ F1 ηB2 cos Θ, − = ∗ 12 2φ ∗ φ F12 γ φ γ φ η2 B 2 Y + ∗Y + + + ∗2 Y 2φ 4φ ∗ 4φ ∗2 8φ 2 F1 γp2 ηQ2 − sin(2Θ)X + cos(2Θ)Y − + 1 2φ ∗ 8φ ∗2 φ∗ 2 ηB2 γ F1 =− sin Θ. − 1 2φ ∗ φ∗ 2
(72a)
(72b)
Unfortunately, the equations are still neither separated nor homogeneous, because the field considered is too general for real optic devices, which contain fields restricted by given requirements. Some of the requirements for the use of most of these devices are considered here. The stability condition for the design particle trajectory is the first such requirement. It is conditioned by vanishing of the right-hand sides of Eqs. (72a) and (72b)—that is, B2 −
γ F1 2ηφ
∗ 12
= B2 −
F1 = 0, vo
(73)
ˇ RADLI CKA
260
F IGURE 1.
The paraxial rays in a magnetic lens using standard and rotating coordinates.
which forms a relationship between magnetic and electric dipole fields—the Wien condition. Because the paraxial trajectory equations (72a) and (72b) are not separated, another natural restriction is to have a field in which they would be. Two such situations exist. The first represents systems in which no axial magnetic field is present. Thus, Θ = 0, which leads to the paraxial trajectory equations X +
F12 γ φ γp2 ηQ2 X = 0, + − + 1 4φ ∗ 2φ ∗ 4φ ∗2 φ∗ 2 γ φ γφ γp2 ηQ2 Y + ∗ Y + Y = 0. + − 1 2φ 4φ ∗ 2φ ∗ φ∗ 2
γ φ X + 2φ ∗
LIE ALGEBRAIC METHODS
261
The second and more interesting situation occurs when F12 γp2 ηQ2 − ∗ + = 0. (74) 1 2φ 8φ ∗2 φ∗ 2 The paraxial trajectory equations are then separated; moreover, their form is the same for X and Y coordinates: F12 γ φ γ φ η2 B 2 (75) + + ∗2 Q = 0, Q + ∗Q + 2φ 4φ ∗ 4φ ∗2 8φ where the vector Q defined as Q = (X, Y )T was used. A consequence is that an electron initially traveling on any surface αX + βY = 0 remains on this surface. Such types of systems are called stigmatic, and the condition expressed in Eq. (74) is known as the stigmatic condition. We describe only stigmatic systems fulfilling the Wien condition unless stated otherwise. This form of the paraxial equations occurs later; therefore, it is suitable to define a linear operator Pˆ1 : F12 γ φ γ φ η2 B 2 f. (76) + + Pˆ1 (f ) = f + ∗ f + 2φ 4φ ∗ 4φ ∗2 8φ ∗2 Eq. (75) thus is then equivalent to Pˆ1 (Q) = 0. The trajectory equations may be written in more compact form using Picht’s transformation 1
Qp = φ ∗ 4 Q,
(77)
which transforms the trajectory equations into F12 (2 + γ 2 )φ 2 η2 B 2 Qp + + + ∗2 Qp = 0. (78) 4φ ∗ 16φ ∗2 8φ The result is of great interest for two reasons. First, it is simpler to perform numerical calculations with this approach than with Eqs. (69a) and (69b). Secondly, as φ ∗ 0, the coefficient F12 (2 + γ 2 )φ 2 η2 B 2 + + 4φ ∗ 16φ ∗2 8φ ∗2 in Eq. (78) is essentially nonnegative, which imposes an interesting restriction on stigmatic electron lenses: they always exert a converging action. Using numerical calculations the solution for Eq. (75) can be found in the form of a block matrix g(z)1ˆ h(z)1ˆ Qo Q(z) , (79) = Qo Q (z) g (z)1ˆ h (z)1ˆ G(z) =
ˇ RADLI CKA
262
where 1ˆ = 10 01 , subscript o means that the value of a function it evaluated in the object plane z = zo , and g(z) or h(z) fulfill Pˆ1 (g) = 0, Pˆ1 (h) = 0,
g(zo ) = 1, g (zo ) = 0, h(zo ) = 0, h (zo ) = 1.
The solution in the original coordinates then takes form g(z)1ˆ h(z)1ˆ qo 0 Rˆ −1 q = , qo −Θ JˆRˆ −1 Rˆ −1 q g (z)1ˆ h (z)1ˆ where Jˆ is the standard symplectic matrix 0 1 ˆ J = . −1 0
(80)
(81)
The solutions in Eqs. (79) and (80) are parameterized by the position of the ray in object plane and its slopes. Similarly, we can parameterize the rays using the position in the object plane and the position in the aperture plane z = za . Let us choose the pair of independent solutions s(z) and t (z) that fulfil s(zo ) = 1, s(za ) = 0, Pˆ1 (s) = 0, t (zo ) = 0, t (za ) = 1. Pˆ1 (t) = 0, The solution in rotating coordinates then reads Q(z) s(z)1ˆ t (z)1ˆ Qo , = Q (z) s (z)1ˆ t (z)1ˆ Qa and in original coordinates takes form q 0 Rˆ −1 s(z)1ˆ t (z)1ˆ qo = . q −Θ JˆRˆ −1 Rˆ −1 s (z)1ˆ t (z)1ˆ qa In the first case, we can write the Wronskian function as ∗ − 1 ∗ − 1 2 2 φ φo Wg = g(z)h (z) − h(z)g (z) = Wg (zo ) o∗ = , ∗ φ φ whereas in the second case, the Wronskian reads ∗ − 1 2 φ W = s(z)t (z) − t (z)s (z) = Ws (zo ) o∗ . φ In both cases the quality 1
φ ∗ 2 (z)W (z) = const is invariant.
(82)
(83)
(84)
(85)
(86)
LIE ALGEBRAIC METHODS
263
2. The Hamiltonian Formulation of the Paraxial Approximation Unlike the expansion of the trajectory equation used in the previous paragraph, the Hamiltonian approach is based on the expansion of the Hamiltonian into polynomials in canonical variables. The second-order terms of the expanded Hamiltonian completely describe the paraxial properties. Because we are considering the monochromatic systems, pτ = 0 is substituted into Eq. (29). Using the same restriction on dipole and quadrupole fields as in the previous paragraph, the expansion up to the second order takes the form e 1 (87a) H0 = − φ ∗ 2 , η F1 (87b) − B2 x, H1 = e v0 eF12 η ηB 1 eγ0 φ eηB 2 2 H2 = q2 p + Lz + + + ∗ 12 ∗ 12 ∗ 12 ∗ 12 ∗ 32 2 2eφ 2φ 4ηφ 4φ 8ηφ 2 1 eF1 eγ0 p2 (87c) + − + eQ2 x 2 − y 2 . ∗ 12 2 8ηφ ∗ 32 2ηφ The notation Lz = xpy − ypx for the z-component of angular momentum L was used. The zero-order part of the Hamiltonian does not contribute to the equations of motion, and vanishing of the term H1 is equivalent to fulfilling Wien’s condition. The transition into rotating coordinates is represented by the extended canonical transformation ˆ Q = Rq, η ˆ P = Rp, e
(88)
where Rˆ is given in Eq. (71), which transforms the Hamiltonian according to ∂F2 η ˜ H q(Q, P, z), q(Q, P, z), z + , H = e ∂z where the generation function F2 reads ˆ F2 = (Rq)P. Applying this to the quadratic part of the Hamiltonian yields F12 1 1 γ φ η2 B 2 2 ˜ Q2 P + + + H2 = ∗ 12 ∗ 12 ∗ 12 ∗ 32 2 2φ 4φ 4φ 8φ 2 F 1 γ 0 p2 ηQ2 cos(2Θ) sin(2Θ) T 1 Q Q, + − + 1 1 sin(2Θ) − cos(2Θ) 2 8φ ∗ 32 2φ ∗ 2 φ∗ 2 (89)
ˇ RADLI CKA
264
which for stigmatic systems reduces to 2B 2 F12 γ φ 1 1 η 2 Q2 . H˜ 2 = P + + + ∗ 12 ∗ 12 ∗ 12 ∗ 32 2 2φ 4φ 4φ 8φ
(90)
The quadratic part of such a Hamiltonian is axially symmetric, which implies that Lz is the integral of motion in the paraxial approximation. The Hamilton equations of motion can be used to find the paraxial trajectory equations 1
Q = φ ∗− 2 P, F12 γ φ η2 B 2 P =− Q, + + 1 1 3 4φ ∗ 2 4φ ∗ 2 8φ ∗ 2
(91a) (91b)
which, after elimination of P, can be written in the form of ordinary differential equations of the second order F12 γ φ γ φ η2 B 2 (92) + + ∗2 Q = 0, Q + ∗Q + 2φ 4φ ∗ 4φ ∗2 8φ in agreement with the results of the previous paragraph. The solution then takes a form analogous to Eq. (79) ⎛ ⎞ ∗− 12 ˆ ˆ g(z)1 φo h(z)1 Q Qo ⎝ ⎠ = (93) 1 φ∗ P Po φ ∗ 2 g (z)1ˆ h (z)1ˆ φo∗
and in original coordinates ˆ −1 R q = p 0
0 e ˆ −1 R
η
⎛ ⎝
g(z)1ˆ φ ∗ 2 g (z)1ˆ 1
∗− 1 φo 2 h1ˆ
φ∗ ˆ φo∗ h (z)1
⎞ ⎠
qo po
.
(94)
When the position in the object and aperture planes is used to parameterize the rays, the solution in rotating coordinates takes a similar form s(z)1ˆ t (z)1ˆ Qo Q (95) = 1 1 P Qa φ ∗ 2 s (z)1ˆ φ ∗ 2 t (z)1ˆ and in the original coordinates ˆ −1 s(z)1ˆ t (z)1ˆ 0 R 1 q = e ˆ −1 ∗ 12 ∗ 12 0 p ˆ ˆ 0 R φ s (z)1 φ t (z)1 η
0 ˆ a) R(z
qo qa
.
(96)
LIE ALGEBRAIC METHODS
265
The Picht transformation corresponds to the canonical transformation 1
Qp = φ ∗ 4 Q, 1
Pp = φ ∗− 4 P,
(97)
which transforms the quadratic part of the Hamiltonian into the form of the Hamiltonian for a linear oscillator F12 1 2 1 (2 + γ 2 )φ 2 η2 B 2 ˜ H2 = P + + + ∗2 Q2 . (98) 2 2 4φ ∗ 16φ ∗2 8φ The equations of motion then agree with trajectory equations (78). In both approaches the paraxial approximation is represented by the linear transformation described by the transfer matrix. The main consequence is that a point in the object plane is imaged into a point in the image plane. This is an important property of the paraxial monochromatic optics. For detailed description of the paraxial properties, cardinal elements, and so forth, see Hawkes and Kasper (1989), for example. B. The Paraxial Transformation: General Dispersion Case We can now abandon the assumption that all electrons have the same energy. According to the previous description, the new variable δ describing the energy deviation must be introduced. Just as for the space variables, only the first power of δ will contribute to the paraxial approximation. The paraxial trajectory equation is then obtained by linearization of the general trajectory equation described previously. If the same assumptions as in the previous subsection are used for the field, the trajectory equations read F12 η 1 γ φ γφ γp2 ηQ2 B x + ∗ x + x + − + + y + y B 1 1 2φ 4φ ∗ 2φ ∗ 4φ ∗2 φ∗ 2 φ∗ 2 2 ηB2 δF1 γ F1 , (99a) =− ∗ + 1 + 2φ 4eφ ∗2 φ∗ 2 η 1 γ φ γφ γp2 ηQ2 B x + x B = 0. (99b) y− 1 + ∗ − y + ∗y + 1 2φ 4φ ∗ 2φ φ∗ 2 φ∗ 2 2 The equations must be completed by the equation δ = 0. In the rotation coordinates the trajectory equations of the stigmatic systems fulfilling the Wien condition can be written F12 δF1 γ φ γ φ η2 B 2 X + ∗X + + + ∗2 X = cos(Θ), (100a) ∗ ∗2 2φ 4φ 4φ 8φ 4eφ ∗2
ˇ RADLI CKA
266 γ φ Y + ∗Y + 2φ
F12 δF1 γ φ η2 B 2 + + ∗2 Y = − sin(Θ). (100b) 4φ ∗ 4φ ∗2 8φ 4eφ ∗2
These are two separate differential equations but, unlike those of the previous chapter, they are inhomogeneous. The method of variation of parameters can be used to find the general solution
X g(z) h(z) −g(z)μ(h) ˆ + h(z)μ(g) ˆ Xo X = g (z) h (z) −g (z)μ(h) ˆ + h (z)μ(g) ˆ Xo , (101) δ δ 0 0 1
Y g(z) h(z) −g(z)νˆ (h) + h(z)ν(g) ˆ Yo (102) Yo , Y = g (z) h (z) −g (z)νˆ (h) + h (z)νˆ (g) δ δ 0 0 1 where the functionals are defined as follows: 1 F1 f cos Θ μ(f ˆ )= dz, 1 3 ∗ φ∗ 2 4eφo 2 1 F1 f sin Θ νˆ (f ) = − dz. 3 ∗ 12 φ∗ 2 4eφo
(103)
Finding the solution in the original coordinates x, y is trivial. Note that both μ(f ˆ ) and ν(f ˆ ) vanish when the dipole field is not present in the system; when the system does not contain a dipole field, there is no dispersion. In the Hamiltonian approach the dispersion case is described by the quadratic part of the Hamiltonian (29), in which, unlike Eq. (87c), pτ is not neglected. When we consider the stigmatic systems in which the Wien condition is satisfied, the quadratic part of the Hamiltonian reads eF12 η ηB 1 eγ0 φ eηB 2 2 q2 H2 = p + Lz + + + ∗ 12 ∗ 12 ∗ 12 ∗ 12 ∗ 32 2 2eφ 2φ 4ηφ 4φ 8ηφ +
ηmc2 4e2 φ
∗ 32
pτ2 +
F1 mcη 3
2eφ ∗ 2
(104)
xpτ .
The transition into rotating coordinates and using Pτ = ηe pτ transforms the Hamiltonian into F12 1 1 γ φ η2 B 2 2 ˜ Q2 H2 = P + + + 1 1 3 2 4φ ∗ 12 2φ ∗ 2 4φ ∗ 2 8φ ∗ 2 +
mc2 4eφ
∗ 32
Pτ2 +
F1 mcη 2eφ
∗ 23
cos ΘXPτ −
F1 mcη 3
2eφ ∗ 2
sin ΘY Pτ . (105)
LIE ALGEBRAIC METHODS
267
We start from results of the previous subsection. First a canonical transformation, which compensates the geometrical part of Eq. (105), is applied. It coincides with Eq. (93) and the transformed Hamiltonian takes the form mc2 ˜ 2 F1 mcη ∗− 1 H¯ 2 = cos Θ P˜τ g(z)X˜ + h(z)φo 2 P˜x Pτ + 3 3 4eφ ∗ 2 2eφ ∗ 2 F1 mcη ∗− 1 − sin Θ P˜τ g(z)Y˜ + h(z)φo 2 P˜y . 3 2eφ ∗ 2
(106)
In these coordinates, the trajectory equations are trivial X˜ =
F1 mcη 3
∗ 12
2eφ ∗ 2 φo
P˜x = −
F1 mcη 2eφ
∗ 32
h(z) cos Θ P˜τ ,
g(z) cos Θ P˜τ ,
F1 mcη Y˜ = − h(z) sin Θ P˜τ , 1 ∗ 32 ∗ 2 2eφ φo P˜y =
F1 mcη 2eφ
∗ 32
g(z) sin Θ P˜τ ,
P˜τ = 0.
(107a) (107b) (107c)
The solution is found after integration: ⎛1 0 0 0 ⎞ 2mcημ(h) ˆ ⎛ ˜ ⎞ ⎛ ˜ ⎞ Xo X 0 1 0 0 2mcη ν ˆ (h) ⎜ ⎟ Y˜o ⎟ ⎜ Y˜ ⎟ ⎜ ⎟ ⎜ ∗ 12 ⎜ P˜ ⎟ ⎜ ⎟ ⎜ P˜ ⎟ = μ(g) ˆ 0 0 1 0 −2mcηφ ⎟ ⎜ xo ⎟ . ⎜ x⎟ ⎜ o ⎟ ⎝ P˜ ⎠ ⎝ P˜ ⎠ ⎜ 1 y yo ⎝ ⎠ ∗2 ˆ 0 0 0 1 −2mcηφo ν(g) P˜τ P˜τ o 0 0 0 0 1
(108)
η δ verifies that the Transforming to rotating coordinates and using Pτ = ec result is in agreement with Eqs. (101) and (102). As in the monochromatic case, the dispersion paraxial approximation is described by a linear transformation. But in contrast to the monochromatic case, the added dimension causes the following result—in presence of the dipole field, a point in the object plane is not imaged into a point in the image plane. In fact, the trajectories of electrons with energy about δ higher than the design energy are shifted about a value proportional to δ. An example of imaging with two different energies of electrons can be seen in Figure 2.
C. Polynomial Form of Solution A direct extension of the paraxial approximation, in which the second and higher powers of the trajectory deviations and slopes are neglected, results in methods in which only powers of higher order than a given one can
ˇ RADLI CKA
268
F IGURE 2.
Focus of the rays with different energy.
be neglected. The computed solution is then in the form of a k-order polynomial, where k is the highest power that is not neglected. Unfortunately, this extension leads to great difficulties in solving the trajectory equation. Adding the higher-orders terms causes the equations to become nonlinear. The trajectory equations in rotating coordinates now read δF1 Pˆ1 (X) − cos Θ = f2 (Q, Q , Q , δ, z) + · · · + fn (Q, Q , Q , δ, z), 4eφ ∗2 (109a) δF 1 Pˆ1 (Y ) + sin Θ = g2 (Q, Q , Q , δ, z) + · · · + gn (Q, Q , Q , δ, z), 4eφ ∗2 (109b) where fl (Q, Q , Q , δ, z) and gl (Q, Q , Q , δ, z) are homogeneous polynomials of l-th order in the variables Q, Q , Q , and δ with z-dependent coefficients. These equations cannot be solved analytically. Fortunately, we do not need the exact solution but only its part up to the n-th order polynomial in the variables mentioned: x(z) =
i+j +k+ln
aij kl x i y j x k y l
(110)
LIE ALGEBRAIC METHODS
in the case of Lagrangian coordinates, or bij kl x i y j pxk pyl x(z) =
269
(111)
i+j +k+ln
in the case of Hamiltonian coordinates. A number of perturbation methods exist for the computation of such a solution. They are based on an interactive solution of the trajectory equations, Lagrangian or Hamiltonian perturbation method, or on the solution of the trajectory equations in the higher-degree polynomial space where they become linear. Some of these methods are described in the next section. What effect do the higher-order terms have from a physical point of view? They change the image properties rapidly. In brief, they cause an object plane point not to be imaged into a point in the image plane; the focusing is perturbed. We observe it as image imperfections. The benefits of this form of solution compared to direct numerical solution of the equations of motion lie in the possibility of obtaining a qualitative description of the imperfection. Each coefficient of the polynomial represents an aberration, a basic optical characteristic of the device. Such coefficients are known as aberration coefficients. Knowledge of the aberration coefficients is sufficient to classify the imperfections and to find ways of compensating for them. The perturbation methods compute the coefficients of the polynomial solution either in analytical or numerical form. The analytical form of the coefficients includes most information about the optical elements; for example, the spherical aberration coefficient of the axial symmetric magnetic lens reads (Hawkes and Kasper, 1989) CS =
∗− 1 φo 2
zi
L1 h4 + 2L2 h2 h 2 + L3 h 4 dz,
(112)
zo
where L1 =
η2 B B
+
η4 B 4
, (113) 1 3 8φ ∗ 2 32φ ∗ 2 η2 B 2 L2 = , (114) 1 8φ ∗ 2 1 1 L3 = φ ∗ 2 . (115) 2 The influence of the field is readily apparent from this form of solution, but the mathematical form of the term is complicated even if the system is very
270
ˇ RADLI CKA
F IGURE 3. Paraxial beam focus and focus of the same beam influenced by the spherical aberration. zi is the position of the Gaussian image.
271
LIE ALGEBRAIC METHODS
simple. In fact, the complexity of such terms increases with the polynomial order and number of elements in the device. In general, its use for higherorder aberrations is very difficult. On the other hand, there are the numerical methods, which allow us to express the numerical solution in polynomial form. The aberration coefficients are then computed numerically. They do not include as much information about the system, but the method can easily be used for higher-order aberrations. In general, the coefficients differ depending on the choice of the parameterization method. We now show the relationship between the coefficients in parameterization by qo , qo and the coefficients in parameterization by canonically conjugate variables qo , po . The relationship between these two parameterizations and parameterization by the object and aperture position is discussed later in this chapter. The relation between p and q is given by p=
∂M , ∂q
where M is defined in Eq. (21), that is, p=Φ
∗ 12
(116)
q 1 + q 2
−η
Ax Ay
.
(117)
In real devices, the object is often outside the field and Eq. (117) is then reduced to 1
p = φ∗ 2
q 1 + q 2
.
(118)
In this case, x i y j pxk pyl = φ ∗
k+l 2
= φ∗
k+l 2
− k+l x i y j x k y l 1 + q 2 2
xi yj x ky l 2 2 k+l 1 1 2 +1 q × 1 − (k + l)q + (k + l) + ··· . 2 4 2 (119)
Substituting in Eq. (111) and calculating the coefficients at x i y j x k y l , we can easily find the desired relationship. As an example we show how to proceed in the case of a lens, which has only spherical aberration. The ray is then described by 2 3 (120) q = M q + C3 p2 p + C5 p2 p + C7 p2 p + · · ·
ˇ RADLI CKA
272 or by
2 3 q = M q + C˜ 3 q 2 q + C˜ 5 q 2 q + C˜ 7 q 2 q + · · · .
(121)
With the aid of Eq. (119), Eq. (120) can be transformed to 3 3 15 2 2 q = M q + C3 φ ∗ 2 q 2 q 1 − q 2 + q 2 8 3 2 ∗ 52 2 2 ∗ 72 2 3 + C7 φ q q 1− q q + o(9), (122) + C5 φ q 2 and by comparing Eqs. (121) and (122), we derive C˜ 3 = φ ∗ 2 C3 , 3
(123a) 3 3 C˜ 5 = φ C5 − φ ∗ 2 C3 , (123b) 2 7 3 5 15 3 C˜ 7 = φ ∗ 2 C7 − φ ∗ 2 C5 + φ ∗ 2 C3 , (123c) 2 8 where coefficients with tildes denote the spherical aberration coefficients in the parameterization by object position and slope. ∗ 52
D. Numerical Methods The numerical methods, which are based primarily on the standard Runge– Kutta numerical algorithm, compute the ray from its position and slope in the object plane. The methods do not use the analytical form of the field in the vicinity of the optic axis, but the values in an arbitrary point are calculated by a spline method (Barth et al., 1990; Zhu and Munro, 1989) from the values of the field at nodal points computed by (FEM, BEM, FDM). The advantage of this approach is that the field derivatives need not be computed, hereby avoiding a source of errors. The computation usually does not use the trajectory equations but it is based on the solution of the equation of motion in time parameterization d (mγ r˙ ) = −eE − e˙r × B, (124) dt where, in contrast to analytical methods, the values of γ , E, and B are given in every point in the manner mentioned previously. The accuracy of such methods is determined by the choice of numerical algorithm; generally it is very high, especially for small z. Nevertheless, methods have two disadvantages. The first restriction is common to any numerical method. Hence the computation of one ray does not include any information about neighboring
LIE ALGEBRAIC METHODS
273
rays, each ray of interest must be computed separately. This means that we cannot qualitatively describe the relationship between object ray conditions and the image properties—the aberrations coefficients—as we could from the polynomial form of solutions. Although the polynomial regression of numerical results is the method used for such problems, it is advisable for low orders of aberrations only (Hawkes and Kasper, 1989). The second disadvantage lies in the fact that the method forms a relationship between only the initial and final coordinates without any parameter adopted. This means that the solution itself does not include any information about the system; hence, it is not possible to recognize which elements must be changed to improve it. Despite these issues, the method is a powerful tool for exact ray computation when the field distribution is known. Moreover, it is useful for checking the accuracy of the polynomial form of solution, but its use in designing optical devices is limited.
V. T HE A NALYTIC P ERTURBATION M ETHOD The analytical perturbation methods are used to describe the nonlinear properties of electron optical systems. They are represented by the trajectory method, the eikonal method, and the Lie algebraic method. The differential algebraic method also can be used for calculation of aberration integrals, but it is used for frequently for numerical evaluation of the aberration coefficients. The first calculation of aberrations was done in the early 1930s by Scherzer (1933), who used the trajectory method, and by Glaser (1935), who introduced the eikonal method into electron optics. The other methods were introduced much later: the Lie algebraic method by Dragt (Dragt and Forest, 1986b) in the 1980s and the differential algebraic method by Berz (1999) in the late 1990s. We first briefly describe each of the perturbation methods and then show the application of the trajectory method, the eikonal method, and the Lie algebraic method to the simple system of the round magnetic lens. A. The Differential Algebraic Method It is not common to start the introduction to perturbation methods used in electron optics with the differential algebraic method (Berz, 1999), which is not as familiar in the electron optics community as the methods described in the following subsections. Even though the polynomial form of the solution is not exact, it is exact up to order of the solution polynomial. For example,
ˇ RADLI CKA
274 the polynomial solution
1 x = xo − f 2 (z)xo2 + g(z)xo 2 is the second-order approximation to the exact solution x = xo + cos f (z)xo − 1 + sin g(z)xo .
(125)
The principle of the differential algebraic method rests in the observation that the nonlinear approximated solution (125) in coordinates xo , xo corresponds to the linear approximation of the map in coordinates xo , xo extended by the polynomial coordinates xo2 , xo xo , and (xo )2 ⎞⎛ ⎞ ⎛ xo ⎛ ⎞ 1 g(z) − 12 f 2 (z) 0 0 x ⎜ ⎟ ⎜ 0 0 ⎟ ⎜ x ⎟ ⎜ 0 g (z) −f (z)f (z) ⎟ ⎜ xo ⎟ ⎜ 2 ⎟ ⎜ ⎟⎜ ⎟ 0 1 2g(z) g 2 (z) ⎟ ⎜ xo2 ⎟ , (126) ⎜ x ⎟ = ⎜0 ⎟ ⎜ ⎟ ⎝ xx ⎠ ⎜ ⎝0 0 0 g (z) g g ⎠ ⎝ xo xo ⎠ 2 (x ) 0 0 0 0 g 2 (xo )2 which is the solution of some linear differential equation in the polynomial space of the second order. Thus, if such equations from the nonlinear differential equations are found in coordinates x and x , it is just the set of the linear differential equations that will be solved, the solution of which can be written in a matrix form similar to Eq. (126). The coefficients in the polynomial solution will be represented by the matrix elements present in the first two rows. Now let us apply it to the trajectory equation. The trajectory equations in accuracy up to k-th order read ∂Hk+1 (q, p, z) ∂H2 (q, p, z) + ··· + , ∂p ∂p ∂Hk+1 (q, p, z) ∂H2 (q, p, z) − ··· − . p = − ∂q ∂q
q =
(127a) (127b)
We will now find the linear approximation to these equations in the polynomial space of the k-th order in the variables q and p. We define the differential algebraic basis element |l1 , l2 , l3 , l4 = x l1 y l2 pxl3 pyl4
(128)
whose derivative can be calculated by Poisson brackets (Goldstein, 1980) [g, h] =
∂g ∂f ∂g ∂f − ∂q ∂p ∂p ∂q
(129)
LIE ALGEBRAIC METHODS
275
in the form d (130) |l1 , l2 , l3 , l4 = |l1 , l2 , l3 , l4 , H2 + · · · + Hl dz and l = k − l1 − l2 − l3 − l4 + 2, which is enough information to describe the k-th aberration order, as degree([g, h]) = degree(g) 4+k+ degree(h) − 2. This procedure can be used to find the set of 4 − 1 linear differential equations of the first order, which determines the solution up to the k-th order. Generally it takes form ˆ = 0, w + A(z)w
(131)
wT = x, y, px , py , x 2 , xy, . . . , py2 , . . . , x k , . . . , pyk .
(132)
where
The evolution operator of this can be formally written z
ˆ dt , M(z, zo ) = T exp − A(t)
(133)
zo
where T represents the time ordering. The usual method of practical computation is based on an iterative method, but unfortunately the matrix Aˆ with block structure ⎞ ⎛ 3+k · · · 4 × 20 4 × 4 × 4 k ⎟ ⎜ ⎟ ⎜ ⎜ 3+k ⎟ ⎟ ⎜ ··· 20 × 20 20 × k 0 ⎟ ⎜ ⎟ ⎜ ˆ A=⎜ (134) ⎟ ⎟ ⎜ · · · · · · 0 0 ⎟ ⎜ ⎟ ⎜ ⎝ 3+k 3+k ⎠ ··· × k 0 0 k is not nilpotent, which implies that Aˆ n = 0 for any n; hence, the iteration method will not converge. The standard calculation method used in the differential algebraic method is to solve Eq. (131) numerically. The method is hence not analytical but it combines approaches of analytical and numerical methods. The method was implemented into a computer code (COSY INFINITY, 2007) and is often used for numerical calculation of higher-order aberrations (Cheng et al., 2006). Such calculations are based on Eq. (131). The result is not the formula for the aberration coefficient, like the formula in Eq. (112), but only the numerical value.
ˇ RADLI CKA
276
The Hamiltonian must be expressed in the DA basis in order to automate the method for any aberration order and any optical system. This task can be solved by using the approach of nonstandard analysis (Berz, 1999). The Hamiltonian in the DA basis takes the form aij kl |i, j, k, l (135) H = ij kl
and Eq. (130) then reads d |l1 , l2 , l3 , l4 = aij kl |l1 , l2 , l3 , l4 , |i, j, k, l . dz
(136)
ij kl
The Poisson brackets can be easily found |l1 , l2 , l3 , l4 , |i, j, k, l = (l1 k − l3 i)|l1 + i − 1, l2 + j, l3 + k − 1, l4 + l + (l2 l − l4 j )|l1 + i, l2 + j − 1, l3 + k, l4 + l − 1.
(137)
Using this procedure the solution of Eq. (131) can be automated for general optical systems and for any order of aberration. The calculation is then done numerically using a method like the Runge–Kutta method. Generally, it takes the form w(z) = M(z, zo )w(zo ) and hence, x=
i
M1i wi (zo ),
y=
M2i wi (zo );
(138)
(139)
i
from these equations the numerical value of the aberration coefficients can be easily found. The previous text was just a brief description of the principles of the method. More information can be found in Berz (1999). The form of the method is independent of the order of aberration, but unfortunately, to know the form of the Hamiltonian the analytical form of the field must be known. Thus, having the exact axial potentials and their derivatives, we can compute aberrations of any order. Use of the method is limited by inaccuracy of the higher-order derivatives of the axial potential in real optical systems. B. The Trajectory Method Although the differential algebraic method allows computation of the polynomial form of solution in a very representative manner, finding the analytical
LIE ALGEBRAIC METHODS
277
form of differential equations in higher polynomial space is too lengthy. In the trajectory method, the polynomial form of solution is computed directly from the trajectory equations, but conversely, it loses the transparency of the differential algebraic method. The computation is commonly based on the iterative solution of the trajectory equation (12), in which the linear approximation is taken as the initial assumption (Hawkes and Kasper, 1989). The equation for the k-th iteration in the rotating coordinates then takes the form Pˆ1 Q[k] = f2 Q[k−1] , Q[k−1] , Q[k−1] , z + f3 Q[k−1] , Q[k−1] , Q[k−1] , z + · · · , (140) where Q[0] = Q[0] (Qo , Qo , z) is the solution of the paraxial equation in rotating coordinates. The highest order of the terms considered on the righthand side coincides with the order of the aberrations computed. Eq. (140) forms a set of two linear inhomogeneous differential equations of the second order, the solution of which can be found using variation of parameters in the case of parameterization by the object position and slopes z 1 h(z) [k] g(t)φ ∗ 2 f2 Q[k−1] , Q[k−1] , Q[k−1] , t Q = 1 ∗ φo 2 zo + f3 Q[k−1] , Q[k−1] , Q[k−1] , t + · · · dt z 1 g(z) − h(t)φ ∗ 2 f2 Q[k−1] , Q[k−1] , Q[k−1] , t ∗ 12 φo zo + f3 Q[k−1] , Q[k−1] , Q[k−1] , t + · · · dt, (141) where Q[k−1] = Q[k−1] (qo , qo , z). In the case of parameterization by position in the object and aperture planes z 1 1 t sφ ∗ 2 f2 Q[k−1] , Q[k−1] , Q[k−1] , α Q[k] = 1 ∗ Wso φo 2 zo + f3 Q[k−1] , Q[k−1] , Q[k−1] , α + · · · dα z 1 − s tφ ∗ 2 f2 Q[k−1] , Q[k−1] , Q[k−1] , α zo
[k−1]
+ f3 Q
,Q
[k−1]
where Q[k−1] = Q[k−1] (Qo , Qa , z).
,Q
[k−1]
, α + · · · dα ,
(142)
278
ˇ RADLI CKA
This procedure commonly used; however, it is not obvious how many iteration steps are needed to evaluate the solution exactly up to k-th order. For this we must examine the trajectory equations more carefully. The best approach is to introduce the perturbation parameter λ. Let us suppose the solution is written in the polynomial form Q = Q1 (Qo , Qo , z) + Q2 (Qo , Qo , z) + Q3 (Qo , Qo , z) + · · · , (143) where Qk (Qo , Qo , z) is a k-th order homogeneous polynomial in Qo and Qo . Similarly, when the parameterization by position in the object and aperture plane is used, the solution will be in the form Q = Q1 (Qo , Qa , z) + Q2 (Qo , Qa , z) + Q3 (Qo , Qa , z) + · · ·
(144)
and Qk (Qo , Qa , z) is a k-th order homogeneous polynomial in Qo and Qa . Using the perturbation parameter, it can be rewritten as Q(λ) = Q1 + λQ2 + λ2 Q3 + · · · ,
(145)
which would be used in the perturbation procedure. Comparing Eqs. (143) and (145) one can easily see that Q(λ = 1) = Q. Similar to Eq. (143), the right-hand side of the trajectory equation (140) must be rewritten (146) Pˆ1 Q(λ) = λf2 + λ2 f3 + · · · , where fk is a k-th order homogeneous polynomial in Qo and Qo or Qo and Qa . Let us now substitute Eq. (145) into the trajectory equation Pˆ1 Q1 + λQ2 + λ2 Q3 + · · · = λf2 (Q1 + λQ2 + · · · , Q1 + λQ2 + · · · , Q1 + λQ2 + · · · , z) + λ2 f3 (Q1 + · · · , Q1 + · · · , Q1 + · · · , z) + · · · .
(147)
Comparing zero-order terms in λ, one finds the paraxial trajectory equations γ 2 F12 γ φ γ φ η2 B 2 ˆ Q1 = 0, (148) + + P1 (Q1 ) = Q1 + ∗ Q1 + 2φ 4φ ∗ 4φ ∗2 8φ ∗2 while comparing the first-order terms leads to equations that determine the second aberration order γ φ γ φ η2 B 2 γ 2 F12 Q2 = f2 (Q1 , Q1 , Q1 , z). (149) + + Q2 + ∗ Q2 + 2φ 4φ ∗ 4φ ∗2 8φ ∗2 As on the right-hand side there is only a function of z (Q1 is the solution of the paraxial approximation), the solution of Eq. (149) can be evaluated by variation of parameters for the case of parameterization by object position and
279
LIE ALGEBRAIC METHODS
slope in the form Q2 =
z
∗− 1 h(z)φo 2
1
φ ∗ 2 g(t)f2 (Q1 , Q1 , Q1 , t) dt
zo
z
∗− 1 − g(z)φo 2
1
φ ∗ 2 h(t)f2 (Q1 , Q1 , Q1 , t) dt
(150)
zo
or in the case of parameterization by position in the object and aperture planes,
1
Q2 =
z t (z)
∗ 12
Wso φo
1
φ ∗ 2 s(α)f2 (Q1 , Q1 , Q1 , α) dt
zo
z − s(z)
φ
∗ 12
t (α)f2 (Q1 , Q1 , Q1 α) dα .
(151)
zo
Computing higher aberration orders requires comparison of the higher orders of λ; for example, the third aberration order is determined by the equations Pˆ1 (Q3 ) =
2 ∂f2 (Q1 , Q , Q , z) 1
1
∂Qi
i=1
Q2i +
∂f2 (Q1 , Q1 , Q1 , z) Q2i ∂Qi
∂f2 (Q1 , Q1 , Q1 , z) + Q2i + f3 (Q1 , Q1 , Q1 , z). ∂Qi
(152)
The solution of the previous equations found by the trajectory method takes a form similar to Eq. (150) or (151). Note that Eqs. (150) and (151) have a simpler form when the aberrations are expressed in the image plane, where g(zi ) = M and h(zi ) = 0, or s(zi ) = M and t (zi ) = 0. In such cases, they simplify to Q2 (zi ) =
∗− 1 −Mφo 2
z
1
φ ∗ 2 h(t)f2 (Q1 , Q1 , Q1 , t) dt
(153)
zo
or Q2 (zi ) = −
M
z
∗1 Wso φo 2 zo
1
φ ∗ 2 t (α)f2 (Q1 , Q1 , Q1 α) dα.
(154)
ˇ RADLI CKA
280
Use of the previous procedure allows computation of the aberrations of any order; moreover, by introducing the perturbation parameter, the calculation becomes more transparent compared to the procedure based on Eq. (140). C. The Eikonal Method Introducing the eikonal method moves into the class of perturbation methods based on variational principles. First introduced into charged particle optics by Glaser (1935), this approach was described by Rose (1987), who used it for calculation of correctors (2004). Thanks to its use by Hawkes and Kasper (1989), the method is very familiar in the electron optics community. Let us start from the Lagrangian formulation of the trajectory equations, namely from Eq. (20), where henceforth we will denote the integrand by M, that is: 1 1 M = Φ ∗ 2 1 + q 2 2 − η(Ax x + Ay y + Az ).
(155)
The paraxial approximation is described by the action, known in geometric optics as the eikonal (0) S12
z2 =
M2 q(0) , q(0) , z dz,
(156)
z1
where M2 is the quadratic part of the Lagrangian (155). Varying such an action yields (0) δS12
z2 =
∂M2 (q(0) , q(0) , z) d ∂M2 (q(0) , q(0) , z) δq(0) dz − dz ∂q(0) ∂q(0)
z1 (0) (0) (0) + p(0) 2 δq2 − p1 δq1 ,
(157)
where p(0) = ∂M2 /∂q . When we assume that q(0) fulfils the paraxial equation of motion d ∂M2 (q(0) , q(0) , z) ∂M2 (q(0) , q(0) , z) − = 0, ∂q dz ∂q
(158)
the equation reduces to (0)
(0)
(0)
(0)
(0)
δS12 = p2 δq2 − p1 δq1 .
(159)
However, the intention here is to describe the situation when higher terms of the Lagrangian are retained. Let us consider that Eq. (155) can be expanded
281
LIE ALGEBRAIC METHODS
to M(q, q , z) = M2 (q, q , z) + λM I (q, q , z) + λ2 M I I (q, q , z) + · · · ; (160) then the paraxial trajectory is changed to q = q(0) + λq(1) + λ2 q(2) + · · · ,
(161)
where parameter λ is perturbation parameter. The variation of the action z2 S12 =
M q, q , z dz
(162)
z1
can be found by using either a method similar to that used to derive Eq. (159) δS = p2 δq2 − p1 δq1 (0) (0) (1) (2) (1) (2) = p2 + λp2 + λ2 p2 + · · · δ q2 + λq2 + λ2 q2 + · · · (0) (0) (1) (2) (1) (2) − p1 + λp1 + λ2 p1 + · · · δ q1 + λq1 + λ2 q1 + · · · (163) with p(1) = ∂M I /∂q , p(2) = ∂M I I /∂q , etc., or by varying the action (162) into which Eqs. (160) and (161) were substituted S=
z2
M2 q(0) + λq(1) + λ2 q(2) + · · · , q(0) + λq(1) + λ2 q(2) + · · · , z
z1
+ λM I q(0) + λq(1) + λ2 q(2) + · · · , q(0) + λq(1) + λ2 q(2) + · · · , z + λ2 M I I q(0) + λq(1) + λ2 q(2) + · · ·, q(0) + λq(1) + λ2 q(2) + · · ·, z + · · · dz. (164)
The idea is to expand Eqs. (163) and (164) into powers of λ (0) (1) (2) S12 = S12 + λS12 + λ2 S12 + · · · ,
δS12 =
(0) δS12
(1) + λδS12
(2) + λ2 δS12
+ ···
(165a) (165b)
and from a comparison to find the q(i) and p(i) . We will show the procedure on perturbation of the first and second orders.
ˇ RADLI CKA
282 1. The First-Order Perturbation This perturbation is described by (1) S12
z2
∂M2 (q(0) , q(0) , z) (1) M I q(0) , q(0) , z + q ∂q
= z1
∂M2 (q(0) , q(0) , z) (1) , + q ∂q
(166)
which, using the integration by parts and the paraxial equations of motion, is reduced to (0) (0) (1) (0) (1) (0) (1) I q1 , q1 , z + p2 q2 − p1 q1 , S12 = S12 (167) where z2 I S12
=
M I q(0) , q(0) , z dz
(168)
z1
is a given function of q1 , p1 , and z. The variation of Eq. (167) reads (1) (1) (0) (1) (0) (1) (0) (1) I δS12 = δS12 + δp(0) 2 q2 + p2 δq2 − δp1 q1 − p1 δq1 ; (169)
on the other hand, we can express the variation from Eq. (163) (1)
(0)
(1)
(1)
(0)
(0)
(1)
(1)
(0)
δS12 = p2 δq2 + p2 δq2 − p1 δq1 − p1 δq1
(170)
and comparing the two previous equations, the first-order perturbation relation (Sturok, 1955) can be found (1)
(0)
(1)
(0)
(0) (1)
(0) (1)
I δS12 = p2 δq2 − p1 δq1 − δp2 q2 + δp1 q1 .
(171)
The relation is more general than commonly needed; for example, the perturbed rays may be required to fulfil some constraints. Two such cases are relevant. The first one (1)
(1)
q 1 = p1 = 0
(172)
constrains the start position and momenta of perturbed rays to be the same as the unperturbed ones. The perturbed ray is then determined by its position and slope in z1 . The first-order perturbation relation then reads (1)
(0)
(1)
(0)
I = p2 δq2 − q2 δp2 , δS12
(173)
283
LIE ALGEBRAIC METHODS
which leads to the set of equations (0)
(0)
I (1) ∂q2k ∂p2k ∂S12 (1) = p2k − q2k , ∂q1 ∂q1 ∂q1 2
(174a)
k=1
I ∂S12
∂p1
=
2
(0)
(1) p2k
k=1
∂q2k ∂p1
which can be written in the form ⎛ ∂S I ⎞
12
⎝
∂q1 I ∂S12 ∂p1
⎠ = MT1
(0)
(1) − q2k
p(1) 2
∂p2k ∂p1
(174b)
,
(175)
,
(1)
−q2
where M1 is a paraxial matrix in parameterization by qo and po . The (1) (1) coordinates q2 and p2 are easy to calculate, ⎛ ∂S I ⎞ (1)
12 p2 T −1 ∂q1 ⎝ ⎠. (176) = M1 I (1) ∂S12 −q2 ∂p1
When the rotating coordinates are used, the previous result can be simplified for stigmatic systems to ⎞ ⎛ ∂S I ⎞ 1 (1) ⎛ φ ∗ ˆ 12 −φ ∗ 2 g 1ˆ P2 ∂Q1 φo∗ h 1 ⎠ ⎠ ⎝ ⎝ = (177) I ∂S12 ∗− 12 −Q(1) 2 g 1ˆ −φo h1ˆ ∂P 1
and in the image plane, where h(zi ) = 0 and g(zi ) = M, we can write I ∂S12 . ∂Po The second reasonable choice of constraint reads (1)
= −M
(178)
(1)
(1)
(179)
Qi
q1 = q2 = 0,
which means that the perturbed ray is determined by its positions in z1 and z2 . The first-order perturbation relation then takes the form (1)
(0)
(1)
(0)
I δS12 = p2 δq2 − p1 δq1 .
(180)
In practical computations, the rays are determined by their position in the object and aperture planes, which means that the constraints take the form (1) q(1) o = qa = 0
(181)
ˇ RADLI CKA
284
and the first-order perturbation relations read (1)
(0)
(1)
(0)
(182)
(1)
(0)
(1)
(0)
(183)
I (0) = p2 δq2 − q2 δp2 − p(1) δSo2 o δqo , I (0) δSa2 = p2 δq2 − q2 δp2 − p(1) a δqa . (1) The quantities p(1) 2 and q2 can be evaluated from the equations (0)
(0)
I (1) ∂q2k ∂p2k ∂So2 (1) = p2k − q2k , ∂qa ∂qa ∂qa 2
(184a)
k=1
(0)
(0)
I (1) ∂q2k ∂p2k ∂Sa2 (1) = p2k − q2k , ∂qo ∂qo ∂qo 2
(184b)
k=1
I and S I were parameterized by positions in the object where the functions So2 a2 I I (q , q , z). As in the previous case, and aperture plane So2 (qo , qa , z) and Sa2 o a ⎛ ∂S I ⎞ (1)
a2 p2 ∂qo T ⎠ = M1 ⎝ , (185) I (1) ∂S12 −q2 ∂qa
where M1 is a paraxial matrix in parameterization by qo and qa . For a stigmatic system in rotating coordinates ⎛ ∂S I ⎞ (1)
∗1 a2 ∗ 12 ˆ P2 ˆ 2 1 φ t 1 −φ s 1 ⎝ ∂Qo ⎠ = , (186) I ∗ 12 ∂S12 ˆ ˆ −t 1 s 1 −Q(1) Wso φo 2 ∂Qa where Ws = st − s t is Wronskian (85). Thus, in image plane, where t (zi ) = 0 and s(zi ) = M, we can write Q(1) i =−
M ∗1
φo 2 Wso
I ∂So2 . ∂Qa
2. The Second-Order Perturbation This perturbation order is described by the part of the action z2 (2) M I I q(0) , q(0) , z + S12 = z1 +
∂M I (q(0) , q(0) , z) (1) q ∂q
∂M I (q(0) , q(0) , z) (1) ∂M2 (q(0) , q(0) , z) (2) q q + ∂q ∂q
(187)
285
LIE ALGEBRAIC METHODS
+
∂M2 (q(0) , q(0) , z) (2) 1 ∂ 2 M2 (q(0) , q(0) , z) (1) (1) q + qi qj ∂q 2 ∂qi ∂qi i,j
+2
∂ 2M
2
(q(0) , q(0) , z) ∂qi ∂qj
qi(1) qj(1) +
∂ 2 M2 (q(0) , q(0) , z) (1) (1) qi qj ∂qi ∂qj
dz. (188)
Similar to the first-order perturbation, by use of equation of motion we can write z2
∂M2 (q(0) , q(0) , z) (2) ∂M2 (q(0) , q(0) , z) (2) q + q dz ∂q ∂q
z1 (0) (2)
(0) (2)
= p2 q2 − p1 q1 .
(189)
(2)
S12 can be also simplified using z2 (1) (1) ∂M (1) ∂M +q q dz = q(1) 2 p2 − q1 p1 , ∂q ∂q
(190)
z1
where the left-hand side can be expanded into powers of the perturbation parameter λ z2 z2 I I (1) ∂M2 (1) ∂M2 (1) ∂M (1) ∂M +q + q q dz + λ q dz ∂q ∂q ∂q ∂q z1
z1
z2 + z1
2 2 (1) (1) ∂ 2 M2 (1) (1) ∂ M2 (1) (1) ∂ M2 + 2qi qj + qi qj qi qj dz ∂qi ∂qj ∂qi ∂qj ∂qi ∂qj i,j
+ o λ2 ,
(191)
and the right-hand side to (1)
(1)
q2 p2 − q1 p1
(1) (1) (1) (0) (1) (0) (1) (1) = q2 p2 − q1 p1 + λ q2 p2 − q1 p1 + o λ2 .
When linear terms in λ are compared,
(192)
ˇ RADLI CKA
286 z2
(1) ∂M2
q
∂q
∂M2 + q(1) ∂q
dz
z1
=−
z2
qi(1) qj(1)
i,j
z1
(1) (1)
2 ∂ 2 M2 ∂ 2 M2 (1) (1) ∂ M2 + 2qi(1) qj(1) + q q i j ∂qi ∂qj ∂qi ∂qj ∂qi ∂qj
(1) (1)
+ q2 p2 − q1 p1 .
dz (193)
Hence, (2)
(0) (2)
(0) (2)
(1) (1)
(1) (1)
II S12 = S12 + p2 q2 − p1 q1 + p2 q2 − p1 q1 ,
(194)
where the second-order characteristic function z2 II S12
= z1
1 ∂ 2 M2 (q(0) , q(0) , z) (1) (1) M I I q(0) , q(0) , z − qi qj 2 ∂qi ∂qj i,j
∂ 2 M2 (q(0) , q(0) , z) (1) (1) ∂ 2 M2 (q(0) , q(0) , z) (1) (1) +2 qi q j + qi qj ∂qi ∂qj ∂qi ∂qj
dz (195)
q
depends either on z and values of q and in z1 , or on z and positions of the ray in the object and aperture planes, similar to the second and third term on the left-hand side, while the functions, which must be computed, can be found on the right-hand side. Varying Eq. (188) and comparing it with the expansion of Eq. (163), the second-order perturbation relation can be derived (Sturok, 1955) (1) (1) (1) (1) II δS12 − q2 δp2 + q1 δp1 (0) (2) (0) (2) (0) (2) (0) = p(2) 2 δq2 − p1 δq1 − q2 δp2 + q1 δp1 .
(196)
Two perturbation relations can be found using a procedure similar to the first-order perturbation. The first one is for rays determined by the position and slopes in the object plane (1)
(1)
(2)
(0)
(2)
(0)
II δS12 − q2 δp2 = p2 δq2 − q2 δp2 ,
(197)
which leads to the set of the equations (1)
(0)
(0)
II (1) ∂p2k (2) ∂q2k ∂p2k ∂S12 (2) − q2k = p2k − q2k , ∂q1 ∂q1 ∂q1 ∂q1 2
k=1
2
k=1
(198a)
287
LIE ALGEBRAIC METHODS (1)
(0)
(0)
II (1) ∂p2k (2) ∂q2k ∂p2k ∂S12 (2) − q2k = p2k − q2k . ∂p1 ∂p1 ∂p1 ∂p1 2
2
k=1
(198b)
k=1
As in the first-order perturbation, it takes the form ⎛ II (1) ⎞ (2)
∂S12 (1) ∂p2k 2 − q p2 k=1 2k ∂q1 ⎟ ⎜ ∂q1 T = M1 ; ⎝ II (1) ⎠ (2) 2 ∂S12 (1) ∂p2k −q 2 k=1 q2k ∂p1 ∂p1 −
(199)
hence,
p(2) 2 (2) −q2
⎛ T ⎜ = M−1 ⎝ 1
II ∂S12 ∂q1 II ∂S12 ∂p1
− −
2
(1)
(1) ∂p2k ∂q1
k=1 q2k
2
(1) (1) ∂p2k
⎞ ⎟ ⎠
(200)
k=1 q2k ∂p1
and for a stigmatic system in rotating coordinates in the image plane, it can be written
2 (1) II ∂Soi (2) (1) ∂Pik . (201) − Qik Qi = −M ∂Po ∂Po k=1
The second perturbation relation for rays determined by the positions in the object and aperture planes reads (1)
(1)
(2)
(0)
(2)
(0)
II (0) − q2 δp2 = p2 δq2 − q2 δp2 − p(2) δSo2 o δpo ,
(202a)
(1) (2) (0) (2) (0) II (2) (0) − q(1) δSa2 2 δp2 = p2 δq2 − q2 δp2 + qa δpa ,
(202b)
which leads to the set of equations (1)
(0)
(0)
II (1) ∂p2k (2) ∂q2k ∂p2k ∂So2 (2) − q2k = p2k − q2k , ∂qa ∂qa ∂qa ∂qa 2
2
k=1
I ∂Sa2
∂qa
−
2 k=1
(203a)
k=1
(1) ∂p2k (1) q2k ∂qa
=
2 k=1
(0)
(2)
p2k
∂q2k ∂qa
(0)
(2)
− q2k
∂p2k ∂qa
.
(203b)
Using the same procedure as in the previous case, we can find for the aberration in image plane
2 (1) II ∂Soi M (2) (1) ∂Pik . (204) − Qik Qi = − 1 ∗ ∂Qa ∂Qa k=1 φo 2 Wso
288
ˇ RADLI CKA
D. The Lie Algebraic Method This method was introduced by Dragt et al. in the University of Maryland in the 1980s. The basic ideas were published by Dragt and colleagues (Dragt, 1987; Dragt and Forest, 1986a, 1986b; Dragt et al., 1988). It is commonly used in beam mechanics (Dragt et al., 1988; Forest, 1998), where it was employed in the computer code MARYLIE (Department of Physics, University of Maryland, College Park, MD, USA) (MARYLIE, 2006). The use of the method in electron optics is rare (Dragt, 1990; Dragt and Forest, 1986b; Ximen, 1995); the authors have concentrated primarily on the calculation of high-order aberrations of simple systems (Hu and Tang, 1998, 1999). The great advance of the method was its description of stability of Hamiltonian systems (Bazzani, 1988; Dragt, 1982; Meyer, 1974). The method’s name is derived from the mathematical structure used. It is based on Poisson brackets, which form the Lie algebra structure on the phase space, and canonical perturbation theory (Cary, 1981). We use the standard notation ∂f ∂g ∂f ∂g − (205) [f, g] = ∂q ∂p ∂p ∂q for the Poisson brackets of two functions. It is easy to see that the equations of motion can be written consistently in the form w = [w, H ],
(206)
where w is vector in the phase space
q w= . p
(207)
Direct calculation allows easy demonstration of the Lie algebra structure of the phase space generated by the Poisson bracket, [f, αg + βh] = α[f, g] + β[f, h]
(Linearity),
(208a)
[f, g] = −[g, f ] f, [g, h] + h, [f, g] + g, [h, f ] = 0
(Antisymmetry),
(208b)
(Jacobi identity).
(208c)
Because the Poisson brackets exist for all differentiable functions, using this operation it is possible to assign the linear operator to every such function, f → :f :,
:f :g = [f, g].
(209)
This is called a Lie operator. The linearity is the direct consequence of Eq. (208a) :f :(αg + βh) = α:f :g + β:f :h,
(210)
289
LIE ALGEBRAIC METHODS
and the Jacobi identity implies that it acts as the derivative on the algebra of functions with product represented by the Poisson brackets, :f :[g, h] = [:f :g, h] + [g, :f :h].
(211)
A similar property also can be found for the algebra of functions where the product is represented by the standard product of function (Steinberg, 1986), :f :gh = (:f :g)h + g:f :h.
(212)
All of these properties have important consequences as mentioned later. The Lie algebraic method is based on the canonical transformations. The standard description of canonical transformation uses the generating function, in which the new and original coordinates are mixed (Goldstein, 1980). That is, ˜ q, z) ˜ q, z) ∂F (p, ∂F (p, , p= . (213) ∂ p˜ ∂q However, for the purposes of the perturbation theory the description via transformation using only original coordinates is more advisable. It is possible to do so by use of the Lie transformations, which are defined as the exponential of the Lie operator; that is, q˜ =
˜ ˜ ˜ z) = e:g(w,z): w, w(w,
(214)
where 1 1 (215) e:g: = 1ˆ + :g: + :g:2 + :g:3 + · · · . 2 3! It can be shown (Steinberg, 1986) that when Eq. (211) is valid, the transformation so produced is canonical. That is, e:g: [f, h] = e:g: f, e:g: h . (216) Moreover, from Eq. (212) it can be seen that all analytical functions are transformed via (Steinberg, 1986) ˜ p,z): ˜ ˜ p, ˜ z) = f q(q, ˜ p, ˜ z), p(q, ˜ p, ˜ z), z = e:g(q, ˜ p, ˜ z). (217) f˜(q, f (q, Unfortunately, the situation for the Hamiltonian is more complicated because the transformation rule for Hamiltonians differs from the transformation rule for functions (Goldstein, 1980). An expansion in terms of the Lie transformation can be found in the form (Cary, 1977) ˜ p,z): ˜ ˜ p, ˜ z) + ˜ p, ˜ z) = e:g(q, H (q, H˜ (q,
1 0
(See also Appendix A.)
˜ p,z): ˜ eθ:g(q,
∂g dθ. ∂z
(218)
290
ˇ RADLI CKA
In fact, the Lie algebraic method is a just modification of the canonical perturbation methods used in the classical mechanics (Lichtenberg and Lieberman, 1982). Like the Hamilton–Jacobi theory, it is based on the fact that motion is the canonical transformation that compensates the Hamiltonian. But in contrast to the Hamilton–Jacobi theory, the Hamiltonian is not compensated using one transformation but it is compensated term by term. For our case, when the Hamiltonian is easily expanded into polynomial in canonical variables H = H 2 + H3 + H4 + · · · ,
(219)
it is natural to find such a sequence of the canonical transformations, which compensate the Hamiltonian order by order. Let us say that the canonical transformation Mk transforms the system with Hamiltonian (219) into a system with Hamiltonian H = Hk+2 + Hk+3 + · · · .
(220)
The equations of motion then take the form d ˜ = [w, ˜ H ] = fk+1 + fk+2 + · · · , w dz
(221)
that is, the lowest order of terms on the right-hand side is k + 1. The physical meaning of Mk arises from the fact that the solution of such an equation can be evaluated in the form ˜ o , z) + gk+2 (w ˜ o , z) + · · · ; ˜ =w ˜ o + gk+1 (w w
(222)
hence, there is no evolution up to k-th order present. If we do not include terms of higher order than k into the calculations, the evolution described ˜ = w ˜o in the previous equation is approximated by an identity; that is, w ˜ = Mk w ˜ o completely describes the and the canonical transformation w evolution up to k-th order. Before we present the procedure for finding such a transformation, we mention a property that is useful when polynomial orders are compared. If we evaluate the Poisson bracket of two homogeneous polynomials of k-th and l-th order, respectively, the result emerges as the homogeneous polynomial of (k + l − 2)-th order; that is, [gk , gl ] = hk+l−2 .
(223)
The proof is based directly on the definition of Poisson brackets, which consists of multiplication and two derivatives.
291
LIE ALGEBRAIC METHODS
The procedure starts by compensation of the quadratic part of the Hamiltonian and is equivalent to the solution of the paraxial approximation. Two separate approaches can be used. The first is to use parameterization with initial position and momentum. The extended canonical transformation representing such a transformation is described by the linear map (94) ⎞ ⎛ ∗− 12 ˆ −1 ˆ ˆ g(z)1 φo h(z)1 0 R ⎠ w, ⎝ 1 ∗ ˜ = ˜ (224) w = M1 w e ˆ −1 φ ∗2 0 R ˆ φ g (z)1ˆ h (z) 1 η φ∗ o
and the new Hamiltonian—the interaction Hamiltonian—reads η ˜ z), z + H4 w(w, ˜ z), z + · · · H int = H3 w(w, e ˜ z) + H4int (w, ˜ z) + · · · . = H3int (w,
(225)
In the second approach the rays are determined by their position in the object and aperture plane. Such a transformation is the extended canonical transformation; in the rotating coordinates it takes the form ˜ Q s 1ˆ t 1ˆ Q , (226) = ∗ 12 ˆ ∗ 12 ˆ P P˜ sφ 1 tφ 1 where P˜ corresponds to Qa , which represent the generalized momentum. The interaction Hamiltonian in this case reads η ˜ z), z + H4 w(w, ˜ z), z + · · · H3 w(w, H int = 1 ∗ eφo 2 Wso ˜ z) + H4int (w, ˜ z) + · · · , = H3int (w, (227) where Wso = s(zo )t (zo ) − t (zo )s (zo ) is the Wronskian (85). The following procedure is similar for both types of parameterization. We will find the canonical transformation that compensates the third-order part of an interaction Hamiltonian. Let us assume that it takes the form [1] ,z):
˜ = e:g3 (w w
w[1] .
It transforms the Hamiltonian into H
[1]
[1]
w ,z = e
:g3 (w[1] ,z):
H
int
[1]
1
w ,z +
[1] ,z):
eθ:g3 (w
∂g3 dθ . ∂z
(228)
0
Moreover, we require the transformed Hamiltonian not to include the thirdorder part; therefore, H [1] w[1] , z = H4[1] w[1] , z + H5[1] w[1] , z + · · · . (229)
ˇ RADLI CKA
292
Comparing terms of the third order on the right-hand side of Eqs. (228) and (229), one can find H3int +
∂g3 = 0, ∂z
(230)
the solution of which z g3 = −
H3int w[1] , z dz
(231)
zo
completely describes the desired canonical transformation. Such a transformation, together with M1 , describes the system up to the second aberration order [1] ,z):
w(2) (z) = M2 w[1] = M1 e:g3 (w
w[1] ,
(232)
and when we neglect the evolution of the third and higher orders w[1] = wo , we can write w(2) (z) = M2 wo = M1 e:g3 (wo ,z): wo .
(233)
Explicitly, for stigmatic systems in rotating coordinates, it takes the form ⎛ ⎞ ∗− 12 ∂g3
ˆ ˆ Qo − ∂P g(z)1 φo h(z)1 Q(2) o ⎠ ∗ (234) =⎝ 1 ∂g φ 3 P(2) ∗2 ˆ Po + ∂Qo φ g (z)1ˆ φ ∗ h (z)1 o
and in the image plane can be written ∂g3 Q(2) = M Qo − . ∂Po
(235)
When we use the aperture position instead of the initial momentum we must proceed differently. When Eq. (233) is applied to the coordinate Q, we find Q(2) = s(z)e:g3 : Qo + t (z)e:g3 : Qa ,
(236)
Q(2) (za ) = e:g3 (za ): Qa
(237)
which is reduced to
in the aperture plane (s(za ) = 0, t (za ) = 1). This means that the perturbed ray does not intersect the aperture plane in the point Qa ; hence, we modify the value Qa → e−:g3 (za ): Qa
(238)
293
LIE ALGEBRAIC METHODS
and Eq. (233) is modified to −:g3 (z): Qo Qo e Q(2) :g3 (z): = M1 e = M1 , P(2) e−:g3 (za ): Qa e−:g3a (z): Qa
(239)
where z g3a = −
H3int (z1 ) dz1 .
(240)
za
Note that Qa is generalized momentum, and the result expressed up to the third order thus takes the form Q − ∂g3
s(z)1ˆ t (z)1ˆ o ∂Qa Q(2) , (241) = ∗ 12 ∗ 12 ∂g3a P(2) ˆ ˆ φ s (z)1 φ t (z)1 Qa + ∂Qo which reduces to
∂g3 Q(2) = M Qo − ∂Qa
(242)
in the image plane. Using Eq. (228) one can evaluate the transformed Hamiltonian H4[1]
z
1 − 2
= H4
int
dz1 H3int w[1] , z1 , H3int w[1] , z ,
(243)
zo
H5[1] = H5int −
z
dz1 H3int w[1] , z1 , H4int w[1] , z
zo
1 + 3
z
z ds
zo
dt H3int w[1] , t , H3int w[1] , s , H3int w[1] , z . (244)
zo
The previous procedure can be used to find the canonical transformation [2] w[1] w[2] , z = e:g4 (w ,z): w[2] , which compensates the fourth-order part of the Hamiltonian, that is, H
[2]
=e
:g4 (w[2] ,z):
H
[1]
[2]
1
w ,z +
[2] ,z):
eθ:g4 (w
∂g4 (w[2] , z) dθ, (245a) ∂z
0
H [2] = H5[2] + H6[2] + · · · .
(245b)
ˇ RADLI CKA
294
Comparing terms of the fourth order and solving the differential equation yields z g4 = − zo z
=−
H4[1] w[2] , z dz int
H4 zo
1 w[2] , z dz + 2
z
z2 dz2
zo
dz1 H3int w[2] , z1 , H3int w[2] , z2 .
zo
(246) The transformed Hamiltonian is determined by Eq. (245a) and the canonical ˜ 3 w[2] reads ˜ =M transformation to interaction coordinates w ˜ 3 = e:g3 (w[1] ,z): e:g4 (w[2] ,z): = e:g4 (w[2] ,z): e:g3 (w[2] ,z): . M
(247)
The rays are then described up to the third aberration order by the canonical transformation w(3) = M1 e:g4 : e:g3 : wo ,
(248)
which for stigmatic systems in rotating coordinates takes ⎛ ⎞ ∗− 1 g(z)1ˆ φo 2 h(z)1ˆ Q(3) ⎠ ∗ =⎝ 1 φ P(3) ∗2 ˆ ˆ φ (z)g 1 φo∗ h (z)1 ∂g3 ∂g4 ∂g3
− 12 [g3 , ∂P ] Qo − ∂Po − ∂P o o × ∂g3 ∂g4 ∂g3 Po + ∂Q + ∂Q + 12 [g3 , ∂Q ] o o o
(249)
and when the aperture position instead of the initial momentum is used, we find Q − ∂g3 − ∂g4 − 1 [g , ∂g3 ]
s(z)1ˆ t (z)1ˆ o ∂Qa ∂Qa 2 3 ∂Qa Q(3) . = 1 1 ∗ ∗ ∂g ∂g 1 P(3) φ 2 s (z)1ˆ φ 2 t (z)1ˆ Qa + 3a + 4a + [g3a , ∂g3a ] ∂Qo
∂Qo
2
∂Qo
(250) By applying this procedure to higher-order terms it is possible to find the canonical transformation that compensates the Hamiltonian up to a given order, let us say up to the k-th ˜ ˜ ˜ ˜ k−1 = e:gk (w,z): M e:gk−1 (w,z): · · · e:g3 (w,z): .
(251)
The evolution in original coordinates can be then evaluated as ˜ k−1 wo , w(k−1) = M1 M
(252)
LIE ALGEBRAIC METHODS
295
which is the analogue of the factorization theorem presented by Dragt (Dragt and Forest, 1986b). In the case of parameterization by position in the object and aperture planes ˜ Mk−1 Qo , (253) w(k−1) = M1 ˜ (a) Qa M k−1 where ˜ ˜ ˜ ˜ (a) = e:gk,a (w,z): M e:gk−1,a (w,z): · · · e:g3,a (w,z): . k−1
(254)
E. Dispersion and Chromatic Aberration We have described previously the perturbation methods only for systems in which the electron energy does not differ. Let us abandon this assumption and focus on the case when a small energy deviation is present. This subsection describes the changes and modifications of the perturbation methods caused by this extension. 1. The Trajectory Method From Eqs. (12) and (100a) it is clear that the additional dimension changes the trajectory equations to − cos Θ F1 ˆ δ P1 (Q) + 4eφ ∗7/4 sin Θ (255a) = f2 δ, Q, Q , Q , z + f3 δ, Q, Q , Q , z + · · · , δ = const,
(255b)
(δ, Q, Q , Q , z)
where the terms fk on the right-hand side represent the k-th order homogeneous polynomial in variables δ, Q, Q , and Q with z dependent coefficients. According to the procedure used in the monochromatic case, using the perturbation parameter λ one can find − cos Θ F1 2 ˆ δ P1 Q1 + λQ2 + λ Q3 + · · · + 4eφ ∗7/4 sin Θ = λf2 (δ, Q1 + λQ2 + · · · , Q1 + λQ2 + · · · , Q1 + λQ2 + · · · , z) + λ2 f3 (δ, Q1 + · · · , Q1 + · · · , Q1 + · · · , z) from which, by comparing of each order in λ, it is possible to find equations that determine all Q(i) , namely F1 − cos(Θ) ˆ P1 (Q1 ) = − δ, (256a) 7 sin(Θ) 4eφ ∗ 4
ˇ RADLI CKA
296
Pˆ1 (Q2 ) = f2 (δ, Q1 , Q1 , Q1 , z), Pˆ1 (Q3 ) =
∂f2 (δ, Q1 , Q1 , Q1 , z) ∂Qi +
(256b) Q2i +
∂f2 (δ, Q1 , Q1 , Q1 , z) Q2i ∂Qi
∂f2 (δ, Q1 , Q1 , Q1 , z) Q2i + f3 (δ, Q1 , Q1 , Q1 , z) ∂Qi
(256c)
.. . The following procedure is similar to the procedure mentioned in Section IV.A. 2. The Lie Algebraic Method For the description of the dispersion case the phase space must include two more canonically conjugate variables, τ and pτ , and the Poisson brackets therefore read [f, g] =
∂f ∂g ∂f ∂g ∂f ∂g ∂f ∂g − + . − ∂q ∂p ∂p ∂q ∂τ ∂pτ ∂pτ ∂τ
(257)
The definition and properties of the Lie operator and Lie transform remain unchanged except that the Poisson brackets and the phase space do change. Hence, the Hamiltonian must include the variable pτ and its expansion will take the form H (q, p, pτ , z) = H2 (q, p, pτ , z) + H3 (q, p, pτ , z) + H4 (q, p, pτ , z) + · · · . Moreover, if the dipole fields represented by coefficient F1 paraxial approximation takes the form ⎞⎛ ⎛ −1 ∗− 1 Rˆ 0ˆ 0 g 1ˆ φ0 2 1ˆ ∗ ⎟ ⎜ ∗1 ⎜ e ˆ −1 φ ˆ M1 = ⎝ 0ˆ 0⎠⎜ ηR ⎝ φ 2 g 1ˆ φ0∗ h 1 0 0 1 0 0 ⎛ ⎞ 1 0 0 0 2mcημ(h) ˆ ⎜ 0 1 0 0 2mcηνˆ (h) ⎟ ⎜ ⎟ ⎜ ⎟ × ⎜ 0 0 1 0 −2mcημ(g) ˆ ⎟ ⎜ ⎟ ⎝ 0 0 0 1 −2mcηνˆ (g) ⎠ 0
0
0
0
1
(258) are present, the
0
⎞
⎟ ⎟ 0⎠ 1
LIE ALGEBRAIC METHODS
297
which causes the following change of the interaction coordinates
q˜ q (259) p = M1 p˜ p˜ τ pτ and the interaction Hamiltonian as well. The rest of the procedure is unaffected. 3. The Differential Algebraic Method In addition to the image position and momentum, the particles’ trajectories are determined by their energy deviation δ, if one considers the dispersion case. For this method, the differential algebraic basis must include other elements. The general basis elements then read |l1 , l2 , l3 , l4 , d = Xl1 Y l2 Pxl3 Pyl4 Pτd
(260)
and since Pτ remains constant along each ray (i.e., Pτ = 0), the derivative of the general differential algebraic element takes the form d |l1 , l2 , l3 , l4 , d = − H2 + H3 + H4 + · · · , |l1 , l2 , l3 , l4 , d . (261) dz − 1 linear differential equations of the first In this case, one can find 5+k 5 order, which determine the solution up to the k-th order. The procedure that follows is analogous to the monochromatic case. F. Example: Round Magnetic Lens The round magnetic lens is the basic element of most electron microscopes. Our intention is to illustrate the application of the perturbation methods described above rather than to describe the physical properties of the lens. First, we perform the calculation using the eikonal method and the Lie algebraic method. The trajectory method will be applied last. We omit the differential algebraic method because it is not appropriate for analytic calculation of aberrations. We than compare the results of all these methods. The magnetic axially symmetric field is determined by one function— the axial flux density B(z). We use the following expansion of the vector potential 1 1 (262a) Ax = − yB(z) + y x 2 + y 2 B (z), 2 16 1 1 (262b) Ay = xB(z) − x x 2 + y 2 B (z), 2 16 Az = 0. (262c)
ˇ RADLI CKA
298
The paraxial equation of the system has already been studied, so we can start to calculate the aberration. 1. The Eikonal Method The Lagrangian of the system up to the fourth order reads M = M 2 + M4 , where 1 ∗1 2 1 φ 2 q − ηB(z)(xy − x y), (263) 2 2 1 1 2 1 (264) M4 = − φ ∗ 2 q 2 + ηB (xy − x y)q2 ; 8 16 hence, the third-order Lagrangian is not present in the system. We will solve the problem in rotating coordinates, where the Lagrangian reads M2 =
1 ∗ 1 2 η2 B 2 (z) 2 φ 2Q − Q , (265) 1 2 8φ ∗ 2 2 1 2 1 1 M4 = − L1 Q2 − L2 Q2 Q 2 − L3 Q 2 − R(XY − X Y )2 4 2 4 − CP Q2 (XY − X Y ) − CQ Q 2 (XY − X Y ) (266) M2 =
with the coefficients defined as follows: L1 = L2 = L3 = R= CQ = CP =
η4 B 4 ∗ 32
32φ η2 B 2
−
η2 8φ
∗ 12
, 1 8φ ∗ 2 1 ∗1 φ 2, 2 η2 B 2 , 1 8φ ∗ 2 1 ηB, 4 η3 B 3 ηB . − ∗ 16φ 16
BB ,
(267a) (267b) (267c) (267d) (267e) (267f)
When we use the paraxial approximation in parameterization with coordinate and momentum in the object plane after substitution of Eq. (93) for Q and Q
LIE ALGEBRAIC METHODS
299
and using L2z = Q2 P2 − (QP)2 , we derive 2 1 M˜ 4 = − φ ∗−2 L1 h4 + 2L2 h2 h 2 + L3 h 4 P2o 4 3 − φ ∗− 2 L1 gh3 + L2 hh (gh) + L3 g h 3 P2o (Qo Po ) − φ ∗−1 L1 g 2 h2 + 2L2 ghg h + L3 g 2 h 2 − R (Qo Po )2 1 − φ ∗−1 L1 g 2 h2 + L2 g 2 h2 + g 2 h 2 + L3 g 2 h 2 + 2R Q2o P2o 2 1 − φ ∗− 2 L1 hg 3 + L2 gg (hg) + L3 h g 3 Q2o (Qo Po ) 2 1 − L1 g 4 + 2L2 g 2 g 2 + L3 g 4 Q2o 4 3 − φ ∗− 2 CQ h 2 + CP h2 P2o Lz − 2φ ∗−1 (CQ g h + CP gh)Qo Po Lz 1 − φ ∗ 2 CQ g 2 + CP g 2 Q2o Lz .
(268)
Because there is no electric field in the system, that is, φ ∗ = φo∗ , a slightly simplified form of the Wronskian remains constant gh − g h = g(zo )h (zo ) − g (zo )h(zo ) = 1,
st − s t = s(zo )t (zo ) − s (zo )t (zo ) = Wso .
(269a) (269b)
When we introduce the notation C=φ
∗− 12
z
L1 h4 + 2L2 h2 h 2 + L3 h 4 dz,
(270a)
L1 gh3 + L2 hh (gh) + L3 g h 3 dz,
(270b)
L1 g 2 h2 + 2L2 ghg h + L3 g 2 h 2 − R dz,
(270c)
zo
K=φ
∗− 12
z zo
A=φ
∗− 12
z zo
F =φ
∗− 12
z zo
L1 g 2 h2 + L2 g 2 h2 + g 2 h 2 + L3 g 2 h 2 + 2R dz, (270d)
ˇ RADLI CKA
300 D=φ
∗− 12
1
E = φ ∗− 2
z zo z
L1 hg 3 + L2 gg (hg) + L3 h g 3 dz,
L1 g 4 + 2L2 g 2 g 2 + L3 g 4 dz,
(270e)
(270f)
zo
k=φ
∗− 12
1
a = φ ∗− 2
1
d = φ ∗− 2
z zo z
zo z
CQ h 2 + CP h2 dz,
(271a)
(CQ g h + CP gh) dz,
(271b)
CQ g 2 + CP g 2 dz
(271c)
zo
the action takes the form of 2 3 1 1 I Soi = − φ ∗− 2 C P2o − φ ∗−1 KP2o (Qo Po ) − φ ∗− 2 A(Qo Po )2 4 1 1 1 1 2 − φ ∗− 2 F Q2o P2o − DQ2o (Qo Po ) − φ ∗ 2 E Q2o − φ ∗−1 kP2o Lz 2 4 1
− 2φ ∗− 2 aQo Po Lz − dQ2o Lz
(272)
and using Eq. (178) the perturbation in the image plane reads ∗− 3 2 CP2 Po + φ ∗−1 K P2 Qo + 2(Qo Po )Po Q(1) o o i =M φ 1
1
+ 2φ ∗− 2 A(Qo Po )Qo + φ ∗− 2 F Q2o Po + DQ2o Qo + kφ ∗−1 2Lz Po − P2o JˆQo 1 + 2aφ ∗− 2 Lz Qo − (Qo Po )JˆQo − dQ2o JˆQo ,
(273)
1 where Jˆ was defined in Eq. (81). When we use Po = φ ∗ 2 Qo , Q(1) = M CQo2 Qo + K Qo2 Qo + 2(Qo Qo )Qo + 2A(Qo Qo )Qo + F Q2o Qo + DQ2o Qo + k 2Kz Qo − Qo2 JˆQo (274) + 2a Kz Qo − (Qo Qo )JˆQo − dQ2o JˆQo
with Kz = Xo Yo − Yo Xo .
LIE ALGEBRAIC METHODS
301
In the case of parameterization by position in the object and aperture plane, we proceed similarly 2 1 M˜ 4 = − L1 t 4 + 2L2 t 2 t 2 + L3 t 4 Q2a 4 − L1 st 3 + L2 tt (st) + L3 s t 3 Q2a (Qo Qa ) − L1 s 2 t 2 + 2L2 sts t + L3 s 2 t 2 − RWs2 (Qo Qa )2 1 L1 s 2 t 2 + L2 s 2 t 2 + s 2 t 2 + L3 s 2 t 2 + 2RWs2 Q2o Q2a 2 − L1 ts 3 + L2 ss (ts) + L3 t s 3 Q2o (Qo Qa ) −
2 1 L1 s 4 + 2L2 s 2 s 2 + L3 s 4 Q2o 4 − Ws CQ t 2 + CP t 2 Q2a Lz
−
− 2Ws (CQ s t + CP st)Qo Qa Lz − Ws CQ s 2 + CP s 2 Q2o Lz (275) where now Lz = Xo Ya − Xa Yo . Using coefficients C=φ
∗− 12
Ws−1
z
4 L1 t + 2L2 t 2 t 2 + L3 t 4 dz,
(276a)
L1 st 3 + L2 tt (st) + L3 s t 3 dz,
(276b)
L1 s 2 t 2 + 2L2 sts t + L3 s 2 t 2 − RWs2 dz,
(276c)
zo
K=φ
∗− 12
Ws−1
z zo
A=φ
∗− 12
Ws−1
z zo
F =φ
∗− 12
Ws−1
1
D = φ ∗− 2 Ws−1
z zo z
L1 s 2 t 2 + L2 s 2 t 2 + s 2 t 2 + L3 s 2 t 2 + 2RWs2 dz, (276d) L1 ts 3 + L2 ss (ts) + L3 t s 3 dz,
(276e)
L1 s 4 + 2L2 s 2 s 2 + L3 s 4 dz,
(276f)
zo
E=φ
∗− 12
Ws−1
z zo
ˇ RADLI CKA
302 k=φ
∗− 12
1
a = φ ∗− 2
1
d = φ ∗− 2
z zo z
zo z
CQ t 2 + CP t 2 dz,
(277a)
(CQ s t + CP st) dz,
(277b)
CQ s 2 + CP s 2 dz
(277c)
zo
the action takes the form of C 2 F I = − Q2a − KQ2a (Qo Qa ) − A(Qo Qa )2 − Q2o Q2a − DQ2o (Qo Qa ) Soi 4 2 E 2 2 2 2 (278) − Q − kQa Lz − 2aQo Qa Lz − dQo Lz 4 o and using Eq. (187) yields Q(1) = M CQ2a Qa + K Q2a Qo + 2(Qo Qa )Qa + 2A(Qo Qa )Qo + F Q2o Qa + DQ2o Qo + k 2Lz Qa − Q2a JˆQo + 2a Lz Qo − (Qo Qa )JˆQo − dQ2o JˆQo . (279) 2. The Lie Algebraic Method The Hamiltonian up to the fourth order takes the form H = H2 + H 4 ,
(280)
where H2 =
η ∗ 12
2eφ η3
p2 +
ηB 2φ
∗ 12
Lz +
ηeB 2 4φ
∗ 12
q2 ,
(281)
2 2 2 2 η3 B 2 2 2 eη3 eη 4 H4 = q p q + B − BB p + 3 3 3 1 ∗ ∗ ∗ ∗ 8e3 φ 2 16eφ 2 128φ 2 32φ 2 3 3 η3 B 2 2 η B ηB η3 B 2 L + L + − q + Lz p2 (282) z ∗ 32 z ∗ 32 ∗ 12 ∗ 32 2 8eφ 16φ 16φ 4e φ which in rotating coordinates takes the form H2 =
P2 2φ
∗ 12
+
η2 B 2 8φ
∗ 12
Q2 ,
(283)
303
LIE ALGEBRAIC METHODS
H4 =
2 1 2 2 1 ∗−1 1 L1 Q + φ L2 Q2 P2 + φ ∗−2 L3 P2 − φ ∗−1 RL2z 4 2 4 1
3
+ φ ∗− 2 CP Q2 Lz + φ ∗− 2 CQ P2 Lz .
(284)
In the case of parameterization with coordinates and momentum in the object plane, the interaction Hamiltonian can be calculated using ˜ P), ˜ P(Q, ˜ P) ˜ , H4int = H4 Q(Q, (285) ˜ P) ˜ in the paraxial approximation is where the transformation (Q, P) → (Q, described in Eq. (93); hence, H4int =
2 1 ∗−2 φ L1 h4 + 2L2 h2 h 2 + L3 h 4 P˜ 2 4 3 ˜ P) ˜ + φ ∗− 2 L1 gh3 + L2 hh (gh) + L3 g h 3 P˜ 2 (Q ˜ P) ˜ 2 + φ ∗−1 L1 g 2 h2 + 2L2 ghg h + L3 g 2 h 2 − R (Q 2 2 1 ˜ P˜ + φ ∗−1 L1 g 2 h2 + L2 g 2 h2 + g 2 h 2 + L3 g 2 h 2 + 2R Q 2 2 1 ˜ P) ˜ ˜ (Q + φ ∗− 2 L1 hg 3 + L2 gg (hg) + L3 h g 3 Q 2 2 1 ˜ + L1 g 4 + 2L2 g 2 g 2 + L3 g 4 Q 4 3 + φ ∗− 2 CQ h 2 + CP h2 P˜ 2 Lz ˜ PL ˜ z + 2φ ∗−1 (CQ g h + CP gh)Q 1 ˜ 2 Lz + φ ∗ 2 CQ g 2 + CP g 2 Q
(286)
where we used L2z = p2 q2 − (qp)2 . Eqs. (231), (243) and (246) can be used to write g3 = 0,
(287) z
g4 = −
H4int Q[2] , P[2] , z dz;
(288)
zo
that is, 2 2 3 1 1 g4 = − φ ∗− 2 C P[2]2 − φ ∗−1 KP[2]2 Q[2] P[2] − φ ∗− 2 A Q[2] P[2] 4 1 1 − φ ∗− 2 F Q[2]2 P[2]2 − DQ[2]2 Q[2] P[2] 2 2 1 1 1 − φ ∗ 2 E Q[2]2 − φ ∗−1 kP[2]2 Lz − 2φ ∗− 2 aQ[2] P[2] Lz − dQ[2]2 Lz . 4
ˇ RADLI CKA
304
The position in the image plane can be calculated using Eq. (249) ∂g4 (Qo , Po ) Q(3) = M Qo − , ∂Po
(289)
4 (wo ) where the second term −M ∂g∂P agrees with Q(1) from Eq. (273). o When the parameterization with position in image and aperture plane is used, the interaction Hamiltonian reads 2 2 1 ˜a H4int = L1 t 4 + 2L2 t 2 t 2 + L3 t 4 Q 4 2 ˜Q ˜ a) ˜ a (Q + L1 st 3 + L2 tt (st) + L3 s t 3 Q ˜Q ˜ a )2 + L1 s 2 t 2 + 2L2 sts t + L3 s 2 t 2 − R (Q
2 2 1 ˜ Q ˜a L1 s 2 t 2 + L2 s 2 t 2 + s 2 t 2 + L3 s 2 t 2 + 2R Q 2 ˜Q ˜ a) ˜ 2 (Q + L1 ts 3 + L2 ss (ts) + L3 t s 3 Q 2 2 2 1 ˜ ˜ a Lz + CQ t 2 + C P t 2 Q + L1 s 4 + 2L2 s 2 s 2 + L3 s 4 Q 4 2 ˜Q ˜ a Lz + C Q s 2 + C P s 2 Q ˜ Lz , − 2(CQ s t + CP st)Q (290) +
2 [2] [2] 2 1 g4 = − C Q[2]2 Q Qa − A Q[2] Q[2] − KQ[2]2 a a a 4 1 [2]2 2 1 − F Q[2]2 Q[2]2 − DQ[2]2 Q[2] Q[2] − E Q a a 2 4 [2]2 [2] [2] [2]2 − kQa Lz − 2aQ Qa Lz − dQ Lz
(291)
and using Eq. (250), we derive
∂g4 (Qo , Qa ) , Q(3) = M Qo − ∂Qa
(292)
which corresponds to Eq. (279). 3. The Trajectory Method The trajectory equation can be derived from the Lagrangian equations d ∂M ∂M = 0, − dz ∂Q ∂Q which leads to 1
φ ∗ 2 Q +
η2 B 2 4φ
∗ 12
ˆ Q = (L3 Q − L2 Q − CQ J2 Q − 2CQ Jˆ2 Q )Q 2
(293)
LIE ALGEBRAIC METHODS
305
+ (L2 Q + L2 Q − L1 Q − CP Jˆ2 Q − 2Cp Jˆ2 Q )Q2 + 2(L2 Q − CP Jˆ2 Q)QQ + 2(L3 Q − CQ Jˆ2 Q)Q Q + 2(CQ Q − R Jˆ2 Q)Kz + 2(CQ Q + CQ Q − CP Q − 2R Jˆ2 Q − R Jˆ2 Q)Kz ,
(294)
where Kz = XY − Y X . Hence, the trajectory equation takes the form of Q +
η2 B 2 Q = f3 (Q, Q , Q , z); 4φ
(295)
that is, there is no polynomial of the second order on the right-hand side. Hence, Q2 = 0 [Eq. (150)] and Eq. (152) is reduced to Q3 +
η2 B 2 Q3 = f3 (Q1 , Q1 , Q1 , z). 4φ
(296)
Using the method of variation of parameters yields Q3 =
∗− 1 −Mφo 2
zi
1
φ ∗ 2 h(t)f3 (Q1 , Q1 , Q1 , t) dt
zo
zi = −M
h(t)f3 (Q1 , Q1 , Q1 , t) dt,
(297)
zo
or Q3 = −
M
zi
∗1 Wso φo 2 zo
M =− Wso
zi
1
φ ∗ 2 t (α)f3 (Q1 , Q1 , Q1 α) dα
t (α)f3 (Q1 , Q1 , Q1 α) dα,
(298)
zo
depending on the parameterization used. Because the practical calculation is lengthy, we show it only for the case of parameterization by position and slopes in the object plane. In such a case, 2 Q1 = gQo + hQo , Wg = gh − hg = 1, and Q1 = − L L3 Q1 ; hence, Eq. (296)
306
ˇ RADLI CKA
reads η2 B 2 Q3 4φ ∗ 1 = φ ∗− 2 I1 Qo 2 Qo + I2 Qo2 Qo + I3 (Qo Qo )Qo + I4 (Qo Qo )Qo + I5 Q2o Qo + I6 Q2o Qo + A1 Kz Qo + A2 Qo 2 Jˆ2 Qo + A3 Kz Qo + A4 (Qo Qo )Jˆ2 Qo + A5 Q2o Jˆ2 Qo + N1 (Qo Qo )Jˆ2 Q + N2 Qo2 Jˆ2 Qo + N3 Q2o Jˆ2 Qo (299)
Q3 +
where the coefficients take the forms L22 3 h − 2L2 h 2 h + L2 h2 h , I1 = − L1 + L3 L2 I2 = − L1 + 2 gh2 − 2L2 h 2 g + L2 g h2 − 2R h − 4Rh , L3 L22 I3 = −2 L1 + gh2 − 4L2 hg h + 2L2 ghh + 2R h + 4Rh , L3 L22 I4 = −2 L1 + hg 2 − 4L2 gg h − 4Rg − 2R g + 2L2 ghg , L3 L2 I5 = − L1 + 2 g 2 h − 2L2 hg 2 + L2 g 2 h + 4Rg + 2R g, L3 L22 3 g − 2L2 g 2 g + L2 g 2 g , I6 = − L1 + L3 L2 h − 2 CP + CQ h, A1 = 2CQ L3
(300a) (300b) (300c) (300d) (300e) (300f)
(301a)
2 h g − CP h2 g − 2CQ h 2 g − 2CP h2 g A2 = −CQ L2 − 2 CP − CQ (301b) ghh , L3 L2 g, (301c) g + 2CQ A3 = −2 CP + CQ L3 L2 2 2 ghg A4 = −2CQ gg h − 2CP g h − 4CQ g h − 6CP − 2CQ L3 L2 2 − 2 CP − CQ (301d) g h, L3 L2 2 gg 2 − 2 2CP − CQ g g , (301e) A5 = −2CQ g 3 − CP g 3 − CQ L3
LIE ALGEBRAIC METHODS
307
L2 L2 2 ghh − 2 CP − CQ h g − 4CQ g h 2 N1 = − 6CP − 2CQ L3 L3 − 2CQ hg h − 2CP gh2 , (302a) L2 2 h h − 2CQ h 3 − CP h3 − CQ hh 2 , (302b) N2 = −2 2CP − CQ L3 L2 N3 = −2 CP − CQ ghg − 2CP g 2 h − 2CQ g 2 h − CP g 2 h L3 − CQ hg 2 .
(302c)
We used Kz Jˆ2 Q = Q 2 Q − (QQ )Q ,
(303a)
Kz Jˆ2 Q = (QQ )Q − Q2 Q .
(303b)
Using Eq. (297), we can write Q3 in the form of Q3 = M I˜1 Qo 2 Qo + I˜2 Qo2 Qo + I˜3 (Qo Qo )Qo + I˜4 (Qo Qo )Qo + I˜5 Q2o Qo + I˜6 Q2o Qo + A˜ 1 Kz Qo + A˜ 2 Qo2 Jˆ2 Qo + A˜ 3 Kz Qo + A˜ 4 (Qo Qo )Jˆ2 Qo + A˜ 5 Q2o Jˆ2 Qo + N˜ 1 (Qo Qo )Jˆ2 Q + N˜ 2 Qo2 Jˆ2 Qo + N˜ 3 Q2o Jˆ2 Qo (304) where coefficients with tildes are given by I˜1 =
∗− 1 −φo 2
zi h(t)I1 (t) dt,
....
(305)
zo
It is clear that Eq. (304) must correspond to Eq. (274); hence, we express the coefficients from Eq. (304) as I˜1 =
∗− 1 φo 2
zi L1 + zo
=
∗− 1 φo 2
zi
L22 4 h − L2 h h3 + 2L2 h 2 h2 dz L3
d 3 3 L1 h + 2L2 h h + L3 h − L3 h h + L2 h h dz dz 4
2 2
4
zo ∗− 12
= C − φo
L3 h 3 h + L 2 h 3 h
zi zo
=C
(306)
ˇ RADLI CKA
308
2 where we used h = − L L3 h and h(zo ) = h(zi ) = 0. For evaluation of the other coefficients, we proceed similarly:
I˜2 = K − φ
∗− 12
zi
d L3 hh 2 g + L2 h3 g − 2Rh2 dz = K, dz
(307a)
zo
I˜3 = 2 K − φ
∗− 12
zi
d L3 hh 2 g + L2 gh2 h + Rh2 dz = 2K, (307b) dz
zo
I˜4 = 2 A − φ
∗− 12
zi zo
I˜5 = F − φ
∗− 12
zi
d 2 2 −Rgh + L2 gg h + L3 g hh dz = 2A, dz (307c)
d 2Rgh + L2 hg 2 h + L3 hh g 2 dz = F, dz
(307d)
d L3 hg 3 + L2 hg 2 g dz = D, dz
(307e)
zo
I˜6 = D − φ ∗− 2 1
zi zo
A˜ 1 = 2k − 2φ
zi
∗− 12
d (CQ h h) dz = 2k, dz
(308a)
d CQ h 2 hg + CP h3 g dz = −k, dz
(308b)
d (CQ g h) dz = 2a, dz
(308c)
zo
zi
A˜ 2 = −k + φ ∗− 2 1
zo
A˜ 3 = 2a − 2φ
∗− 12
zi zo
A˜ 4 = −2a + 2φ
zi
∗− 12
d CP g 2 h2 + CQ ghg h dz = −2a, (308d) dz
zo
A˜ 5 = −d + φ ∗− 2 1
zi zo
d CP g 3 h + CQ ghg 2 dz = −d, dz
(308e)
LIE ALGEBRAIC METHODS
N˜ 1 = 2φ ∗− 2 1
zi
d CP gh3 + CQ h2 g h dz = 0, dz
309
(309a)
zo
N˜ 2 = φ ∗− 2 1
N˜ 3 = φ ∗− 2 1
zi zo zi
d CP h4 + CQ h2 h 2 dz = 0, dz
(309b)
d CP g 2 h2 + CQ h2 g 2 dz = 0. dz
(309c)
zo
The third-order aberration polynomial then takes the form Q3 = M CQo2 Qo + K Qo2 Qo + 2(Qo Qo )Qo + 2A(Qo Qo )Qo + F Q2o Qo + DQ2o Qo + k 2Kz Qo − Qo2 Jˆ2 Qo + 2a Kz Qo − (Qo Qo )Jˆ2 Qo − dQ2o Jˆ2 Qo (310) which coincides with Eq. (274). 4. Comparison of Methods In this section we applied all the analytical perturbation methods described to the simple example of a round magnetic lens. The calculation is not new, but our purpose was to compare the different approaches of the perturbation methods. The comparison shows that even though the formulation of the trajectory method is very straightforward, the practical application is very demanding. On the other hand, there are no significant differences in complexity between the eikonal and the Lie algebraic method. We showed that even though the numerical values of the aberration coefficients are independent of the method used, this does not mean that the form of the aberration integrals is also identical. However, they can be transformed into the same form by using integration by parts, as shown in the comparison of the results of the trajectory method with the eikonal or Lie algebraic method. The eikonal method and the Lie algebraic method are much more desirable for the calculation of higher-order aberrations. Because both methods are based on the symplectic structure of the phase space, the relationship among the aberration coefficients is much more apparent. However, whereas the procedure of derivation of the second-order perturbation relations (196) is not transparent, the calculation of higher-order perturbation in the Lie algebraic method is a straightforward procedure. Moreover, the Lie algebraic method provides insight into the structure of the aberration polynomials—using
ˇ RADLI CKA
310
symplectic classification, we can describe the relationship among aberrations and determine which fields influence a given aberration coefficient. Finally, we can state that the trajectory method is not advisable for calculation of higher-order aberration coefficients. However, we were not able to show any great differences between the eikonal and the Lie algebraic methods. We will show that introduction of the Lie algebra structure into the polynomial space offers some advantages, mostly in the aberrations classification or in the study of periodic systems, but the calculation of the aberration coefficients is still a lengthy task that is rarely done without the use of computer programs for symbolic computation such as Maple (Symbolic Computation Group, University of Waterloo, Waterloo, Ontario, Canada) or Maxima.
VI. T HE S YMPLECTIC C LASSIFICATION OF G EOMETRIC A BERRATIONS The theory of aberrations describes the nature of the optical imperfections of various devices. The aberrations describe the deviation of an electron trajectory from the ideal paraxial ray, determined as a linear function of positions and slopes in the object plane or some other parameterization. From the mathematical point of view, the ray is function r = r(z; ro , ro ),
(311)
which can be expanded in a Taylor series in initial conditions r0 and r0 for each z. The aberration part is determined by the nonlinear part of the expansion. Unfortunately, since the ray is a function of four initial variables, the Taylor polynomial contains a large number of members even in relatively low order, 1 (n + 3)(n + 2)(n + 1). (312) 6 Pn denotes the number of members in the Taylor polynomial of n-th order. It gives 20 members of the third order, 56 members of the fifth order, 220 members of the 9-th order, and so on. Two questions arise. The first is the meaning of each member in the aberration polynomial, including its influence on the optical properties of the system. The second is whether some structure exists among the aberration coefficients, that might be useful for understanding the system properties. Both questions are partly answered by the symplectic classification of aberrations. The symplectic classification of the aberration polynomials of axial symmetric systems was presented in Dragt and Forest (1986a); general systems were classified in a series of works (Navarro-Saad and Wolf, 1986; Pn =
311
LIE ALGEBRAIC METHODS
Wolf, 1986, 1987). We present the classification of aberration polynomials of stigmatic systems in terms of the representation of group adjoint to the algebra of quadratic polynomials that are determined by the quadratic part of the Hamiltonian. We also show the essential influence of the form of the paraxial approximation on the relationship among the aberration polynomials. A. Aberrations and Lie Transformations The previous chapter showed the relation between aberrations and Lie transformations. We found the transfer map in the form of a Lie transformation
M = M1 · · · e:gk (wo ,z): · · · e:g3 (wo ,z): .
(313)
Now we can find the aberrations as the expansion of w(wo , z) = M(wo )
(314)
as a Taylor series in powers of wo ; that is, w = f1 (wo , z) + f2 (wo , z) + · · · + fk−1 (wo , z) + · · · ,
(315)
where fl denotes a homogeneous polynomial of l-th order in the phase-space variables w. Hence, we can describe aberrations using the Lie transformation, particularly using the homogeneous polynomials gk . One of the most significant properties of aberrations is the invariance with respect to a canonical transformation e:f : that may, for example, represent the rotation around the optic axis. The invariance of the aberrations with respect to a canonical transformation e:f : is defined as follows: The canonical transformation exp(:g(w):) is invariant with respect to the canonical transformation exp(:f (w):) when the diagram (316) commutes, w
e:g(w):
˜ w
e:f (w):
wI
˜ e:f (w):
e:g(w
I ):
(316)
˜ I; w
that is, ˜ e:g(w ): e:f (w): = e:f (w): e:g(w): . I
(317)
Such a property of the aberration must influence the form of the Lie transformation, namely g(w) e:g(w): = e−:f (w˜ ): e:g(w ): e:f (w): = e:g:(w ) e−f (w ) e−:g(w ): e:g(w ): ef (w ) = exp :e:f (w): g(w): ; (318) I
I
I
I
I
I
I
ˇ RADLI CKA
312 hence, g(w) must satisfy
g(w) = e:f (w): g(w).
(319)
The canonical transformation does not change the form of g. The invariance of aberrations with respect to a canonical transformation imposes a condition on the Lie transformation. We explore the properties of the Lie transformation in the following text. B. Aberrations and Paraxial Approximation The paraxial approximation describes the optimal electron paths, but we have seen previously that it also plays an important role in the aberration theory. It played an important role in all the perturbation methods. We will focus our attention on the Lie algebraic method, where the paraxial approximation is described by the linear operator M1 that satisfies the equation d M1 = −M1 :H2 :. (320) dz Generally, this equation cannot be solved analytically; however, using the Magnus formula (Jagannathan and Khan, 1996) we can find the operator in the form of the Lie transformation, M1 = exp :g2 (w, z): (321) where z g2 (w, z) = −
1 dz1 H2 (w, z1 ) + 2
zo
z
z2 dz2
zo
dz1 H2 (z1 ), H2 (z2 ) + · · · .
zo
(322) It is a member of subalgebra h2 of the Lie algebra of the quadratic polynomials in phase-space variables with z-dependent coefficients, where the Poisson bracket plays the role of Lie brackets. Mathematically, h2 is the smallest subalgebra of the algebra that contains the quadratic part of Hamiltonian. In the case when [H2 (z1 ), H2 (z2 )] = 0 for each z1 and z2 , h2 is reduced to a one-dimensional (1D) subspace and Eq. (322) is reduced to z g2 (w, z) = −
dz1 H2 (w, z1 ),
if H2 (z1 ), H2 (z2 ) = 0 ∀z1 , z2 .
zo
Unfortunately, this condition is not generally satisfied, and the previous equation is no longer valid. However, if the structure of h2 is known, it is also
313
LIE ALGEBRAIC METHODS
possible to describe the general structure of action of M1 on the polynomial subspace. Let us denote by P the vector space of polynomials in the phase-space variables. The map that assigns for each element g ∈ h2 the adjoint Lie operator :g: ∈ gl(P ) is a representation of the algebra h2 . Using standard procedures we can decompose P into the irreducible subspaces according to the representation of h2 . Let us consider any polynomial f0 that is an element of the invariant subspace U ⊂ P and the series f0 ,
f1 = :g2 :f0 ,
f2 = :g2 :f1 ,
....
Because the first member of the series f0 ∈ U and U is invariant according to the representation of h2 , f1 ∈ U , and similarly for each fi ∈ U (i.e., all members of the series are elements of U ). Moreover, U is a vector space; thus, any linear combination of members of U is member of U as well. Hence,
M1 f = exp(:g2 :)f =
∞ :g2 :n n=0
n!
f ∈ U.
(323)
If the space is invariant with respect to the representation of h2 , then it is invariant with respect to the action M1 = exp(:g2 :), where g2 ∈ h2 . We have thus shown that, if we find the decomposition of the polynomial subspace on the irreducible subspaces under the representation of h2 , these subspaces are also invariant under the action of M1 . But why should we do that? We can provide three reasons. The n-th order part of the Hamiltonian takes the form Hn = aij kl x i y j pxk pyl . i+j +k+l=n
The transition to the interaction Hamiltonian (225) ˜ z), z = M1 Hn (w, ˜ z) = Hn w(w, ˜ z) H int (w,
(324)
is of a complicated form in these coordinates; however, when we decompose the n-th order polynomials on irreducible subspaces Vn = Vn1 ⊕ · · · ⊕ Vnk , the parts of the Hamiltonian belonging to different invariant subspaces do not mix. This simplifies the transition. The action of M1 also arises in two other situations. The first one is on recalculation of the aberration coefficients expressed in the object coordinates to those expressed in the paraxial coordinates in the image. Generally, the coordinates in the image can be expressed by wi = M1 wo + f2 (wo ) + f3 (wo ) + · · · −1 p p p = wi + f2 M−1 1 w i + f 3 M1 w i + · · · ,
(325)
ˇ RADLI CKA
314 p
where wi = M1 wo denotes paraxial coordinates in the image. By using the feature of the Lie transformation (217) the previous equation takes the form of p p −1 p (326) wi = wi + M−1 1 f 2 w i + M1 f 3 w i + · · · , which is in the same form as the previous case. Similarly, we can say that the parts of the aberration polynomials belonging to different irreducible subspaces do not mix during the transformation. The last example concerns the combination of systems. Let us consider two systems, the first of which is described by transition map MI and the second by transition map MI I . The resulting map of these systems, which is formed by a combination of these two maps, is described by the transfer map (Dragt and Forest, 1986b) M = MI MI I . By using decomposition of transfer maps into paraxial and nonlinear parts, we find M = M1 exp :g3 (wo ) + g4 (wo ) + · · · : = MI1 exp :g3I (wo ) + g4I (wo ) + · · · : · MI1I exp :g3I I (wo ) + g4I I (wo ) + · · · : , (327) which can be written by using the manipulations described in the previous section in the form I I )−1 (g
M = MI1 MI1I e:(M1
3 (wo )+g4 (wo )+···):
e:g3
I I (w )+g I I (w )+···: o o 4
. (328)
Using the well-known Baker–Campbell–Hausdorff formula yields −1 I g3 (wo ) = MI1I g3 (wo ) + g3I I (wo ), (329a) −1 I 1 I I −1 I g4 (wo ) + g4I I (wo ) + M1 g3 (wo ), g3I I (wo ) . g4 (wo ) = MI1I 2 (329b) From the last equation, it is clear that it is the aberrations belonging to corresponding irreducible subspaces that mix when combining the transfer maps. These examples show the important role of the paraxial approximation, which can be described by the action of M1 . The decomposition of aberration polynomials into irreducible subspaces shows the structure of the aberration and might be helpful for understanding the optical properties of the system. In particular, we will proceed as follows: • We find the general form of the algebra h2 for stigmatic systems. • We find the decomposition of the polynomial space in irreducible subspaces, according to the representation of h2 . • We describe the structure of invariant subspaces of the polynomial space under the action of M1 .
LIE ALGEBRAIC METHODS
315
C. Lie Algebra h2 In the general case, the quadratic part of the Hamiltonian is described by Eq. (87c); the subalgebra h2 is hence generated by four polynomials q2 , p2 , Lz , and x 2 − y 2 . The algebra h2 is therefore equal to the algebra of all quadratic polynomials, which is via Eq. (312) a 10-dimensional vector space. The structure of such a space is too complicated. Fortunately, the general form of the quadratic part of the Hamiltonian is not necessary, as most optical devices are designed to be stigmatic. When one applies the stigmatic condition (74), the quadratic part of the Hamiltonian reduces to eF12 η ηB 1 eγ0 φ eηB 2 2 q2 . (330) p + L + + + H2 = z ∗ 12 ∗ 12 ∗ 12 ∗ 12 ∗ 32 2 2eφ 2φ 4ηφ 4φ 8ηφ The subalgebra h2 is then formed by four polynomials p2 , q2 , qp, and Lz . It is usual to introduce the notation (Dragt and Forest, 1986a) 1 a + = − p2 , 2 1 a − = q2 , 2 1 1 a0 = a + , a − = qp, 2 2 Lz = xpy − ypx . One can then find the commutation relations Lz , a + = 0, Lz , a − = 0, + − a0 , a + = a + , a , a = 2a0 ,
(331a) (331b) (331c) (331d)
[Lz , a0 ] = 0, a0 , a − = −a − .
The polynomials a + , a − , and a0 form the Lie subalgebra of the quadratic polynomial algebra, which is isomorphic to sp(2, R); moreover, as the polynomial Lz commutes with the others, the structure of h2 can be written h2 ∼ = sp(2, R) ⊕ R.
(332)
This is the structure considered in the following text. Let us denote the adjoint operator to a + , a − , a0 , and Lz as follows: 1 aˆ + = :a + : = − :p2 :, 2 1 aˆ − = :a − : = :q2 :, 2
(333a) (333b)
ˇ RADLI CKA
316
1 1 + − : a , a := :qp:, 2 2 Lˆ z = :Lz : = :xpy − ypx :. aˆ 0 = :a0 : =
(333c) (333d)
These form the basis of the adjoint algebra :h2 :, which has the same algebraic structure as algebra h2 . D. Representation of h2 on the Space of Complex Polynomials When the structure of h2 is known, we can describe the adjoint representation on the polynomial space. The first important property is that the polynomial order remains unchanged under the action of h2 : the homogeneous polynomials of different order do not mix and they can be investigated independently. Hence, the polynomial space can be decomposed to V =
∞
Vn ,
(334)
n=1
where each Vn is a reducible representation of h2 . The order of a polynomial can be calculated as the eigenvalue of a number operator d d d d +y + px Nˆ = x + py . (335) dx dy dpx dpy Because the action h2 conserves the order of homogeneous polynomials, (336) [Nˆ , aˆ 0 ] = Nˆ , aˆ + = Nˆ , aˆ − = [Nˆ , Lˆ z ] = 0. Now we must find a decomposition of each Vn to irreducible subspaces. First, we describe the eigensubspaces of Lˆ z . This operator is the generator of rotation around the axis z (Goldstein, 1980). The transformation has no eigendirection in the real plane perpendicular to the axis z. The standard manner of describing the decomposition on the irreducible subspaces is to extend the real space to a complex space. In the complex extension, it is possible to find the basis of eigenvectors of any linear operator. We introduce the coordinates √ z = (x + iy)/√2, (337a) z¯ = (x − iy)/ 2 (337b) and canonically adjoint momenta found by a standard procedure (Goldstein, 1980), read √ pz = (px − ipy )/ 2, (337c)
LIE ALGEBRAIC METHODS
√ pz¯ = (px + ipy )/ 2.
317 (337d)
The z-coordinate of angular momentum takes the form Lz = i(zpz − z¯ pz¯ ),
(338)
the action of which is described by Lˆ z zi1 z¯ i2 pzi3 p¯ zi4 = i(i3 − i4 − i1 + i2 )zi1 z¯ i2 pzi3 p¯ zi4 ;
(339)
i that is, the polynomials of form zi1 z¯ i2 pz3 p¯ zi4 are eigenvectors of Lˆ z . Let us denote
l = (i3 − i4 − i1 + i2 ). Using n = i1 + i2 + i3 + i4 , one can find that l = n − 2(i1 + i4 ). Because i1 + i4 can take any value from 0 to n for a given polynomial order n l ∈ {−n, −n + 2, . . . , n − 2, n}.
(340)
The eigenvectors with the same value of l form a subspace of Vn . As [sp(2, R), Lz ] = 0, these subspaces are invariant under the action of h2 and one can write Vn =
n
Vn,n−2k .
(341)
k=0 i
Let us consider the polynomial u = zi1 z¯ i2 pz3 pzi¯4 ∈ Vn,l , then the polynomial i u¯ = zi2 z¯ i1 pzi4 pz¯3 ∈ Vn,l must exist in the subspace Vn . Because Lˆ z u = ilu, the action of Lˆ z on u¯ takes the form ¯ Lˆ z u¯ = −i(i3 − i4 − i1 + i2 )u¯ = −il u; hence, u¯ ∈ Vn,−l . Using the linearity of Lˆ z one can find, for each u ∈ Vn,l , a complex conjugate polynomial u¯ ∈ Vn,−l . Thus, it was shown that subspaces Vn,l and Vn,−l are complex conjugate, that is, Vn,l = V¯n,−l .
(342)
The next step is to describe the action of sp(2, R) on the polynomial subspace Vnl . The polynomials a + , a − , and a0 transform as follows: a + = −pz pz¯ ,
(343a)
a = z¯z, 1 a0 = (zpz + z¯ pz¯ ). 2
(343b)
−
(343c)
ˇ RADLI CKA
318
Let us note that all the polynomials Lz , a + , a − , and a0 are real polynomials expressed in complex coordinates. First, we describe general properties of a finite dimensional complex representations of sp(2, R), which is for our case represented by operators aˆ + , aˆ − , and aˆ 0 . Here it is customary to introduce the Casimir operator (Bäuerle and Kerf, 1999), aˆ 2 = aˆ − aˆ + + aˆ 02 + aˆ 0 = aˆ + aˆ − + aˆ 02 − aˆ 0 .
(344)
This commutes with all elements of sp(2, R) 2 − 2 + 2 aˆ , aˆ = aˆ , aˆ = aˆ , aˆ 0 = 0, hence, in the irreducible subspace there exists a basis formed by common eigenvectors of aˆ 2 and aˆ 0 . Let us denote them |j, m, aˆ 2 |j, m = j (j + 1)|j, m,
(345)
aˆ 0 |j, m = m|j, m.
(346)
From the commutation relations the meaning of aˆ + and aˆ − as raising or descending operator, respectively, can be desired aˆ 0 aˆ ± |j, m = aˆ ± aˆ 0 ± a ± |j, m = (m ± 1)a ± |j, m, ±
a |j, m ∼ |j, m ± 1,
(347) (348)
aˆ +
transforms a vector with an eigenvalue m relative to that is, the action of aˆ 0 to a vector with the eigenvalue m + 1; similarly the action of aˆ − transforms a vector with the eigenvalue m to a vector with the eigenvalue m − 1. The representation is finite, the consequence of which is that there must exist a vector |j, m1 = 0 that vanishes under the action of aˆ + , i.e. a + |j, m1 = 0. Using Eq. (344) it can be known aˆ 2 |j, m1 = m21 + m1 |j, m1 and comparing with Eq. (345) m1 = j . Hence, in the irreducible subspace, there exists a vector |j, j = 0 that vanishes under the action of aˆ + . Applying the operator aˆ − , we can generate a chain of vectors, k (349) |j, j − k ∼ aˆ − |j, j , which is invariant under the action of sp(2, R) and forms an irreducible subspace. For the representation to be finite, the chain must be finite as well; hence, there must exist km such that |j, j − km = 0 that vanishes under the action of aˆ − ; that is, aˆ − |j, j − km ∼ |j, j − km − 1 = 0
(350)
LIE ALGEBRAIC METHODS
and when we calculate the action of aˆ 2 aˆ 2 |j, j − km = aˆ + aˆ − + aˆ 02 − aˆ 0 |j, j − km = (j − km )2 − j + km |j, j − km = j 2 + j |j, j − km
319
(351)
and compare the coefficients in the last equation, we find km = 2j . Hence, we can write the possible eigenvalues of aˆ 0 m ∈ {j, j − 1, . . . , −j + 1, −j }.
(352)
Moreover, because km is an integer, the possible values of j are constrained to 3 1 (353) j ∈ 0, , 1, , . . . . 2 2 The irreducible representation of sp(2, R) can be characterized by the eigenvalues of aˆ 2 . We will denote them Dj , (354) Dj = |j, j , |j, j − 1, . . . , |j, −j + 1, |j, −j . We have not yet completely defined the polynomial |j, m; we have only noted that it is proportional to (a − )j −m |j, j ; now we determine the multiplicative factor. We will use a different normalization from that used in quantum mechanics, where the representation of sl(2, R) ∼ = sp(2, R) is used for the description of the angular momentum operator. We will use the normalization introduced by Dragt and Forest (1986a) 1 aˆ − |j, m, m+j 1 aˆ + |j, m. |j, m + 1 = m−j |j, m − 1 =
(355a) (355b)
It is well known (Bäuerle and Kerf, 1999) that the representation of sp(2, R) is completely reducible, which for our case means that the space Vn,l can be written as a direct sum Vn,l,j = Dj , (356) Vn,l = j ∈K
j ∈K
where K is an index set. Now two questions arise: (1) what is the form of elements in Dj and (2) how can we determine the index set K? It was shown that each irreducible space Dj is determined by the appropriate vector |j, j ; hence, if one wants to describe all Dj ∈ Vnl , one must find all the polynomials u ∈ Vnl that vanish under the action of aˆ + and are
ˇ RADLI CKA
320
eigenvectors of aˆ 0 and aˆ 2 . The polynomials c u = pza pzb¯ i(zpz − z¯ pz¯ ) = pza pzb¯ Lcz ,
a + b + 2c = n
fulfil such conditions. In fact, aˆ + u = 0, aˆ 0 u = 12 (a + b)u, and aˆ 2 u = ( 14 (a + b)2 + 12 (a + b))u. Moreover, we require the polynomial to be an element of Vnl . Let us find the connection between a, b, c, and n, l, and j .
Nˆ u = (a + b + 2c)u = nu, Lˆ z u = i(a − b)u = ilu, 1 1 2 aˆ u = (a + b) (a + b) + 1 u = j (j + 1)u. 2 2
(357a) (357b) (357c)
Comparing the results and solving the algebraic equation results in the u in form j + 12 l j − 21 l 12 n−j pz¯ Lz
u = pz
j ;l
=: n Pj ,
(358)
where the notation for the polynomial using its eigenvalues relative to j ;l operators N , Lˆ z , aˆ 2 , and aˆ 0 was introduced; thus n Pm satisfies
Nˆ n Pmj ;l = n Pmj ;l , j ;l j ;l Lˆ z n Pm = il n Pm , j ;l aˆ Pm j ;l aˆ 0 n Pm 2n
= j (j + 1) =
(359a) (359b) n
Pmj ;l ,
j ;l m n Pm .
(359c) (359d)
Moreover, it can easily be shown that n
n
Pjj ;l = Lz2
−j 2j
Pjj ;l .
(360)
Because the exponent of pz , pz¯ , and Lz in Eq. (358) cannot be negative, the value of j is constrained with the others, that is, j
1 n, 2
hence,
j
j∈
1 l, 2
1 j − l, 2
|l| |l| n , + 1, . . . , 2 2 2
(361)
(362)
and subspace Vn,l ∼ = D |l| ⊕ D |l| +1 ⊕ · · · ⊕ D n2 ⊕ Un,l 2
2
(363)
321
LIE ALGEBRAIC METHODS j ;l
where each Dj is generated by the action of aˆ − on the polynomial n Pj , n
1
Pmj ;l = α(j, m)Lz2
n−j − j −m j + 12 l j − 21 l pz pz¯ aˆ
1
= Lz2
n−j 2j
Pmj ;l . (364)
Unl is some rest subspace and the multiplicative factor α(j, m) j i=m+1 1/(i + j ). Using Eq. (364) it is easy to see that n
Vnlj = Lz2
−j
· V2j,l,j .
=
(365)
Now we will show that Unl = 0 by comparing the subspace dimensions. The dimension of the n-th order polynomial subspace can be evaluated as 1 (n + 3)! = (n + 3)(n + 2)(n + 1). (366) n!3! 6 Conversely, the dimension can be found
n/2 n n dim Vn = Vn,n−2k = dim Dj + dim Un,n−2k dim Vn =
k=0
k=0
j =|n−2k|/2
and as the dimension of Dj is 2j + 1,
n/2 n (2j + 1) + dim Un,n−2k dim Vn = k=0
j =|n−2k|/2
n |n − 2k| n + 1 + |n − 2k| + 1 − +1 = 2 2 2 k=0 2 n n n (n − 2k)2 +1 − + dim(Un,n−2k ) + dim Un,n−2k 2 4 n 1
k=0
=
k=0
1 (n + 3)(n + 2)(n + 1) + 6
n
dim(Un,n−2k ).
(367)
k=0
Comparing Eqs. (366) and (367) it shows that dim Unl = 0 and Vn,l ∼ = Vn,l, |l| ⊕ Vn,l, |l| +1 ⊕ · · · ⊕ Vn,l, n2 , 2
(368)
2
where each Vn,l,j ∼ = Dj . We have shown that Vn,l = V¯n,−l ; when we use the decomposition in Eq. (368) it is easy to see that each of the subspaces Vn,l,j is the complex conjugate of the subspace Vn,−l,j , or more particularly, n
¯ Pmj ;l = n Pmj ;−l .
(369)
ˇ RADLI CKA
322
The description of the decomposition of polynomial space to irreducible subspaces is done by combining Eqs. (334), (341) and (368). The particular form of the polynomials is described by Eq. (364). E. Example: Decomposition of Polynomial Space up to Fourth Order We use the method explained in the previous subsection. Since the irreducible subspaces of different orders do not mix, we can describe the decomposition of each order independently. 1. Polynomials of the Third Order For the subspace V3 , n = 3 and using Eq. (340), l ∈ {−3, −1, 1, 3}. Hence, V3 = V3,−3 ⊕ V3,−1 ⊕ V3,1 ⊕ V3,3 .
(370)
For V3,−3 , n = 3 and l = −3 using Eq. (362), j ∈ { 32 }; hence, V3,−3 = V3,−3, 3 ∼ = D3/2 . 2
(371)
The irreducible subspace D3/2 is generated from polynomial (358); that is, 3
3
P 32
;−3
2
= pz3¯
(372a)
and, applying the operator aˆ − , 3
3
P 12
3
P
3
P
;−3
2 3 2 ;−3 − 12 3 2 ;−3 − 32
hence,
1 − 3 32 ;−3 1 − 3 aˆ P 3 = aˆ pz¯ = zpz2¯ , 3 3 2 1 − 3 12 ;−3 1 − 2 = aˆ P 3 = aˆ zpz¯ = z2 pz¯ , 2 2 2 =
− 21 ;−3
= aˆ − 3 P 3 2
= aˆ − z2 pz¯ = z3 ;
V3,−3 = pz3¯ , zpz2¯ , z2 pz¯ , z3 .
(372b) (372c) (372d)
(373)
Using the same procedure one can find for l = −1, j ∈ { 12 , 32 }, and V3,−1 ∼ = D1/2 ⊕ D3/2 , with D1/2 formed by polynomials 3 3
1
P 12 P
;−1
2 1 2 ;−1 − 12
= ipz¯ (zpz − z¯ pz¯ ) = pz¯ Lz ,
(374a)
= iz(zpz − z¯ pz¯ ) = zLz
(374b)
LIE ALGEBRAIC METHODS
323
and D3/2 by 3
1
P 32
3
P
3
P
3
P
;−1
2 1 2 ;−1 1 2 1 2 ;−1 − 21 1 2 ;−1 − 23
= pz pz2¯ , 1 2 2 z¯ p + zpz¯ pz , 3 z¯ 3 2 1 = z¯ zpz¯ + z2 pz , 3 3 =
= z¯ z2 .
(375a) (375b) (375c) (375d)
The subspace V3,1 is complex conjugate to V3,−1 , i.e., V3,1 ∼ = D1/2 ⊕ D3/2 with D1/2 = pz Lz , z¯ Lz , ! " 2 1 2 1 2 2 2 2 D3/2 = pz pz¯ , z¯ pz pz¯ + zpz , z¯ pz¯ + z¯zpz , z¯z 3 3 3 3 ∼ and the subspace V3,3 = D3/2 , V3,3 = pz3 , z¯ pz2 , z¯ 2 pz , z¯ 3
(376a) (376b)
(377)
is complex conjugate to V3,−3 . 2. Polynomials of the Fourth Order For the fourth-order polynomials, the values of l are elements of the set {−4, −2, 0, 2, 4}. We show the procedure only for the case l = 0; for the others, we present only the results. When l = 0, j goes through the set {0, 1, 2}; thus, V4,0 ∼ = V4,0,0 ⊕ V4,0,1 ⊕ V4,0,2 . The irreducible space V4,0,0 ∼ = D0 is a 1D vector space generated by the polynomial
P00;0 = L2z = −(zpz − z¯ pz¯ )2 . (378) The irreducible subspace V4,0,1 ∼ = D1 is generated by the polynomial 4
4
and applying operator
aˆ − ,
P11;0 = pz pz¯ Lz
we generate the basis
1 1 P01;0 = aˆ − 4 P11;0 = (zpz + z¯ pz¯ )Lz , 2 2 4 1;0 P−1 = aˆ − 4 P01;0 = z¯zLz . 4
(379a)
(379b) (379c)
ˇ RADLI CKA
324
The subspace V4,0,2 ∼ = D2 , the descending series determined by the polynomial 4
P22;0 = (pz pz¯ )2 ,
(380a)
takes the form 1 P12;0 = pz pz¯ (zpz + z¯ pz¯ ), 2 1 1 4 2;0 P0 = (zpz + z¯ pz¯ )2 + z¯zpz pz¯ , 6 3 1 2;0 4 P−1 = z¯z(zpz + z¯ pz¯ ), 2 4 2;0 P−2 = z¯ 2 z2 . 4
(380b) (380c) (380d) (380e) ∼ = D2 ,
In the subspace V4,−4 j must be equal to 2; hence, V4,−4 = V4,−4,2 which is generated by (381) V4,−4,2 = pz4¯ , zpz3¯ , z2 pz2¯ , z3 pz¯ , z4 . For V4,−2 , j goes through the set {1, 2}; hence, V4,−2 = V4,−2,1 ⊕ V4,−2,2 ∼ = D1 ⊕ D2 , where V4,−2,1 = pz2¯ Lz , zpz¯ Lz , z2 Lz (382) and V4,−2,2
! 1 1 = pz pz3¯ , pz2¯ (¯zpz¯ + 3zpz ), zpz¯ (zpz + z¯ pz¯ ), 4 2 " 1 2 z (zpz + 3¯zpz¯ ), z¯ z3 . 4
(383)
The subspace V4,2 is complex conjugate to V4,−2 , and V4,4 is complex conjugate to V4,−4 . F. Representation of h2 on the Real Space of Polynomials We have described the representation of h2 on the space of complex polynomials, but the real representation is more complicated. We have noted that, except for the axially symmetric polynomials, there exist no eigenvectors of Lz in the real polynomial space. Fortunately, using the decomposition of the space of complex polynomials, we can find the irreducible subspaces of real polynomials. Let us consider the irreducible subspaces Vn,l,j and Vn,−l,j . We showed that j ;l these subspaces are complex conjugate, and each element n Pm ∈ Vn,l,j is
325
LIE ALGEBRAIC METHODS j ;−l
complex conjugate to n Pm
∈ Vn,−l,j . Let us define two real polynomials
1 n j ;l n j ;−l Pm + P m , 2 i j ;l j ;−l η = n Pm − n P m . 2 ξ=
(384a) (384b)
Even though these two polynomials are not eigenvectors of Lˆ z , the subspace ξ, η is invariant under the action; that is, 1 j ;l j ;−l Lˆ z ξ = il n Pm − il n Pm = lη, (385a) 2 i j ;l j ;−l Lˆ z η = il n Pm + il n Pm = −lξ. (385b) 2 Now we will apply this approach to the whole space V . Let us define the real vectors n j ;l Cm = 2j −1 n Pmj ;l + n Pmj ;−l , (386a) n j ;l j −1 n j ;l n j ;−l Sm = i2 Pm − P m (386b) and for each n, l > 0, and j the real subspaces j ;l j ;l + = n Cj , . . . , n C−j , Un,l,j j ;l j ;l − = n Sj , . . . , n S−j . Un,l,j
(387a) (387b)
Using the properties of representation h2 on the space of complex polynomials, we find − a − n j ;−l aˆ − n j ;l a n j ;l Cm = 2j −1 Pm + Pm j +m j +m j +m n j ;l j ;l j ;−l = 2j −1 n Pm−1 + n Pm (388) = Cm−1 and similarly aˆ − n j ;l n j ;l Sm = Sm−1 . j +m
(389) j ;l
+ can also be generated from n Cj by lowering with The subspace Un,l,j − aˆ − and the subspace Un,l,j can likewise be generated from the polynomial n S j ;l . m
Hence, these subspaces are irreducible under the action of sp(2, R) and isomorphic to Dj . Unfortunately, they are not invariant under the action of Lˆ z , j ;l j ;l Lˆ z n Cm = l n Sm ,
(390a)
ˇ RADLI CKA
326
j ;l j ;l Lˆ z n Sm = −l n Cm ,
(390b)
which causes mixing these two subspaces. The irreducible subspace under the action of h2 is then + − ⊕ Un,l,j , Un,l,j = Un,l,j
(391)
where ⊕ for this one instance means just vector space addition. When l = 0, the situation is more straightforward because all the polynoj ;0 mials n Pm are real polynomials expressed in complex coordinates. We need only to express the polynomials in real coordinates and normalize; that is, n j ;0 Cm
j ;0
= 2 j n Pm .
(392)
Hence, the decomposition of the real polynomial space n
U=
1
2 2
Un,2k,j
(393)
n k=0 j =k
is the decomposition on the irreducible subspaces according to the action of h2 . G. Example: Real Polynomials up to the Fourth Order We have calculated the decomposition of the space of complex polynomials to irreducible subspaces according to the action of h2 . Now we will use that result and, applying the procedure described in the last section, we will show the decomposition of the real polynomial space up to the fourth order. 1. The Third-Order Polynomials We will show the procedure for construction of the space U3,3,3/2 . Using + can be found: Eqs. (373), (377) and (386b), the basis of U3,3,3/2
3
C
3 3 1 ;3 ;−3 = 2 2 3 P 32 + 3 P 32 2 √ 32 3 = 2 pz + pz¯ = px 3 − 3px py2 , 1 = 2 2 (¯zpz + zpz¯ ) = x px2 − py2 − 2ypx py ,
3
C
1 = 2 2 pz z¯ 2 + pz¯ z2 = px x 2 − y 2 − 2xypy ,
(394c)
3
C
1 = 2 2 z3 + z¯ 3 = x 3 − 3xy 2 .
(394d)
3
3
C 32
;3
2
3 2 ;3 1 2 3 2 ;3 − 12 3 2 ;3 − 32
(394a) (394b)
327
LIE ALGEBRAIC METHODS
Alternatively, the basis can be found as the descending series generated by the 3/2;3 − takes the form polynomial 3 C3/2 . The basis of U3,3,3/2
3
S
3 3 1 ;3 ;−3 = i2 2 3 P 32 − 3 P 32 2 √ 32 3 = i 2 pz − pz¯ = 3px2 py − py3 , 1 = i2 2 (¯zpz − zpz¯ ) = y px2 − py2 + 2xpx py ,
3
S
1 = i2 2 pz z¯ 2 − pz¯ z2 = py x 2 − y 2 + 2xypx ,
(395c)
3
S
1 = i2 2 z3 − z¯ 3 = 3x 2 y − y 3 .
(395d)
3
3
S 32
;3
2
3 2 ;3 1 2 3 2 ;3 − 12 3 2 ;3 − 32
Hence, U+
3,3, 32
U−
3,3, 32
= px 3 − 3px py2 , x px2 − py2 − 2ypx py , px x 2 − y 2 − 2xypy , x 3 − 3xy 2 , = 3px2 py − py3 , y px2 − py2 + 2xpx py , py x 2 − y 2 + 2xypx , 3x 2 y − y 3 .
(395a) (395b)
(396a)
(396b)
Now we present the results for the other subspaces. U+
3,1, 12
U+
3,1, 32
U−
3,1, 32
= px Lz , xLz ,
! 1 = px p2 , xp2 + 3 ! 1 = py p2 , yp2 + 3
U−
3,1, 12
= py Lz , yLz ,
2 1 px qp, px q2 + 3 3 2 1 py qp, py q2 + 3 3
" 2 xqp, xq2 , 3 " 2 2 yqp, yq . 3
(397)
(398a) (398b)
2. Polynomials of the Fourth Order Apart from the case l = 0, the procedure is similar to the case for polynomials of the third order. Only the results are presented here. + − ⊕ U4,4,2 , where U4,4,2 = U4,4,2 4 2;4 C2 4 2;4 C1 4 2;4 C0
2 = px2 − py2 − 4px2 py2 , = (xpx − ypy ) px 2 − py2 − 2px py (xpy + ypx ), = x 2 − y 2 px2 − py2 − 4xypx py ,
(399a) (399b) (399c)
ˇ RADLI CKA
328 4 2;4 C−1 4 2;4 C−2
= (xpx − ypy ) x 2 − y 2 − 2xy(xpy + ypx ), 2 = x 2 − y 2 − 4x 2 y 2
+ and polynomials are bases of U4,4,2 4 2;4 S2 = 4px py px2 − py2 , 4 2;4 S1 = (xpy + ypx ) px2 − py2 + 2px py (xpx − ypy ), 4 2;4 S0 4 2;4 S−1 4 2;4 S−2
= 2(xpx − ypy )(xpy + ypx ), = (xpy + ypx ) x 2 − y 2 + 2xy(xpx − ypy ), = 4xy x 2 − y 2
(399d) (399e)
(400a) (400b) (400c) (400d) (400e)
− . form bases of U4,4,2 The subspace U4,2 is formed from subspaces + U4,2,2 = px2 − py2 p2 , xpx3 − ypy3 , (xpx − ypy )qp, x 3 px − y 3 py , 2 x − y 2 q2 , (401a) ! 1 − = 2px py p2 , p2 (xpy + ypx ) + px py qp, (xpy + ypx )qp, U4,2,2 2 " 1 (401b) (xpy + ypx )q2 + xyqp, 2xyq2 , 2
+ U4,2,1 = px2 − py2 Lz , (xpx − ypy )Lz , x 2 − y 2 Lz , − = 2px py Lz , (xpy + ypx )Lz , 2xyLz . U4,2,1
(402a) (402b)
The subspace belonging to l = 0 is formed from irreducible subspaces ! " 2 2 2 2 2 2 1 2 2 2 U4,0,2 = p , p qp, p q + 2(qp) , q qp, q , (403a) 3 (403b) U4,0,1 = p2 Lz , qpLz , q2 Lz , 2 U4,0,0 = Lz . (403c) H. Decomposition of Polynomials with Respect to M1 As shown previously, the irreducible subspaces in real polynomial subspace according to the action of h2 are also invariant with respect to
M1 = e:g2 : ,
g2 ∈ h2 .
(404)
LIE ALGEBRAIC METHODS
329
Moreover, if B(z) = 0, using Eqs. (322) and (330) g2 = a(z)p2 + b(z)qp + c(z)q2 + d(z)Lz ,
(405)
where each coefficient is nonzero, these invariant subspaces are irreducible. The decomposition into irreducible subspaces is then described by Eq. (393). Now we can easily find the physical meaning of the numbers that characterize the basis vectors of irreducible subspaces. First, we will show the meaning of number l. In complex polynomial space, it determines the eigenvalue of Lz , that is, j ;l j ;l Lˆ z n Pm = il n Pm ;
(406)
the meaning of l cannot be found in the real polynomial because, except for the axial symmetric polynomials, there are no eigenvectors of Lz . Let us consider the polynomial n j ;l Cm = 2j −1 n Pmj ;l + n Pmj ;−l , (407) which, according to the action of exp(ϕ:Lz :), takes the form j ;l j ;l j ;−l exp(ϕ:Lz :) n Cm = 2j −1 eϕ:Lz : n Pm + eϕ:Lz : n Pm j ;l j ;−l = 2j −1 eiϕl n Pm + e−iϕl n Pm .
(408)
j ;l
When ϕl = 2π k, the polynomial n Cm is invariant under the Lie transformation exp(ϕ:Lz :). This transformation represents the rotation in phase space j ;l around the optic axis through an angle ϕ = 2π k/ l. The polynomials n Cm j ;l n and Sm remained unchanged when the phase space is rotated around the origin through an angle 2π k/ l. The next interesting characteristic of the homogeneous polynomials is the eigenvalue m of Lie operator aˆ 0 . The operator a0 generates the Lie transformation exp(τ :a0 :), which acts as pure magnification in four-dimensional (4D) phase space, −τ e q q q exp(τ :a0 :) . (409) = exp(τ :qp:) = p p eτ p Thus, the value m describes how the influence of aberrations changes with magnification, or more trivially, by how it expresses excess of p’s over q’s; m is known as the Seidel weight (Navarro-Saad and Wolf, 1986). Aberrations with the highest weight are most important in practical calculations in magnifying systems because they contain just px or py and their influence does not change with distance from the axis in the object plane. The eigenvalue j of operator aˆ 2 has the worst-explained meaning. It expresses the order of skewness variable Lz = xpy −ypx in the homogeneous
ˇ RADLI CKA
330 j ;l
polynomial n Cm , which takes n/2 − j . In the axial symmetric case, when the order of Lz is odd, the polynomial is not invariant under the discrete space reflection x → −x, px → −px , y → y, py → py or y → −y, py → −py , x → x, px → px , respectively, and the aberrations that it describes are anisotropic. I. The Third-Order Axially Symmetric Aberrations The third-order axial symmetric aberrations are described by the Lie transformation exp(:g4 :), where g4 ∈ V4,0 , that is, g4 =
j 2
j ;0
αj,m 4 Cm .
(410)
j =0 m=−j
This type of aberration is well described and classified; we now mention the standard meaning of each member in Eq. (410). The term α2,2 4 C22;0 describes the spherical aberration 2 exp α2,2 :4 C22;0 : q = exp α2,2 : p2 : q = q − 4α2,2 p2 p + o(5) = q + C3 p2 p + o(5)
(411)
where we introduced the coefficient of third-order spherical aberration C3 = −4α2,2 . The next term describes the coma exp α2,1 :4 C12;0 : q = exp −α2,1 :p2 qp: q (412) = q + K p2 q + 2(qp)p + o(5), where the coefficient of the third-order coma K = −α2,1 was introduced. The 2;0 generates the distortion; that is, using D = −α2,−1 polynomial α2,−1 4 C−1 2;0 exp α2,1 :4 C−1 : q = exp −D:q2 qp: q = q + Dq2 q + o(5). (413) 2;0 The polynomial 4 C−2 has no effect on the change of coordinates; it affects only the final slopes. It is necessary to describe the effect of 4 C02;0 and 4 C00;0 together; they generate the astigmatism and the field curvature. We use the definition used in Dragt and Forest (1986b) rather than definition in Hawkes and Kasper (1989): 1 2 2 2 2 4 2;0 4 0;0 α2,0 C0 + α0,0 C0 = α2,0 p q + (qp) + α0,0 q2 p2 − (qp)2 3 3 1 (414) = −A(qp)2 − F p2 q2 , 2
LIE ALGEBRAIC METHODS
331
where the coefficients of astigmatism A = α0,0 − 2/3α2,0 and field curvature F = −2/3α2,0 − 2α0,0 were introduced. The effect of astigmatism then reads exp −A:(qp)2 : q = q + 2A(qp)q + o(5) (415) and similarly, the effect of the field curvature 1 2 2 exp − F :q p : q = q + F p2 q + o(5). 2
(416)
Both aberrations mix the effect of the action of the polynomials 4 C02;0 and 4 C 0;0 . In fact, these polynomials differ only in value of spin; their other 0 characteristics are identical. The polynomials from the space V4,0,1 are not invariant under reflection with respect to any plane that contains the optic axis. Thus, the aberrations that are described by these polynomials are anisotropic. The first is anisotropic coma exp α1,1 :4 C11;0 : q = exp −k:Lz p2 : q = q + k 2Lz p − p2 Jˆ2 q + o(5) = 2kqpJˆ2 p − 3kp2 Jˆ2 q,
(417)
with k = −α1,1 ; the second one is the anisotropic astigmatism exp α1,0 :4 C01;0 : q = exp(−a:Lz pq:)q = q + a(Lz q − qpJˆ2 q) + o(5) (418) = a q2 Jˆ2 p − 2qpJˆ2 q , where a = −α1,0 ; and the last is the anisotropic distortion 1;0 exp α1,−1 :4 C−1 : q = exp −d:Lz q2 : q = q − dq2 Jˆ2 q + o(5). (419) These coefficients have no analogue in light optics; their existence is caused by the presence of a magnetic field (Hawkes and Kasper, 1989). The structure of aberration polynomials merits further mention. They are represented by the third-order homogeneous polynomials as the result of the Poisson bracket [V4,0 , q] and because each g ∈ V4,0 satisfies Lˆ z [g, q] = [Lˆ z g, q] + [g, Lˆ z q] = [g, q]
(420)
where [V4,0 , q] ∈ V3,1 . The polynomial space V3,1 is formed by 12 polynomials and considering the phase-space dimension, it takes 124 possible combinations. However, we showed that there are just nine independent aberration coefficients. This is because the transformation is canonical and can be described by g4 ∈ V4,0 , like the Lie transformation exp(:g4 :). Hence,
ˇ RADLI CKA
332
the general axial symmetric aberration polynomial reads 2 q˜ qo po po qo = +C = exp(:g4 :) po po 0 p˜ 2 2 2(qo po )po + po qo qo po (qo po )qo +K + 2A +F −(qo po )po −p2o po −p2o qo 2 qo qo 0 +D +E −q2o qo −2(qo po )qo − q2o po Lz po − po2 Jˆ2 qo Lz qo − (qo po )Jˆ2 qo +k + a −Lz qo − qo po Jˆ2 po −po2 Jˆ2 po q2o Jˆ2 qo . (421) −d 2Lz qo + q2o Jˆ2 qo J. Reflection Symmetry We have noted that the reflection symmetry has interesting consequences in properties of aberration in the axial symmetric case. Therefore, we will describe the general reflection symmetry of the aberrations polynomials. Let us denote by Ωˆ α the reflection with respect to the plane −x sin α + y cos α = 0, the plane that arises from the plane xz by the rotation around the z-axis about the angle α. Such a transformation can be described by composition of transformations Ωα = Rˆ −α Ωˆ 0 Rˆ α .
(422)
Next we find the transformation property for the complex polynomial
n P j ;l . m
It is readily seen that Ωˆ 0 Lz = −Lz ; hence, n n −j n −j j ;l j ;l j ;l Ωˆ 0 n Pm = Ωˆ 0 Lz2 2j Pm = (−1) 2 −j Lz2 Ωˆ 0 2j Pm .
(423)
The transformation Ωˆ 0 , which in fact changes the sign of y and py , is l;j equivalent to complex conjugation for the complex polynomials 2j Pm . Hence, using Eq. (369) j ;l j ;−l Ωˆ 0 2j Pm = 2j Pm
(424)
and combining the previous results, we find n
−j j ;l j ;−l j ;−l Ωˆ 0 n Pm = (−1) 2 −j Lz2 2j Pm = (−1) 2 −j n Pm . n
n
(425)
333
LIE ALGEBRAIC METHODS j ;l
The transformation property of n Pm according to Ωˆ α can be found j ;l j ;l j ;l j ;−l Ωˆ α n Pm = Rˆ −α Ωˆ 0 Rˆ α n Pm = Rˆ −α Ωˆ 0 eilα n Pm = (−1) 2 −j eilα Rˆ −α n Pm n
n
j ;−l
= (−1) 2 −j e2ilα n Pm
(426)
.
There are two interesting cases, the first one—α = kπ/ l—then j ;l j ;−l Ωˆ k πl n Pm = (−1) 2 −j n Pm n
(427)
and the second—α = π/2l + kπ/ l—then j ;l j ;−l Ωˆ 2lπ +k πl n Pm = −(−1) 2 −j n Pm . n
(428)
These properties can be used to describe the transformation properties of real j ;l polynomials. First, we direct our attention to polynomial n Cm j ;l j ;l j ;−l Ωˆ α n Cm = Ωˆ α 2j −1 n Pm + n Pm n j ;−l j ;l (429) = (−1) 2 −l e2ilα n Pm + e−2ilα n Pm , hence, j ;l j ;l Ωˆ k πl n Cm = (−1) 2 −j n Cm , n
j ;l j ;l Ωˆ 2lπ +k πl n Cm = −(−1) 2 −j n Cm , n
(430a) (430b)
j ;l
and similarly for the polynomial n Sm j ;l j ;l j ;−l Ωˆ α n Sm = Ωˆ α i2j −1 n Pm − n Pm n j ;−l j ;l = i(−1) 2 −l e2ilα n Pm − e−2ilα n Pm ,
(431)
hence, j ;l j ;l Ωˆ k πl n Sm = −(−1) 2 −j n Sm , n
j ;l j ;l Ωˆ 2lπ +k πl n Sm = (−1) 2 −j n Sm . n
(432a) (432b)
The previous equations completely describe the reflection symmetry of polynomials. Unfortunately, from the reflection symmetry of the optical system, it is not possible to find the refraction symmetry of the aberration coefficients. This is a consequence of the anisotropic effect magnetic field, j ;l j ;l which mixes the polynomials n Cm with n Sm . We can proceed in this manner only for electrostatic lenses or in light optics.
ˇ RADLI CKA
334
K. Changing Parameterization We have described how the aberration coefficients change when the paraxial image coordinates are used instead of the object ones. Now we describe the change of the aberration coefficients, when the parameterization by the position in the object and aperture planes is used or when the position of the aperture plane is changed. Let us denote the interaction coordinates in case of parameterization by the object position and momenta q1 and p1 , in case of the parameterization by the position in the object and aperture plane q2 and p2 = qa . The transitions between rotating coordinates and interaction coordinates are then described by ⎞ ⎛ ∗− 12 ˆ ˆ h 1 g 1 φ o q ⎠ q1 ∗ (433) =⎝ 1 φ ˆ p1 p h1 φ ∗ 2 g 1ˆ φo∗
in the case of parameterization by position and momenta in the object plane or by s 1ˆ t 1ˆ q2 q (434) = ∗ 12 ˆ ∗ 12 ˆ p p φ s1 φ t1 2 in the case of parameterization by the position in the object and aperture plane. It is clear that the meaning of q1 and q2 does not differ, thus, q1 = q2 .
(435)
Comparing the equations for q ∗− 12
q = gq1 + φo
hp1 ,
q = sq2 + tp2 = sq1 + tp2 ,
(436) (437)
we can find ∗− 12
p2 = ga q1 + φo
ha p1
(438)
where we used s=g− t=
ga h, ha
1 h. ha
(439) (440)
The subscript a indicates that the function is evaluated in za , i.e., ha = h(za ).
335
LIE ALGEBRAIC METHODS
The transformation takes the form of an extended canonical transformation ˆ 1 0ˆ q1 q2 1 = ∗− 2 p2 ˆ ˆ p 1 ga 1 φo ha 1 ˆ 1 0ˆ 1 q1 1 . (441) = exp ga :q2 : ∗− p1 2 0ˆ φo 2 ha 1ˆ Now we determine how the transformation acts on the symplectic polynomials n C j ;l , that is, m j ;l j ;l n j ;l Cm (q2 , p2 ) = Mn Cm = n Cm q2 (q1 , p1 ), p2 (q1 , p1 ) . (442) First, we show the effect of the scaling, which is described by the diagonal matrix ˆ 1 0ˆ 1 Ms = . (443) ∗− 0ˆ φo 2 ha 1ˆ ∗− 12
This transformation leaves q1 unchanged and scales p1 by the factor φo j ;l The power of p1 in n Cm is expressed by the eigenvalue m, that is, n ˜j ;l Cm
∗− 1 n +m n j ;l j ;l = Ms n Cm = φo 2 ha 2 Cm .
ha .
(444)
The action of the canonical part of the transformation takes the form ∞ (ga aˆ − )i n ˜j ;l 1 2 n ˜j ;l n ˜j ;l Mc Cm = exp ga :q : Cm = Cm ; (445) 2 i! i=0
using Eq. (388), we find j ;l Mc n C˜m
=
j +m i=0
j + m i n ˜j ;l ga Cm−i . i
(446)
Finally, the transformation is described by the composition of these two maps: j ;l n j ;l Cm (q2 , p2 ) = Mc Ms n Cm j +m ∗− 12 n +(m−i) j + m i n j ;l 2 φo ha ga Cm−i (q1 , p1 ). (447) = i i=0
We can see that all irreducible subspaces Vnlj are invariant under the transformation M and we can describe them independently. Let us consider
ˇ RADLI CKA
336 the polynomial j
gn =
j ;l am n Cm (q1 , p1 )
=
m=−j
j
j ;l
a˜ m n Cm (q2 , p2 ).
(448)
m=−j
Using Eq. (447), we find gn =
j
a˜ m
m=−j
=
j
j +m
n +(m−i) i ∗− 1 φo 2 ha 2 ga
i=0 j
n +k j ;l ∗− 1 φo 2 ha 2 n Ck
k=−j
j +m i
n j ;l Cm−i (q1 , q2 )
gam−k a˜ m
m=k
j +m m−k
(449)
and comparing the previous two equations, it is easy to find the transformation of aberration coefficients in the form j ∗− 12 n +k m−k j + m 2 ak = φo ha . (450) a˜ m ga m−k m=k
Let us now apply the previous results to the transformation of the third-order axial symmetric aberrations. They are described by g4 in Eq. (495a)— j ;0 j ;0 g4 = αj,m 4 Cm (q1 , p1 ) = α˜ j,m 4 Cm (q2 , p2 ) (451) j,m
j,m
and using Eq. (450), we can write for V4,0,2 ⎛ ∗− 1 ⎞ ⎛ (φo 2 ha )−4 α2,2 1 0 0 ⎜ ⎟ ⎜ ∗− 12 ⎟ −3 ⎜ 1 0 ⎜ (φo ha ) α2,1 ⎟ ⎜ 4ga ⎜ ⎟ ⎜ 2 2 ⎜ ∗− 12 ⎟ 6g 3g 1 a a ⎜ (φo ha )−2 α2,0 ⎟ = ⎜ ⎜ ⎟ ⎜ 3 2 ⎜ ∗− 1 ⎟ ⎝ 4ga 3ga 2ga ⎝ (φo 2 ha )−1 α2,−1 ⎠ ga4 ga3 ga2 α2,−2 and similarly for V4,0,1 ⎛ ∗− 1 (φo 2 ha )−3 α1,1 ⎜ ⎜ ∗− 12 ⎜ (φo ha )−2 α1,0 ⎝ ∗− 12
(φo
ha )−1 α1,−1
⎞ ⎟ ⎟ ⎟= ⎠
1 2ga ga2
0 1 ga
1
⎞⎛ ⎞ 0 α˜ 2,2 ⎟ ⎜ 0⎟ ⎟ ⎜ α˜ 2,1 ⎟ ⎟ ⎜ 0 ⎟ ⎜ α˜ 2,0 ⎟ ⎟ (452) ⎟⎜ ⎟ 0 ⎠ ⎝ α˜ 2,−1 ⎠
ga
1
0 0 0
0 0 1
α˜ 2,−2
α˜ 1,1 α˜ 1,0 α˜ 1,−1
(453)
LIE ALGEBRAIC METHODS
337
and finally ∗− 1 −2 α˜ 0,0 . α0,0 = φo 2 ha
(454)
Transformation of the aberration coefficients can be easily found from the transformation property of αj,m using their definition, presented in Section VI.I. The aberration coefficients also vary when the aperture position is changed. Let us consider a system with the object position in zo and the aperture position in za . The rays are parameterized by the position in the object plane q1 and by the position in the aperture plane, which plays the role of the generalized momentum p1 . When the position of the aperture is changed to z = za˜ , the rays are then similarly parameterized by q2 = q1 , p2 . The paraxial ray then reads q = sq1 + tp1 = s˜ q1 + t˜p2 ;
(455)
hence, p2 =
s − s˜ t q1 + p1 , t˜ t˜
(456)
which can be using t˜ =
t , t (za˜ )
s˜ = s −
s(za˜ ) t t (za˜ )
(457)
written in form p2 = s(za˜ )q1 + t (za˜ )p1 .
(458)
The transformation matrix between q1 , p1 and q2 , p2 then describes the extended canonical transformation ˆ ˆ 1 0ˆ 1 0ˆ 1 q2 q1 q1 = = exp sa˜ :q2 : . (459) p2 p1 p1 2 sa˜ 1ˆ ta˜ 1ˆ 0ˆ ta˜ 1ˆ Because the procedure is similar to previous case, we present only the results: j ;l n j ;l Cm (q2 , p2 ) = Mc Ms n Cm j +m n +(m−i) j + m i n j ;l 2 sa˜ Cm−i (q1 , p1 ) (ta˜ ) (460) = i i=0
and analogically to Eq. (450), we can determine j n j +m m−k +k ak = (ta˜ ) 2 . a˜ m sa˜ m−k m=k
(461)
ˇ RADLI CKA
338
When the results are applied to the third-order axial symmetric aberrations, for V4,0,2 we derive ⎞ ⎛ 1 (ta˜ )−4 α2,2 ⎜ (t )−3 α2,1 ⎟ ⎜ 4sa˜ ⎟ ⎜ ⎜ a˜ ⎟ ⎜ 2 ⎜ ⎜ (ta˜ )−2 α2,0 ⎟ = ⎜ 6sa˜ ⎟ ⎜ ⎜ ⎝ (ta˜ )−1 α2,−1 ⎠ ⎝ 4s 3 a˜ α2,−2 s4 ⎛
a˜
0 1 3sa2˜
0 0 1
0 0 0
3sa2˜ sa3˜
2sa˜
1
sa2˜
sa˜
0 ⎞ ⎛ α˜ 2,2 ⎞ ⎟ ⎜ 0⎟ ⎟ ⎜ α˜ 2,1 ⎟ ⎟ ⎜ 0 ⎟ ⎜ α˜ 2,0 ⎟ ⎟. ⎟ ⎟⎜ ⎠ ⎝ 0 α˜ 2,−1 ⎠ α˜ 2,−2 1
(462)
Similarly for V4,0,1 we derive ⎞ ⎛ 1 (ta˜ )−3 α1,1 ⎝ (ta˜ )−2 α1,0 ⎠ = ⎝ 2sa˜ sa2˜ (ta˜ )−1 α1,−1 ⎛
0 1 sa˜
⎞⎛ ⎞ α˜ 1,1 0 0 ⎠ ⎝ α˜ 1,0 ⎠ , 1 α˜ 1,−1
(463)
and finally α0,0 = (ta˜ )−2 α˜ 0,0 .
VII. A XIAL S YMMETRIC A BERRATIONS OF
(464)
THE
F IFTH O RDER
Recently the demand for aberration calculations has increased, since the exact calculations of the field and ray-tracing lead to better designed electron systems. Moreover, the possibility of eliminating all primary aberrations (Rose, 1987) makes a knowledge of the aberrations of fifth order necessary. Therefore, we present a calculation of the fifth-order aberration of a round magnetic lens based on the Lie algebraic method. The value of the spherical aberration will be compared with the value that is calculated by fitting of raytracing results. The analytical calculation of the fifth-order aberration is not an easy task because the symbolic calculation becomes very lengthy. We used the mathematical program Maple (Maple, 2006) for symbolic calculation. The fifth-order aberrations are described by g4 and g6 ; hence, for the calculation the symplectic structure of the axial symmetric polynomials up to sixth order must be known. This is solved in the first two sections. Next we show the expansion of the interaction Hamiltonian up to the sixth order and the calculation of the aberration using the Lie algebraic method.
LIE ALGEBRAIC METHODS
339
A. Axial Symmetric Polynomial of the Sixth Order According to Eq. (368) the space of the sixth-order axially symmetric polynomials is formed by V6,0 ∼ = V6,0,0 ⊕ V6,0,1 ⊕ V6,0,2 ⊕ V6,0,3 .
(465)
The space V6,0,3 is a descending series of the polynomials determined by the polynomial (p2 )3 , 6 3;0 C3 6 3;0 C2 6 3;0 C1 6 3;0 C0 6 3;0 C−1 6 3;0 C−2 6 3;0 C−3
3 = p2 , 2 1 3 = aˆ − p2 = qp p2 , 6 1 1 2 = aˆ − 6 C23;0 = q2 p2 + 4(qp)2 p2 , 5 5 1 − 6 3;0 2 3 = aˆ C1 = (qp)3 + q2 p2 qp, 4 5 5 1 − 6 3;0 1 2 2 2 4 2 = aˆ C0 = p q + q (qp)2 , 3 5 5 1 − 6 3;0 = aˆ C−1 = q2 qp, 2 3 3;0 = aˆ − 6 C−2 = q2 .
(466a) (466b) (466c) (466d) (466e) (466f) (466g)
Using Eq. (365) we can easily compute others subspaces from the axial symmetric polynomials of lower orders. The first is the anisotropic subspace of the axial symmetric polynomials of the sixth order, V6,0,2 = Lz V4,0,2 ,
(467)
6 2;0 Cm
(468)
that is 2;0 = Lz 4 Cm .
The next subspace is formed by the isotropic polynomials V6,0,1 = L2z V2,0,1 = L2z p2 , L2z qp, L2z q2 .
(469)
The last one is formed by one anisotropic polynomial V6,0,0 = L3z .
(470)
340
ˇ RADLI CKA
B. Algebra of the Axial Symmetric Polynomials up to the Sixth Order The subspace of the axially symmetric polynomials up to the sixth order takes the form
O6 = h2 ⊕ V4,0,0 ⊕ V4,0,1 ⊕ V4,0,2 ⊕ V6,0,0 ⊕ V6,0,1 ⊕ V6,0,2 ⊕ V6,0,3 , (471) where all the subspaces were explicitly described previously. We introduce the algebraic structure of O6 , which is necessary for the description of the fifth-order axially symmetric aberrations. We have shown that each subspace in the decomposition in Eq. (471) is irreducible with respect to the action h2 . However, the space O6 is not an irreducible subspace of the axially symmetric polynomial with respect to the standard Poisson bracket, because e.q. the Poisson bracket [6 C03;0 , 6 C23;0 ] is a polynomial of the tenth order. To avoid the difficulty, we modify the definition of the Lie bracket on the space O6 following j ;0 j ;0 n j1 ;0 n j2 ;0 [n1 Cm11 , n2 Cm22 ] if n1 + n2 − 2 6, (472) 1C 2 m1 , Cm2 6 = 0 else. This definition ensures that the Poisson brackets, which would be of higher order than six, vanish. This means that the Lie brackets of any two polynomials from V6,0 or one polynomial from V4,0 and one from V6,0 vanish. Because the action of h2 was described previously, we need only describe the Lie bracket of any two polynomials from V4,0 . Because the polynomials are axially symmetric the Lie brackets [V4,0,0 , O6 ] vanish. Now we calculate the Lie brackets [V4,0,1 , V4,0,1 ⊕ V4,0,2 ] 4 1;0 4 j ;0 Cm1 , Cm2 3−j 1;0 2j j ;0 = Lz 2 Cm , Cm2 1 ⎧ 3−j + 2j j ;0 6 j ;0 ⎪ ⎪ −2Lz [a , Cm2 ] = −2(m2 − j ) Cm2 +1 if m1 = 1, ⎨ 6 j ;0 2j j ;0 = 2L3−j if m1 = 0, (473) z [a0 , Cm2 ] = 2m2 Cm2 ⎪ ⎪ ⎩ 3−j − 2j j ;0 j ;0 2Lz [a , Cm2 ] = 2(m2 + j ) 6 Cm2 −1 if m1 = −1. The calculation of the Lie brackets [V4,0,2 , V4,0,2 ] is more complicated. We will use several properties to shed light on the algebraic structure. First, we describe the eigenvalues of the Lie bracket of two polynomials from V4,0 with respect to aˆ 0 , Lˆ z , Nˆ and aˆ 2 . Because the Lie bracket consists of a multiplication and two derivatives, we can write for the number operator j ;0 j ;0 j ;0 j ;0 Nˆ 4 Cm11 , 4 Cm22 = 6 4 Cm11 , 4 Cm22 . (474)
LIE ALGEBRAIC METHODS
341
Lˆ z and aˆ 0 are Lie operators; hence, j ;0 j ;0 j ;0 j ;0 j ;0 j ;0 Lˆ z 4 Cm11 , 4 Cm22 = Lˆ z 4 Cm11 , 4 Cm22 + 4 Cm11 , Lˆ z 4 Cm22 = 0, (475a) j ;0 j ;0 j ;0 j ;0 j ;0 j ;0 aˆ 0 4 Cm11 , 4 Cm22 = aˆ 0 4 Cm11 , 4 Cm22 + 4 Cm11 , aˆ 0 4 Cm22 j ;0 j ;0 = (m1 + m2 ) 4 Cm11 , 4 Cm22 . (475b) The first equation shows that the Lie bracket of any two axially symmetric polynomials is again the axially symmetric polynomial. The consequence of the second equation is 4 j1 ;0 4 j2 ;0 Cm1 , Cm2 = 0 for |m1 + m2 | > 3. (476) The situation is more complicated for aˆ 2 because it is not a Lie operator and j ;0 j ;0 the Lie bracket [4 Cm11 , 4 Cm22 ] need not be the eigenvector of aˆ 2 ; hence, 4 j1 ;0 4 j2 ;0 6 j ;0 Cm1 , Cm2 = cj Cm1 +m2 . (477) j ∈J
However, we can at least reduce the set J . We have shown that eigenvalue m is connected with eigenvalue j , i.e., m ∈ {−j, . . . , j }; hence, possible values of j are constrained to j ∈ {|m1 + m2 |, . . . , 3}. Because the reflection with respect to the plane that contains the optic axis is the canonical transformation, the Lie brackets satisfy j ;0 j ;0 j ;0 j ;0 Ω0 4 Cm11 , 4 Cm22 = Ω0 4 Cm11 , Ω0 4 Cm22 j ;0 j ;0 = (−1)2−j1 (−1)2−j2 4 Cm11 , 4 Cm22 . (478) The result of the Lie bracket of any two isotropic or anisotropic polynomials is an isotropic polynomial, but when these polynomials are mixed the result is an anisotropic polynomial. In our case of j1 = j2 = 2, this restricts the possible value of j in Eq. (477) to j ∈ {1, 3}. Summarizing the previous results, we can find ' 3;0 1 < |m1 + m2 | 3, c3 6 Cm 4 2;0 4 2;0 1 +m2 (479) Cm1 , Cm2 = 1;0 6 C 3;0 + c |m + m | < 2. c1 6 Cm 3 1 2 m1 +m2 1 +m2 However, the concrete values of the coefficients c1 and c3 must be determined by standard calculation of Poisson brackets. We present only the results: 3 4 2;0 4 2;0 C2 , C1 = −4 p2 = −46 C33;0 , (480a) 4 2;0 4 2;0 2 2 6 3;0 C2 , C0 = −8 p qp = −8 C2 , (480b) 2 4 2;0 4 2;0 2 8 C2 , C−1 = 8 p2 qp − 4 p2 q2 = − 6 C11;0 − 126 C13;0 , (480c) 5
ˇ RADLI CKA
342 4
2 32 2;0 = −16p2 (qp)2 − 4 p2 q2 = − 6 C01;0 − 166 C03;0 , C22;0 , 4 C−2 5 4 2;0 4 2;0 4 6 1;0 2 2 2 6 3;0 C1 , C0 = −4p q = C1 − 4 C1 , 5 4 2;0 4 2;0 4 C1 , C−1 = −4(qp)3 − 4q2 p2 qp = 6 C01;0 − 86 C03;0 , 5 4 2;0 4 2;0 28 1;0 2 3;0 C1 , C−2 = −8(qp)2 q2 − 4p2 q2 = − 6 C−1 − 126 C−1 , 5 4 2;0 4 2;0 4 1;0 3;0 C0 , C−1 = −4q2 (qp)2 = 6 C−1 − 46 C−1 , 5 4 2;0 4 2;0 2 3;0 C0 , C−2 = −8 q2 qp = −86 C−2 , 4 2;0 4 2;0 2 3 3;0 C−1 , C−2 = −4 q = −46 C−3 .
(480d) (480e) (480f) (480g) (480h) (480i) (480j)
These commutation relations completely describe the algebraic structure of O6 . C. Interaction Hamiltonian of a Round Magnetic Lens The vector potential of the axial symmetric magnetic field reads (Hawkes and Kasper, 1989) 1 Ax = − yΠ z, q2 , 2 1 Ay = xΠ z, q2 , 2 Az = 0
(481) (482) (483)
where Π takes the form 1 2 2 (4) 1 (484) q B + ···. Π = B(z) − q2 B + 8 192 When the potential is substituted to Eq. (29) with pτ = 0 and Φ ∗ = φ, the Hamiltonian reads 1 e2 ∗ 2 − eL Π z, q2 − e2 q2 Π 2 z, q2 . H = φ − p (485) z 4 η2 In rotating coordinates (88) we can find the expansion of the Hamiltonian in the form H2 =
1 ∗− 1 2 1 ∗ 1 2 2 φ 2p + φ 2α q , 2 2
(486)
343
LIE ALGEBRAIC METHODS
2 1 ∗− 3 2 2 1 ∗− 1 2 2 2 1 ∗ 1 4 φ 2 p + φ 2 α q p + φ 2 α − αα q2 8 4 8 1 ∗− 1 2 2 1 ∗−1 2 1 3 1 α − α Lz q2 , + φ 2 α Lz + φ αp Lz + (487) 2 2 2 8 2 2 3 1 1 ∗− 5 2 3 3 1 H6 = φ 2 p + φ ∗− 2 α 2 q2 p2 + φ ∗− 2 3α 4 − αα p2 q2 16 16 16 2 3 1 ∗1 6 1 1 3 2 (4) q + φ 2 α − α α + α − αα 16 8 12 1 ∗− 1 1 2 2 3 2 ∗− 3 2 2 4 2 3α − αα Lz q + α φ 2 Lz p + φ 4 2 4 3 ∗−2 2 2 1 ∗−1 1 2 2 3 3α − α q p Lz + αφ p Lz + φ 8 4 4 1 1 3 1 2 3α 5 + α (4) − α 2 α Lz q2 + φ ∗−1 α 3 L3z + (488) 8 24 2 2 H4 =
where α(z) =
η 1
2φ ∗ 2
(489)
B(z).
In the calculation we use the parameterization by the positions in the object and aperture plane; thus the paraxial approximation takes the form q(z) = s(z)qo + t (z)qa , p(z) = φ
∗ 12
s (z)qo + φ
∗ 12
(490) t (z)qa .
(491)
The interaction coordinates are then in the form of Eq. (226) and the interaction Hamiltonian can be calculated using Eq. (227): H int =
˜ z), z + H6 w(w, ˜ z), z + · · · . H4 w(w,
η ∗ 12
(492)
eφo Wso After lengthy but trivial calculation, H4int and H6int can be written in the form H4int =
j 2 j =0 m=−j
j ;0
aj,m 4 Cm ,
H6int =
j 3
j ;0
bj,m 4 Cm .
j =0 m=−j
The form of the coefficients aj,m and bj,m is show in Appendix B.
(493)
ˇ RADLI CKA
344
D. Calculation of g4 and g6 The Lie transformation that describes the aberration up to the fifth order for axial symmetric systems is described by the polynomials g4 and g6 ,
M1 e:g6 : e:g4 : .
(494)
The axial symmetry determines the form of g4 and g6 ; generally they are written as g4 =
j 2
j ;0
αj,m 4 Cm
(495a)
j =0 m=−j
or g6 =
j 3
j ;0
βj,m 6 Cm .
(495b)
j =0 m=−j
Now the form of the coefficients αj,m and βj,m must be determined. Because zi g4 (zi ) = −
H4 (qo , qa , z) dz = − int
zi j 2
j ;0
aj,m (z) dz 4 Cm
(496)
j =0 m=−j zo
zo
using Eq. (495a), we can easily show that zi αj,m = −
aj,m (z) dz.
(497)
zo
The situation for βj,m is more complicated because the term g6 is calculated from zi zi z 1 dz dz1 H4 (qo , qa , z1 ), H4 (qo , qa , z) g6 = − H6int (qo , qa , z) dz + 2 zo
=−
zo
j 3
zi
zo
j ;0
bj,m (z) dz 6 Cm
j =0 m=−j zo 3 1 + 2
×
4
j1
zi j2
j1 ,j2 =0 m1 =−j1 m2 =−j2 zo
j1 ;0 4 j2 ;0 Cm . 1 , Cm2
z dz
dz1 aj1 ,m1 (z1 )aj2 ,m2 (z) zo
(498)
LIE ALGEBRAIC METHODS j ;0
345
j ;0
The Lie brackets [4 Cm11 , 4 Cm22 ] render the calculation difficult. We will calculate the contribution of each Seidel weight independently. First, we describe the contribution with Seidel weight m = 3; we show later that this polynomial describes the spherical aberration of the fifth order. Using the algebraic structure of O6 described previously, we find that the contribution to 6 C33;0 yields only the bracket [4 C22;0 , 4 C12;0 ] = −4 6 C33;0 ; hence, zi β3,3 = −
zi b3,3 (z) dz − 2
zo
z dz
zo
dz1 a2,2 (z1 )a2,1 (z) − a2,2 (z)a2,1 (z1 ) .
zo
(499)
To simplify the notation let us define the bilinear operator zi
I (f, g) =
z dz
zo
dz1 f (z1 )g(z) − f (z)g(z1 ) .
(500)
zo
Eq. (499) now reads zi b3,3 (z) dz − 2I (a2,2 , a2,1 ).
β3,3 = −
(501)
zo
Next, we find the part with Seidel weight m = −2, which describes the distortion; that is, we must find the form of β3,−2 and β2,−2 , which are the 3;0 2;0 or 6 C−2 , respectively. We use Eq. (498) and commutation coefficients of 6 C−2 relations in the space O6 . The contribution of the Poisson bracket in Eq. (498) 3;0 to 6 C−2 is produced by 4
2;0 4 2;0 2 2;0 3;0 C02;0 , 4 C−2 , C0 = −8 q2o (qo qa ) = −86 C−2 , = − 4 C−2
hence, zi b3,−2 (z) dz + 4I (a2,−2 , a2,0 ).
β3,−2 = − zo
2;0 is given by the Lie brackets The contribution to 6 C−2
4
2;0 4 1;0 2;0 2;0 C−2 , C0 = − 4 C01;0 , 4 C−2 , = 4 6 C−2 4 1;0 4 1;0 4 2;0 4 1;0 2;0 C−1 , C−1 = − C−1 , C−1 = −2 6 C−2
(502)
ˇ RADLI CKA
346 and β2,−2 then takes the form zi
b2,−2 (z) dz + 2I (a2,−2 , a1,0 ) − I (a2,−1 , a1,−1 ). (503)
β2,−2 = − zo
Using a similar approach we find for m = 2 zi β3,2 = −
b3,2 (z) dz − 4I (a2,2 , a2,0 ),
(504)
b2,2 (z) dz + 2I (a1,0 , a2,2 ) + I (a2,1 , a1,1 ),
(505)
zo
zi β2,2 = − zo
for m = 1 zi b3,1 (z) dz + 6I (a2,−1 , a2,2 ) + 2I (a2,0 , a2,1 ),
β3,1 = −
(506)
zo
zi β1,1 = − zo
4 2 b1,1 (z) dz + I (a2,−1 , a2,2 ) + I (a1,0 , a1,1 ) + I (a2,1 , a2,0 ), 5 5 (507)
zi b2,1 (z) dz + 4I (a1,−1 , a2,2 ) + 2I (a2,0 , a1,1 ) + I (a1,0 , a2,1 ),
β2,1 = −
(508)
zo
for m = 0 zi b3,0 (z) dz + 4I (a2,−1 , a2,1 ) + 8I (a2,−2 , a2,2 ),
β3,0 = −
(509)
zo
zi β1,0 = −
6 32 b1,0 (z) dz + I (a2,−1 , a2,1 ) + I (a2,−2 , a2,2 ) 5 5
zo
β2,0
+ 2I (a1,−1 , a1,1 ), zi = − b2,0 (z) dz + 3I (a1,−1 , a2,1 ) + 3I (a2,−1 , a1,1 ), zo
(510) (511)
LIE ALGEBRAIC METHODS
zi β0,0 = −
2 2 b0,0 (z) dz + I (a1,−1 , a2,1 ) + I (a2,−1 , a1,1 ), 5 5
347 (512)
zo
and for m = −1 zi b3,−1 (z) dz + 6I (a2,−2 , a2,1 ) + 2I (a2,−1 , a2,0 ), (513)
β3,−1 = − zo zi
β1,−1 = −
4 b1,−1 (z) dz + I (a2,−2 , a2,1 ) + I (a1,−1 , a1,0 ) 5
zo
β2,−1
2 + I (a2,0 , a2,−1 ), 5 zi = − b2,−1 (z) dz + 2I (a1,−1 , a2,0 ) + 4I (a2,−2 , a1,1 )
(514)
zo
+ I (a2,−1 , a1,0 ).
(515)
The formulas are completed by substituting the terms aj,m and bj,m , which can be found in Appendix B. E. Spherical Aberration of the Fifth Order This subsection present the calculation of the spherical aberration of the fifth order. The ray is described by the Lie transformation q(z) = M1 e:g6 (qo ,po ,z): e:g4 (qo ,po ,z): qo .
(516)
In the image plane and in the rotating coordinates, this can be expanded into qi = Mqo + M g4 (zi ), qo + M g6 (zi ), qo 1 + M g4 (zi ), g4 (zi ), qo + · · · . (517) 2 The first term describes the paraxial approximation, the second one determines the third-order aberrations, and the last two terms cover the fifth-order aberrations. The paraxial approximation and the third-order aberrations were described previously; here we describe only the fifth-order aberration. The spherical aberration is determined by the polynomials with the highest Seidel
ˇ RADLI CKA
348
weight (m = 5/2 for fifth-order polynomials). We separate the effect of g6 and g4 . The effect of g6 consists of the summation of Lie brackets of the form ⎛ 1 ;1 ⎞⎤ ⎡ 1 2 ⎢6 j ;0 ⎜ C− 12 ⎟⎥ 6 j ;0 (518) Cm , qo = ⎣ Cm , ⎝ 1 ⎠⎦ , 1 S 2 ;1 1 −2
1 2 ;1 − 12
where we used x = 1 C
1 2 ;1 − 12
and y = 1 S
. If this term describes the spherical
aberration, it must have the Seidel weight 5/2 (the highest weight in fifthorder polynomials). On the other hand, because aˆ 0 is a Lie operator the Seidel weight of the Lie bracket in Eq. (518) is m − 1/2; thus m = 3. Hence, the effect of g6 on the spherical aberrations is represented by 3 2 j ;0 M βj,3 6 C3 , qo = β3,3 q2a , qo = −6Mβ3,3 q2a qa . (519) j
In the case of g4 the contribution consists of a summation of double Lie brackets of the form ⎡ ⎛ 1 ;1 ⎞⎤⎤ ⎡ 1 2 4 j1 ;0 4 j2 ;0 ⎢4 j1 ;0 ⎢4 j2 ;0 ⎜ C− 12 ⎟⎥⎥ (520) Cm1 , Cm2 , qo = ⎣ Cm1 , ⎣ Cm2 , ⎝ 1 ⎠⎦⎦ , 1 S 2 ;1 1 −2
which has the Seidel weight m1 + m2 − 1/2. Because the Seidel weight must be 5/2, we can easy find constraints such that m1 + m2 = 3. Hence, the contribution of g4 to the spherical aberration of the fifth order takes the form j ;0 j ;0 1 M αj1 ,m1 αj2 ,m2 4 Cm11 , 4 Cm22 , qo , (521) 2 m1 +m2 =3 j1 ,j2
which can be evaluated thus: 1 Mα2,2 α2,1 4 C22;0 , 4 C12;0 , qo + 4 C12;0 , 4 C22;0 , qo 2 1 + Mα2,2 α1,1 4 C22;0 , 4 C11;0 , qo + 4 C11;0 , 4 C22;0 , qo . (522) 2 The first row represents the isotropic spherical aberration and the second the anisotropic spherical aberration. After a short calculation, the first row vanishes and the contribution is reduced to 2 2 ya . (523) −4Mα2,2 α1,1 qa −xa
349
LIE ALGEBRAIC METHODS
The fifth-order spherical aberration then becomes 2 2 −6Mβ3,3 q2a qa − 4Mα2,2 α1,1 q2a
ya −xa
.
(524)
Summarizing, the coefficient of the isotropic spherical aberration reads C5 = −6β3,3
(525)
c5 = −4α2,2 α1,1 = −C3 k3 .
(526)
and the anisotropic one
The results were tested on the simple magnetic lens shown in Figure 4. The field was calculated in the program EOD (Electronic Optic Design) (Lencová and Zlámal, 2006). We used a regular mesh to establish the axial density at equally spaced points. The derivatives of the axial flux density up to fourth order were calculated using Gaussian wavelets (Berz, 1999). The object plane z = −0.04 m is focused to the image plane z = 0.01 m. The aperture plane was chosen to be located at z = −0.02 m. The coefficients of the spherical aberration of the third and fifth order are calculated from the aberration integrals derived. The spherical aberrations also can be easily calculated by fitting ray-tracing results. Let us assume that all rays start from the axis with ya = 0. For the ray intersections with the image plane xi = M C3 xa3 + C5 xa5 + C7 xa7 + · · · , (527) 5 7 yi = −M c5 xa + c7 xa + · · · . (528) The aberration coefficients can be easily computed by fitting the ray-tracing results to these expressions. These results are compared with results computed using aberration integral in Table 1. The results show very good agreement between the results derived by fitting and those obtained by calculation of the aberration integrals. We showed that the spherical aberration also has the anisotropic part. This is a surprising property of fifth-order aberrations. It shows that finding the general structure of the fifth-order aberrations of axially symmetric systems is a very important and difficult task. We will proceed for each Seidel weight independently using a similar procedure to that used above. F. Distortion: Seidel weight −5/2 The distortion depends only on position in the object plane and is hence described by the polynomials that do not depend on the aperture position.
ˇ RADLI CKA
350
F IGURE 4.
Testing a round magnetic lens and the field of the lens on the axis.
LIE ALGEBRAIC METHODS
351
TABLE 1
Coefficient
Aberration integral
Fitting
C3 C5 c5
1.532 · 106 m−2 2.889 · 1011 m−4 7.649 · 1010 m−4
1.532 · 106 m−2 2.831 · 1011 m−4 7.650 · 1010 m−4
In the case of fifth-order polynomials, they have the Seidel weight −5/2. The effect of g6 can be expressed as j ;0 M βj,−2 6 C−2 , qo j
3;0 2;0 , qo + β2,−2 6 C−2 , qo = M β3,−2 6 C−1 2 2 = Mβ3,−2 q2o (qo qa ), qo + Mβ2,−2 Lz q2o , qo .
(529)
The effect of g4 has the form of Eq. (520), but the Seidel weight is now − 52 ; thus, m1 + m2 = −2, 1 M 2
j ;0 j ;0 αj1 ,m1 αj2 ,m2 4 Cm11 , 4 Cm22 , qo .
(530)
m1 +m2 =−2 j1 ,j2
After a short calculation we can easily find 2 2 3 2 1 2 −M β3,−2 + 4α2,0 α2,−2 − α2,−1 qo qo + α1,−1 2 2
(531)
for the isotropic distortion of the fifth order and 2 M(β2,−2 − 2α1,−1 α2,−1 + 2α2,−2 α1,0 ) q2o Jˆqo
(532)
for the anisotropic distortion. G. Coma: Seidel Weight 3/2 We now describe the aberrations that are linear in the distance from the optic axis in the object plane. This aberration is known as coma. Note that we use terminology proposed by Liu (2002) rather than by Zhu et al. (1997). These aberrations must have the Seidel weight m = 3/2. Using a similar approach to that followed for spherical aberration, we can write for the effect of g6 on
ˇ RADLI CKA
352
fifth-order coma as j ;0 Mβj,2 6 C2 , qo j
= Mβ3,2 6 C23;0 , qo + Mβ2,2 6 C22;0 , qo 2 2 = Mβ3,2 q2a (qo qa ), qo + Mβ2,2 q2a Lz , qo .
(533)
Similarly, for the effect of g4 we can constrain m1 + m2 = 2; hence, 1 M 2
j ;0 j ;0 αj1 ,m1 αj2 ,m2 4 Cm11 , 4 Cm22 , qo ,
(534)
m1 +m2 =2 j1 ,j2 j1 ,j2
which can be divided into an isotropic part M α2,2 α2,0 4 C22;0 , 4 C02;0 , qo + 4 C02;0 , 4 C22;0 , qo 2 M 2 4 1;0 4 1;0 M 2 4 2;0 4 2;0 C1 , C1 , qo + α1,1 C1 , C1 , qo (535) + α2,1 2 2 and an anisotropic part M α2,2 α1,0 4 C22;0 , 4 C01;0 , qo + 4 C01;0 , 4 C22;0 , qo 2 M 4 1;0 4 2;0 M + α2,1 α1,1 4 C12;0 , 4 C11;0 , qo + C1 , C1 , qo . 2 2
(536)
Combining the previous equations, we can find for the isotropic part of fifthorder coma 4 5 2 1 2 2 2 + α2,1 qa qo −M β3,2 − α2,2 α2,0 + 8α2,2 α0,0 + α1,1 3 2 2 4 1 2 1 2 − 4M β3,2 + α2,2 α2,0 − 2α2,2 α0,0 − α2,1 − α1,1 (qo qa )q2a qa 3 2 2 (537) and for the anisotropic part 2 M(5β2,2 + 2α2,2 α1,0 − 2α2,1 α1,1 ) q2a Jˆ2 qo − 4M(β2,2 + 2α2,2 α1,0 )q2o (qo qa )Jˆ2 qa .
(538)
LIE ALGEBRAIC METHODS
353
H. Peanut Aberration: Seidel Weight 1/2 The peanut aberration has the Seidel weight 1/2. Using the same similar approach, we find the effect of g6 to be M
j ;0 βj,1 6 C1 , qo
j
= Mβ3,1 6 C13;0 , qo + Mβ1,1 6 C11;0 , qo + Mβ2,1 6 C12;0 , qo . (539) The constraint for m1 and m2 in the double Lie bracket (520), which describes the effect of g4 , reads m1 + m2 = 1; hence, 1 M 2
j ;0 j ;0 αj1 ,m1 αj2 ,m2 4 Cm11 , 4 Cm22 , qo .
(540)
m1 +m2 =1 j1 ,j2
After a short calculation, we derive 4 −2M β3,1 − β1,1 + 3α2,1 α0,0 + 2α1,1 α1,0 q2a (qo qa )qo 5 4 2 − M β3,1 + 4β1,1 − α2,1 α2,0 + 4α2,2 α2,−1 − 2α2,1 α0,0 − α1,1 α1,0 5 3 4 2 × q2o q2a qa − 2M β3,1 − 2β1,1 + 4α2,2 α2,−1 − α2,1 α2,0 5 3 − 2α2,1 α0,0 − 2α1,1 α1,0 (qo qa )2 qa (541)
for the isotropic part and 2 2M 2β2,1 − α2,1 α1,0 − α1,1 α2,0 + 4α2,2 α1,−1 + α1,1 α0,0 q2a (qo qa )Jˆqo 3 − M(β2,1 + 6α1,1 α0,0 + 8α2,2 α1,−1 )q2a q2o Jˆqa 2 − 2M β2,1 + α2,1 α1,0 + 4α2,2 α1,−1 − α1,1 α2,0 − 2α1,1 α0,0 3 × (qo qa )2 Jˆqa for the anisotropic part.
(542)
ˇ RADLI CKA
354
I. Elliptical Coma: Seidel Weight −1/2 The fifth-order polynomials with Seidel weight −1/2 describe the elliptical coma. The contribution of g6 then takes the form j ;0 Mβj,0 6 C0 , qo = M β3,0 6 C03;0 , qo + β1,0 6 C01;0 , qo j
+ β2,0 6 C02;0 , qo + β0,0 6 C00;0 , qo . (543)
We now find the constraint on m1 and m2 in the form m1 + m2 = 0; hence, the contribution of g4 reads j ;0 j ;0 1 M αj1 ,m1 αj2 ,m2 4 Cm11 , 4 Cm22 , qo . (544) 2 m1 +m2 =0 j1 ,j2
Substituting the forms of polynomials and calculating the Lie bracket, we derive 3 2 2 −M β1,0 + β3,0 + 8α2,2 α2,−2 − α2,1 α2,−1 + α2,0 + 5α1,1 α1,−1 5 9 4 1 2 + α2,0 α0,0 − α1,0 + 2α0,0 q2o q2a qo 3 2 6 8 2 8 2 − M β3,0 − 3β1,0 + 2α1,0 − α2,0 + 2α2,1 α2,−1 + α2,0 α0,0 5 9 3 2 − 2α1,1 α1,−1 − 2α0,0 (qo qa )2 qo 6 2 − M 2β1,0 + β3,0 − 2α1,1 α1,−1 − 4α2,0 α0,0 − α1,0 + 16α2,2 α2,−2 5 4 2 + 2α2,1 α2,−1 − α2,0 (545) q2o (qo qa )qa 3 for the isotropic elliptical coma of the fifth order and 2 M β2,0 + 3β0,0 − 2α1,0 α0,0 − α1,0 α2,0 q2o q2a Jˆqo 3 + M 2β2,0 − 3β0,0 + 4α1,0 α0,0 + 2α2,1 α1,−1 2 8 − α1,0 α2,0 + 2α1,1 α2,−1 (qo qa ) Jˆqo 3 2 + 2M −β2,0 − 4α2,1 α1,−1 + α1,0 α2,0 − α1,0 α0,0 q2o (qo qa )Jˆqa (546) 3 for the anisotropic elliptical coma.
LIE ALGEBRAIC METHODS
355
J. Astigmatism and Field Curvature: Seidel Weight −3/2 The astigmatism and the field curvature are described by polynomials linear in p; in the case of fifth-order polynomials, they have the Seidel weight −3/2. Hence, the effect of g6 reads j ;0 M βj,−1 6 C−1 , qo j
3;0 1;0 2;0 , qo + β1,−1 6 C−1 , qo + β2,−1 6 C−1 , qo = M β3,−1 6 C−1
(547)
and the effect of g4 1 M 2
j ;0 j ;0 αj1 ,m1 αj2 ,m2 4 Cm11 , 4 Cm22 , qo .
(548)
m1 +m2 =−1 j1 ,j2
After calculation, one can find 4 2 −2M β3,−1 − β1,−1 − α2,0 α2,−1 + 4α2,1 α2,−2 + α1,0 α1,−1 5 3 + α0,0 α2,−1 q2o (qo qa )qo 2 − M β3,−1 + β1,−1 − α1,0 α1,−1 − 2α0,0 α2,−1 + 4α2,1 α2,−2 5 2 2 − α2,0 α2,−1 q2o qa (549) 3 for the isotropic part of the astigmatism and field curvature and 2 2M β2,−1 − α2,0 α1,−1 + 4α1,1 α2,−2 − α1,0 α2,−1 + α0,0 α1,−1 3 × q2o (qo qa )Jˆqo 8 − M β2,−1 + α2,0 α1,−1 + 4α1,1 α2,−2 − 2α1,0 α2,−1 + 2α0,0 α1,−1 3 2 2 ˆ (550) × qo J qa for the anisotropic part.
A PPENDIX A. T HE H AMILTONIAN T RANSFORMATION Although the transformation rule for the Hamiltonian under a z-dependent canonical transformation has been derived by Cary (1977), for completeness it is summarized here. We start from the rule valid in the z-independent case
ˇ RADLI CKA
356
and the extension to the z-dependent case will be derived using the extended phase-space formalism. ˜ with 2 degrees of Let us consider two canonical coordinate systems w, w ˜ ˜ describing w freedom and an s-dependent Lie transformation w = e:g(w,z): ˜ :g( w,z): ˜ Although the transformation rule for the w. their relationship w = e Hamiltonian in the z-independent case does not differ from the transformation rule for functions, in this case it might not be obvious. Fortunately, each canonical z-dependent system with 2 degrees of freedom is equivalent to the canonical z-independent system with 2 degrees of freedom known as the extended phase space (Goldstein, 1980), in which z is one of canonical variables with canonical conjugate momentum pz . A vector in the extended phase space then reads W = (w, z, pz ), and the Hamiltonian takes the form ˜ is represented by a H¯ = H + pz . The canonical transformation e:g(w,z): ˜ E :g( W): , which does not depend on the independent canonical transformation e variable t in the extended phase space. For the Lie operator in the extended phase space the notation :f :E —with the Poisson bracket defined according to ∂g ∂h ∂h [g, h]E = [g, h] + ∂g ∂z ∂pz − ∂pz ∂z —was used. The transformed Hamiltonian then is written as ˜ E ¯ ˜ ˜ = e:g(W): H¯˜ (W) H (W) (551) and using the coordinates in the original phase space can be written as ˜ s) + p˜ z H˜ (w,
˜ E H (w, ˜ s) + pz = e:g(w,s):
∂g 1 ∂g ∂g 1 ˜ :g(w,z): ˜ z) + =e + g, + g, g, + · · · + p˜ z , H (w, ∂z 2 ∂z 6 ∂z 1 ˜ z) ∂g(w, ˜ ˜ :g( w,z): ˜ z) = e ˜ z) + dθ eθ:g:(w,z) H˜ (w, , (552) H (w, ∂z 0
which is the desired expression. The situation is described by the diagram H¯ (W)=H (w,z)+pz
(w, z) ˜ e:g(w,z):
˜ z) (w,
H (w, z) ?
˜ z) H˜ (w,
(W, t) ˜
e:g(W):E
˜ t) (W, H˜ (w,z)=−p ˜ z
H¯ (W) ˜
e:g(W):E
˜ H˜¯ (W).
(553)
LIE ALGEBRAIC METHODS
357
A PPENDIX B. T HE F ORM OF THE I NTERACTION H AMILTONIAN FOR THE ROUND M AGNETIC L ENS This appendix presents the form of the interaction Hamiltonian of the round magnetic lens. We use the notation introduced in Section VII.C. The direct calculations were done in the mathematical program Maple (Maple, 2006). We suppose that H4int and H6int are in the form as in Eq. (493); the coefficients in the fourth-order Hamiltonian then read 2 1 a2,2 = W −1 −αα t 4 + α 2 t 2 + t 2 , (554a) 8 1 a2,1 = W −1 α 4 − αα t 3 s + α 2 stt 2 + α 2 s t t 2 + s t 3 , (554b) 2 1 a2,0 = W −1 4α 2 sts t + −3αα + 3α 4 t 2 s 2 + 3s 2 t 2 4 (554c) + α2t 2s 2 + α2s 2t 2 , 1 a0,0 = W −1 α 2 s 2 t 2 − 2α 2 sts t + α 2 t 2 s 2 + 3α 2 W 2 , (554d) 6 1 a2,−1 = W −1 s 3 t + α 2 s t s 2 + α 2 sts 2 + α 4 − αα ts 3 , (554e) 2 1 −1 4 4 a2,−2 = W s + α − αα s 4 + 2 α 2 s 2 s 2 , (554f) 8 1 3 1 2 1 2 α − α t + αt , (555a) a1,1 = 8 2 2 1 (555b) a1,0 = − α + α 3 ts + s t α, 4 1 3 1 2 1 2 α − α s + αs (555c) a1,−1 = 2 8 2 and for the sixth-order Hamiltonian, they take the form 4 1 3 α − αα t 2 t 4 + 3α 2 t 2 t 4 b3,3 = 16W 1 1 2 (4) 3 6 6 6 , (556) αα − α α + α + α t + t + 12 8 1 b3,2 = 12α 2 s t 3 t 2 + 6α 2 stt 4 + −2αα + 6α 4 t 4 t s 16W i + 12α 4 − 4αα t 2 t 3 s 1 (4) 3 2 5 3 6 5 t s + 6s t , (557) + −6α α + 6α + αα + α 2 4
ˇ RADLI CKA
358
1 1 3 4 4 2 3 2 2 4 15 2 4 = − αα + α t s + α s t + s t + 6α 2 sts t 3 4W 4 4 4 4 9 4 3 2 2 2 9 2 2 2 2 α − αα s t t + α s t t + 6α 4 − 2αα st 3 s t + 2 2 2 5 15 15 15 αα (4) + α 2 + α 6 − α 3 α s 2 t 4 , + (558) 16 32 4 4
b3,1
b1,1 =
b3,0
1 1 15 − α −30W 2 α 3 + 5W 2 α t 2 + α 2 W 2 t 2 5W 8 4 1 3 1 − α −6α 3 + 2α t 2 s 2 t 2 + α 2 s 2 t 4 − α 12α 3 − 4α s t st 3 8 4 8 3 2 3 1 3 3 2 4 2 2 2 2 − α sts t − α −6α + 2α s t + α s t t , (559) 2 8 4
1 5s 3 t 3 + 9α 2 sts 2 t 2 + −3αα + 9α 4 s t s 2 t 2 = 4W 5 5 2 3 3 4 6 (4) 3 + 5α + αα − 5α α + α s t + 3α − αα st 3 s 2 12 8 4 2 3 2 3 2 2 3 2 + 3α s t t + 3α − αα s tt + 3α s t s , (560)
b1,0
b3,−1
1 1 = − α −30W 2 α 3 + 5W 2 α ts + 15α 2 W 2 s t 10W 2 3 − α 3α − α tt 2 s 3 − 2α α 3 − α t s s 2 t 2 + 3α 2 s t 3 s 2 3 2 3 2 2 2 2 3 2 − α 3α − α s st − 6α sts t + 3α s t t , (561)
1 15 4 2 s t + 4(3α 4 − αα )s 3 ts t + 12α 2 ts 3 t s = 8W 2 3 4 1 4 2 + α − αα s t + 9α 2 s 2 t 2 s 2 + 3 3α 4 − αα s 2 t 2 s 2 2 2 15 6 15 2 5 (4) 15 3 4 2 3 2 2 4 α + α + αα − α α s t + α t s , + 2 16 8 2 2 (562)
LIE ALGEBRAIC METHODS
b1,−1 =
b3,−2
359
W 2 1 − α 5α − 30α 3 s 2 + 15α 2 W 2 s 2 − α α − 3α 3 t 2 s 4 20W 2 + 2α α − 3α 3 ts t s 3 − α α − 3α 3 s 2 s 2 t 2 + 3α 2 s 2 t 2 s 2 2 3 2 2 4 − 6α ts t s + 3α t s , (563)
1 = 3s 5 t + 2 3α 4 − αα s 2 ts 3 + 3α 2 sts 4 + 6α 2 s 3 t s 2 8W 1 (4) 3 2 3 6 5 4 4 + α − 3α α + αα + 3α ts + −αα + 3α s t s , 8 4 (564) 1 b3,−3 = −αα + 3α 4 s 2 s 4 + 3α 2 s 2 s 4 + s 6 16W 1 2 1 3 6 (4) + s6 , (565) α − α α + α + αα 8 12
b2,2 =
3 5 3 1 (4) 4 1 3 α − α 2 α + α t + − α + α 3 t 2 t 2 , 8 16 192 16 4 (566) 3 3 1 2 3 3 1 2 3 α − α t ts + α − α t ts + αs t 3 = 2 8 2 8 2 1 (4) 3 5 3 2 3 α + α − α α t s, + (567) 48 2 4
3 4 αt + 8
b2,1
1 3 1 (4) 9 2 9 5 2 2 b2,0 = − α + α 3 s 2 t 2 + α − α α + α t s 16 4 32 8 4 1 3 3 2 2 9 2 2 1 3 + − α + α t s + αs t + − α + 3α t s ts, 16 4 4 4 (568) 1 3 3 3 1 (4) α − α 2 α + α 5 t 2 s 2 b0,0 = − α + α 3 s 2 t 2 + 20 5 240 20 10 1 3 3 2 2 3 2 2 1 3 3 α − α t s ts, + − α + α t s + αs t + 20 5 10 20 5 (569)
ˇ RADLI CKA
360
3 3 1 2 3 3 1 2 3 α − α s ts + α − α s ts + s 3 t α b2,−1 = 2 8 2 8 2 1 (4) 3 5 3 2 α + α − α α ts 3 , (570) + 48 2 4 3 5 3 2 1 (4) 4 3 5 4 b2,−2 = s + α s α − α α + α 8 16 192 8 1 3 3 2 2 + − α + α s s . (571) 16 4
R EFERENCES Barth, J., Lencová, B., Wisselink, G. (1990). Field-evaluation from potentials calculated by the finite-element method for ray tracing—The slice method. Nucl. Instr. Methods Phys. Res. A 298, 263–268. Bäuerle, G., Kerf, E. (1999). Lie Algebras, Part 1—Finite and Infinite Dimensional Lie Algebras and Applications in Physics. Studies in Mathematical Physics. Elsevier, New York. Bazzani, A. (1988). Normal forms for symplectic maps of R2n . Celest. Mech. 42, 107–128. Bazzani, A., Todesco, E., Turchetti, G., Servizi, G. (1994). A normal form approach to the theory of nonlinear betatronic motion. CERN. Yellow Report 94/02, Geneva. Berz, M. (1999). Modern map methods in particle beam physics. Adv. Imaging Electron Phys. 108. Cary, J. (1977). Time-dependent canonical transformations and the symmetry-equals-invariant theorem. J. Math. Phys. 18, 2432–2435. Cary, J. (1981). Lie transform perturbation-theory for Hamiltonian systems. Phys. Rep. 79 (2), 129–159. Cheng, M., Shanfang, Z., Yilong, L., Tiantong, T. (2006). Differential algebraic method for arbitrary-order chromatic aberration analysis of electrostatic lenses. Optik 117, 196–198. COSY INFINITY (2007). COSY INFINITY, computer code. http://bt.pa. msu.edu/index_files/cosy.htm. Dragt, A. (1982). Lectures on nonlinear dynamics in 1981 Fermilab Summer School. AIP Conf. Proc. 7, 147. Dragt, A., Forest, E. (1986a). Foundations of a Lie algebraic theory of geometrical optics. In: Sánchez-Mondragón, J., Wolf, K.B. (Eds.), Lie Methods in Optics. Springer-Verlag, Berlin, pp. 45–103. Dragt, A., Forest, E. (1986b). Lie algebraic theory of charged-particle optics and electron-microscopes. Adv. Electr. Electron Phys. 67, 65.
LIE ALGEBRAIC METHODS
361
Dragt, A., Neri, F., Rangarajan, G., Douglas, D., Healy, L., Ryne, R. (1988). Lie algebraic treatment of linear and nonlinear beam dynamics. Annu. Rev. Nucl. Sci. 38, 455–496. Dragt, A.J. (1987). Elementary and advanced Lie algebraic methods with applications to accelerator design, electron microscopes and light optics. Nucl. Inst. Meth. Phys. Res. A 258, 339–354. Dragt, A.J. (1990). Numerical third-order transfer map for solenoid. Nucl. Inst. Meth. Phys. Res. A 298, 441–459. Forest, E. (1998). Beam Dynamics—A New Attitude and Framework. Harwood Academic Publishers, New York. Glaser, W. (1935). Zur Bildfehlertheorie des Elektronenmikroskops. Z. Physik 97, 177–201. Goldstein, H. (1980). Classical Mechanics. Addison-Wesley. Harrington, R. (1967). Matrix methods for field problems. Proc. Inst. Electr. Electron. Eng. 55 (2), 136–149. Hawkes, P., Kasper, E. (1989). Principles of Electron Optics. Academic Press, London. Hu, K., Tang, T.T. (1998). Lie algebraic aberration theory and calculation method for combined electron beam focusing-deflection systems. J. Vac. Sci. Technol. B 16 (6), 3248–3255. Hu, K., Tang, T.T. (1999). Lie algebra deflection aberration theory for magnetic deflection systems. Optik 110 (1), 9–16. Jackson, J. (1998). Classical Electrodynamics. John Wiley & Sons, New York. Jagannathan, R., Khan, S. (1996). Quantum theory of the optics of charged particles. Adv. Imaging Electr. Phys. 97, 257. Khursheed, A. (1999). The Finite Element Method in Charged Particle Optics. Kluwer Academic. Lencová, B. (1995). Computation of electrostatic lenses and multipoles by the first-order finite-element method. Nucl. Instr. Meth. Phys. Res. A 363, 190–197. Lencová, B., Zlámal, J. (2006). A new program for the design of electron microscopes. In: Abstracts—CPO7, pp. 71–72. Available at http://www.mebs.co.uk. Lichtenberg, A., Lieberman, M. (1982). Regular and Stochastic Motion. Springer-Verlag, New York. Liu, Z.X. (2002). Fifth-order canonical geometric aberration analysis of electrostatic round lenses. Nucl. Instr. Methods Phys. Res. A 488, 42–50. Liu, Z.X. (2006). Differential algebraic method for aberration analysis of typical electrostatic lenses. Ultramicroscopy 106, 220–232. Manikonda, S., Berz, M. (2006). Multipole expansion solution of the Laplace equation using surface data. Nucl. Inst. Meth. Phys. Res. A 558, 175–183. Maple (2006). Symbolic Computation Group. University of Waterloo, Waterloo, Ontario, Canada. Available at maplesoft.com.
362
ˇ RADLI CKA
MARYLIE (2006). Available at www.physics.umd.edu/dsat/dsatmarylie.html. Matsuya, M., Saito, M., Nakagawa, S. (1995). A semianalytical aberration calculation method using polynomials and Lie algebra. Nucl. Instr. Meth. Phys. Res. A 363, 261–269. MEBS (2007). Munro Electron Beam Software, Ltd, London, England. Meyer, K. (1974). Normal form for Hamiltonian. Celest. Mech., 517–522. Navarro-Saad, M., Wolf, K. (1986). The group theoretical treatment of aberrating systems 1. J. Math. Phys. 27 (5). Press, W., Teukolsky, S., Vetterling, W.T., Flannery, B. (1986). Numerical Recipes in Fortran. Cambridge University Press. Rose, H. (1987). Hamiltonian magnetic optics. Nucl. Instr. Meth. A 258, 374– 401. Rose, H. (2004). Outline of an ultracorrector compensating for all primary chromatic and geometrical aberrations of charged-particle lenses. Nucl. Instr. Meth. Phys. Res. A 519, 12–27. Rouse, J., Munro, E. (1989). 3-dimensional computer modeling of electrostatic and magnetic electron—Optical components. J. Vac. Sci. Tech. B 7 (6), 1891–1897. Scherzer, O. (1933). Zur Theorie der elektronenoptischen Linsenfehler. Z. Physik 80, 193–202. Steinberg, S. (1986). Lie series, Lie transformations, and their applications. In: Sánchez-Mondragón, J., Wolf, K.B. (Eds.), Lie Methods in Optics. Springer-Verlag, Berlin, pp. 45–103. Sturok, P. (1955). Static and Dynamic Electron Optics. Cambridge University Press, Cambridge, MA. Venturini, M., Dragt, A. (1999). Accurate computation of transfer maps from magnetic field data. Nucl. Instr. Meth. Phys. Res. A 427, 387–392. Wang, L., Rouse, J., Liu, H., Munro, E., Zhu, X. (2004). Simulation of electron optical systems by differential algebraic method combined with Hermite fitting for practical lens fields. Microelectron. Eng. 90, 73–74. Wolf, K. (1986). The group theoretical treatment of aberrating systems 2. J. Math. Phys. 27 (5). Wolf, K. (1987). The group theoretical treatment of aberrating systems 3. J. Math. Phys. 28 (10). Ximen, J. (1995). Canonical aberration theory in electron optics up to ultrahigh–order approximation. Adv. Imaging Electr. Phys. 91, 1–36. Zhu, X., Liu, H., Munro, E., Rouse, J. (1997). Analysis of off-axis shaped beam systems for high throughput electron beam lithography. In: Proceedings of Charged Particle Optics III. SPIE, pp. 47–61. Zhu, X., Munro, E. (1989). A computer-program for electron-gun design using second-order finite–elements. J. Vac. Sci. Technol. B 7 (6), 1862– 1869.