I The Modern Development of Hamiltonian Optics

I The Modern Development of Hamiltonian Optics

I T H E MODERN DEVELOPMENT OF HAM I L T 0 N I A N 0 P T I C S BY R. J. PEGIS Bausch 6 Lomb Inc., Rochester, N . Y . CONTENTS PAGE $ 1 . INTRODUC...

1MB Sizes 8 Downloads 264 Views

I

T H E MODERN DEVELOPMENT OF HAM I L T 0 N I A N 0 P T I C S BY

R. J. PEGIS

Bausch 6 Lomb Inc., Rochester, N . Y .

CONTENTS PAGE

$ 1 . INTRODUCTION

. . . . . . . . . . . . . . . . . . $ 2. T H E CHARACTERISTIC FUNCTIONS . . . . . . . . $ 3. T H E DEPENDENCE OF THE ABERRATIONS UPON OBJECT AND STOP POSITION . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3 4

16

$ 4. CONCLUSION

29

REFERENCES

29

Q 1. Introduction The method of Sir William Hamilton in mechanics and geometrical optics was undoubtedly one of the most profound mathematical discoveries to come from the nineteenth century. From the time of the communication of his “Theory of Systems of Rays” to the Irish Royal Academy in 1827, Hamilton continued to startle the scientific world with his new idea of a “characteristic function” in physics. He died in 1865. In the field of mechanics the new theory took hold immediately, so that today no one doubts its place in theoretical and applied science. But the theory was intended for use in geometrical optics as well as in mechanics, and the task of further developing Hamilton’s ideas along these lines was left to a small handful of followers whose work is almost exclusively confined to this century. We mention STEWARD[ 19281, SYNGE[ 19371, LUNEBERG[ 19441, HERZBERGER [ 19581. Probably the most prolific and difficult writer on Hamiltonian optics this century is T. Smith, an English mathematician who has spent most of his life adapting Hamilton’s methods to modern lens design. His basic articles appeared in the 1920’s though related articles continue through the 1940’s. What is unfortunate is that for the most part these articles have been neglected or misunderstood. The reason for this lies partly in the inherent hfficulty of the material, and partly in the enormous economy of expression exercised by their author. There is considerable need today for an understandable presentation of Hamiltonian optics t o the contemporary scientific world, with special attention to the ideas of T. Smith which in practice would be otherwise unavailable. This article is intended as an introduction to the modern developments of Hamiltonian optics. Section 2 develops the more basic and classical ideas, while Section 3 introduces the more radical algebra of aberrations, first discussed by T. SMITH[ 19221. It is hoped at some future date to discuss the rest of Smith’s work.

4

MODERN HAMILTONIAN OPTICS

Q 2. The Characteristic Functions 2.1. P R E L I M I N A R Y R E M A R K S

The distinguishing feature of Hamilton’s method is the use of a “characteristic function” to describe the performance of an optical system. This is not to be confused with the current use of a “merit function” in lens design, for the latter is a performance function defined by itself and applicable t o any system, while the characteristic function is actually a function of the system and completely describes the geometric optical properties of that system. Several types of characteristic function are possible, for the properties of a system can [1828] was be described in terms of points or rays or both. HAMILTON the first t o use such functions and the originator of the idea, though it was BRUNS [ 18951 who independently singled out the so-called angular characteristic or “eikonal” as basic for aberration theory. In this second part we discuss what is commonly known about the theory of two of the characteristic functions, the point characteristic and the eikonal. There is also a “mixed characteristic” discussed by SYNGE [1937] and LUNEBERG [1944], but is similar in properties to the other two and will not be discussed here. We shall show, in fact, that only the eikonal has certain special advantages and that because of these its use is almost always preferable. 2.2. FERMAT’S P R I N C I P L E

We are given a general optical system which images one space (called the object space) into another (called the image space). No special assumptions are made about the transformation between the spaces - it may not be one-to-one, so that a point may be imaged into a spot, or vice versa. We retain only the physically obvious assumption that a straight line ray entering the system is imaged into a straight line ray leaving the system. This implicitly involves us in another assumption which we make about the spaces - they are homogeneous and isotropic. In the object space, whose refractive index we denote by n, we choose a right-handed system of perpendicular axes x, y , z, and similarly in the image space of index n‘ we choose axes x‘, y‘, z‘. Let the direction cosines of a general ray in the object space be L, M, N and the direction cosines of the optically corresponding ray in the image space be L’, M‘, N’. All quantities in the object space are measured with respect to the x, y , z system, all quantities in the image space with respect to the x’,y’, z’ system. The

I?

5 21

T H E CHARACTERISTIC FUNCTIONS

5

two systems may be arbitrarily oriented with respect to each other, though in most applications we make them parallel or even coincident. Let P(x,y, z) be a general point in the object space and P’(x’,y’, 2 ’ ) a general point in the image space. All the laws of geometrical optics are contained in Fermat’s Variational Principle which states that the path taken by light from P through the system to P’ will be such that the time of propagation along it is stationary in the Calculus of Variations sense - this means that the path is such that if it were altered infinitesimally, the resulting infinitesimal change in time of propagation w d d be zero. Now we know that the time of propagation through a medium is proportional to the optical path (refractive index multiplied by geometrical path) taken through the medium ; hence by Fermat’s principle the optical path must be stationary. 2.3. ILLUSTRATIVE EXAMPLE

I n many statements of Fermat’s principle the phrase “optical path must be stationary” is replaced by “optical path must be a minimum”. We give here an example from LUNEBERG [I9441 pp. 96-97, which demonstrates that the stationary optical path need not be a minimum.

Fig. 2.1

In a medium of air (index unity) consider a spherical mirror with center M and vertex Q, as shown in Fig. 2.1. Let POand P I be symmetrically located about M with respect to the mirror axis. We know

6

[I,

MODERN HAMILTONIAN OPTICS

5

2

in advance that the ray path which will be taken between POand P1 via the mirror is PoQPl since it is the only path fulfilling the reflection law. Since the medium is air the optical path here is PoQ QP,. We show that if Q' be any point on the mirror and in the plane of PO,Q, Pi then the optical path via Q' is shorter than that via Q. To do this we construct the ellipse E through Q with Po and P I as focal points. I t s radius of curvature a t Q is certainly greater than MQ, the radius of the mirror, so it will lie outside the mirror. Extend PoQ' till it intersects the ellipse, say a t Q". Then

+

+ Q"Q' > PiQ', PiQ" + Q"Po + PoQ' > PiQ , PiQ"

.*.

.*.

PiQ"

+ Q"Po > PiQ' + Q'Po.

But on the other hand

+ Q"Po PIQ + QPo, ... PiQ + QPo > pie' + Q'Po, PiQ"

=

which demonstrates our assertion that the true optical path need not be a minimum. 2.4. SNELL'S LAW

To further illustrate and confirm Fermat's principle we use it t o deduce Snell's law. Consider the case of refraction by a single surface whose equation is given in the form x = f ( y ,2 ) .

(2.1)

Let the x , y , z and x', y', z' coordinate systems be coincident, so that all coordinates are measured in the same system. Fig. 2.2 shows a proposed ray path from P to P', where all symbols have the meanings assigned t o them in subsection 2.2. If p(Z,y, Z) denote a point on the surface, then the optical path from P to P' via F is given by

nD

+ n'D',

(2.2)

where we have written

By Fermat's principle we shall have the optically correct ray joining P

I#

9 21

THE CHARACTERISTIC FUNCTIONS

7

and P’ if infinitesimal alterations in the ray leave (2.2) unaffected. Such alterations are accomplished by varying ( Z , j j , Z) slightly, maintaining 2 = / ( j j , S ) so that P will remain on the surface. In effect, then, we are requiring that the derivatives of (2.2) with respect to 7

Fig. 2.2

and Z, where 2 = f(7,Z ) , must be zero. Performing the differentiations and equating to zero we have O=

n(y

-

y)

D

o=-- n(z - z)

+

n(x - x ) as + nyy - y‘) -D/

D

n(z - x )

V az -+ az

+ ByzD-/x’) ax ’

nyz - 2 ’ )

q x - x ’ ) ax

(2.5)

D, -k D’ az ’ D + D where D and D’ are as defined in eqs. (2.3) and (2.4). Now from Fig. 2.2 we have

2 - X

T - V

so that our derivative equations may be written

nM - n’M‘ = (n’L‘ - nL)f:, nN - n’N‘ = (n’L‘ - nL)/;, where we have used the notation

ax

- - / +-

a7

ax

= /;.

(2.7)

8

MODERN HAMILTONIAN OPTICS

[I,

92

For reasons of symmetry we consider in conjunction with the two equations of (2.7) a trivial third equation

nL

- n’L’ =

(n’L’ - nL)(- 1).

(2.8)

We now regard (I,, M, N) as components of a unit vector s in the incident direction and (L’, M’, N’) as components of a unit vector s’ in the refracted direction. Also, since the equation of the surface may be written in the form -

x + f(7, z) = 0

and since the direction of the normal to any surface g(x,y , z ) given by the vector

=

0 is

(Z, $>t>.

it is clear that (- I , fi, fz) are direction numbers for the normal to our surface. Denote this vector by Ap where ilis a length and p is the unit normal. In terms of s, s’, p our eqs. (2.7) and (2.8) may be written in the convenient vector form

ns

- n’s‘ =

(n’L’ - nL)Ap.

Taking the vector cross-product of this equation with p we have

n(s x p ) - n’(s’ x p ) = 0,

(2.9)

since the cross-product of the vector p with itself is zero. It is easily seen that (2.9) is Snell’s law. For from the directions of the vectors we see that the plane defined by s and p is parallel (and therefore coincident with) the plane defined by s’ and p ; and from the magnitudes of the vectors we have

n sin(s,p ) = n’ sin(s’, p ) ,

(2.10)

where the symbol (s,p ) means the angle between s and p . 2.5. THE P O I N T CHARACTERISTIC

If in the previous discussion we had actually carried out a solution for the (x,y , z ) of an optically correct ray from P to P‘ and substituted these values in the expression (2.2) we would have found n D n’D’ as a function of the initial and final points alone, i.e. a function of P and P‘. This function would be the true optical path from P to P’ and

+

1, §

21

9

T H E CHARACTERISTIC FUNCTIONS

a function of their six coordinates. We denote it by (2.1 1)

V ( x ,y , 2, x’,y‘, 2’)

and call it the point characteristic function of the system. If the system consisted of several surfaces, we would have to impose the conditions for stationary path at each surface, eliminate all the intermediary coordinates, and end up with a function V of the initial and final coordinates alone. There are special difficulties and special methods associated with carrying out this scheme - meanwhile we only wish to examine the usefulness of the function V on the supposition that we could obtain it. We apply Fermat’s principle to an arbitrary optical system, using the same symbols P , x,y , z , L, M, N, P’, x’,y‘, z’, L‘, M’, N’ with the same meanings as before now applied to the system as a whole. For greatest generality we take the coordinate systems x,y , z and x’,y‘, z’ to be unrelated. The most important property of the point characteristic comes to light when we investigate the derivatives of V with respect to its six variables. In the literature see, for instance, SYNGE[ 19371 pp. 17-24, STEWARD 119281 pp. 19-20. With reference to Fig. 2.3, let PQ be a ray

Fig. 2.3

entering the system and let Q’P‘ be the corresponding emerging ray. We know that small changes in Q and Q’ do not affect V , so now we consider the effect on V of small changes in P and P’. Define a point P + SP near P with coordinates (x Sx,y + Sy, z + Sz) and a point P’ SP‘ near P’ with coordinates (x‘ Sx’,y’ + Sy’, z’ 62’). Let Q SQ and Q’ + SQ‘ be points on the ray defined by P SP and P‘ SP‘, near Q and Q’ respectively. To facilitate the writing of equations, a distance enclosed in square brackets, e.g. [ P ,Q] shall denote an optical path. Hence V , the optical path from P to P’ is given by (2.12) = [ P , Q1 [Q, Q’l [Q’, P’I.

+

+ + +

v

+

+

+

+ +

10

[I, § 2

MODERN HAMILTONIAN OPTICS

If now we denote by V + SV the point characteristic (optical path) for the points P SP and P' SP', we have

+

V

+

+ SV = [P + SP, Q + SQ1 + [Q + SQ, Q' + SQ'I + [Q' + SQ', P' + W .

Now by Fermat's principle the change in V must be due to the change in P and P' alone, for if the optical path is stationary, our diversion of the intermediary points Q and Q' to the nearby points Q SQ and Q' SQ' produces no change in V . Hence we may ignore the diversion of Q and Q' and write

+

+

V

+ dv

=

[P

+ dp, Q1 + [Q, Q'l + [Q', P' + @'I.

Subtracting eq. (2.12) from this we have

+ dp, Q] + [Q', P + @'I - [P,Q] - [Q', P'] = { [ P+ SP, Ql - [P,QI} + {[Q', P' + 6P'l - [Q', P'I} = {[P+ dp, Ql - [P, Q]} - {[P'+ dp', Q'l - [P',Q'l}.

dv = [P

+

But [P dP, Q] - [P,Q] = B[P,Q] taken with respect to x , y , z, and [P' dP', Q'] - [P', Q'] = d[P', Q'] taken with respect to x', y', z', so that we have

+

The various derivatives of [P,Q] and [P', Q'] may be worked out as follows. Let Q have coordinates (a,v , w). Then

[P,Q] = % [ ( x - 21)'

+ ( y - v)' +

(Z

- w)']'.

Taking the partial derivative of this with respect to x we have

a

-[P, Q] = ~ Z ( X- Z C ) [ ( X - a)' ax But in Fig. 2.3 (x - a)[@ - a)'

+ ( y - v)' + ( Z - w)']'.

+ ( y - v)' + (2 - w)']-'

Hence we may write

a

-[P, Q] = - nL ax

= - L.

1,

§ 21

THE CHARACTERISTIC FUNCTIONS

11

and similarly

In the same way, letting the coordinates of Q‘ be (u‘,v ’ , w‘) we find from Fig. 2.3 (since [P’Q’] is negative)

a

-- [P’,Q‘]

ax’

a

=

-

a ax

-[P’Q’]= - n’M‘,

aY’

[Q’, P‘] =

-

n‘L’,

a

-[P’Q’]= - 12”’. azl

Hence eq. (2.13) may be written in the striking form

6V = - n(L6x

+ M6y + N6z) + n’(L’6~’+ M‘dy‘ + ”62’).

(2.14)

From this we have all the derivatives of V . Denoting partial derivatives by subscript letters here and henceforth in this article we may write

It may be noted that V satisfies Hamilton’s partial differential [ 19441 equation in ( x , y, z ) and in (x’, y‘, z‘), as discussed by LUNEBERG pp. 103-1 10, STEWARD [1928] pp. 19-20, SYNGE [1937] pp. 18-19. v2?

+ v,2 + vz2 =

vz.2

122,

+ v,*2+ Vz.2 =

(2.16)

12‘2.

Interesting as these relations seem, they constitute in reality a serious disadvantage in the use of the function V . For because of eq. (2.16) not every function of our six coordinate variables can be the point characteristic of an optical system, but rather, only those functions satisfying two given non-linear partial differential equations. For the [ 19491 pp. 222-228. analogous situation in mechanics see LANCZOS 2.6. THE EIKONAL

The angular characteristic function or “eikonal” may be defined geometrically in the following way. In Fig. 2.4 let 0 and 0‘ be origins for the ( x , y, z ) and (x’, y‘, z’) coordinate systems. Let P and P’ be the points where two optically corresponding rays cross the ( y , z ) and

12

M O D E R N H A M I L T O N I A N OPTICS

[I>

92

(y’, z’) planes; then V ( O , y , 2, O‘, y’, 2’) = [P,P’].

(2.17)

Let perpendiculars from 0 and 0‘ meet the entering and departing

Fig. 2.4

rays in S and S’, and define

E

=

(2.18)

[ S ,S’].

Then the optical distance E is called the eikonal. Now if we project OP and O‘P‘ upon the two portions of the ray we have P‘S‘ = - S‘P’ = - My‘ - N’z‘, S P = My Nz,

+

where L, M, N, L‘, M‘, N‘ are defined as before. Hence the eikonal E is given by

E

=

V

+ %(My + Nz)

-

%’(M’y’

+ ”2’).

(2.19)

Taking the first variation of this we have

SE

= SV

+ n(MSy + ySM + NSZ + zSN)

- n’(M’Sy’

+ y’SM’ + N’Sz’ + z’SN’).

Substituting from eq. (2.14) for SV with Sx = 6%’ = 0 (since P and P‘ are confined to the planes x = 0 and x’ = 0 respectively) we have the simplification

SE = nySM

+ nzSN - n‘y’SM‘ - ~z’z’SN’.

(2.20)

Hence for the derivatives of E when it is regarded as a function of M, N, M’, N’ we may write

EM= ny, EM, = - dy‘,

E N = nz, E N , = - n‘z’

(2.21)

Thus it is to our advantage to regard E as a function of the four independent direction cosines M, N , M’, N‘ alone, and it will be seen that the properties of the system are completely determined when the form of this function is known. Hence E , the eikonal, is also known as the ‘angular characteristic function’ of the system. It is more con-

1,

§ 21

T H E CHARACTERISTIC FUNCTIONS

13

venient than V , since it does not have to satisfy any given differential equations. If we allow for variation of x and x’ as well (which is seldom done) the eqs. (2.21) may easily be shown to assume the slightly more complicated form

xM \

as discussed in SYNGE[I19371 pp. 29-36. Finally we note the analytical significance of E . Substituting (2.15) in (2.19) for nM, nN, n’M’, n’N‘, we have

E =V

-Y

V~ 2Vz

-

Y’V,, - z ’ V ~ , .

Thus -E is the Legendre transform of V with respect to y, z , y’, z’, and analytically its new variables V,, V z , V,,, V z ,are by eq. (2.15) n M , nN, n’M’, n‘N‘, or equivalently the direction cosines, as we have already chosen for E . The connection of V and - E via the Legendre transform is the same as the connection between the Lagrangian and the Hamiltonian in classical mechanics, so that many of the advantages of the Hamiltonian accrue to the eikonal. For an interesting discussion of the situation in mechanics, see LANCZOS[ 19491 pp. 262-280. 2.7. THE CHOICE O F VARIABLES

When 0 and 0‘ are chosen, we have seen that E is a function of M, N, M’, N’.However, our main concern is with the symmetrical optical system, which has an axis of symmetry such that planes normal to it are imaged into other normal planes. We choose the x- and x’-axes to coincide, and nearly always take the y- and y’-axes (therefore also the 2- and 2’-axes) to be parallel. The origins 0 and 0’ on the common x-axis are not necessarily optically corresponding. Because of the symmetry, if they- and z-axes are rotated through an angle 6 about the common x-axis, and the y’- and 2‘-axes rotated through the same angle, there should be no change in the optical path. Hence the point characteristic and the eikonal may be written purely in terms of the invariants of the rotation. I n the case of the point characteristic whose variables are x , y , 2, x’,y‘, z’, if the dependence on y , z , y’, z’ is invariant under rotation about the common x-axis, then

14

MODERN HAMILTONIAN OPTICS

[I,

52

these four variables may be replaced by three: the lengths of the vectors ( y , z ) and (y’, z’) and the angle between them, or equivalently by y2 22, yy‘ zz’, y‘2 zt2. I n the case of the eikonal, since the variables are M, N , M’, N’, we replace them with the three symmetric variables of the rays: the angles made with the axis by the incident and refracted rays and the angle between these two rays, or equivalently, L, L’, LL’ MM’ NN’. But since L2 + M2 + N 2 = L’2+ M ’ 2 + ”2 =z 1,

+

+

+

+

+

it is just as correct to choose as symmetric variables the quantities (1) = M2

+ N2,

(2) = MM’

+ NN’,

(3) = M’2

+ ”2.

(2.23)

The use of these numbers to denote variables was introduced by T. SMITH[1922], and while confusing a t first sight leads to great convenience in the writing of subcripts. We denote the derivatives of E with respect to these three variables by E l , E2, E3, and we consider E as a function E [ ( I ) ,( 2 ) ,(3)] of them. Then we have in eq. (2.21)

ny

= E M = 2ME1+ M’E2,

nz = E N = 2NE1 -n’y’ = E M (= ME2

+ N’Ez,

+ 2M’E3,

-dz‘

= E N , = NE2

(2.24)

+ 2N’E3.

Let us now take 0 and 0’ to be corresponding points in the system. Then the conditions for the plane x = 0 t o be imaged onto the plane x‘ = 0 without image errors are n‘y‘ = Gny, n’z’ = Gnz, (2.25) where G is the ‘reduced magnification’, or ratio of the sizes of image and object (measured in optical rather than geometlical length). It is convenient at this point to choose the initial and final media to be air, so that n = n’ = 1 , and G may be thought of as a geometrical magnification. From eqs. (2.24) and (2.25) we then have 0

=

Gy -y‘

0

=

GZ - 2’

Hence we may write

+ E z ) + M’(GE2 + 2E3), = N(2GE1 + Ez) + N’(GE2 + 2E3). =

M(2GE1

+

+ +

M(2GE1 E2) - - M’(GE2 2E3) - N’(GE2 2E3) ’ N(2GE1+ Ez) M __ M‘ i.e. - -for all rays. N N‘

(2.26)

1,

5 21

THE CHARACTERISTIC FUNCTIONS

15

This is easily seen to be a contradiction, for it implies that all rays lie in planes through the axis. The only situation in which the Contradiction is avoided is if in (2.26) 2GE1+ EZ= 0, GE2

+ 2E3 = 0.

(2.27)

These may be thought of as the conditions for freedom from image errors. Multiplying the first by G and adding the second we have after dividing by 2 G2E1 GE2 E B = 0. (2.28)

+

+

Again, multiplying the first of eqs. (2.27) by an arbitrary constant S and adding the second we have 2SGE1+ (S

+ G)E2 + 2E3 = 0.

(2.29)

Our last two equations suggest that great simplicity would result from a linear change of variables from ( l ) , (2), (3) to I, 11, 111, say, in such a way that eqs. (2.28) and (2.29) would become the equations EII = EIII = 0. It turns out more convenient t o use - EII, so we define - EII = 2SGE1+ (S G)E2 2E3,

+ + EIII = G2E1 + GE2 + E3.

To keep the formulae symmetric in S and G (which will prove advantageous later) we must choose EI as EI

=

S2E1

+ SE2 + E3.

To give these differential relations we must have for our linear equations (1) = S2I - 2SGII G'III,

+

(2)

=

SI

(3) = I

-

-

(S + G ) I I

211

+ 111,

+ GIII,

(2.30)

from which we solve for the equations of transformation, obtaining

+ G2(3),

(S - G)ZI

=

(1)

-

(S - G)'II

=

(1)

-

(S + G)(2)

(S - G)'III

=

(1)

-

2S(2)

2G(2)

+ SG(3),

+ S'(3).

(2.3I )

The variables I, 11, 111, first introduced by T. SMITH[1922], are most convenient for aberration theory, since we know that when E is expressed in terms of them, the conditions for freedom from image

16

MODERN HAMILTONIAN OPTICS

[I,

93

errors are

EII = EIII = 0,

(2.32)

i.e. E must be a function of I alone. Thus if E for a system could be expanded as a power series in I, 11, 111, the various aberrations could be identified with terms such as I I1 (third order distortion), III3 (fifth order spherical aberration) which do not involve I alone. For a discussion of the geometrical aberrations from this point of view, see STEWARD [ (1 926) ; (1 928), pp. 30-49)]. The arbitrary constant S in the transformation is carried along for purposes of symmetry, and since it enters into the equations in exactly the same manner that G does we interpret it as a magnification, usually the magnification associated with the pupil planes of the optical system.

Q 3. The Dependence of the Aberrations upon Object and Stop Positior, 3.1. PRELIMINARY REMARKS

We have seen that in the expansion of the eikonal for a symmetrical optical system working at a magnification G as a power series in the variables I, 11, 111, all terms save powers of I alone represent image errors. Now the variables I, 11, I11 involve the magnification, so if we change G we obtain new variables 1’, 11’, III’, defined in the same way as I, 11, I11 except with the old magnification G replaced by the new magnification G‘, and the new image errors will be represented by the terms in the new eikonal at the new magnification which do not involve I’ alone. Similarly the variables I, 11, 111 involve S , so that changes in S also affect the terms in the eikonal. Our purpose in this third part is to investigate the dependence of the terms of E on G and S , where we shall take S to be the magnification associated with the pupil planes of the optical system. The algebra of this dependence may be treated very generally, and all orders of aberration considered. Our primary source is T. SMITH[1922], one of his most difficult and important papers, and it is essential to understand this algebra of aberrations in the interpretation of his later papers. As a first step, however, we must investigate the significance of E as a power series in I alone. Clearly any such series leads to freedom from image errors, and we should like to find some standard form for

I.

3 31

17

OBJECT A N D STOP POSITION

the series for E , such that any and all departures from it (even in powers of I) may be regarded as aberration, even if not all are errors in the image. 3.2. NOTATION FOR THE EIKONAL

We find it convenient to let the focal points of the symmetrical optical system be origins for the object and image spaces, and as before t o choose the x- and x’-axes coincident with the axis of revolution of the system. I n this situation we represent the eikonal by E , and call it the focal eikonal. Now if with reference to the given origins we define the symbol E’ to represent the eikonal of the same system with axial points (x, 0, 0) in the object space and (x’,0, 0) in the image space, where x and x’ are measured positively to the right from their respective focal points, we have E’ = E - nLx n’L‘x‘. (3.1)

+

Again we assume that the end media are air, so that n = n’ = 1. Suppose now that the axial points at x and x’ are conjugate. Then if f is the focal length of the system and G the magnification at which it is working, we have from Newton’s lens formula as developed, for example, in STEWARD [1928], p. 3,

f/G, X ’ = - fG, so that writing EG to identify the conjugates we have for E‘

x

==

EG It is customary to write K

EGK

=E =

-

L f - L’fG. -

(34

(3.3)

G

l / f , the power of the system, so that

L

= E K - - - L‘G.

(3.4)

G

Let S be the magnification associated with another pair of conjugate planes perpendicular t o the axis, which we shall take to be the pupil planes. For them we have

.:

EsK = EGK

L + ( S - G) (-SG

- L’)

.

(3.5)

18

[I,

MODERN HAMILTONIAN OPTICS

43

3.3. ABERRATIONS O F THE STOP

We assume that the image is free from aberrations, so that E G is a function of I alone. But we should like to determine EG uniquely and for this purpose we find it convenient to impose the additional condition that any ray passing through the axial point of the stop, i.e. through the axial point of the plane in the object space corresponding to magnification S , be refracted through the axial point of the corresponding image plane. This will uniquely determine the coefficients in the power series for EG in the variable I. Optically, the condition we are imposing means that we would like the form of the eikonal when there is no spherical aberration of any order at the axial points of the be pupil planes, i.e. at the center of the stop. Let ( Y ,2)and (Y’,2’) the coordinates of intersection of a general ray with the pupil planes. Then by eq. (2.21) we may write the derivatives of ES as

Y

z

EMS, Z

Y’ = -EM‘S,

= ENS,

- E N‘S .

Z’=

S o w EG is given as a function of I alone, so that writing ES in terms of EG by means of eq. (3.5)we have

with similar equations in Z and 2’. But in EG the differentiation with respect to M and M’, N and N‘ can be written in terms of differentiation with respect to I. For from the definition of I in eqs. (2.31) and (2.32) we have

a aM

a

~-

2M’

. -

81 a 2(M - GM’) a ______ --_-8M

(S - G)2

81

81 a _ _ _anl’ a1

-

81 ’

2G(M - GM’) (S - G)’

a

a1 ’

with similar equations in N and N’. Hence, using the relations

aL

__

8M

-

M -_

L ’

aL __ 8N

N

- --

L ’

aL aM’

~-

=o,

aL

a“

= 0,

1 , s 31

19

O B J E C T A N D STOP POSITION

which follow immediately from the differentiation of L and L’ as functions of M and N, M’ and N’ respectively, we have

Y ( S - G)2

= 2(M -

(S - G)3 M GM’)EIG - -__SGK L ’

with similar equations for Z and 2’.Now for freedom from axial aberration at magnification S, if Y and Z are zero, Y’ and 2‘ must also be zero, independently of the values of the direction cosines. Using eq. (3.6) and its counterpart in Z and 2’we then have

M L

--

2SG(M - GM‘) SM’ EIGK = ___ (S - G)3 L’ ’

N 2 S G ( N - GN’) -EIGK L (S - G)3

(3.7)

SN’

= - __.

L‘

If we eliminate M and M’ from the first pair of these equations (or N and N’ from the second) we find (SL - G L ’ ) ~ G E I G K= (S - G)3

(3.8)

and from this we could find E G if we could find the form which L and L’ take under these conditions in terms of I alone. To simplify the notation we write EI for E I G , the G being understood. Then squaring and adding the two equations of (3.7) we have M2

+ N2 - 4S2G2[(M - GM’)2 + (N

(S - G)‘

L2

1 - L2

i.e. Set 26

-

=

L2

-

+

G N ’ ) ~ ] K ~ E-I SZ(M’2 ~ L‘2

4S2G21K2E12 (S - G)4

-

S’(1

-

L‘2)

L’2

~ G K E I / ( S- G)2. Then we have 1

-

L2

= L2S2Iu2

i.e.

L

=

1

(1

+ S21212)k ’

”2)

,

20

[I, §

MODERN HAMILTONIAN OPTICS

3

Substituting these values in eq. (3.8) we have 2GK

{ (1 + S

S2IZ12)+

-

(1

+ Iu2)+ EI = (S - G)3,

S

(3.9)

)=S-G.

This is equivalent to a quartic equation in u.If we let -+Cn be the usual binomial coefficient, i.e. the coefficient of tn in (1 t)-+, we have

+

(3.10)

where we have written

The series (3.10) may be solved for u as a series in I by successive approximation or by formal series reversion to give zt =

1

+ &el1 - Q(e2 - 2e12)IZ + . . .

and since from the definition of u we have GKEG

=

+(S - G)2

we may therefore write the series for

+ &(5e3

- 24ezel

- &(7e4

- 40esel -

s

E G

zt dI,

as

+ 24e13)14 l8e22

+ 132e2el2 - 88e14)15,

(3.1 1)

up to terms of the fifth degree. This is the form which the eikonal must take in the absence of all image errors and all orders of spherical aberration of the pupil. There is no constant term in the aberration since its value is quite arbitrary, only the derivatives of E being significant .

1,

J 31

21

O B J E C T A N D STOP POSITION

Equation (3.1 1) gives us a reference for the coefficients of powers of I in the eikonal of any system. When an imperfect system is being considered we subtract eq. (3.1 1) from its eikonal, and all of the terms which remain, i.e. a power series in I, 11,111, will represent aberrations. However, in the transformation theory which follows it is more convenient to transform the full eikonal EG, remembering that when all is done the coefficient of the term in I alone at any order must have a correction applied to it if it is to represent the aberration at the center of the stop. We may now find the form of the focal eikonal E under the aberration-free conditions described above. To do this we substitute from eq. (3.1 1) in eq. (3.4), using the latter in the form

and writing for I its value (S - G)-2[(1) - 2G(2)

+ G2(3)].

The extra factor (S - G)-2n introduced by In in this substitution is most simply absorbed into the coefficient en by writing

en' =

S2n+1 - G

(S - G)2n+l

*

Then the terms in the focal eikonal E of the first three orders when aberrations are absent are

{ a~ 1

EK = - (2) - -

+ (3)2G - el'

-

[(I) - 2G(2)

+ G2(3)]2} (3.12)

When aberrations are present, however, it is not at all obvious what form E will take when the form of EG is given. This equation, as well as the question of the dependence of the aberration terms on G and S will be discussed with the general transformation theory in what follows. First, however, it might not be out of order to review the terminology used in describing the orders of aberration. If we keep only the linear terms in I, 11, I11 in EG, i.e. if we consider M, N, M', N' as small quantities, the ray will become a paraxial ray

22

MODERN HAMILTONIAN OPTICS

[I,

§3

and we shall have Gaussian optics. Since when the system is in focus there are no Gaussian aberrations, we would suspect that the linear part of the eikonal E G has only a term in I, and this suspicion is indeed correct. Again, the quadratic terms in EG, viz. the terms in 1 2 , 111, I 111,112,I1 111,1112 are the next to be considered, and of these all but one (the term in 12) represent aberrations. The five aberration terms are related to the Seidel aberrations, as is shown in a slightly different notation in STEWARD [1928] pp. 30-49. Steward, Smith and nearly all British writers call these first aberrations first order or primary aberrations, while in America they are called third order aberrations. As the order of the aberrations increases, the British terminology is first, second, third, etc., or p r i m a r y , secondary, tertiary, while the American (and some more recent British) is third, fifth, seventh, etc. Here we shall adopt the older British terminology, because it is more suited to the variables with which we are dealing. Thus the quadratic terms in the eikonal give the five first order image errors plus the first order spherical aberration of the pupil, and in general the n’th order terms in the eikonal give the aberrations of order n - 1 , of which all but one are image errors, and one is an aberration of the pupil. We now go on t o discuss the general transformation expressions. 3.4. STATEMENT O F THE TRANSFORMATION

It is desired t o express the coefficients in the eikonal at object and stop magnifications G‘ and S’ in terms of those at G and S. I n such a transformation we have seen that the old variables I, 11, I11 will become new variables 1’,11’, 111’,but it should be carefully noted that the quantities ( l ) , (2) ,(3)in terms of which the old and new variables are defined do not change in the transformation, for they are independent of G and S , being functions of the direction cosines alone. By analogy with eqs. (2.31),the variables 1’, 11’, 111’ are defined by the equations

+ G’2(3), (S’ - G’)’II’ = ( 1 ) - (S’ + G’)(2) + G’2(3), (S’ - G’)’III’ = ( 1 ) 2S’(2) + S”(3).

(S’ - G‘)’I’

=

(1)

-

2G’(2)

(3.13)

-

Solving these equations for ( I ) , (Z), (3) in terms of 1’, 11’, 111’ either directly or by the equation analogous to eq. (2.30),and substituting the results in eq. (2.31)we obtain the transformation from I, 11, I11

1,

5 31

23

OBJECT AND STOP POSITION

to 1', 11', 111' as

I ( S - G)'

II(S - G)'

I I I ( S - G)'

= I'(S' - G)' - 2II'(S' - G)(G' - G) =

+ III'(G'

- G)',

I'(S'- G)(S' - S) - II'{(S' - G) (G' - S)

+ (G'

-

G)(S' - S ) ]+ III'(G'

- G)(S'

= I'(S' - S)' - 2II'(S' - S)(G' - S)

- S),

+ III'(G'

(3.14)

-S )'.

These relations may be expressed more concisely in a notation borrowed from invariant theory, the crossed brackets, which we proceed to define. For a more detailed study, see GRACEand YOUNG[1903], pp. 1-20. 3.5. CROSSED BRACKETS

By (ao, a l , . . . a ,

0 x,y)" we agree to mean

nCOaOXn

+ nClalxn-'y + .. + nCnany*,

where nCr is the usual binomial coefficient. We may describe this expression by saying that (x ty)" is to be expanded and tr replaced by a, throughout. This description enables us to interpret expressions such as

+

(ao, a l ,

- - - an 0 X,y)"(.',

Y')"-~

as long as n 2 K . For we simply take (%

+ ty)k(x' + ty'),-k

and replace t r by a, throughout. Another obvious extension is (bob1 . . . b2n

0 x , y , 4,,

+ +

which is defined by the operation of evaluating (x ty t2z)n and replacing tr by b, throughout. This again may be extended to

.

(bob1 . . bzn

0 X,y , z)k (x',y', z')~-'

precisely as above. 3.6. THEORY O F THE TRANSFORMATION

Returning to our transformation, we see that if we divide eqs. (3.14) through by (S - G)2 we may write the result in crossed bracket

24

M O D E R N H A M I L T O N I A N OPTICS

form as

+ s, - g)2, I1 = (I’,11’)111’ 0 1 + s, - g)(s, 1 - g ) , I

=

[I,

§3

(I‘, XI’, 111’ 0 1

(3.15)

I11 = (1’,11’)111‘ 0 s, 1 - g)2, where we have written s=-

S’- s S-G

9

G‘ - G g=S-G’

so that s and g represent the displacements of the stop image and of the object image respectively as fractions of the original separation of these images, as seen from eq. (3.2). The relation of the eikonal EG (where S is implied as the stop magnification) to the eikonal EG’ (where S‘ is implied as the stop magnification) may be inferred from eq. (3.5) as

EG’

==EG

+ G‘ K-

(&- - L’).

(3.16)

But to perform the transformation explicitly we must assume that EG and EG’ are expanded as infinite series of some form chosen to simplify the work as much as possible. Ordinary power series in I, 11, I11 and in 1’, 11’, 111’ would lead to hopelessly complicated transformation expressions, so we follow a different approach and investigate the transformation through the structure of its invariants. First we note the identity

(S - G)2(II11 - 112) r z (1)(3) - (2)2 =

(S’ - G’)2(I’111’ - 11’2)= (MN’ - M’N)’.

(3.17)

This shows that (S - G)2(II11 - 112)is an invariant of the transformation, and, moreover, vanishes for rays in a plane through the axis of the system, since we have seen that a ray will lie in a plane containing the axis only if (3.18) Consider now the terms in E G of order n, i.e. the aberrations (including stop aberration) of order n - 1. These terms will form a homogeneous expression of order n in the variables I, 11,111.Since the

1,531

25

O B J E C T A N D STOP P O S I T I O N

transformation is linear and homogeneous, the new terms of the n’th order will be derived from and only from the old terms of the same order. But in virtue of the identity (3.17) if we represent the n’th order terms as a finite series of powers of (S - G)2(I I11 - 112) with coefficient polynomials tailored to bring each term up to the n’th degree, then upon transformation the powers of (S - G)2(I I11 - 1 1 2 ) will be invariant and therefore the old polynomial coefficient of each power will alone determine the new coefficient of the same power of ( S - G)2(I I11 - 112). The decomposition of the n’th order terms into a series of powers of (S - G)2(I I11 - 1 1 2 ) is not unique if we allow arbitrary coefficient polynomials. However, if we use crossed bracket polynomials the decomposition has been shown by T. SMITH[1922] to be unique, though the original proof is tedious. Writing out the terms in the various series explicitly we have for the n’th order terms

0 I, - 211, I I I p + (I I11 - II2)(S- G)2(D?jD:. . .Dkn-2 0 I, -211,III)n-2 0 I, -211, III)n-4 + (I I11 - II2)2(S- G)4(DiDi.. (D;, D?,. . .

+ ...

+ (I I I I - I I ~ ) w ( S - G ) ~ ~ ( D ~ ~ D. .D&-Zw & , + ~ . 0 I, -211,

+

-

*

a

(3.19)

III)fi-2w

>

where the D’s are to be regarded as aberration coefficients and the different “series” are simply the groups of terms involving the different powers of ( S - G)2(I I11 - 112). It should be noted that the terms in eq. (3.19) may have a common factor depending on tz applied to them all. This will be important when we consider that in the transformation there will be extra terms of each order arising from the quantity

in eq. (3.16), and all of them will be of series zero. The transformation equations for series zero will of course be more complicated because of the extra terms, so we treat this series and the problem of choosing over-all coefficients for the terms of the various orders a little later. Hence we concentrate on series w ,w > 0, at first. From eq. (3.14) we find that for any b we have identically

(I,11,I11 0 1, - b)2 = (I’, 11’,111’ 0 1 + s

-

sb, - g - b

+ gb)2.

(3.20)

26

MODERN HAMILTONIAN OPTICS

[I,

93

We wish to consider what happens to eq. (3.19) under a change of stop and conjugates from S and G to S’ and G’. By eq. (3.16), if we avoid series zero with its extra terms, each of the series in the n’th order of E G goes directly into the same series in EG ‘ . By eq. (3.17) the factor (S - G ) 2 w ( I I11 - I I 2 ) w at the head of every series of eq. (3.19) is invariant. We thus only need consider what happens to terms of the form (3.21)

under the transformation. Now the left hand side of eq. (3.20) is I - 2bII b2II1, which is exactly what we would put for the second bracket of eq. (3.21), with b as the dummy t , in its expansion. This will allow us to find the relation between the D’s before and after transformation. The definition of the D’s after transformation is of course analogous to that before transformation, so that

+

pi;,

D;;+l,

. . . Dy;-2w 0 1’, - 211’, 1II’)n-Zw = (Dyw,DYw+,, . . . DYn’n-2w 0 I, - 211, 1II)n-zW.

The definition of the crossed brackets allows this to be written as (I’ - 2B 11‘ + B2 III’)n-zW = (I - 2b I1

+ b 2 III)n-2w,

(3.22)

where Bv is to be replaced by D;”+,, and bv is to be replaced by DYw+*. But I - 2bII b2III is the left hand side of eq. (3.20), and hence equals

+

1’(1

+s

-

sb)’ - 2II’(1

+s

-

sb)(g

+ b - gb) + III’(g + b

-

gb)2. (3.23)

We thus wish to find the (n - 2w)th power of this expression and equate to the left hand side of eq. (3.22), with the convention on Bw and bv. Coefficients of like powers of the variables could then be equated. Consider the term I’r(- 211’)PIII’t,

Y

+ p + t = PZ - 2w.

By eq. (3.22) the particular D’ that serves as coefficient for this term depends on $ 2t = v, the power of B. Then for all 9, t such that 9 2t = v, the coefficient of I’r(- 211’)PIII’t in the left hand side of eq. (3.22) is D;;,, multiplied by whatever trinomial coefficient is associated with these powers. On the right hand side, obtained from eq. (3.23), the corresponding coefficient will be the same trinomial coef-

+

+

1,

9 31

27

OBJECT AND STOP POSITION

ficient multiplied by (1

+ s - sb)'r( 1 + s

-

Sb)P(g

+ b - gb)P(g + b - gb)'t,

with our convention on bv. If we write

for the trinomial coefficient of convention on bv

n ) Dkz+v ( ( p,t

n

-

2w

=

y>

I,

.*. DhZ++,= ( 1 (1

1

That is to say

a+,

= (DFw,D&,+l

-

I,

2w

p,t

p , t on % - 2w we have with the usual

) (1 +

s - sb)Zr+P(g

+ s - Sb)Zr+P(g + b

-

+ s - ~b)zn-4w-v( g + b

. . . DFn-2w 0 1 +s,

+ b - gb)p+2t

gb)v -

gb)v.

- ~ ) 2 9 2 - 4 W - V (g, 1 - 8)". (3.24)

This important equation gives the aberration coefficients for S' and G' in terms of those for S and G. We must of course remember that w > 0, since series 0 needs separate treatment. As noted previously, the result (3.24) requires modification for the special case w = 0 owing to the presence of the terms in L and L' in eq. (3.16), all of which contribute to series 0. The additional terms of order n are evidently

G' - G ,f (- I)"+C,(l)" K 1 GG'

- (-

1)niC,(3)"}

(3.25)

in which (- 1)" is just what it says, but (1)" and ( 3 ) B are powers of the variables (1) and (3). I n terms of the variables 1', 11', 111' the expression (3.25) is

G' - G

K

(S'21' - 2S'G'II' (- I)"*C,( GG'

+ G'ZIII')" - (I' - 211'

1

+ 111')" .

Next we evaluate the binomial coefficient, which readily gives

(- 1)"*Cn = -

(2n - 2) ! 22"-1.%!.(n - l ) !

(3'26)

28

M O D E R N H A M I L T O N I A N OPTICS

Hence the additional terms of order n are

(G’ - G) - _-__

K

(S’2I’ XI

(2% - 2) ! 22n-l.n!.(n - l ) ! - 2S’G’II’

GG’

+ G’2III’)n - (I’ - 211’ + 111’)n}.

(3.27)

Now for all terms of the same order we had proposed an over-all coefficient, say c ~ + ~ where + ~ , 9 q Y = n. Calling this Cn it seems wise t o take

+ +

(3.28) for then every term of order n in the eikonals would have the coefficient cn included in its definition, and this would be the same as in the additional terms of the same order. The factor may then be cancelled in the transformation expressions, order zero included. Again, the quantities expressing the aberrations are preferably dimensionless, so that the factor l/K which introduces the unit of length must also be excluded. To do this we re-write eq. (3.16) in the form KEG’ = KEG

+ (G‘ - G)

(GLG,

--- L’

)

and regard all coefficients as coefficients of the dimensionless eikonal KE. Now in the expansion of (3.27) in terms of the form I’r(-21I’)PIII’t the same trinomial coefficients are encountered as in the regular terms of the series, so that again these cancel in the transformation formulae. Note too that aside from the trinomial coefficient, the coefficient of I’r(- 211’)PIII’t in the large parentheses in (3.27) is S”&.S‘pG‘p. G‘2t - 1, i.e. (S’2r+pG’V--1G--1 - 1). GG’ Thus with all factors accounted for, the relations for the aberrations of the zero series may be written in the form

q o = (0;q. . .Din 0 1 + s, - s)2”-”(g,

1 - g)”

+ (G’ - G)(S’2n-vG’v-lG-l

- 1).

(3.29)

Thus we have explicit stop-shift and conjugate-shift expressions for all orders of aberrations.

=I

CONCLUSION,

29

REFERENCES

3.7. RELATION TO THE FOCAL EIKONAL

We now mention the relations between the D's of any order and the coefficients of the standard focal eikonal E . If we define aberration coefficients for E by the relations

(D&,Dyw+l.. .Dyn)n--2w 0 I,

- 211, III)n-2W

= (Jq$Yw+1.

*

.Eyn-2w

0 (1)) - 2(2), (3))"-2"

(3.30)

then it follows as in T. SMITH[I9221 that for all series except series zero we have

oyW+, = ( E ~ ~ , E : .~.+ . E~ ;, ~ -0 s, ~ -1)2n-4w-v ~

(G, -S)"

(3.31)

and for series zero the result is modified to =

(E:, Ey . . .Eo2% 0 S,- l)zn-'J(G, - I)v-G-.S2n-vGv-l.

(3.32)

These results make it possible to write out a t once the form of the focal eikonal in the variables ( l ) , (2), (3) when EG is given.

Q 4. Conclusion In the space of one article it is impossible to discuss or even mention many of the remarkable things that have been done this century in geometrical optics, especially by T. Smith. It is hoped that this article will at least draw attention t o the fact that a general aberration theory is far from impossible and that the first step towards it seems to be a detailed review of the great work of T. Smith.

References BRUNS, H., 1895, Saechs. Ber. d. Wiss. 21. GRACE,J. H. and A. YOUNG,1903, Algebra of Invariants (Cambridge). HAMILTON, Sir W., 1931, Collected Mathematical Papers, Vol. I (Cambridge). HERZBERGER, M., 1958, Modern Geometrical Optics (Interscience, New York). C . , 1949, The Variational Principles of Mechanics (Toronto). LANCZOS, R. K., 1944, Mathematical Theory of Optics (Brown University). LUNEBERG, SMITH,T., 1921, 1922, Trans. Opt. SOC.(London) 23 (1921-22); Reprinted in "National Physical Laboratory Collected Researches" 17,Paper 13, with an Appendix of Proofs. STEWARD, G. C., 1926, Trans. Camb. Phil. SOC.23, No. 9. STEWARD, G. C., 1928, The Symmetrical Optical System (Cambridge). SYNGE,J . L., 1937, Geometrical Optics (Cambridge).