Volume
18, number
August
OPTICS COMMUNICATIONS
3
SOME COMMENTS ON THE USE OF THE ZERNIKE POLYNOMIALS
1976
IN OPTICS
Eric C. KINTNER* Department Received
of Physics, University of Edinburgh,
Scotland
30 April 1976
The advantages
of a modified
application
of the Zernike
Recent papers by Bezdid’ko [l] and No11 [2] indicate a revival of interest in the properties of the Zernike polynomials. This revival is both appropriate and timely, first because the Zernike polynomials are the natural set of functions for describing any function defined within a unit circle, and second because they are increasingly being applied to novel physical problems (such as atmospheric turbulence). In this paper, I would like to suggest a slight but significant modification to the usual application of the Zernike polynomials. Using the dimensionless circular co-ordinates (p, cp), the pupil function of an optical system may be expressed in the form exp {2niW,@),
P G 1,
0,
P>
%(P, cp)=
(la) 1,
where
(lb) (m + n even) and the functions (7?:(p)} are the Zernike polynomials. The function @(p, cp) represents the phase, or path length variation, over the pupil. This completely general form may be simplified in two obvious ways. First, when the pupil is unupodized, the function @(p, (p>is entirely real and the series expansion may be expressed in terms of sines and cosines. Second, * Address from October, dards,
Washington,
1976: U.S. National D.C. 20234, USA.
Bureau
polynomials
are demonstrated.
when the pupil is symmetric about the axis cp= 0 (as in optical design), the series consists of cosine terms only. In practice, only a limited number of coefficients are employed. This is the form most commonly seen in the literature (e.g. [3] ). Alternatively, but generally validly, the pupil function may be expressed in the form n$o m~nk~92~(p)eimq,
p
0,
P>l,
(m+n even),
3c(P, 4 = (2)
13c(p,cp)l<1. When the pupil is unaberrated, the pupil function is real, and again the series expansion reduces to sines and cosines; when the pupil is symmetric, only the cosine terms remain. The disadvantage of this expression, which has hitherto inhibited its application, is that it is not as familiar to opticists as the polynomial expansion of the aberrated wavefront. However, its particular advantage is that it has a simple “Fourier” transform (in circular co-ordinates), and this property leads simply and directly to several powerful applications. Where the Fraunhofer approximation is valid, the amplitude distribution in the image plane of an optical system is given by the (inverse) Fourier transform of the pupil function. If the pupil function is characterized by eq. (2), then it may be shown that the amplitude point-spread function in the image plane is given, using the dimensionless coordinates (r, O), by
of Stan-
235
Volume
18, number
3
OPTICS
COMMUNICATIONS
(3) (The dimensionless coordinates p and r may be related to the geometrical coordinates p’ and r’ through the equations p = p’/pO
and
r = (pO/hR)r’,
(% b)
where po is the radius of the pupil, R the distance from the pupil to the image, and h the mean wavelength of the quasi-monochromatic illumination). This result may be deduced from the similar equation given by [2] , or obtained from the equations given in [3] ; a complete derivation is given in [4] . The importance of this result is that, unlike the “spectrum” of the wavefront deviations in the pupil [derived from eq. (l)] , the “spectrum” of the pupil function itself [derived from eq. (2)] , may be associated with a real physical distribution, namely, the amplitude distribution in the image plane. Interestingly, this same result was originally obtained by Zernike [5], using the form of the pupil function in eq. (2) in the same paper on the phase-contrast method wherein he first introduced the Zernike polynomials. A diagram of the orthogonal functions in the image plane (fig. 1) is useful for interpreting the result in eq. (3). Two important features of this diagram should be noted. First, only the function of degree n = 0 is non-zero at the origin; that is, only the ki coefficient contributes to the Strehl intensity at the gaussian image point. Second, the innermost and largest maximum of the orthogonal function for each
f”(d 2
I
0 0
Fig. 1. The fist four radial functions term (-i)” has been suppressed.
236
2
r
in the image plane.
The
August
1976
successive degree n appears further from the gaussian image point and is of smaller amplitude; that is, the light becomes increasingly dispersed as the degree y1is increased. From these observations, two conclusions may be inferred immediately. An expression for the total energy passing through the circular pupil may be obtained by taking the squared modulus of the expression in eq. (2) and integrating over the unit circle. Using the orthogonality of the Zernike polynomials, as demonstrated in [3] , the total energy (within a constant) is found to be 1 t=
s 0
2n s 0
= ‘ITnqo
Ww)12pd4p
m5Ln
Ik,” 12/(n
+
1).
Pa)
Since the area of the unit circle is ‘II,the mean energy density over the pupil is
E/rr=
Ik$G
c
5
n=l
m=-rl
lk;P/(n
t 1).
(sb)
When ki z 1 and the other coefficients vanish, C /n = 1 and the amplitude distribution in the image plane is the Airy diffraction pattern corresponding to an unaberrated and unapodized pupil. If the coefficients { kr } are varied while the energy remains constant (aberration), eq. (5b) can be re-arranged to give the normalized Strehl intensity: -
+I1
i= lki I2 = 1- ncl ,c,
Ikr 12/(n t 1).
(6)
This exact expression for the Strehl intensity is analogous to the approximate expression, based on eq. (I), which is derived in [3]. By arguments identical to those in [3], eq. (6) can be used to show that an isolated aberration term kr (n f 0) implicitly includes the compensation necessary to keep the Strehl intensity at a maximum for the given degree of aberration. In other words, any further re-adjustment of the other coefficients {k:} necessarily reduces the Strehl intensity. Thus the familiar aberration-balancing feature of the Zernike polynomials, which holds approximately when the pupil function is expressed as in eq. (l), holds
OPTICS COMMUNICATIONS
Volume 18, number 3 exactly when the pupil function
is expressed as in eq.
(2). It should be obvious that while these results are exact when eq. (2) is used, they remain approximately true (for small aberrations) when eq. (1) is used because an exponential may be approximated by a linear term (for small arguments). Through the familiar van Cittert-Zernike theorem, the results of diffraction theory may be applied to problems of partial coherence. For example, let the (real) intensity distribution over a circular source be given by +n 1, NP,
c c
k~SQ~(p)eim(o,
n=O m=-n
4 =
pG1 (m +n even).
0,
P>l
(7)
Then, where the Fraunhofer approximation holds, the complex degree of coherence is given by
7(r,0)
= (l/yo)
Jn+l(X> X
x=0
1 n = 0, = 59 0,
n>O,
@a)
eimq.
The constant y. is set by the requirement ~(0, -) E 1 at the origin. Since
it follows that 7(0,-)
= 1 = (l/yo)k;[2n*;],
SO
@b)
yQ = nk;.
These examples demonstrate the special advantages of the direct representation for the pupil function given by eq. (2). Clearly, in order to extend the usefulness of this new approach, practical procedures are needed to connect the coefficients {@r} of the wavefront representation in eq. (1) with the coefficients {ky} of the direct representation in eq. (2). Recent research [6] indicates that an analytic procedure can be developed to accomplish this. I am grateful for stimulating R.M. Sillitto and W.J. Tango.
discussions with
References
5 5 kr [2rr(-ir n=O m=-n
X Jn+l(27rr)/2nr]
August 1976
that
[l] S.N. Bezdid’ko, Sov. J. Opt. Technol. 41 (1974) 425. [2] R.J. NOB, J. Opt. Sot. Am. 66 (1976) 207. [ 31 M. Born and E. Wolf, Principles of Optics (Pergamon Press, 1959), Sect. 9.2 and App. VII. [4] E.C. Kintner and R.M. Sillitto, Optica Acta (1976), in press. [S] F. Zernike, Physica 1 (1934) 689. [ 61 E.C. Kintner, Ph. D. Thesis, (University of Edinburgh, 1975); W.J. Tango, private communication.
237