Application of simulated annealing to deconvolution in astronomical speckle imaging

Application of simulated annealing to deconvolution in astronomical speckle imaging

Volume 80, number 3,4 OPTICS COMMUNICATIONS Application of simulated annealing in astronomical speckle imaging I January 1991 to deconvolution ...

477KB Sizes 0 Downloads 45 Views

Volume 80, number

3,4

OPTICS

COMMUNICATIONS

Application of simulated annealing in astronomical speckle imaging

I January

1991

to deconvolution

E.J. Spillar and G.R. Ayers The W:voming Infrared Obwrvator.v, Department qfPhysical Astronomy, Ciniversrtyof W:voming, Laramle, WY 82071. USA Received

7 June 1990

Monte-Carlo type simulated annealing has been investigated as an optimization procedure for reconstructing high spatial resolution astronomical images. In particular, the annealing method is shown to be a powerful technique for phase retrieval in connection with Knox-Thompson speckle correlation data. An annealing schedule is proposed and optimum annealing parameters identified.

1. Introduction Knox-Thompson (KT) and triple correlation based methods are now well established imaging procedures used in the astronomical speckle interferometry community (for a detailed study see Ayers et al. [ 1 ] and references therein). However, considerable work remains in the area of image deconvolution or reconstruction from the resulting ensemble average correlation spectra. In this paper an image reconstruction procedure based on Monte-Carlo [ 2,3] simulated annealing is described. Although the procedure is computationally expensive, particularly for large scale problems, the method does possess a number of attractive properties and therefore warrants further attention. The attractive properties are: (i) the method is straightforward to code; (ii) known data error estimates can be incorporated into the procedure in a straightforward way; (iii) since local extrema in a search space can be avoided by suitable choice of annealing parameters and schedules, global extrema can be found; (iv) it is straightforward to introduce nonlinear constraints such as image positivity; (v) it is possible to calculate estimates in the errors of measured parameters directly from the reconstruction technique [ 41. In the preliminary work described here, only the 0030-401819

1603.50

0 I99 1 - Elsevier

Science Publishers

first two properties have been directly used. Property (iii) is shown to be true for the KT data used for this work. Properties (iv) and (v) are currently being investigated in detail and results will be presented at a later date. The seminal work of Geman and Geman [ 51 applied simulated annealing to image reconstruction. Geman and Geman [ 51 showed that under certain conditions simulated annealing will always find the global minimum of the energy function. They found upper limits on the time necessary to find the minimum and exploited (iv) and (v) by using “prior knowledge” to constrain reconstructions. The first application of simulated annealing to the problem of phase recovery was that of Nieto-Vesperinas and Mendez [ 61, who applied the technique to reconstruction from autocorrelation functions, using positivity as a constraint. We extend their approach to the reconstruction of images from KT spectra. In section 2 the problem of reconstructing images from KT spectrum data is briefly stated, and the application of simulated annealing to this problem is described. Section 3 discusses the results of applying annealing to the image reconstruction and gives further details on its application. Various annealing parameters, as discussed in section 2, are experimented with and optimum values identified. Conclusions based on the presented work are discussed in section 4.

B.V. (North-Holland)

219

Volume 80, number

OPTICS COMMUNICATIONS

3,4

2. Discussion of the KT method and the simulated annealing procedure The Knox-Thompson spatial correlation function [ 71, IKT( u, AU), of a one-dimensional image intensity distribution, i(x). is defined as IKT(~.A~)=I(~)

P(u+Au)

,

where I( U) is the spatial Fourier transform of i(x).

I(u)=

s

i(x)

exp( -2niux)

(1) function

tit

(2)

Here x is a spatial coordinate, and u and Au are spatial frequency coordinates. The problem to be solved consists of finding i(,u) given I KT. Typically, I KT is estimated from astronomical speckle data by evaluating ensemble averages over the recorded data set. The estimate of IKT thus contains noise terms which make the process of optimum image reconstruction non-trivial. Further details on speckle interferometry and the KT method can be found in Dainty [ 81. The KT spectrum data considered in this work is primarily computationally simulated so that the amount and form of the noise present can be controlled. In order to use the simulated annealing optimization procedure, an energy function must first be identified with its global minimum somehow corresponding to an optimum reconstructed image. We define the energy function, EL, as E, =

IsIIKT(~,A~)--1:‘(~,A~)/2d(AU)dU a( u, Au)’

(3) where a( U, Au) is the standard deviation of the recorded KT data, IKT(u, Au), and the subscripts k serve to indicate the various functions associated with the kth estimate. ik(x), of the reconstructed image. Ek is a x2 description of the fit between the KT spectrum of reconstructed image, and the data. The simulated annealing procedure is used to minimize Ek and hence find an optimum image reconstruction. To understand the simulated annealing algorithm’s origin, consider how a piece of iron is annealed. At a temperature near the melting point, 220

I January

1991

thermal fluctuations continually rearrange the atoms in the material, thus “testing” different configurations of atoms. At high temperatures, enough free energy exists so that worse states are occasionally accepted; this allows the iron to escape local energy minima. As the temperature is lowered the atoms tend toward lower energy arrangements. When the material is cooled slowly enough, imperfections such as cracks or dislocations (associated with local energy minima) have time to disappear. and the global lowest energy rearrangement of atoms results, a crystalline solid. In an analogous fashion, the simulated annealing algorithm evolves a system which has well defined states, each of which has a well defined energy. A random sequence of states is generated as a control parameter T is slowly decreased. To produce the next state from the current state, a new state “near” the old state is randomly generated, and the change in energy AEl, = EL -El, _, is calculated. If the energy has decreased, the new state is automatically accepted. If the energy has increased, the new state is accepted with probability exp( - AE,/T), by analogy to the Boltzmann distribution. Again the k subscript serves to indicate values associated with the kth image estimate or state. After a series of random states have been produced at a given value of the parameter T, the parameter is decreased. T begins at a value high enough so that essentially any perturbation is accepted and is slowly decreased until some termination criterion is reached; the final state of the system is the minimum energy state. Because states of larger energy are occasionally accepted, local minima do not fool the algorithm. As applied to the KT spectrum data, the SA procedure consists of generating a series of images { ik).. Given ik, a trial for iA+, is created by randomly changing ik. If the change in the energy function, AE,, is negative, the new estimate is accepted. If AE,> 0 the new image is accepted with the Boltzmann probability, otherwise it is rejected and a new trial image is created. To allow statistics to be accumulated at a given temperature, T is not changed between every image. At a given T, enough trials are generated such that the sequence of images {in. .... i,,,}, samples all possible values fairly. Then T is decreased and a new chain of images is created.

Volume 80, number

3,4

OPTICS

COMMUNICATIONS

At a particular value of T, with a judicious choice of the energy function, the Markov Chain fi, n, .... i,,,} (known as a “Gibbs sampler”) visits trial images with frequency proportional to their likelihood. This means that the Gibbs sampler can be used to obtain Monte-Carlo estimates of quantities and errors on these quantities [4]. We shall exploit this in future work. Geman and Geman show that if the temperature is lowered sufficiently slowly, the series will converge such that limk_mE(iL)+E,,,, where Emin is the minimum obtainable energy. That is to say ik will converge to i(x), the image with the minimum energy. The procedure for computing T is known as a cooling schedule. Several choices for the cooling schedule have been used in the literature [ 91. The choices of cooling schedule, initial temperature, and method for perturbing ik to obtain ik+1, all affect the time it takes for series of images to converge. Several statistics associated with the pth Markov chain at temperature T, are calculated. In particular, if Ek, k= n, n+ 1, .... n+m are the energies of the trials in the series at a particular temperature T,, then the mean ,u = 1/m,XkE,, and the standard deviation (J=J_ ( Ek - 11,) ) are calculated. The length of individual Markov chains depends on the size of the problem and various values for the present problem are considered in section 4. After a Markov chain has been realized, its characteristics are used to determine the new temperature T,,,. Based on [9] (4) where 6=0.1. Initially, the control parameter T must be chosen high enough so that the space is thoroughly explored, but not so high that a great deal of time is wasted in so doing. Again based on [ 9 1, T is initially set so that 95% of the image trials are accepted. The method of perturbing images is discussed in the following section, which describes the results of applying the annealing procedure to image reconstruction from KT data.

3. Application and results In order to study the annealing

procedure,

com-

I January

1991

putationally generated, noisy, KT spectrum data of a one-dimensional sampled image have been used. The image to be reconstructed is described by N= 128 sample points. The sampling interval is 6x and in Fourier space is 6u= 1/N6x. The KT data has u maX= 386~ and Au can take on the values 0,6u, 2&i, 3&u, which is reasonably typical of one-dimensional near-infrared speckle data. These figures give approximately 150 KT spectrum data points. The scale size of the present problem is therefore in the range 38 to 150 and for ease of discussion is considered to be 100. Instead of generating a Markov chain of images, it is equivalent and computationally more efficient to generate a series of trials, {lk( u)}, for the Fourier transform of the reconstructed image. Each new image transform estimate, Ik+, (u), is generated by perturbing one of the frequency components of Zk( u ). The frequency components are perturbed in turn, starting at u=O with a perturbation value being drawn from a uniform probability distribution between values of ? CN (typically C is around unity). The symbol N is an estimate of the standard deviation of the noise on the Fourier modulus data. A value for N is found by analysing the section ZKT( u, Au = 0 ) of the KT spectrum data. In this preliminary work, N represents an average value over all Fourier frequency components. Different values of the constant C are tried in combination with various sizes of Markov chains. To compare the effect of the varying parameters, a fixed number of total trial images have been generated for each combination of C and chain length and the changing value of the energy function has been recorded. The results for 400000 trial images in each case are shown in fig. 1. The cooling schedule described in section 2 is used in each case. An alternative procedure known as quenching, again by analogy to the methods of ironmongery, consists of setting T= 0 throughout the annealing process. The probability of accepting a perturbation which increases the energy function is always exactly zero in this schedule. Using this procedure, there are no guarantees of avoiding local minima in the energy function and convergence tends to stagnate very quickly. The horizontal lines in fig. 1 correspond to the results of quenching using various values of the constant C. Such a schedule is ob221

Volume 80. number

3.4

OPTICS

I January

COMMUNICATIONS

10

Markov

100

chain

1991

l( 00

length

Fig. I A graph of the final cost or energy obtamed after 400000 trial images in annealing procedures with various combinations of (‘and Markov chain length. Curves (c). (d) and (e) correspond to simulated annealing with C= 1.0. C=O.2 and C~4.0 respectively. while the two horizontal lines, (a) and (b ). correspond to quenching with c‘= I .O and C~4.0 respectively.

viously independent of the Markov chain length. The number used to describe the Markov chain length in this work is not the total number of trials but the number of trials accepted. For example, a Markov chain length of 100 corresponds to generating enough trials at each fixed temperature such that 100 are accepted. Based on the simulations carried out and the results shown in fig. 1, the following observations can be made. All combinations of Markov chain length and perturbation scale give reconstructed images of equivalent quality, if the annealing procedure is allowed to converge. The fastest convergence observed required approximately 400000 perturbations and so for comparison, each combination of annealing parameters was used in an annealing procedure of 400000 perturbations. The perturbation scale (c‘= 1.O) matched to the estimated errors gave fastest convergence. Markov chain lengths comparable to the size of the problem (around 100) gave fastest convergence, independent of the perturbation scale. although the tolerance around the optimum chain length is greatest in the case of the optimum perturbation scale. Quenching is slower to converge than the annealing procedure combined with optimum annealing parameters but is superior in performance 222

to the annealing procedure with sub-optimum annealing parameters. Fig. 2 shows six independently reconstructed images obtained using a Markov chain length of 100, 400000 perturbations, and C= 1.0. The results are typical for all combinations of annealing parameters provided the procedure is allowed to converge.

4. Conclusions In this preliminary work, the process of simulated annealing has been applied to the reconstruction of images from KT spectrum data and optimum annealing parameters have been identified. Perturbations of trial image Fourier components need to be closely matched to the error on the components. Such errors can be estimated from the KT spectrum data. The length of Markov chains used in the annealing procedure need to be matched to the scale size of the problem at hand. Specifically for KT spectrum data, chain lengths in the region between the number of reconstructed Fourier components and the number of KT spectrum data points are required for fastest convergence. Determination of a more precise optimum chain length value is hampered by the presence of correlation between individual data

Volume 80, number

3,4

OPTICS

COMMUNICATIONS

RECONSTRUCTED 0.20-f

v,

2

0.05

k! w

0.00

:

f4

-0.05

i+

-0.10

z

-0.15 0.10 I 48

:

(C = 1.0: I : : :

56

1 January

IMAGES

MARKOV CHAIN LENGTH = 100) I : : : : : : : I : : : I

64

72 PIXEL

1991

80

:

:

88

:

k

96

NUMBER

Fig. 2. Six independently reconstructed images obtained using a Markov chain length of 100, 400000 perturbations, and c’= 1.0. These parameter values are considered to be reasonably optimum for the present problem. The dotted line with open circles for data points is the diffraction limited image plotted for comparison. Note that a number of the images are negative, reflecting the trivial ambiguity of the KT spectrum data that the sign of the reconstructed image is not uniquely defined to be positive or negative.

0.12 t

IRS9

NGC7538

,x

2.2

pm

t

t

0.09

t NORTH

-1

Relative Fig. 3. A reconstructed speckle data.

1

0

position

image of NGC 7538 (IRS 9) obtained

points, between errors on data points, and between individual Fourier components. Fortunately, there exists a reasonable tolerance on the value of the optimum chain length.

2

in arc-seconds

using the described

simulated

annealing

procedure

from near infrared

Based on the above conclusions and simulation results, an optimum annealing procedure has been identified and applied to one-dimensional infrared speckle data obtained at the Wyoming Infrared Ob-

223

Volume 80. number

OPTICS

3.4

COMMUNICATIONS

servatory (WIRO). An image of NGC 7538 (IRS 9), reconstructed from KT spectrum data at 2.2 urn obtained at WIRO in July 1989 is shown in fig. 3 as an example of the use of the annealing technique. The extended two micron emission in the southerly direction confirms (with increased resolution) the slow slit scan observations of Dyck and Staude [ lo]. Such emission has also been observed by Werner et al. [ 1 11, who interpret it as reflection nebulosity. The techniques we have outlined for reconstructing images from KT spectra are easily extended. In a paper currently in preparation, we apply annealing to reconstructing images from the bispectrum in one and two dimensions. We also exploit the flexibility of the technique to apply constraints, such as positivity, to the reconstruction process.

References [ 1) G.R. Ayers, M.J. Northcott Am.A5 (1988)963.

224

and J.C. Dainty,

J. Opt. Sot.

1 January

1991

[2] N. Metropolis, A.W. Rosenbluth, M.N. Rosenbluth. A.H. Teller and E. Teller, J. Chem. Phys. 21 ( 1953) 108. [3] N.E. Collins, R.W. Eglese and B.L. Golden, American Journal of Mathematical and Management Sciences 8 (1988) 209. [ 41 R. Szeliski, Bayesian modeling of uncertainty in low-level vision (Kluwer, Boston, I989 ), [ 51 S. Geman and D. Geman, IEEE Transactions on Pattern Analysis and Machine Intelligence 6 ( 1984) 721. [ 61 M. Nieto-Vesperinas and J.A. Mendez. Optics Commun. 59 (1986) 29. [ 71 K.T. Knox and B.J. Thompson, Astrophysical Journal Letters 193 (1974) 45. [ 81 J.C. Dainty, in: Laser Speckle and Related Phenomena, ed. J.C. Dainty (Springer, Berlin, 1975) p. 255. [ 91 E. Aarts and J. Korst, Simulated annealing and Boltzmann machines (Wiley, New York, 1989). [ IO] H.M. Dyck and H.J. Staude, Astron. Astrophys. 109 (1982) 320. [ I 1 ] M.W. Werner, E.E. Becklin, I. Gatley, K. Matthews. G. Neuegebauer and C.G. Wynn-Williams, Mon. Not. Roy. Astron. Sot. 188 (1979) 463.