Efficient method for finding the position of object boundaries to sub-pixel precision

Efficient method for finding the position of object boundaries to sub-pixel precision

the position of object boundaries to sub-pixel precision John P Oakley* and Richard T Shann’ A new class of algorithms is described for the analysis ...

2MB Sizes 13 Downloads 35 Views

the position of object boundaries to sub-pixel precision John P Oakley* and Richard T Shann’

A new class of algorithms is described for the analysis boundaries in a discrete image. The simplest algorithm of this class accepts as input an initial estimate for the boundary position and, after a process of iterative refinement, outputs a more accurate estimate. The method is an extension of known methods for edge detection which are based on Gaussian filtering. Instead of using a discrete filter and exhaustive evaluation, the filter output is computed only at isolated points. These points are selected by a numerical optimization routine to converge on the feature of interest. Because in general the points do not coincide with pixels the filter must be reconstructed between pixels from the discrete image data. This method permits the measurements to be made to sub-pixel accuracy without the need for a mathematical model of the boundary. The reliability of the method is discussed in terms of the pixel size (sampling error) and the size and proximity of clutter relative to the size of the filter. The accuracy is related to the filter size. Possible applications include non-contact measurement, and an example is given. Keywords: numerical optimization, derivative of Gaussian, sub-pixel resolution, reconstruction filter, image reconstruction, metrology, edge detection, frame store, CCD camera.

INTRODUCTION There are many industrial and research situations in which it is attractive to use video cameras for making measurements. Specialized, non-contact, co-ordinate measurement systems are available but these tend to be cumbersome and expensive. Photography is relatively *Department of Electrical Engineering. Illustration, University of Manchester, Ml3 9PL, UK

‘Department Oxford Road,

0262-8856/91/004262-l 262

of Medical Manchester

1

@

cheap and convenient, but it is slow because of the need to process the film before taking measurements from an enlarged image. Also, the accuracy of measurement is limited by the stability of the photographic media used. Until recently, the poor performance and high cost of suitable cameras has limited the use of video camera technology in measurement applications. There are now several medium (512 by 512) and high (1024 by 1024) resolution Charge Coupled Device (CCD) cameras available at reasonable cost. By their very nature, these cameras provide an image with low geometrical distortion, and this makes them ideal for dimensional measurements. In this type of application the camera is used with a digital frame store, a microprocessor and specialized software. Measurements are made by finding the distance between chosen points on each boundary in the image frame. The chosen point is any point (in principle) in the case of a straight edge, or the point, in the case of a curved edge, where a particular orientation is achieved. Then a mathematical transformation is used to relate this distance to the object dimension. The task of finding a mathematical model for the image geometry is known as the camera calibration problem, and it will be assumed here that such a model is available. This leaves the problem of measuring distances in the discrete image. The measurement accuracy required in a given application is often greater than the resolution of the image frame store. There are several known image analysis techniques which can be used in this type of application, but these techniques tend to be either computationally expensive or application-specific or both. This paper describes a new method, the Iterative Boundary Locator (IBL), for accurate boundary detection which is simpler to use than existing methods and requires a relatively small amount of computer power. The IBL is particularly suited to measurement applications involving regular (straight or curved) shapes. It is based on a process of iterative refinement of an initial 1991 Butterworth-Heinemann Ltd image and vision computing

estimate for boundary locations. In interactive systems the initial estimate of boundary positions is easily obtained using a moving cursor or a light pen. In automatic inspection applications an a priori estimate may be available.

Current rn~th~s

for boundary

The aim of this paper is to introduce this new algorithm in sufficient detail to encourage the interested reader to experiment. A review of the pertinent aspects of the directional derivative edge operator is first given, and the principles behind the new algorithm are described, together with a discussion of the expected accuracy. A simple implementation of the IBL is outlined, and examples of results obtained using it are given. Various possibilities for extending and/or modifying the approach are also discussed.

detection

The most widely

used method for obtaining precise measurments from discrete image data involves an initial image analysis stage which results in a set of imprecise and possibly inaccurate boundary points. This data is then combined with a mathematical model tor the known boundary shape in order to give a precise location for the boundary. For example, if it is required to find the exact centre of an elliptical object, then a I~umericai model fitting procedure is used to find the ellipse parameters which best fit a set of edge data. This approach often gives accurate results, even in the presence of high noise levels, because the final parameters are determined by many image points. A review of this class of algorithms is given in Ballard and Brown ’ under the heading ‘Searching near an approximate location’. The Hough transform or Generalized Hough transform (GHT) is another method which can be used for making precise measurements from digital image data. An iterative search in image space can be replaced with an iterative search in GHT space, with the start point being determined from u priori knowledge of the object shape. This method is not so attractive for metrology applications, since a complete mathematical model of object shape is required and the space to be searched may have many dimensions. These two methods can be characterized as nonlocal methods. More accurate local position information can be obtained without a mathematical model for the boundary shape by using an interpolation algorithm in conjuction with the Laplacian of Gaussian operator”. The approach described here is similar to this in that accurate local position information is obtained directly from the distribution of grey levels in the image.

The new method ‘The boundary finding method which will be described here achieves sub-pixel accuracy using only local information. It does this by evaluating a directional derivative of Gaussian edge detector at points which lie between pixel positions. The output from the IBL is a set of floating point co-ordinates for the best estimate of boundary position. No explicit model for the object is required, just an initial estimate of the position and orientation of the boundary. Typically, the IBL might be used interactively, with the user pointing to the required boundary and the algorithm delivering the position and orientation, or in an industrial application a component might be delivered to the system in a roughly constant position and the locator will determine its precise position. The IBL can also be used to seek out the orientation of a boundary by including the direction of the directional derivative as one of the variables to be optimized. A long narrow edge operator will then automatically align itself with a nearby boundary. This is useful when the orientation of straight edges in images must be found.

DIRECTIONAL DERIVATIVE EDGE ENHANCEMENT OPERATORS Edge enhancement operators basically provide estimates for the partial derivatives of the underlying grey tone intensity surface using the discrete image data. These estimates can only be made under some assumptions concerning the smoothness of the underlying continuous image. Early edge operators, such as the Roberts and Sobel operators, have found many applications, but produce a transformed image which bears little relation to a true gradient image. An important step was taken by Marr and Hildreth , who combined a derivative operator, the Laplacian, with a low-pass filter operator, the Gaussian filter. They found that an accurate Laplacian of Gaussian image could be produced from an arbitrary input image. They argued that a Gaussian filter can be used to define the spatial scale or resolution of the image. If the convolution operation is denoted by ” then the filtered image f’ of an input image f is given by: .

f’=f*G where

G is the kernel

of the Gaussian

filter

and (Yis the scale parameter*. The Laplacian operator V2 can then be applied to f’ in order to find boundary features at that scale. The computation of V2f’ is simplified because of the associativity of the convolution operator. This means that

V'f'=V2(f*G)=f*(V2G) Thus the Laplacian of a Gaussian filtered image can be obtained from a simple convolution with a kernel function V’G, which has a well-known ‘mexican hat’ shape. The technique of combining derivative operators and filters can be applied to derivative operators other than the Laplacian. In particular, any partial derivative of the Gaussian filtered image can be estimated in this way. This fact is of particular importance in the present work. Marr and Hildreth used the zero crossing contour of the Laplacian of Gaussian image as an estimate for the boundary. Other workers used the same type of Gaussian filter but with derivative measures other than the Laplacian. Haralick” uses the second order direc*In this paper we will use the standard form for the Gaussian function only when discussing the relationship of filter size to positional accuracy, since the USC of sigma complicates the algebra. ‘The reader may convert IX to the sigma convention using CT= I/V%.

tional derivative and finds the zero crossings in the transformed image. Canny” uses the first directional derivative and describes a scheme for edge analysis which involves evaluating the directional derivative in the direction of maximum gradient. Instead of zero crossings he uses local minima/maxima as estimates of boundary position. Canny’ also discusses the use of Gaussian filters which are not radially symmetric. An example of such a filter would be the function exp (--cyx- -py ‘) for (Y#/3. which can be decomposed into the product function exp(-ax’)exp(-py*). This non-radial filter is useful for finding extended edges in noisy images. For example, if the location of a vertical edge is to be found then the directional derivative filter will have the form -2aXexp(--ax2)exp(-py2). The final term in the product function provides smoothing along the edge direction. When the edge is long, p may be made small, and this will improve detector performance in the presence of noise. Canny calls this smoothing term the ‘projector function’ and the first term the ‘detector function’. The TBL is a boundary finding technique which is related to the Canny scheme in that it involves searching for local minima/maxima of the directional derivative of Gaussian operator. It is important to appreciate the effect of the non-radial filter on curved boundaries and so this will be illustrated using twodimensional isometric projection plots. Figure la shows an isometric projection of a simple function which, shown as an image, would be a uniformly bright circular disk. Figures lb and lc both show directional derivative of Gaussian convolution kernels, but lb is a radial filter whereas lc is elliptical in shape, the projector function being longer than the detector function. The results of the two convolutions are shown in Figures 2a and 2b respectively. Both of the convolved plots show minima and maxima at the

Figure 2. (a) Convolution with the operator of Figure l(b), (b) convolution with the operator of Figure I(c)

region on the boundary of the circle which is aligned with the operator direction. However, the peak is much more pronounced with the elliptical edge operator, and the whole function looks smoother. The ragged edge of Figure 2a is a sampling artifact which is reduced with the elliptical operator. In practice, the elliptical edge operator is less sensitive to sampling errors due to its larger support. However, in the present context the main reason for choosing the elliptical edge operator is that, with suitable choice of the projector function, strong minima and maxima can be obtained from a boundary with a very small curvature. As a general edge finding method the elongated directional derivative of Gaussian operator has many advantages over earlier techniques. These advantages are discussed in Canny6. Briefly, the method is far less sensitive to noise, it can work at any resolution, and it can take advantage of prior knowledge about boundary length and orientation. However, there are three important practical problems which arise when using the technique in image analysis. These are: 1 Computation of the required image convolution is expensive, even when this is restricted to the area of interest. 2 This computation must be repeated for all the directions of interest. 3 The result is (at best) accurate to the nearest pixel.

Figure I. (a) Test image, (b) operator with projection and detection functions the same, (c) operator with elongated projection function 264

The latter point may be somewhat contentious. Some improvement in accuracy can be obtained by applying an interpolation procedure to the transformed image, but the magnitude of this improvement will depend on the individual image and will be hard to predict. image and vision computing

The IBL is an implementation of the derivative of Gaussian operator which avoids the three difficulties mentioned above. The problem of computational cost is avoided by not actually computing the convolved image. Instead the direction (or approximate direction) and the approximate position of the edge are specified in advance by the user and the transformed image is analysed at that one point for that one direction. On the basis of this analysis. the algorithm then chooses another point as a (hopefully better) estimate for the boundary location, and the process is iterated until the analysis indicates a boundary has been found. This means that the transformed image is only evaluated at a convergent sequence of points, rather than at all points on the image raster, and this results in a huge saving in computer time. especially for large operators. Moreover, the same convergent process can be applied to the possible orientations of the boundary, rather than exhaustively searching all directions. A floating point world co-ordinate system is used to represent the image raster, and the problem of obtaining high accuracy is solved by reconstructing the value of the transformed image between sample points. The IBL therefore represents a new combination of existing techniques for iteratively searching smooth functions and existing techniques for image enhancement.

that the nearest maximum/minimum* will correspond to the required boundary in the image, but this is likely provided the initial position is sufficiently close. There are various methods available for image reconstruction/interpolation and for numerical search. The methods chosen for the IBL seem to complement each other in a technically elegant way. The algorithms are discussed in detail in the following sections.

Evaluating

the transformed

Having selected a suitable image transformation for edge finding, an algorithm is required which can evaluate the transformed image at any point. There are two types of algorithm which can be used; interpolation and reconstruction. The interpolation method is well known and various interpolation algorithms have been previously used in sub-pixel feature detection2.3. However, in order to use interpolation to evaluate the transformed image at some point (x, y), the discrete transformed image must be previously computed for pixels surrounding (x, y), and this is a considerable disadvantage in this application. The reconstruction method gives an estimate T,f ‘(x, y) for the transformed image at (x, y) directly from the image samples in the region of (x, y). This is achieved with a reconstruction sum of the form THf’(.x. y)=

ITERATIVE

BOUNDARY

LOCATOR

Basic principles The IBL is a combination of an image transformation and a search procedure. Various feature-enhancement transformations could be used with the TBL but, for reasons of clarity, attention will be restricted here to the directional derivative of Gaussian operator described in the previous section. If the direction of the derivative ii given by the angle H and the scale of the projector and detector functions are cy and /3, respectively, then the kernel function K(s, y, 0) for this convolution can be written: K(x, y. 0) = [Xcose+ysin8]e”‘~+‘“~+‘“‘, where a = -acos’H-psin’0 h = -cusin’@-pcos’0 c =2(p-a)cosHsinB

(I)

The transformed image produced by convolving an image f(s, y) with the kernel function (1) will be denoted T,f(.r, y). This will be a smooth surface, with local minima and maxima which correspond to boundaries in the original image. In the direction 0 is held fixed then the surface is a function of x and y, but if 0 is allowed to vary then we have a three dimensional surface. The purpose of the IBL is to refine an initial estimate (xi, y,) or (xi, y,, /3,) for the position of a local maximum/minimum in the transformed image. The situation is analogous to a mountaineer being placed at some point in a mountain range and being asked to climb the nearest peak. There is no guarantee

image

r

x

x

c

S,,R(x-i,y-j)

(2)

where R is a continuous function which is known as the Reconstruction filter, and the S,, are the image samples. In practice, the reconstruction filter R in (2) decays to zero and so the sum is not infinite. The filter R should be chosen so that T*f’(x, y) is as close as possible to the transformed image T,f ‘(x, y) at all points. Filters are discussed in some detail elswhere’. Here we note that an intuitively reasonable choice for R would be the kernel function K(x, y, 0), and this gives good results in practice. A more detailed argument for this will be given later.

Search method An exhaustive search of points (x, y) to find where the function (2) has a minimum would be impractical. Since the intensity surface defined by (2) is a continuous function with continuous partial derivatives, this is unnecessary. Instead, a numerical optimization algorithm can be used in order to iteratively refine an initial estimate for the edge position. One of the best algorithms for this purpose is known as Newton’s method, and this is based on a quadratic model of the underlying function surface. The Newton optimization method requires estimates for the first and second order partial derivatives of the function being searched. These partial derivatives can be obtained by differentiating (2). For example, the first order partial derivative dTf ‘idx is given by

*Obviously a maximum in a function f corresponds to a minimum in the function -f. so we can restrict the discussion in the rest of this paper to finding the maximum.

265

Then the new estimate formula

X,,,

is calculated

from the

xk+, = & - GP'gk.

This is known as the Newton step. Convergence can be detected by monitoring the size of the Newton step. Further details of the Newton optimization method can be found in any introductory text on numerical optimization.



Y-x

a

Sampling error If it were possible to process the unsampled image directly (for example, by opticat methods), then the required convolution operation could be written

TJ(x,y)=

f-

J-rJ-a

[l‘f(x’,y’)Ktx’-xc,y’-y,6)dw’dy’ (4)

*

e Figure 3. Typists operator K(x, y), (6) aWax, (c)

masks.

aK/ay,

(a) Ed@e operator (d) a-K/a?, (e)

aWax ay, (f, &Jay

Thus each partial derivative can be estimated using the apropriate reconstruction filter (8Rlax, a2Rl axay, etc). If 6 is fixed, then five partial derivatives must be evaluated for each point. For the free 0 IBL this number increases to nine. However, if R = K then these filters all take the simple form Pexp(ax2+by2+cxy)

(3)

where P is a polynomial in x, y, costt and sin 8. Examples of the six reconstruction filters required for a fixed 6 derivative of Gaussian TBL are shown in Figure 3. Here the projector scale 0 is one quarter of the detector scale ac (meaning that the operator is elongated in the y direction). Figure 3a shows the basic transformation kernel K(x, y,(9). Figures 3b and c show the two first order partial derivatives a K/ax and aKlay. Figures 3d, 3e and 3f show the second order partial derivatives. An outline of the Newton method is as follows: let the position of the kth estimate of boundary position be denoted by the vector X,. Let the derivative information be denoted by the vector (called the gradient)

Sampling error will be defined here as the difference between the estimate T,,f’ for T,,f given by (2) with the true function 7’,$ which could be obtained directly from (4). The starting point in this analysis is a model for the image sensor. This will be assumed to be a shiftinvariant point spread function (PSF) WZ(X,y) so that the image sample Sji at (i, j) is given by I s,, =

l-1-x

z

--z

f(x’,y’)m(x’-i,y’-j)dx’dy’

(5)

The most probable type of sensor is a CCD camera in which case the sample values may be assumed to be defined by a uniform integration of the incoming image over each (square) pixel. This is equivalent to a sensor function m(x, y) which is rectangular top hat. However, in practice the incoming image will be subject to additional blurring factors such as imperfect focus, finite aperture, and lens aberrations. The estimated image produced by reconstructing the sampled image can be found by substituting for the pixel values from (5) in the reconstruction sum (2), and reversing the order of integration and summation:

T&f’(x9Y) =

ix1’.f(x’, -z

Y')

-z

*~=~m(~‘-i,r._i)R(.r-i.y-i)dx’dy’

(6)

The sampling error is given by

WXk) aKtXk) gk s -ax

aY

ic

w.6

Y)

=

II -a:

and the matrix (called the Hessian)

E

f(x’, y’> 4x’, y’, x,

Y)

d.x’ dy’

-tc

where e(x’, y’, x, y) is given by e(x’, y), x, y) =

Gk=

sj$(x’-i,y’-i)

R(x-6 y-i>K(x’-.Gy’-y)

256

(7)

image and vision computing

If the functi~~n e is small. then the sampling error SC will also be small. To achieve a sampling error which is low, irrespective of the input image f, then the reconstruction filter must be chosen to minimize a norm (for example. the sum of squares) of e. In Appendix A a reconstruction filter is derived which is optimal in an image-dependant sense. and this should give the best results in this system. This filter depends on the sensor PSF rn. When m is a rectangular function then the optimal filter is given by the convolution product m* K. In practice, the kernel function K often has a similar shape to VE*K and is simpler to calculate.

a

Accuracy of detector

with noise --MT-’

A careful choice of the parameters (Yand p is required with this method if sub-pixel accuracy is to be obtained. There are a variety of ways in which the sampling error can be estimated from (7). For example, the sampling error might be computed for a specific input image f, such as a step edge, or it might be computed as the expected error when f is a random field. An imagedependent upper bound for sampling error has been formulatedx using function spaces. In general, the error due to both sampling and noise will decrease when the filter size is increased (i.e. reduced cr and p). To minimize sampling error it will be generally sufficient to ensure that the filter is large compared with the pixel dimension. To avoid false edges due to noise we will often want to increase the size further. However, as we let the detector function become wider then the peak due to an edge will become wider, and so its location will become increasingly ill-defined. To get an idea of the magnitude involved consider the image f to be a one dimensional step edge at x = a (see Figure 4a). As our argument here concerns the shape of the Gaussian peak which is n(~rmally thought of in terms of its ‘width’ (r, we will switch to the notation V= (2cr)-“’ for this section. The normalized” detector function Xe-,,‘?V. K(x) = o7

Tf(x)

= e -(i

--
(8)

a Gaussian peak at the edge position. Consider now the effect of noise added tof. Suppose we have estimated a maximum deviation of 8~ on the true Tfdue to noise. The noise will have the effect of distorting the shape of the edge response to Tf’ so that the peak found by the optimization algorithm will be displaced by some distance 6p from the true peak at x = a (see Figure 4b). The magnitude of this displacement will depend on the sharpness of the peak, which will in turn depend on the value chosen for c. It is reasonable to assume that the value of (T will be large enough so that the detector output will decrease monotonically between a+Sp and a (otherwise the search algorithm might find a false maximum, in which *K is normalized so that a positive going step gives a maximum than a minimum, and the maximum response is unity.

aximum (worst case)

Response of Step Edge

False Maxima

b

a

Figure 4. One dimensional case case the noise is too great and (T must be increased). Since we have supposed that 1Tffx) - Tf’(x)

j 66~

for all x we have

/ Tf(a) - Tf’(a) / G Sv and j Tf(a + Sp) - Tf’(a + Sp) )s we can subtract the expressions limits on the right / Tf(a) - Tf(a + Sp) + (Tf’(a

au on the left if we add the

+ Sp) - To’)

/<2Su

in this expression both Tf(a) - Tf(a+ Sp) and Tf’ are positive hence it follows (a+@)-V’(a) that IT’(a) - Tf(u +6p)/ 626~ (the worst case is shown in Figure 4b). Substituting in our Gaussian 7”(x) we get

to first order sps2

in the small quantity

crv7G

‘&% (9)

This bound tells us that we can keep our positional error 6p down if we can decrease the maximum error 6~ in the transformed image Tf caused by the noise. It is tempting to try to do this by increasing the smoothing (T, but equation (9) warns us that unless 6~ decreases with (r faster than ~7.~ this may well not work. To get a feel for how 6~ might decrease with large (r we can model the noise as a stationary random field. In the case of uncorrelated white noise the mean square value of the noise would be

rather

267

that is r.m.s. noise ~a-“’

(IO)

We would not expect to achieve a better reduction in Sv than that for the r.m.s. noise given by (lo), which suggests that large values of sigma will not reduce Sp. That is, once cz is large enough to avoid the risk of false maxima due to noise or clutter any further increase is detrimental to the accuracy. Turning then to the question of the critical value of a cf where the risk of false maxima becomes serious, this obviously depends on how far we are from the edge we are trying to detect. If our initial estimate x0 for the edge position is out on the tail of the Gaussian (8), then even a small amount ,pf noise might produce a false peak there. On the steep sides and in the central hump the noise will fail to cause a distinct peak, so only its distorting effect at the true edge position is relevant. It follows that once u is large enough to place x0 here we will lose accuracy (by (9)) if we increase it further. This is a slightly different situation to the usual trade-off between accuracy and robustness in Canny6 type schemes, since (at least for sharp edges) it is the error in x0 which limits the sharpness of filter we can use. For edges that are not sharp (e.g. a ramp extending over several pixels) (T must be large enough for the filter to cover the thickness of the edge. In two dimensions there is the width of the projector function o;a = l/m to consider as well. The noise error 6v will be reduced if a larger value of G,, is used, since this increases the filtering action of the operator. The sampling error will not vary greatly with gr, for long operators. Thus u;, should be set to be as large as possible in order to reduce 6v, the upper limit being dictated only by the length of the boundary to be detected and the distance to the next (unwanted) feature. For sharply curved boundaries a correction will be needed because the output point will lie off the true edge on the concave side, as with all detectors of this type.

Interpretation of IBL result Termination of the search algorithm can be assured by setting a limit to the maximum number of function evaluations or derivative evaluations. When the algorithm stops because this limit is reached then low confidence should be placed in the output from the IBL. Even when normal termination is achieved then care should be taken in interpreting the result. This is because the effect of large amounts of noise is to add extra local minima and maxima to the transformed image. Sampling error, which is associated with very small operator sizes, will also produce false minima and maxima. Confidence in the IBL result may be increased by examination of the second order partial derivatives at the final point in conjunction with the size of the peak found. The first order derivatives will always be close to zero at this point, but some of the second order derivatives will be large if a sharp edge has been found. A co-ordinate transformation can be aplied to the hessian matrix (the second order partial derivatives) to find the derivatives a”Tfidz’ and a’Tf/aw*, in the 268

operator direction z and in the direction w perpendicular to z. The former is a measure of edge sharpness and the latter gives an inverse measure of the length of the boundary.

PROTOTYPE

IBL

General An experimental version of the IBL was implemented in Microsoft Pascal on an IBM PC-AT compatible microcomputer which was fitted with a mathematical coprocessor. An Imaging Technology FGlOO video frame grabber was used to acquire and display camera images. The cameras which have been tried are a Bosch TYK 91D tube camera and a Pulnix TM760 CCD camera. For the numerical optimization the authors first used routine VA21AD from the UK Atomic Energ Authority (UKAEA) Harwell subroutine library Y, which is freely available from UKAEA for research purposes. Another choice is routine E04LBF from the NAG library”, which gives similar results. The optimization code was compiled using the Microsoft FORTRAN compiler and linked together with PASCAI.code for the rest of the algorithm. This proved to be a highly effective way of utilizing the experience of the numerical analysis community without resorting to writing image processing programs in FORTRAN. At each point in the iteration six local operators are evaluated using the reconstruction sum (2) in order to find the value and partial derivatives of the transformed image. It is necessary to store or compute six reconstruction filters for this purpose. These reconstruction filters are essentially continuous functions which are centred on the current point, and whose value must be found at pixel positions around this point. In the prototype IBL it was found convenient to compute these functions as required, rather than attempting to store them. The alternative would have been to sample reconstruction filters on a lattice which was much finer than the original pixel lattice, and to store large tables of these values. The compute-on-demand strategy leads to a program which is relatively simple but which runs slowly because of the need to perform large amounts of floating point calculations at each iteration. A typical experiment with this implementation of the IBL requires ten iterations, each iteration taking a few seconds of processing time. The IBL has also been implemented on a Sun Sparcstation with Datacube framestore giving a 100 fold speed improvement.

Computing the reconstruction filters Two methods were used to compute the six reconstruction filters. The first method is based on factorizations of the polynomial parts of the filters. These factorizations are somewhat ad hoc, but have been found to be reasonably efficient and simple to implement. An algorithm in pseudo-code calculating the values at an arbitrary point x is given in Appendix B, and this code is called by the optimization program when it requires information about the image at x. This type of algorithm has two main disadvantages. Firstly, it is hard to prove the correctness of the image and vision computing

maxima of the Laplacian image occur near (just inside) object corners and boundary discontinuities. A slight boundary discontinuity of a few degrees can be amplified by using a suitable non-radial Gaussian filter to give a sharp local maximum of the Laplacian. In general, the methodology presented here allows the use of much more complicated operators than standard convolution techniques. The addition of one or more degrees of freedom does not have a disastrous effect on the search time, provided a reasonable initial estimate is available. The method has been shown to be capable of finding boundary positions with an accuracy better than one tenth of a pixel. Better results might be achieved if the use of trial and error in setting the parameters (Yand p could be avoided, using characterizations of the noise and clutter in the image. The effect of noise on the IBL is not yet well understood, and further work is required in this area. The computational cost of this edge finder increases with the area of the Gaussian filter (but only linearly with the filter length). This increase could be avoided by using a multi-resolution pyramid representation for the image, and by using that level of the pyramid which is most appropriate for the required operator parameters. A standard type of interpolation algorithm, such as bilinear or bicubic interpolation, could be used rather than the algorithm presented here. In this case, the Newton search algorithm could be expected to perform less well (or even to fail altogether) when the search converges to the pixel scale. This is because the partial derivatives computed by interpolating the various Gaussian operators will not correspond to properties of the actual function surface. It may be possible to use the Newton algorithm for the global part of the search and switch to a simpler method when the algorithm has converged to the pixel scale. Finally, although there are good arguments4 for using a Gaussian filter, this iterative search method could be used with any other type of filter which possesses the required partial derivatives. High order (greater than 4) B-spline filters are an attractive alternative.

ACKNOWLEDGEMENTS Support for the work came from the UK Science and Engineering Research Council (J .P. O.), the Manchester Central District research grants committee (R.T.S.), and from facilities provided by the University of Manchester. The authors are grateful to UKAEA Harwell and NAG Ltd for making their subroutine libraries available. The authors wish to thank Peter Woods for the Wolfson Image Analysis Unit at Manchester University for his constructive comments on the draft of this paper, and A. Fryett of the Department of Electrical Engineering for taking the measurements reporoduced in tables. REFERENCES 1

Ballard,

D H and Brown,

Prentice-Hall, 2 270

C M Computer

3

4

5

6

7

changes with sub-pixel accuracy using LaplacianGaussian masks’, IEEE Trans. PAMI Vol 8 No 5 (1986) pp 651-665 Nalwa, V S ‘Edge detector resolution improvement by image interpolation’, IEEE Trans. PAMZ Vol 9 No 3 (May 1987) pp 446-451 Marr, D and Hildreth, E C ‘Theory of edge detection’, Proc. Royal Society London Series B Vol 207 (1980) pp 187-217 Haralick, R M ‘Digital step edges from zero crossings of the second directional derivative’, IEEE Trans. PAMI Vol 6 (1984) pp 58-68 Canny, J F ‘A computational approach to edge detection’, IEEE Trans. PAMZ Vol 8 No 6 (1986) pp 679-698 Oakley, J P ‘A new method for image interpolation and its application in image reconstruction’, 3rd Int. Conf. Image Processing & its Applications,

8

9

10

University of Warwick, UK (18-20 July 1989) J P and Cunningham, M J ‘A function space model for digital image sampling and its application in image reconstruction’, Comput. Vision, Graph. & Image Process., Vol 49 (1990) pp 171-197 United Kingdom Atomic Energy Authority (UKAEA), Harwefl Subroutine Library Catalog, AERE Harwell, Didcot, UK (1988) Numerical Algorithms Group, NAG Introductory Guide NAG Central Office, Oxford, UK (1983) Oakley,

G Reduce; Software for Algebraic Computation Springer-Verlag, New York (1987) 12 Peterson, D P and Middleton, D ‘Sampling and

11 Rayna,

reconstruction of wave-number-limited functions in n-dimensional Euclidean spaces’, Information 6; Control Vol 5 (1962) pp 279-323 13

Rosenfeld, Processing:

C

Digital

Picture

Press, New York

(1987) 14

Askari, M, Cameron, A M and Oakley, J P High Temperature Technology Vol 8 No 3 (1990)

APPENDIX

A

The optimal reconstruction filter will be defined as that function which, for every x and y, minimizes the expression x _ / e(x’, y ‘, x, Y) 12d.r’ dy’, II-z -x where the function e is defined in (7). For simplicity, the one dimensional case will be discussed. Oakley’ shows that the Fourier transform i,,(s) of the optimal reconstruction filter rO(x) is given by

R(s) h(s)

x

‘0(S)=~?~~h(s+2n~)~2



where k(s) and Z?(s) are the Fourier transforms of the kernel function K(x) and the sensor function m(x), respectively, and the bar represents the complex conjugate. If m(x) is the rectangular response function defined by

Vision,

N Jersey (1982) Huertas, A and Medioni, G ‘Detection of intensity

A and Kak, A Vof I Academic

m(x) =

1,

(xl<05

0,

otherwise I ’ image and vision computing

(which might be appropriate for example if a CCD camera were used), then the denominator is a constant function with s and Ftj(s) has the form ~(~)~(~). A similar result holds in the two dimensiona case (in fact, the result in Oakley’ is for the N dimensions). It can be shown that this reconstruction filter also gives the lowest expectation of error when the input is an uncorrelated random field. The proof involves the calculus of variations and follows the same lines as that given by Peterson and Middleton’” for statistically optimum image reconstruction filters. A version of this is given derivation with more recent notation elsewhere’-‘. In the present context, the proof must be adapted for reconstruction of a transformed image rather than the original image. APPENDIX

C

Tables 1 and 2 give the output values of the fixed 0 IBL for values of B corresponding to the right and left APPENDIX

B

A procedure to evaluate the transformed image and its partial derivatives is given here to assist readers who wish to experiment for themselves. This procedure is written in pseudo-code, and would be called from the Harwell or NACJ optimization routines when they require the transformed image or its derivatives at a point X. The predefined constants theta (=B),a.b and c are defined in equation (1) of the main text: procedure evaluate-image ( X :VECTOR-TYPE;

{X[l], point coordinates

floating

var

F: float)

X[2] are the passed to this procedure}

{will hold the result 7”‘(X[]) to be returned to the calling routine}

varG : VECTOR-TYPE;

{will

hold

&‘a.~, a/@

of Tf’ (Xll))

var CC : HESSIAN-TYPE;

{&‘i+x’, etc)

var val, p. VI. ~7, v3:float;

{intermediate values} begin {initialize

Pmitimf

l,eft-hund X

side

Right-hand

side

Apex

Z

X

Z

X 261.41)0

I

71.431

348.154

44x.413

345.775

2 3 J 5 6

8X.176 8X.W 72.723 72.416 78.312

3.55.874 342.971 x2.514 345.586 359.444

462.242 462.131 446.748 446.527 452.32s

353.760 275.120 331.229 273.8% 339.768 259.713 343.24X 259.412 3S7.102 265.374

Top left 4l.h7Y 288.116 Top middle Xl.332 292.054 Top right 116.555 287.7.53 Middle ieft 41 .Y26 320.399 Middle middle 77.9X0 355.091 Middle right 114. I34 346.439 Bottom left 39.783 403.554 Bottom middle 73.659 326.7X0 Bottom right 110.617 405.689

315.815 355.374 390.671 316.029

2X6.370 228.686 289.783 268.055 2X5.032 303.124 319.134 228.W

Z

X7.217 94.687 82.253 91.12’) X4.5Oh 97.Y4h 27.251 30.710 266.622 59.535

3S?.R% 352.845 264.YXX 93.741 488.189 343.24 I 301.034 X4.726 413.81 I 302.164 127.157 132.01 I 448.673

423.S3h

484.689

401.31

2hO.964 164.026

I 297.783

143.284

result stores)

F: = 0.0

for i: = I to 2 do begin Cl;]: = 0.0; for j: = I to 3 do GG[i, jl: =(I.() end; {calculate an ellipse centrcd on X[] of orientation theta whose size (depending on alpha and beta) is large enough to cover the filters) [for each pixel in the ellipse} begin {let

Table Cl. IBL output for the backlit image of a steel ball for A = 0, r/2, r in pixels. X is measured across the screen, Z down the screen

d.\-. dr

(let itw-lul

be the current

floating point offsets of the point in the ellipse from

X11)

ha

irt t/w

the

irrtrtgr

vultte

at

thtp cwwwt

pi.yct

dlipsr )

P:=dr’c,rt.st+d~~sint; (uherc sin t

(theta)] c >d.u “d.17):

= sin (theta)

and

cost

= CDS

Table C-2. As Table C.l but with the furnace window between the steel ball and the camera removed. The positions are only nominally the same as in C.1. The brackets enclose a rogue value, probably a clerical error Pnsitiorl

Left-hctnti X

sidr

Right-hand

Z

X

66.X1 1 75.612 76.3X-l X3.732 7h.iY7 74.lY2

337.748 354.548 x7.015 351.991 363.137 34ti.675

341.125 449.Y70 x0.739 -ISO. 4SO.SN 448.561

Top left 34.030 Top middle 75.043 Top right 104.s47 Middle left 35.2Yi Middle middle 7X.0.52 Middle right 107.423 Bottom left 33.5x3 S3ottom middle 75.703 Bottom right 107.9X6

301.742 288.693 290.348 368.626

41x.472 340.43Y 478.958 4IY.691

I 2 3 4 S h

side Z

Apex X

(332.802)153.973

Z

354.430 356.7.50 3.51.X.16 362.7l.J 346.517

263.011 263.797 271.948 263.556 ?hl.ShY

77.88.5 Y4.440 %.X0X 91.X26 102.662 86.642

3Ol.hOX 2x7.994 2YO.303 36X.675

232.358 262.257 291.901 332.H3Y

31.771 2X.018 30.1 IO 10X.484

34X.X72 Zi?.-tYh .W.fW 265.429 88.543 33.018 4X1.X1 I i-13.442 103.X33 84.433 420.Y3Y 417.YX4 -I~O.OOX230.361 lhO.2h5 427.Yl3 450.196 427.102 263.29X 315. I1 I 482.362 4I4.006 205.648

166.93X 153.396

271

Table C.3. Diameters and heights computed from Table c.1 Position

Diameter

Height

1 2 3 4 5 6

373.992 374.066 374.059 374.025 374.051 374.013

259.748 260.130 259.847 260.012 259.910 260.327

Top left Top middle Top right Middle left Middle middle Middle right Bottom left Bottom middle Bottom right

374.136 374.043 374.116 374.103 374.075 374.055 374.027 374.014 374.042

259.992 260.209 260.180 260.232 260.228 260.114 260.848 261.137 260.216

Table C.4. Diameters and heights computed c.2 Position

Top left Top middle Top right Middle left Middle middle Middle right Bottom left Bottom middle Bottom right

Diameter

from Table

Height

374.314 374.347 374.355 374.347 374.343 374.369

(262.390) 260.050 260.079 260.087 260.268 259.955

374.433 374.396 374.411 374.400 374.444 374.388 374.401 374.403 374.376

259.904 260.325 260.315 260.166 260.198 260.297 260.209 260.569 260.163

hand sides and the apex of the steel ball. A typical image is shown in Figure 5. Measurements were taken with a =0.2 and /3 =0.002. Positions 1 to 6 are all

272

near the centre of the screen, while the others test the edges of the imaged area. The movements were achieved by moving the camera. The second set of readings were made with the furnace window between the steel ball and the camera removed. The positions are only nominally the same as in Table 1. Table 2 has one rogue reading, enclosed in brackets, which was presumably a clerical error. Tables 3 and 4 give the diameters deduced in pixels, as well as the height of the apex above the computed diameter (the latter permits the computation of the aspect ratio of the acquisition system). Tables 5 and 6 give the means and standard deviations about the mean, which give an indication of the accuracy of the IBL. Table C.5. Mean values and standard deviations from the mean of the diameter and height from Table C.3 Positions I to 6 Diameter Height

Mean 374.03 260.00

Standard deviations 0.03 0.21

Remaining positions Diameter Height

Mean 374.07 260.35

Standard deviations 0.04 0.38

Overall Diameter Height

Mean 374.05 260.21

Standard deviations 0.04 0.36

Table C.6. Mean values and standard deviations from the mean of the diameter and height from Table C.4, ignoring the rogue point. Positions 1 to 6 Diameter Height

Mean 314.35 260.10

Standard deviations 0.02 0.11

Remaining positions Diameter Height

Mean 374.41 260.24

Standard deviations 0.02 0.17

Overall Diameter Height

Mean 374.38 260.18

Standard deviations 0.04 0.17

image and vision computing