Fractional-power synthetic discriminant functions

Fractional-power synthetic discriminant functions

Pergamon Pattern Recoonition, Vol. 27, No. 4, pp. 577 585, 1994 Elsevier Science Ltd Copyright © 1994 Pattern Recognition Society Printed in Great Br...

1MB Sizes 3 Downloads 111 Views

Pergamon

Pattern Recoonition, Vol. 27, No. 4, pp. 577 585, 1994 Elsevier Science Ltd Copyright © 1994 Pattern Recognition Society Printed in Great Britain. All rights reserved 0031 3203/94 $7.00+.00

0031-3203(93)E0015-Y

FRACTIONAL-POWER SYNTHETIC DISCRIMINANT FUNCTIONS J. D. BRASHERand J. M. KINSER Teledyne Brown Engineering, Mail Stop 60, Cummings Research Park, 300 Sparkman Dr. NW, Huntsville, AL 35807-7007, U.S.A. (Received for publication 2 November 1993) Abstract--The standard synthetic discriminant function (SDF) generalizes well. That is, it recognizes objects represented by, but not included in, the training set from which it is synthesized. However, it also correlates with objects not represented by the training set. That is, it does not discriminate well. Conversely, the minimum average correlation energy (MACE) SDF discriminates, but does not generalize, well. By using a power spectrum normalization procedure, a parametric SDF which generalizes better than the MACE SDF and discriminates better than the standard SDF is obtained. Synthetic discriminant function Fractional power Generalization Discrimination

I. INTRODUCTION The synthetic discriminant function (SDF) correlation filter and its variants ") are synthesized from a set of training images representing the class of objects to be recognized or discriminated. A key feature of the S D F filter is the value produced at the correlation-plane origin when cross-correlated with an image. It is given simply by the inner product of the S D F and the given image and is usually constrained to a constant (e.g. unity) for each training image used in the synthesis process. An object is recognized by the filter if its image produces the prescribed correlation value, or close to it, when cross-correlated with the SDF. By construction, the S D F recognizes the training images. The filter design objective is twofold. On the one hand, the filter should recognize all objects belonging to the class represented by (but not necessarily included in) the training set. This is generalization. On the other hand, the filter should not recognize objects that do not belong to the designated training-set class. This is discrimination. The standard SDF, ~2) henceforth referred to simply as SDF, is known to generalize well but not discriminate. It is relatively insensitive to deviations from, or distortions in, the training images. Concomitantly, it generally produces broad correlation peaks with prominent sidelobes, making target detection and location more difficult. Alternatively, the minimum average correlation energy (MACE) SDF, t3) which minimizes sidelobe structure in the correlation plane, generally produces sharp, well-defined correlation peaks, but is sensitive to distortions of the training images. Consequently, it discriminates, but does not generalize, well.

Coefficient rooting

Power spectral density

By virtue of the mathematical structure of these SDFs, a power spectrum normalization technique taken from signal and image processing can be employed as a means of incorporating the desirable attributes of both SDFs in a single filter. The result is a parametric SDF involving a control variable which can be adjusted to balance the degree of generalization and discrimination, as well as to control correlation peak sharpness and sidelobe structure.

2. SDF FORMULATION Let the class of objects to be recognized be represented by a set of N training images. The SDF can be synthesized directly from the training set in the image domain, or in the Fourier domain using the Fourier transforms of the training images. The latter formulation is chosen since the MACE SDF is usually synthesized in the Fourier domain. Suppose that the training images each consist of d pixels. It is mathematically convenient to order them lexicographically as (complex) d-dimensional vectors, {x x, x 2. . . . . xN }. If the filter is represented by the complex d-dimensional vector h, then the usual (Fourier-domain) SDF constraints are expressed as the inner products x~+h = d

Vne{1,2 . . . . . N}

(1)

in which the superscript " + " signifies the Hermitian adjoint. Setting the constraint value to d in the Fourier domain corresponds to a value of unity in the image domain. Let X be the d × N complex data matrix whose nth column is the vector x. and let u be the N-dimensional vector whose components are all d; that is, u , = d Vne{1,2 . . . . . N}. Then the S D F con577

578

J.D. ]]RASHERand J. M. KINSER

straint equation takes the form X+h = u.

(2)

The SDF solutionIt) of equation (2) is hsDr = X ( X + X ) - ~u

(3)

and the MACE SDF solutionI~) is hMACE= D - iX(X+ D - 1X)- lu

(4)

in which D is a d x d real, diagonal matrix whose non-zero elements are 1

Ok~ : ~ . ~

N

IX~I 2 Vk~{1,2 ..... d).

(5)

In equation (5), Xk, = x,(k) is the kth (Fourier) component of the nth training vector.

'

!

The same training set was used to synthesize all the SDF filters discussed in this paper. It was generated from the 128 x 128 pixel image exhibited in Fig. 1, which is a model replica of the Millennium Falcon from the movie Star Wars. The training set consisted of ten such images rotated in the plane at 20deg increments from 0 ° to 180°, inclusive. Intermediate orientations of this image were used to test generalization of the filters to in-class objects not in the training set. Discrimination against objects not in the training-set class was tested with the 128 x 128 pixel image of a tank shown in Fig. 2. For the subsequent discussion, the following terminology shall be adopted. Images belonging to the training set shall be called training images. Images belonging to the training-set class but not included in the training set shall be designated as target-class

::

•i!~

Fig. 1. Training image (0°) of 128 x 128 pixels used to generate the training set which consisted often such images rotated in the plane at 20deg increments. Intermediate orientations (target-class images) were used to test generalization.

Fractional-power synthetic discriminant functions

::.

.::.+.

579

::

iii.....

:3:.:.:

::

;;II!IIIII~IIIL'~::::::::::i!iiii!: ::::~!!ii:%!iiii~i:;,:ii?~iii::::~:/I:'

:::::::

,:'............:~~....................

Fig. 2. Non-target-class image of 128 x 128 pixels used to test filter discrimination.

images. Finally, images not belonging to the target class represented by the training set shall be referred to as non-target-class images. As illustrated in Fig. 3, the SDF recognizes the nontraining, target-class image as well as it does the training image. However, it also recognizes the non-target-class object (i.e. the tank). The SDF generalizes well but does not discriminate. Figure 4 demonstrates that the MACE SDF does not recognize the non-target-class object, but neither does it recognize the non-training, targetclass object. The MACE SDF discriminates well but does not generalize. Moreover, the characteristically broad correlation peaks and sidelobes typically produced by the SDF and the characteristically sharp correlation peaks and suppressed sidelobes associated with the MACE SDF are clearly evident in these figures. The filter~4~described below balances the discrimination of the MACE SDF with the generalization

of the S D F and retains control over correlation-plane sidelobes as well.

3. FRACTIONAL-POWERS D F We begin by expressing the SDF of equation (3) in the form hsDv = I - 1X(X + I - IX)- lu

(6)

where I - 1 = I is the d × d identity matrix whose nonzero elements (all unity) can be written as 1

N

Ikk=~.~l IXk.I°

Vke{1,2 . . . . . d}.

(7)

Equation (7) should be compared to equation (5). Now let A be the d x d real, diagonal matrix whose non-zero

580

J.D. BRASHERand J. M. KINSER

(a)

(a)

(h)

(h)

(c) Fig. 3. Correlation surfaces for SDF cross-correlated with: (a) a training image (80°), giving peak magnitude 1.000; (b) a target-class image (90°), giving peak magnitude 0.992; (c) the non-target-class image of Fig, 2, giving peak magnitude 0.639.

(c) Fig. 4. Correlation surfaces for MACE SDF cross-correlated with: (a) a training image (80°), giving peak magnitude 1.000; (b) a target-class image (90°), giving peak magnitude 0.194; (c) the non-target-class image of Fig. 2, giving peak magnitude 0.183.

Fractional-power synthetic discriminant functions

581

la)

(h)

(c)

(d)

(e)

(f)

Fig. 5. Correlation surfaces for F P S D F cross-correlated with a training image (80 °) for: (a) p = 0.0, the SDF, (b) p = 0.4; (c) p = 0.8; (d) p = 1.2, (e) p = 1.6; and (f) p = 2.0, the M A C E SDF.

582

J.D. BRASHERand J. M. K1NSER

elements are the functions 1

A k k ( p ) = ~.. -~_, IXk.I ~' V k e { 1 , 2 . . . . . d } ^ p e r 0 , 2 3.

(8) Note that p = 0 gives A = I as in equation (7), while p = 2 gives A = D as in equation (5). As the elements of D are the components of the power spectral densities of the training images, averaged over the training set, the exponent p~(0, 2) in the elements of A denotes a "fractional power". In signal and image processing, this procedure is known as coefficient rootingJ 5~ Hence, the fractional-power SDF is defined by h(p) = A -

+A- ~X)- lu

1X(X

(9)

as a function of the parameter p whose value in [0, 2] determines the degree of MACE-like or SDF-like character of the filter and controls the balance between generalization and discrimination. For example, for p = 0, the fractional-power (FP) S D F reduces to the SDF, h(0)= hsov, while for p = 2, it reduces to the MACE SDF, h(2) = hMACE"The character of the filter varies between the two extremes as p ranges over [0, 2]. In this way, a balance can be attained between the generalization of the SDF and the discrimination of the MACE SDF without sacrificing one for the other. Note that the SDF constraints in equation (2) are still satisfied by the SDF defined in equation (9) for any value of p. A F P SDF was synthesized from the same training set described in Section 2. Figure 5 displays the progression of correlation surfaces for a training image cross-correlated with the F P SDF for selected values of the exponent p. It clearly demonstrates the improvement in peak definition as p increases from 0 to 2. Along with this improvement, however, comes a concomitant sensitivity to deviations from the training set,

but this usually translates into improved discrimination. Generally, though, the price paid for this increase in discrimination is a decrease in generalization. Figure 6 shows the correlation peak (inner product) magnitude as a function of orientation of the training image in Fig. 1 and shows how generalization degrades as p increases from 0 to 2. This is signified by the decrease in peak magnitude for image orientations intermediate between the 20 deg increments of the training images. The larger the value of p, the greater the decrease in peak magnitude for the non-training images. Note that interpolation between training images is generally better than extrapolation beyond the last one (at 180°). However, all the filters exhibit some response to the 200 and 220deg orientations, even though these were not in the training set. That is, some memory of the 20 deg intervals still persists beyond the last training image. By construction, peak magnitudes are unity for all training images for any value of exponent p in the F P SDF. Figure 7 gives the correlation peak magnitude as a function of exponent p; that is, the magnitude of the inner product of the indicated images with the FP SDF, as p varies over its range. The upper curve (horizontal line) corresponds to a training image. The central curve is for a target-class object and again demonstrates the degradation in generalization as p increases. It shows that better generalization performance is achieved for smaller values of the exponent p, corresponding to a more SDF-like filter. The lower curve, for the nontarget-class object shown in Fig. 2, shows that better discrimination performance is obtained with larger values of the exponent p, corresponding to a more MACE-like filter. By optimizing h(p) on (0, 2), a balance can be attained between these conflicting goals in a single filter with the generalization of the S D F and the discrimination of the MACE SDF.

.=

; ,=

O.O

v V 'd v ' " 03' 0.8

_

~

•.o ]1 V

~ i i

OAt! J

L~

o

u \

p

:/c!:,.::,::::., o

20

40

60

80

100

120

140

160

180

200

2.0

220

ORIENTATION(I:~C4~EES) Fig. 6. Correlation peak magnitude, for various FP SDFs, as a function of orientation of the input image of Fig. 1. Training images correspond to orientations at 20 deg intervals between 0° and 180°, inclusive.

Fractional-power synthetic discriminant functions

583

10 --¢ ;,

Training Image Target-Class Image Non-Target-Class Image

~

=-

~

--

_

~

.,-

Q lg m r. O= el



.1

la,

.01

!

0

1

2

P

Fig. 7. Correlation peak magnitude as a function of exponent p for a training image (80°), a target-class image (90 °) and the non-target-class image of Fig. 2.

-8

~,

Training Image Target-Class Image Non-Target-Class Image

.1

U~ 0 D.

.01

.001

.0001 0

1

2

P

Fig. 8. PCE ratio as a function of exponent p for a training image (80°), a target-class image (90 °) and the non-target-class image of Fig. 2.

584

J.D. BRASHERand J. M, KINSER

It is evident from Figs 3-5 that using only the magnitude of the correlation peak as a filter performance metric may be misleading. A more precise measure of correlation peak definition and sharpness is provided by the peak-to-correlation energy (PCE) ratio, ~6)defined as the ratio of the square of the correlation peak magnitude to the correlation-plane energy. For a given image x, the PCE ratio produced with the F P SDF, defined in equation (9), is PCE(p) = Ix+h(p)12/[h+(p)Ah(p)]

Vp~[0,2] (10)

where A is a d × d real, diagonal matrix whose nonzero elements are

Akk=lX(k)l 2 Vke{1,2 . . . . . d}

(11)

and x(k) is the kth (Fourier) component of the vector x. By construction, the M A C E S D F minimizes correlation-plane energy and should therefore produce the maximum PCE ratio for training images. This is borne out in Fig. 8, which gives the PCE ratio as a function of p for the same training image, target-class image and non-target-class image as for Fig. 7. However, the figure also shows that for the non-training, target-class images, the optimum PCE ratio may be provided not by the M A C E S D F (i.e. a F P S D F with p = 2) but by a F P S D F with p < 2. For the training data used in this work, this occurs for a value ofp ~ 1.6.

Acknowledyements--This research was supported by Independent Research and Development funds provided by Teledyne Brown Engineering. The authors express their appreciation to Dr Richard Juday of NASA JSC for supplying the training data used in this work. REFERENCES

1. B. V. K. Vijaya Kumar, Tutorial survey of composite filter designs for optical correlators, Appl. Optics 31, 4773 (1992). 2. C. F. Hester and D. P. Casasent, Multivariant technique for multiclass pattern recognition, Appl. Optics 19, 1758 (1980). 3. A. Mahalanobis, B. ¥. K. Vijaya Kumar and D. P. Casasent, Minimum average correlation energy filters, Appl. Optics 26, 3633 (1987). 4. D. J. Sullivan, A. V. Forman, Jr. and A. W. Chang, Realtime, distortion-tolerant composite filters for automatic target identification, Proc. SPIE 1701, 178 (1992); J. M. Kinser and J.D. Brasher, Landscaping the correlation surface, Proc. SPIE 1701, 188 (1992). 5. W. K. Pratt, Digital Ima#e Processing,pp. 326-327. Wiley, New York (1978). 6. B. V. K. Vijaya Kumar and L. Hassebrook, Performance measures for correlation filters, Appl. Optics 29, 2997 (1990). 7. Ph. Refregier, Filter design for optical pattern recognition: multicriteria optimization approach, Opt. Lett. 15, 854 (1990). 8. B. V. K. Vijaya Kumar, Minimum-variance synthetic discriminant functions, J. Opt. Soc. Am. A 3, 1579 (1986). APPENDIX An alternative approach to combining the SDF and MACE SDF is to form a simple linear combination of the two as

4. CONCLUSION

hA(~)= ~hMAcI~+ (1 -- 0t)hsoF V~E[0, 1].

In this paper, the fractional-power S D F has been described. It is a parametric S D F possessing an adjustable parameter allowing the filter designer to control the balance between generalization and discrimination. Thus, the character of the filter can be varied continuously between the two extremes of the standard SDF, with its ability to generalize, and the M A C E SDF, with its ability to discriminate. Generally, these are conflicting goals where one is obtained at the expense of the other. Varying the power spectral density exponent permits optimization of filter performance for balanced generalization and discrimination and control of correlation-plane sidelobes and peak sharpness. Although the performance of all S D F filters is training-data dependent and may vary for different training sets, our results nevertheless demonstrate that a single S D F filter with the ability to generalize and discriminate can be synthesized using our procedure.

As a function of the weight parameter ct, hA(0 ) = hso F is the SDF and hA(l) = huAcEis the MACE SDF. The linear interpolation hA(a ) is to be optimized on (0, 1). Yet another compromise between the SDF and the MACE SDF results from a multicriterion optimization,"'7~ maximizing noise tolerance while minimizing correlation-plane energy. The resulting filter can be expressed as hn(fl) = B - 1X(X + B - IX)- lu

(A1)

(A2)

in which B(fl) = flD + (1 - fl)a2l

V i l l i 0 , 1].

(A3)

In equation (A3), D and I are defined as before and tr2 is the (zero-mean, uncorrelated) noise variance. The filter hB(fl) is optimized on (0, 1) to balance the noise tolerance of the SDF with the sidelobe suppression and peak sharpness of the MACE SDF. Here, h~(0) = hMv,the minimum-varianceSDF ~s) (which for zero-mean, uncorrelated noise is identical to the SDF) and ha(l) = hMACeis the MACE SDF. Again, the SDF constraint, equation (2), is still satisfied by the filters in equations (A1) and (A2).These filters are mentioned here for purposes of information and to suggest other approaches to the goal of our procedure.

About the Author--JAMES D. BRASHERholds the degrees of Ph.D. in condensed matter theory and M.S. in experimental solid state physics, both from the University of North Carolina, the M.A. in mathematics from the University of Alabama in Huntsville and the B.S. in physics and mathematics from the University of Alabama. His main research interests include information theory, complexity theory, pattern recognition and image processing. He is currently employed in the Sensor Systems Department of Teledyne Brown Engineeringin Huntsville, Alabama where he is manager of the Applied Mathematics Section and is involved

Fractional-power synthetic discriminant functions

in applications of optical pattern recognition and in modeling and simulating optical correlators. Dr Brasher is a member of Sigma Xi and IEEE.

About the Author--J. M. KINSER has earned a B.A. in physics from William Jewell College, an M.S. in physics from The University of Alabama, Huntsville, and is presently working towards a doctorate. He is currently employed in the Sensor Systems Department at Teledyne Brown Engineering in Huntsville, Alabama. His fields of interest include the adaptation of neural networks to the complexity inherent in training data.

585