Needles in a haystack: Fast spatial search for targets in similar-looking backgrounds

Needles in a haystack: Fast spatial search for targets in similar-looking backgrounds

Available online at www.sciencedirect.com Journal of the Franklin Institute 349 (2012) 2935–2955 www.elsevier.com/locate/jfranklin Needles in a hays...

1MB Sizes 0 Downloads 21 Views

Available online at www.sciencedirect.com

Journal of the Franklin Institute 349 (2012) 2935–2955 www.elsevier.com/locate/jfranklin

Needles in a haystack: Fast spatial search for targets in similar-looking backgrounds Kaveh Heidarya,n, H. John Caulfieldb,1 a

Department of Electrical Engineering, Alabama A&M University, PO Box 702, Normal, AL 35762, USA b Alabama A&M University Research Institute, PO Box 313, Normal, AL 35762, USA Received 21 April 2011; received in revised form 22 March 2012; accepted 30 May 2012 Available online 19 August 2012

Abstract This paper develops an efficient and robust algorithm that simultaneously detects and locates image anomalies and intrusions. Anomalies refer to image regions that do not belong to expected classes. In situations where most of the image is of one or more types of known background classes while a few isolated regions may belong to unknown classes, the algorithm detects and locates potential intrusions by blanking regions it classifies as members of the known classes. We used a combination of Fourier filtering, a fast linear way to scan the content of the whole scene in parallel, with Margin-Setting, a powerful nonlinear discriminant trained to distinguish members of known classes from everything else. That combination retains the power of Margin-Setting and the simplicity, speed, and locating ability of Fourier filtering. Examples show the ability of this method to remove essentially all background material while leaving the similar looking intrusions intact. The classifier is trained using a few small square patches extracted from images or image regions representing the background classes of interest. Processed images related to four different problems as well as cumulative numerical results of many tests performed on one of those problems are presented. Excellent performance is observed for the examples considered here. & 2012 The Franklin Institute. Published by Elsevier Ltd. All rights reserved.

1. Introduction This paper presents a way to largely rid an image of unpromising background as a preliminary step in target detection and classification. Effectively, it uses Fourier methods n

Corresponding author. Tel.: þ1 256 372 5587; fax: þ1 256 372 5855. E-mail addresses: [email protected] (K. Heidary), John.Caulfi[email protected] (H. John Caulfield). 1 Tel.: þ1 256 372 5844. 0016-0032/$32.00 & 2012 The Franklin Institute. Published by Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.jfranklin.2012.05.013

2936

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

to locate and remove items that represent probable background. This can be done by spatial and/or spectral filtering, but we have only demonstrated spatial filtering, as the objective here is background removal and detection of targets in gray-scale digital imagery. The intent of this work is to make fast and robust negative target detection possible. In the context of this paper ‘‘negative target detection’’ is defined as follows: Rather than basing the target detection process on the stated target attributes, provide a method of recognizing the dominant background—the things the target is not. That is the problem attacked here, and its detailed analysis dictates the paper organization outlined below. First, one begins with what is known to dominate the image—call it the background B(x, y). It is assumed that at least a few samples of B are available to use for instructing an algorithm to recognize occurrences of B at any location, even allowing for specific B’s to differ somewhat from those trained on. This, of course, is the standard statistical pattern recognition problem. To distinguish between B and all other things, called potential intrusions A(x, y), the best that can be done in the absence of any information about A is to assume that A or its Fourier spectrum has equal probability at all points in the image and in its power spectrum. The equivalent action here is to remain silent on the spatial and the spectral content of potential intrusions. Second, the background removal algorithm must be computationally efficient and amenable to real-time implementation using readily available processors. Methods that require serial examination of countless small regions of each image frame are not likely to be widely applicable. A space-invariant discriminant that can be applied in parallel at all points in the image is required. For well known reasons, this imposes symmetry conditions that effectively dictate the use of Fourier plane filtering. Third, it is vital that the discrimination achieved between A and B be very reliable, in discrimination terminology, very robust. The process must be highly reliable even when the intrusions are fairly analogous to the background. Unfortunately, there is a conflict between what requirements two and three seem to imply. Requirement two dictates Fourier correlation, which can only implement linear discriminants. Yet, requirement three points to the extreme improbability of intrusion instances A being linearly discriminable from background B. Fourth, it is seldom true that anywhere near enough samples are available to achieve the kinds of reliable robustness users seek according to PAC (probably approximately correct) learning theory. So whatever method is used must outperform PAC learning predictions dramatically. This too argues against Fourier filtering, even though Fourier filtering is essentially required for practical systems. The problem considered here is not new and there have been many attempts to solve it. So far as we know, no prior method provides all of the sought-after advantages, namely:

   

Distinguishes B from an A with totally unknown and unknowable spatial and spectral characteristics. Operates on images in parallel at high speed. Discriminates against B with very high reliability. Accomplishes those feats with only a small number of samples in the training set.

The distinguishing between A and B is a classical Bayesian problem of distinguishing between two hypotheses, that is, H1: The image is B plus noise, and H2: The image is A

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

2937

plus noise. This disallows the hypothesis H3: The image is (AþB) plus noise, because the neighborhood is assumed to be too small to contain significant amounts of both A and B. That assumption is made to provide high spatial resolution in distinguishing B from A. An extended A may be much larger than a single neighborhood. In that case each small part of an extended A ought to be independently distinguishable from B. In Sections 2 and 3, we briefly explore other background elimination methods and, more-or-less-equivalently anomaly finders, and show that they present problems that our approach eliminates or at least greatly reduces. Section 3 shows how to modify the strictlylinear method of Fourier filtering to accomplish highly nonlinear discrimination. This is done by proper use of multiple Fourier filters, thresholding each, and nonlinearly combining their outputs. Any non Fourier approach will inevitably be slower than Fourier filtering, because of the latter’s parallel space-invariant nature. It does not matter to this paper whether the Fourier methods are electronic or optical. The problem analysis and mathematical formulation of the algorithm are covered in Section 4. This includes a detailed description of the algorithm implementation. Similarities and differences between Margin-Setting with two other discrimination approaches are briefly discussed. In order to demonstrate the efficacy of the algorithm, in Section 5 we apply it to four quite different examples. In all cases, finding the anomalies by eye was very difficult, while their locations became obvious after filtering. One of the examples is examined extensively in order to obtain quantitative assessment of the classifier performance. The other three examples are used to show before and after images. Finally, in Section 6 we offer a brief review of the paper and state some conclusions supported by the work reported here. 2. Prior work Whereas no prior work known to us has successfully achieved the goals outlined above, one set of papers [1–3] has successfully attacked a related problem. That problem is tracking an object by background elimination as well as positive identification in the case of a moving camera. They had to use a fairly simple linear discriminant, because frame-byframe retraining is required. Our work allows for slower training (perhaps updated every minute not every frame), so it can use much more powerful algorithms. In fact, there has been a large volume of work dealing with fitting models to the background to allow its recognition and subtraction [4–14]. Other related work goes under the name ‘‘anomaly detection.’’ The idea is to characterize expected objects and look for the unexpected. For example, [15] treats every neighborhood as having a particular normal distribution and assigns classes accordingly. Spectral aspects of the background were emphasized in [16]. Statistical characterization of neighborhoods is utilized in [17] as a means of anomaly detection. Power spectrum is used in [18] to characterize image neighborhoods. Somewhat distantly related work goes under many names but most commonly ‘‘novelty filters.’’ Those readers who seek further knowledge in that field might start with review articles by Markou and Singh [19–21]. The general idea put forth in these papers is to look for and isolate whatever has changed between frames. Face detection in arbitrary composite images has attracted a great deal of attention [22–24]. Detecting and locating faces with different scales, orientations, and poses against complex backgrounds is a challenging problem, and is the necessary first step towards the subsequent face recognition task. Advances in low latency face detection algorithms have

2938

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

led to robust detection systems with fast frame rate capabilities [25–29]. Face detection algorithms are based on training the classifier with many exemplars consisting of face and non-face image templates. AdaBoost based cascade classifiers in [27] consist of simple early-stage filters to eliminate large non-face image sections, while complex late-stage filters concentrate on more challenging portions of the image in order to locate potential faces. In contrast to the above face detection systems, the anomaly detection algorithm developed in this paper attempts to learn from samples of only one class, namely background, in order to detect and locate objects that are not members of the trained-on class. Similar to face detection systems of [27–29] ours is also a cascade filter. The analogy, however, stops there as the filter stages comprising our anomaly detection classifier are entirely different from existing face detection filters. The problem attacked in this paper is somewhat distantly related to the broad class of problems dealing with detection of the human face in an image, characterization and recognition of faces and identity verification based on the facial image [30–32]. A large number of face images are used in [30] in order to develop a feature space called eigenface that fully spans significant variations of the training set. Face locations in the input image are isolated by projecting the image into the eigenface and subsequent thresholding. Face tracking systems also can be construed as negative filtering operations in the sense that the algorithm attempts to eliminate all nonface content from each image in the input sequence. This brief survey suggests two things. First, this is an active field of research. Second, the approach taken in this paper is quite distinct from prior work. The value of this approach will be shown by argument and, more convincingly, we hope, by before-and-after pictures that show how cleanly the intruding areas can be segregated from the background even when they are almost impossible for a human observer to find. That and the real-time capability suggest this approach may have practical value. 3. Analytical approach and related work The approach taken in this work is to use Fourier correlation to recognize and blank out all regions of an image likely to belong to one or more classes of interest. Here, classes of interest are collectively referred to as background. Background is represented by samples taken from a reference image or set of images. Each sample consists of a square patch, and small samples will be used to give good spatial resolution. The Fourier approach allows the image to be processed in parallel. It is important to use a powerful discriminant: One that classifies image regions with very high reliability as belonging to the class of interest and not to any other random class. A Fourier filter is a linear discriminant applied in the Fourier domain, so Fourier filtering is very unlikely to be able to accomplish that discrimination robustly. Thus it is vital to combine the space-invariant characteristics of Fourier filtering with the most powerful nonlinear discrimination tool available to us. That problem has been solved by our super generalized matched filter (SGMF) as described in [33,34] and recapitulated by the concept map of Fig. 1. VanderLugt in his seminal work [35] established matched filter (MF) as an important tool in the optics community. He showed that such filters are normally complex valued and hence not suitable for amplitude-only filtering, while holographic matched filters could be made to represent the complex MF. Early on, it was realized that in order to handle realistic problems the theory must be extended and one must go beyond MF [36,37]. The MF is not even defined for problems of greatest interest—those in which the target signal is

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

2939

Fig. 1. The evolution and relationships among the matched filter (MF), generalized matched filter (GMF), and super-generalized matched filter (SGMF) shows each as nested within the other and offering progressive robustness.

ill defined and represented by a set of what are presumed to be fair samples of a potentially infinite set of possible signals. Filter design methodologies for optimal detection of multisignal targets is still an active area of research. A small subset of the numerous papers dealing with this topic has been collected in an SPIE Milestones volume [38]. Again, no assertion is being made that optics is better or worse as a means to do this task, even though much of the published work is in the context of optical implementation. For purposes of this discussion, one set of papers is more relevant than the others. That is the set of papers dealing directly with the topic of generalized matched filters (GMFs). These filters have two defining characteristics. First, they can handle a target class represented by a set of examples. Second, when the set of examples contains only one member, the GMF reduces to the MF. Said the other way, the GMF subsumes the MF. This paper discusses an approach that subsumes GMF, so we call it super-generalized

2940

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

matched filter or SGMF. The SGMF is a nonlinear filter [33,34,38–40] comprised of multiple GMFs whose output planes are nonlinearly combined. This enables the filter designer to retain the especially valuable space-invariant filtering characteristics provided by Fourier correlators while gaining the added discrimination advantages offered by nonlinear filtering. Given a set of target-class images, henceforth referred to as the training set or trainers, the algorithm developed herein computes an ordered set of classifier filters – generalized matched filters (GMFs) – and a threshold value for each. An unlabeled image is applied to the classifier filter set, hereafter referred to as the super-generalized matched filter (SGMF). If the peak response of any of the constituent filters (GMFs) to the unlabeled test image exceeds the respective threshold level the decision is made in favor of labeling the image as target-class otherwise it is labeled non-target-class. Fig. 1 shows the logical progression of filter design for the two-class problem. If one class is a fixed signal and the other is noise with known distribution, a matched filter (MF) is defined. The MF is the single filter that gives the highest expected value of processed signal level to processed noise level [41]. If the first class is represented by a set of signals, it is possible to define a generalized matched filter (GMF) that handles that problem well and reduces to the MF when the set of signals has only one member. The MF is not defined for the case in which one class is represented by a set of more than one signal. We explore here a sequentially derived set of GMFs that handle the case of one object class represented by a set of images. It contains the GMF as a special case when the filter sequence has only one member. Thus, symbolically, SGMF+GMF+MF. There are numerous designs for powerful Fourier filters, but only two meet the definition of a generalized matched filter, namely a filter that works well for multiple objects and reduces to matched filter when the number of objects is one. Those are linear combinations of matched filters, and the Caulfield–Haimes discriminant [36]. The algorithm consists of two major components. In one part a cluster, comprised of a judiciously selected subset of the training set, is selected using the current training set. In the second part the cluster is utilized to train a classifier (generalized matched filter GMF) by employing a bio-mimetic algorithm fashioned after the immune system evolution process. Training of GMFs can be based on any available robust classification algorithm. In this paper, however, we have used a powerful pattern classification algorithm called Margin-Setting [42,43] for this purpose. The GMF is then used to eliminate some members from the current training set. These are cluster members that fall within the sphere of influence of the GMF. The remaining members of the training set form the current training set for the next training round. This process continues, generating a GMF in each round, until the current training set is empty. The ordered set of GMFs so generated constitutes the SGMF. 4. Problem formulation The training set consists of multiple sub-images derived from one or more images comprising the training source image set. For example, training a classifier capable of distinguishing floating objects from the background sea surface involves a set of subimages (training set) obtained from various sea surface images (training source image set), acquired at different scales and view angles, under varied environmental and lighting conditions, and devoid of any floating objects. All images here and henceforth are assumed

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

2941

to be gray-scale and are represented by real valued matrices. The training set is defined below, where ai, S and Ns denote, respectively, a trainer, the set of all trainers and the number of trainers. S ¼ fai : 1rirNs g

ð1Þ

A typical trainer is obtained by extraction of a rectangular sub-image from a contiguous region of one of the training source images as shown below. In (2) ai and Bj denote, respectively, a trainer and the training source image from which it is drawn; k, l are randomly chosen integers which determine the spatial neighborhood of the source image from which the trainer is obtained; M, N and M0, N0 denote, respectively, the number of pixels along vertical and horizontal directions of the trainer and the respective source image. ai ¼ Bj ðk : ðk þ M1Þ, 1rkrM0 M þ 1,

l : ðl þ N1ÞÞ

ð2aÞ

1rlrN0 N þ 1

ð2bÞ

The trainer set is normalized such that for each image the mean pixel intensity is set to zero and the sum of squares of pixel values is set to one in accordance to (3), where ai denotes a typical normalized trainer. PM1 PN1 ai ðm,nÞ ð3aÞ ai ¼ ai  m ¼ 0 n ¼ 0 MN ai ai ¼ h PM1 PN1 m¼0

2 n ¼ 0 ai

ð3bÞ

i1=2

Next, the trainer spectra and the mutual correlations between all trainer pairs are computed as shown in (4), where Ak and lkl denote, respectively, the spectrum of a particular trainer and the peak correlation between a pair of trainers. Ak ðp,qÞ ¼

M1 1 X X N

ak ðm,nÞej2pðmp=Mþnq=NÞ ;

0rprM1,

0rqrN1

ð4aÞ

m¼0n¼0

"

1 XN X 1 M1 Ak ðp,qÞAnl ðp,qÞej2pðmp=Mþnq=NÞ lkl ¼ max m,n MN p¼0q¼0

L ¼ ½lkl ;

# ð4bÞ

1rk, lrNs

ð4cÞ n

where ak , Ak represent, respectively, a normalized trainer and its spectrum; j, denote the unit imaginary number and complex conjugate operator, respectively; L represents the peak cross-correlation matrix and Ns is the number of trainers. A cluster is defined as a group of NC trainers for which the mutual correlation coefficients satisfy certain conditions outlined below. The parameter value NC is user specified and denotes cluster population. The cluster formation process proceeds as follows. First, a pair of trainers whose peak correlation is highest among all current trainer pairs is identified. If there is more than one such pair, one is chosen randomly. The trainer pair so identified constitutes the first two cluster members. Among the remaining trainers, the trainer whose minimum peak correlation with respect to all cluster members is largest is identified and added to the cluster. If there is more than one such trainer, one is chosen randomly. The process

2942

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

continues, adding one trainer to the cluster in each step, until the number of trainers in the cluster reaches NC. The procedure for selecting the first two members of the cluster is described by (5a). Following the selection of the first two members the third member of the cluster is chosen in accordance to (5b). (k, l kal lkl Zluv , (w wak,l

81ru, vrNs ,

uav ) *1 ¼ ak ,

*2 ¼ a l

ð5aÞ

  minðlwk ,lwl ÞZmin lpk ,lpl ,

8p 1rprNs ,

pak,l ) *3 ¼ aw

ð5bÞ

where *1 ,*2 ,*3 denote the first three cluster members. The process described in (5b) continues, finding one new cluster member in each cycle and adding it to the cluster set. The process is terminated when the cluster population reaches Nc or there are no additional trainers. The cluster is subsequently used to compute the classifier filter (GMF) and the respective threshold using the procedure outlined below. The computation of the classifier filter begins with forming a large number of filters (typically one-thousand), each obtained as a weighted sum of the spectra of all the cluster members using randomly generated weight coefficients. Each one of the randomly generated filters is normalized with respect to the sum of squares of its pixels. Eq. (6a,b) describe the formation of a typical filter and its subsequent normalization. F k ðm,nÞ ¼

Nc X

wkl C^ l ðm,nÞ,

1rkrNF ,

wkl 2 ½01

ð6aÞ

l¼1

F k ðm,nÞ F k ðm,nÞ ¼ h 2 i1=2 PM1 PN1   F ðm,nÞ k m¼0 n¼0

ð6bÞ

where C^ l is the spectrum of a typical cluster member, weight coefficients wkl are randomly chosen from a uniform probability distribution function in [01], NF denotes the number of synthesized filters, and F k is the spectrum of a typical normalized filter. Following the synthesis of all filters, to each filter a utility value is assigned in accordance to the filter peak response to all the cluster members. Filter peak response to a typical cluster member is defined as the peak correlation between the filter and the respective cluster member and is computed as shown in (7a). The utility value of a typical filter is the minimum peak response of the filter with respect to all cluster members from which it is synthesized and is computed as shown in (7b). " # 1 n XN X 1 M1 j2pðmp=Mþnq=NÞ Rkl ¼ max Fk ðp,qÞC^ l ðp,qÞe ð7aÞ , 1rlrNc m,n MN p¼0q¼0 Uk ¼ minðRkl Þ, l

1rkrNF

ð7bÞ

where Nc is the number of trainers in the cluster, NF is the number of synthesized filters and Uk denotes the utility value of the synthesized filter F k . Next, the synthesized filters are rank ordered in accordance to their utility values and a subset consisting of NF0 highest ranked filters are chosen for further processing. Typical values for user-prescribed parameters NF , NF0 , are one-thousand and ten, respectively. The set of filters so chosen constitutes the generation-zero filter set, and the highest ranked among them is denoted as

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

2943

the generation-zero prototype. Each normalized filter in (6b) is represented by its corresponding weight vector which is a point in the NcD space. Eq. (8a,b) below describe the process of forming the generation-zero filter set and selecting the generation-zero prototype. 0

F k ðm,nÞ ¼

Nc X

0 wkl C^ l ðm,nÞ,

1rkrNF0

ð8aÞ

l¼1 0

0 F^ ðm,nÞ ¼ F p ðm,nÞ; 0 F^ ðm,nÞ ¼

Nc X

Up ZUk ,

8k 1rp,k r NF

ð8bÞ

w^ 0l C^ l ðm,nÞ

ð8cÞ

l¼1

n o ^ 0¼ w ^ 01 , w ^ 02 ,. . ., w^ 0Nc W

ð8dÞ

0

0

^0 where F k ðm,nÞ is a typical member of the generation-zero filter set, and F^ ðm,nÞ, W denote, respectively, the generation-zero prototype and the corresponding weight vector. The generation-zero filter set is mutated by applying a perturbation process to the respective weight vectors. For each weight vector a mutually independent NcD Gaussian process is defined whose mean is equal to the respective weight vector and the variance along all directions is the same and specified by the user. From each Gaussian process a number, commensurate with the utility value of the corresponding filter, of new weight vectors are chosen randomly. The weight vector mutation process  isshown in (9). This leads to a large number (NF) of altered weight vectors from which NF0 vectors are chosen randomly.   1 0 W k N W k ,s2 , 1rkrNF0 ð9aÞ   1 1 1 f 1 W k1 ,W k2 ,. . .,W kNc ¼ Wk

1 ð2pÞNc =2 sNc

h

e

1

0

1=2 W k W k

ih

1

0

W k W k

iT ð9bÞ

where s is the user-prescribed standard deviation (typically set at 0.1), superscript T denotes matrix transpose, and f 1 is the multivariate Gaussian process from which the Wk

1

mutated weight vectors are randomly chosen. The chosen weight vectors W k are used to synthesize the generation-one filter set which are then normalized with respect to their energy content as shown in (6b). The utility value of each filter is computed and filters are rank ordered in the manner described before. The filter with highest utility value is designated as the generation-one prototype. This process is repeated for a user specified number of mutation cycles (typically five) or until the overall filter utility values reach a plateau and no further improvement is observed. It is noted that in each cycle superior filters (those with higher utility values) mutate preferentially and generate a proportionately larger offspring set. In each mutation cycle the filters are chosen randomly from the large number of mutant filters which are spawned from the set of filters in the mutation cycle immediately preceding it. There exists an inherent upward trend and overall improvement in the filter utility values as one proceeds from one mutation cycle to the

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

2944

next. This is because in any cycle a larger proportion of the synthesized filters are progenies of superior parents from the earlier cycle. Since the present-cycle filters are chosen from this set at random, there is an overall improvement in performance vis-a-vis the filter utility value. At the termination of the mutation process the filter with the highest utility value is declared the round-one filter and its utility value, defined in (7), is the corresponding zeromargin threshold. The round-one threshold is defined as below. T ð1Þ ¼ ð1 þ 0:01 dÞT0ð1Þ

ð10Þ T0ð1Þ ,T ð1Þ denote,

respectively, zero-margin where d is the percent-margin parameter, and threshold and threshold for the round-one filter. The typical value for the user-prescribed percent-margin lies between zero and ten. The response of the round-one filter to all trainers is computed as shown below. Those trainers whose response values exceed the threshold are removed from the training set. " # 1 XN X 1 M1 ð1 Þn j2pðmp=Mþnq=NÞ Rk ¼ max I ðp,qÞAk ðp,qÞe ð11Þ ; 1rkrNs m,n MN p¼0q¼0 where I(1) denotes the round-one filter, Ak is the spectrum of a typical trainer, and Rk is the filter peak-response to the trainer. The set of trainers for which the peak-response exceeds the round-one filter threshold are said to be subsumed by the filter and are removed from the training set, leading to a reduced set of trainers.   ð12aÞ S ð1Þ ¼ ai : Ri ZT ð1Þ S~ ¼ S\S ð1Þ

ð12bÞ

where S, S ð1Þ , S~ denote, respectively, the original trainer set, set of trainers subsumed by the round-one filter, and the reduced trainer set. This concludes the round-one training process. The round-two training process repeats all the round-one steps described above using the reduced training set given in (12b). It concludes with computation of the roundtwo filter I(2), the respective threshold T(2), and a further reduced trainer set. This process continues for a user-specified number of training rounds or until there are less than two trainers left. Each round of training starts with an input trainer set and concludes with the computation of a filter-threshold pair and a reduced trainer set that is passed to the next training round. At the end of the training process the threshold values are adjusted by multiplying them with the relaxation parameter as shown below. T

ðnÞ

¼ ð10:01 bÞT ðnÞ ,

ðnÞ

1rnrNR

ð13Þ

where T , T ðnÞ are, respectively, the adjusted (relaxed) and computed threshold values for a typical classifier,NR denotes the number of classification rounds and b is the relaxation parameter (0rbr100) with typical value in 10–30 range. The training process in its entirety generates a multi-round cascade classifier. Each classifier round comprises one filter-threshold pair. The heuristic clustering procedure described here has some resemblance to K-means [44,45]. The two methods, however, are fundamentally different in that K-means uses a predetermined number of clusters, whereas here we compute one cluster using the available exemplars, train a filter (GMF) using the computed cluster, utilize the filter to remove some samples from the training set, and start the process anew. The user specifies the

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

2945

cluster population, but the number of clusters is not preordained. This is because the number of trainers eliminated by a GMF in a particular training round may be equal to, smaller than, or larger than the cluster population it is trained on. Effective clustering is the K-mean’s objective, whereas, the intent here is the robust distinction between background and intrusions. The classifier is applied to the input test image by computing the cross-correlation of the test image with respect to each filter comprising the classifier. The cross-correlation results are converted to the respective binary images by setting all pixels at which the correlation exceeds the corresponding threshold to zero and all other pixels to one. The computed binary images are multiplied pixel-wise and the resultant binary image is complemented to produce the anomaly map, in which all background and non-background (anomalies) pixels have values of one and zero, respectively. 5. Classifier performance tests In order to demonstrate the efficacy of the algorithm for detection of image anomalies a number of simulations were conducted using a diverse set of backgrounds and anomalies. The underlying scenario in all the simulations comprises a target-class source image consisting of one or multiple spatially consistent background regions from which the classifier is trained. The background image (target-class source image) is then contaminated by replacing one or multiple small square image patches at random locations with square patches of the same size extracted from arbitrary non-target class images. The classifier is subsequently applied to the contaminated image in order to detect and locate the anomalous (non-target) regions. Fig. 2 shows two gray-scale images, ivy-leaves representing the target-class source, and deep-grass representing non-target contamination source. Two small square patches were extracted from the deep-grass image and were subsequently inserted in the ivy-leaves image replacing squares of the same size in the host image. The positions of the extracted deepgrass image patches and their destinations in the ivy-leaves image were chosen randomly. The ivy-leaves image transplanted with two deep-grass patches is shown in Fig. 3a. Visually the images of Fig. 2 are quite similar, and indeed it is impossible to use simple inspection to locate the two transplanted deep-grass anomalies in the corrupted image of 3a. A classifier was trained using forty 20  20-pixel image patches drawn randomly from

Fig. 2. Ivy-leaves image (a) represents the target-class and deep-grass image (b) is the source from which anomalies are extracted for transplantation in the target-class host image.

2946

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

Fig. 3. The Input image (a) is comprised of ivy-leaves and two small non-leaves (deep-grass) patches. The output image (b) shows that the filter correctly detected and located all the foreign (non-leaves anomalies) patches.

6 false-negative false-positive

error rates (percent)

5 4 3 2 1 0 20

21

22

23

24 25 26 27 relaxation (percent)

28

29

30

Fig. 4. Effect of relaxation parameter on error rates. Percent error rates are plotted as functions of relaxation parameter.

the ivy-leaves image of Fig. 2a (training source). The classifier was then applied to the corrupted image of Fig. 3a, and all the image sections classified as target-class members were set to white. As seen from the output image of Fig. 3b, the classifier correctly identified all the foreign image segments, i.e., the two deep-grass patches in this case. This experiment was repeated multiple times, each time square patches were extracted from Fig. 2b and were transplanted in Fig. 2a at random locations. In some of the experiments the extracted image patches were randomly scaled and rotated prior to transplantation in the host image. In every case the classifier correctly located the image anomalies. In order to obtain statistically reliable performance metrics for the algorithm, a classifier was trained, using forty 20  20-pixel image patches randomly extracted from the targetclass image (ivy-leaves), and was subsequently applied to one-thousand test images comprised of randomly selected 20  20-pixel image patches equally divided between target

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

2947

9 8 7 6 5 4 3 2 1 0

0.35 N = 20 30 40 50

false-positive (percent)

false-negative (percent)

and non-target (deep-grass) source images of Fig. 2. The process of trainer selection, classifier computation, test image set formation and testing was repeated one-hundred times and the detection results were averaged. There are two types of errors, namely falsenegative, which refers to the classifier failure to correctly identify target-class test images, and false-positive, representing misclassification of non-target class test images as targetclass. The plots of Fig. 4 show the false-negative and false-positive error rates as functions of the threshold relaxation parameter. As seen in (13), increasing the relaxation parameter b lowers the filter threshold values and therefore results in a more permissive classifier which in turn detects a larger percentage of the target-class inputs (lower false-negative) at the expense of higher false-positive error. It is noted that misclassification of non-target class patches (deep-grass in this case) as target class (ivy-leaves) is virtually zero. For example, setting the relaxation parameter at 30-percent, b ¼ 30 in (13), results in false-negative and false-positive error rates of 0.67percent and 0.23-percent, respectively. Fig. 5 illustrates the effect of the number of trainers on the classifier performance. All the parameters are the same as those in Fig. 4 with the exception of the number of trainers. It is seen that using a few trainers the algorithm generates a classifier that can recognize foreign (non-target) image patches with virtual certainty. For example, a classifier trained with only twenty 20  20 -pixel target-class image patches classified 96-percent of the previously unseen target-class patches correctly and misclassified less that 0.23-percent of the non-target patches. It is also seen that the overall performance improves as the number of trainers is increased, as expected. As the number of trainers is increased the false-negative error decreases substantially at the expense of a slight increase in the false-positive error. Next, the effect of trainer spatial dimensions on the classifier performance is examined. The number of trainers was set at 40 in all cases while image-patch dimensions were varied. The plots of Fig. 6 show error rates as functions of threshold relaxation with the trainer dimension used as the control parameter. It is seen that increasing the threshold relaxation leads to lower false-negative and higher false-positive error rates. It is also seen that for fixed relaxation the false-negative error increases with larger-dimension trainers. This is to be expected, as larger trainers will result in more restrictive classifiers, which in turn explains the extremely low false-positive error rates for the 20  20-pixel and 30  30-pixel

0.3 0.25

N = 20 30 40 50

0.2 0.15 0.1 0.05 0

20 21 22 23 24 25 26 27 28 29 30 relaxation (percent)

20 21 22 23 24 25 26 27 28 29 30 relaxation (percent)

Fig. 5. Effect of the number of trainers on the classifier performance. The false-negative (a) and false-positive (b) error rates are plotted as functions of relaxation (b) with the number of trainers (N) as the control parameter.

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

102

false-negative (percent)

8

10x10 20x20 30x30

7 6

10x10 20x20

false-positive (percent)

2948

5 4 3 2 1 0 20 21 22 23 24 25 26 27 28 29 30 relaxation (percent)

101

100

10-1 20 21 22 23 24 25 26 27 28 29 30 relaxation (percent)

Fig. 6. Effect of trainer spatial dimensions on the classifier performance. The false-negative (a) and false-positive (b) error rates are plotted as functions of relaxation (b) with trainer dimensions (M  N) as the control parameter.

8

9

6 5 4 3 2 1 0 10

false-negative false-positive

8

false-negative false-positive

error rate (percent)

error rate (percent)

7

7 6 5 4 3 2 1

15

20 25 30 number of trainers

35

40

0 10

15

20 25 30 number of trainers

35

40

Fig. 7. Effect of number of trainers on the classifier performance. The classifier was trained using 25  25-pixel (a), and 15  15-pixel (b) trainers extracted from the ivy-leaves source image. The classifier was tested using image patches extracted from the ivy-leaves and deep-grass images of Fig. 2.

trainer cases. Indeed the false-positive error rate for the classifier trained with 30  30-pixel patches is zero for all relaxation values of Fig. 6b and therefore is not displayed on the logarithmic scale. The classifier trained with forty 20  20-pixel patches and relaxation of 30-percent has false-negative rate of 0.675-percent and false-positive rate of 0.23-percent. Increasing the trainer dimension to 30  30 while keeping all other parameters the same results in an elevated false-negative rate of 1.8-percent whereas the false positive rate remains zero. The effect of the number of trainers on the classifier performance is examined next. The classifiers were trained using two distinct sets of training images comprised of square patches extracted from the ivy-leaves image of Fig. 2a. In the first case the trainers consisted of 25  25-pixel image patches and in the second case they consisted of 15  15pixel patches. The threshold relaxation parameter was set at 30-percent for all test cases. The number of trainers for each scenario was varied from ten to forty, and for each case

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

2949

the classifier was tested using one thousand test images comprised of randomly chosen image patches extracted from target (ivy-leaves) and non-target (deep-grass) classes of Fig. 2. As before, the test images were equally divided between the two classes, and the training and testing process was repeated one-hundred times for each scenario and detection results were averaged. The plots of Fig. 7 show the error rates as functions of the trainer population for each setting of the trainer dimension. It is noted that the classifier based on forty 25  25-pixel trainers achieves false-negative and false-positive error rates of 0.2-percent and zero, respectively. This means that the classifier, on average, correctly classified 99.8-percent of the test image patches extracted from the ivy-leaves image and rejected 100-percent of the ones extracted from the deep-grass image. The classifier based on forty 15  15-pixel trainers, on the other hand, achieved false-negative and falsepositive error rates of 0.175 and 7.8, respectively. In case of the smaller trainers, as one increases the number of training elements, the volume in the feature hyperspace encompassed by the classifier increases and inevitably includes non-target space. This accounts for the overall upward trend in the false-positive error rate. This phenomenon, however, diminishes for dimensionally larger training elements. The images of Fig. 8 show one of the trainers comprising the training set and all the five filters comprising the classifier for a particular training scenario. The classifier is based on thirty 20  20-pixel trainers extracted from the ivy-leaves image of Fig. 2, one of which is shown in the upper-left corner of Fig. 8. The cluster population and the margin-value were set at five and zero, respectively. The training process resulted in a five-filter classifier

Fig. 8. The upper left image is one of the thirty 20  20-pixel ivy-leaves trainers and the other five images represent the five-filter classifier (SGMF). The filter orders increase from left to right and from top to bottom. Filters order one through five subsumed 9, 7, 5, 5, 4 trainers and have threshold values of 0.5520, 0.4925, 0.6897, 0.4336, 0.3873, respectively.

2950

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

(SGMF) all of which are shown in Fig. 8. The number of trainers utilized in the synthesis of each particular filter and the respective threshold value are listed in the caption of Fig. 8. Figs. 9–11 show additional examples that demonstrate the effectiveness of this technique for detecting and locating image anomalies. The original images for these examples were obtained from the Texture Database in [46]. Fig. 9 shows two types of skin where, for the purpose of this demonstration, Fig. 9a is the image of type-one skin and is designated as target, and Fig. 9b is the image of type-two skin and is designated as non-target. The target and non-target sample images shown in Fig. 9 have dimensions (300  300) and (200  200) pixels, respectively. The image of Fig. 9c is the target corrupted by five small patches randomly extracted from the non-target image. The objective here is to obtain a classifier that is capable of detecting and locating all potential non-target intrusions of any type in previously untrained-on input images consisting of type-one skin. The classifier was trained using thirty randomly selected 15  15-pixel patches from the target-class. In this example cluster population of five and relaxation parameter of 20-percent were used, and

Fig. 9. Images of type-one (a) and type-two (b) skin represent target and non-target classes. The image in (c) shows type-one skin corrupted by patches extracted from the type-two skin. The image in d is the result of the application of the filtering process to the image in (c).

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

2951

Fig. 10. Images of net (a) and carpet (b) represent target and non-target classes. The image in (c) shows net corrupted by patches extracted from carpet. The image in (d) is the result of the application of the filtering process to the image in (c).

the training process resulted in a seven-filter SGMF classifier. The classifier was applied to the image of Fig. 9c and it correctly detected all non-target image sections. The filtered image is shown in Fig. 9d, and close visual inspection of the corrupted image shows that the classifier has indeed detected and located all the skin anomalies successfully. The above example was repeated using type-two skin as the target class. A new classifier was trained using thirty 15  15-pixel samples randomly extracted from Fig. 9b. The input image was subsequently synthesized by transplanting in the image of 9b a few randomly selected patches extracted from the image of Fig. 9a. The experiment was repeated multiple times, and the classifier detected all intrusions in every trial. In the example of Fig. 10, a classifier was trained to distinguish one type of fabric (target-class) against all other fabric types. Fig. 10a shows the image of a piece of net, which constitutes the target-class, and Fig. 10b shows the carpet image, which represents non-target. The target and non-target sample images shown in Fig. 10 have dimensions (300  300) pixels. The classifier was trained using twenty 10  10-pixel patches randomly selected from Fig. 10a, a cluster

2952

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

Fig. 11. Image of background (target-class) consisting of two skin types corrupted by anomalies comprised of net and carpet patches (a), and the filtered image (b).

population of four and relaxation parameter of 25-percent. The training process resulted in a three-filter classifier. In principle, this classifier is capable of distinguishing backgrounds comprised of net from all non-net intrusions. Three randomly selected 10  10-pixel patches were extracted from the carpet image and were randomly transplanted in the net image. Fig. 10c and d show the corrupted net image and the filtered image. Close visual examination of the Fig. 10c image shows that the classifier correctly located all non-net image sections. The example of Fig. 11 shows an image consisting of a background comprised of the two skin types of Fig. 9. The image also contains intrusions which were extracted from net and carpet images of Fig. 10 and were implanted at random locations. A classifier was trained using fifty 15  15-pixel patches obtained from the images of two skin types of Fig. 9. Twenty-five training patches were randomly chosen from each skin type and collectively were called the background class. The classifier was trained using cluster population of five and relaxation parameter of 20-percent, resulting in a nine-round classifier. It is noted that in the training process no distinction is made between the type-one and type-two skin and all training samples are considered as members of the same class, namely background or target-class. Fig. 11b shows the filtered image. It is noted that the classifier has correctly isolated all image regions that are not members of the background class. Representative data pertaining to the performance of the anomaly detection classifiers for an assortment of background-anomaly combinations and various settings of the training parameters are listed in Table 1. As before, false-negative error rate (FN) is the percentage of test images comprised of arbitrarily chosen square patches of the background source image that are misclassified as anomalies. False-positive error rate (FP), on the other hand, is the percentage of arbitrarily chosen square patches of the anomaly source image that are misclassified as background. Although in the examples presented here all anomalies are rectangular, the computed SGMF classifier is capable of detecting arbitrarily shaped anomalies. Corrupting a test image with an arbitrarily shaped anomaly, involves superimposing the anomaly on the test image at the desired location. This is tantamount to replacing the affected test image pixel

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

2953

Table 1 Classifier performance. Environment

Training parameters

Error rates

Background

Anomaly

P

Q

b

FN

FP

Ivy Leaves

Deep grass

20 40

20 20

20 10 30 15 25 15 15

30 20 25 30 30 30 30 30 30 25 25

4 5.75 3.02 0.67 0.32 0.35 1.8 0.2 0.2 0 0

0.23 0 0 0.23 0.26 37 0 7.8 0 0 0

Type-one skin Type-two skin

Type-two skin Type-one skin

50 40 40 40 40 30 30

Net

Carpet

20

10

25

0

0

Skin

Fabric

50

15

25

0

0

P represents number of trainers, Q represents the spatial dimensions of each trainer Q  Q, b is the relaxation parameter (percent), FN and FP denote, respectively, percent false-negative and percent false-positive error rates.

values with values of the respective pixels of the anomaly. As the result of this transplantation process a rectangular region of the test image, whose area is larger than the oddly shaped anomaly, constitutes the anomalous region. This is despite the fact that some of its pixels are the original target-class pixels. Applying the SGMF to the test image will result in classification of the rectangular region circumscribing the foreign section as anomalous, and hence detection of the oddly shaped anomaly is materialized. 6. Summary and conclusions This paper presents a practical algorithm for negative Fourier filtering of imagery data, where the objective is the removal of expected background rather than finding anomalies and intrusions which may constitute potential targets. In four challenging and varied scenarios, the classifier introduced in this paper, worked extremely well and was able to remove the background and expose potential targets for further processing. In each case the classifier was trained using a few background samples, each consisting of a small square image patch randomly extracted from the background. The classifier was able to remove all the background pixels, most of which were not included in the training set, while leaving all other objects intact. Our approach could work (to varying degrees) with any good classifier that can be implemented in the Fourier transform domain. The aim of this paper is to explain the concept and show that it works well using different images that all have the feature of making human search for the ‘‘anomalies’’ very difficult. The use of SGMFs for Fourier recognition and subsequent blanking of regions that are likely to belong to the expected class or classes has been shown to be a very powerful means to highlight the unexpected in a scene. The salient contribution of the work presented here is the fusion of

2954

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

Fourier filtering and Margin-Setting techniques for development of negative-filter classifiers for anomaly detection. Possible applications are immense, e.g., locating

     

Defects in materials on a conveyor belt or produced as sheets Objects floating in the sea. Marijuana growing in fields of other grass. Small patches of diseased or necrotic tissue in an otherwise healthy region. Downed airmen bobbing about in the sea. Defects in a largely-acceptable manufactured item.

References [1] H.T. Nguyen, A. Smeulders, Robust tracking using foreground-background texture discrimination, International Journal of Computer Vision 69 (2006) 277–283. [2] H.T. Nguyen, A. Smeulders, Fast occluded object tracking by a robust appearance filter, IEEE Transactions on Pattern Analysis and Machine Intelligence 26 (8) (2004) 1099–1104. [3] Y. Sugaya, and K. Kanatani Extracting moving objects from a moving camera video sequence, in: Proceedings of the 10th Symposium on Sensing via Imaging Information, (2004), pp. 279–284. [4] J. Cheng, J. Yang, Y. Zhou, H. Cui, Flexible background mixture models for foreground segmentation, Image and Vision Computing 24 (2006) 473–482. [5] W.E. Grimson, C. Stauffer, R. Romano, L. Lee Using adaptive tracking to classify and monitor activities in a site, in: Proceedings of the IEEE Computer Vision and Pattern Recognition, (1998), pp. 22–29. [6] C. Stauffer, and W.E. Grimson Adaptive background mixture models for real-time tracking, in: Proceedings of the IEEE Computer Vision and Pattern Recognition 2, (1999), pp. 246–252. [7] N. Friedman, and S. Russell Image segmentation in video sequences: a probabilistic approach, in: Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence, (1997), pp. 175–181. [8] J. Kato, T. Watanabe, S. Joga, J. Rittscher, A. Blake, An HMM-based segmentation method for traffic monitoring movies, IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (9) (2002) 1291–1296. [9] C.R. Wren, A. Azarbayejani, T. Darrell, A.P. Pentland, Pfinder: real-time tracking of the human body, IEEE Transactions on Pattern Analysis and Machine Intelligence 19 (7) (1997) 780–785. [10] C. Ridder, O. Munkelt, and H. Kirchner Adaptive background estimation and foreground detection using Kalman-filtering, in: Proceedings of the International Conference on Recent Advances in Mechatronics, (1995), pp. 193–199. [11] J. Rittscher, J. Kato, S. Joga, A. Blake, A probabilistic background model for tracking, Proceedings of the Sixth European Conference on Computer Vision (2000) 336–350. [12] B. Stenger, V. Ramesh, N. Paragios, F. Coetzee, J.M. Buhmann, Topology free hidden Markov models: application to background modeling, Proceedings of the IEEE International Conference on Computer Vision 1 (2001) 294–301. [13] P.W. Power, J.A. Schoonees, Understanding background mixture models for foreground segmentation, Proceedings of the Image and Vision Computing, New Zealand (2002) 267–271. [14] K. Toyama, J. Krumm, B. Brumitt, B. Meyers, Wallflower: principles and practice of background maintenance, Proceedings of the IEEE International Conference on Computer Vision 1 (1999) 255–261. [15] D.W.J. Stein, S.G. Beaven, L.E. Hoff, E.M. Winter, A.P. Schaum, A.D. Stocker, Anomaly detection from hyperspectral imagery, IEEE Signal Processing Magazine 19 (2002) 58–69. [16] H. Kwan, S.Z. Der, N.M. Nasrabadi, Adaptive anomaly detection using subspace separation for hyperspectral imagery, Optical Engineering 42 (2003) 3342–3351. [17] A. Goldman, and I. Cohen Anomaly Detection Based on an Iterative local statistics approach, Convention of the Electrical and Electronic Engineers in Israel, (6–7 September 2004), pp. 440–443. [18] Q. Cheng, Y. Xu, E. Grunsky, Integrated spatial and spectrum method for geochemical anomaly separation, Natural Resources Research 9 (2000) 43–52.

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

2955

[19] M. Markou, S. Singh, Novelty detection: a review, Part I: Statistical approaches, Signal Processing 83 (2003) 2481–2497. [20] M. Markou, S. Singh, Novelty detection: a review, Part II: Neural network based approaches, Signal Processing 83 (2003) 2499–2521. [21] S. Singh, M. Markou, An approach to novelty detection applied to the classification of image regions, IEEE Transactions on Knowledge and Data Engineering 16 (4) (2004) 396–407. [22] H.A. Rowley, B. Shumeet, T. Kanade, Neural network-based face detection, IEEE Transactions on Pattern Analysis and Machine Intelligence 20 (1) (1998) 23–38. [23] M.H. Yang, D. Kriegman, N. Ahuja, Detecting faces in images: a survey, IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (1) (2002) 34–58. [24] P. Viola, M.J. Jones, Robust real-time object detection, Proceedings of IEEE Workshop on Statistical and Computational Theories of Vision (2001). [25] H. Schneiderman, T. Kanade, A statistical method for 3D object detection applied to faces and cars, Computer Vision and Pattern Recognition (2000). [26] P. Viola, M.J. Jones, Rapid object detection using a boosted cascade of simple features, IEEE Computer Vision and Pattern Recognition I (2001) 511–518. [27] P. Viola, M.J. Jones, Robust real-time face detection, International Journal of Computer Vision 57 (2) (2004) 137–154. [28] E. Grossman, Automatic design of cascade classifiers, International IAPT Workshop on Statistical Pattern Recognition (2004). [29] M.M. Dundar, J. BI, Joint optimization of cascaded classifiers for computer aided detection, IEEE Conference on Computer Vision and Pattern Recognition (2007). [30] M. Turk, A. Pentland, Eigenfaces for recognition, Journal of Cognitive Neuroscience 3 (1) (1991) 71–86. [31] R. Ebrahimpour, E. Kabir, M.R. Yousef, Teacher-directed learning in view-independent face recognition with mixture of experts using single-view eigenspaces, Journal of the Franklin Institute 34 (2008) 87–101. [32] A. Khoukhi, S.F. Ahmed, A genetically modified fuzzy linear discriminant analysis for face recognition, Journal of the Franklin Institute 348 (2011) 2701–2717. [33] K. Heidary, H.J. Caulfield, Application of supergeneralized matched filters to target classification, Applied Optics 44 (1) (2005) 47–54. [34] R.Barry Johnson, Kaveh Heidary, A unified approach for database analysis and application to ATR performance metrics, Proceedings of SPIE 7696 (2010) 1–20. [35] A. VanderLugt, Signal detection by complex filtering, IEEE Transactions on Information Theory IT 10 (1964) 139–145. [36] H.J. Caulfield, R. Haimes, Generalized matched filtering, Applied Optics 19 (1980) 181–183. [37] H.J. Caulfield, M.H. Weinberg, Computer recognition of 2-D patterns using generalized matched filters, Applied Optics 21 (1982) 1699–1704. [38] M.A.G. Abushagur, H.J. Caulfield (Eds.), Selected Papers on Fourier Optics, SPIE Milestone, Volume MS 105, SPIE-International Society for Optical Engine, 1995. [39] E.R. Dougherty, J. Astola, Nonlinear Filters for Image Processing, Wiley-IEE Press, 1999. [40] S. Lototsky, R. Mikulevicius, B.L. Rozovski, Nonlinear filtering recisited: a spectral approach,, SIAM Journal on Control and Optimization 35 (2) (1997) 435–461. [41] S. Haykin S., Communications Systems, fourth ed., Wiley, New York, 2001. [42] H.J. Caulfield, K. Heidary, Exploring margin setting for good generalization in multiple class discrimination, Journal of Pattern Recognition 38 (8) (2005) 1225–1238. [43] K. Heidary, H.J. Caulfield, Discrimination among similar looking, noisy color patches using margin setting, Optics Express, Journal of Optical Society of America 15 (1) (2007) 62–75. [44] J.B. MacQueen, J. B., Some methods for classification and analysis of multivariate observations, in: Proceedings of Fifth Berkeley Symposium on Mathematical Statistics and Probability, University of California Press, 1:281–297 (1967). [45] T. Kanungo, D.M. Mount, N. Netanyahu, C. Piatko, R. Silverman, A.Y. Wu, An efficient k-means clustering algorithm: analysis and implementation, IEEE Transactions Pattern Analysis and Machine Intelligence 24 (2002) 881–892. [46] University of Western Australia Texture Database, /http://local.wasp.uwa.edu.au/pbrouke/textureS.