Needles in a haystack: Fast spatial search for targets in similar-looking backgrounds

Available online at www.sciencedirect.com Journal of the Franklin Institute 349 (2012) 2935–2955 www.elsevier.com/locate/jfranklin Needles in a hays...

Download PDF

1MB Sizes 0 Downloads 21 Views

Report

PDF Reader
Full Text

Available online at www.sciencedirect.com

Journal of the Franklin Institute 349 (2012) 2935–2955 www.elsevier.com/locate/jfranklin

Needles in a haystack: Fast spatial search for targets in similar-looking backgrounds Kaveh Heidarya,n, H. John Caulﬁeldb,1 a

Department of Electrical Engineering, Alabama A&M University, PO Box 702, Normal, AL 35762, USA b Alabama A&M University Research Institute, PO Box 313, Normal, AL 35762, USA Received 21 April 2011; received in revised form 22 March 2012; accepted 30 May 2012 Available online 19 August 2012

Abstract This paper develops an efﬁcient and robust algorithm that simultaneously detects and locates image anomalies and intrusions. Anomalies refer to image regions that do not belong to expected classes. In situations where most of the image is of one or more types of known background classes while a few isolated regions may belong to unknown classes, the algorithm detects and locates potential intrusions by blanking regions it classiﬁes as members of the known classes. We used a combination of Fourier ﬁltering, a fast linear way to scan the content of the whole scene in parallel, with Margin-Setting, a powerful nonlinear discriminant trained to distinguish members of known classes from everything else. That combination retains the power of Margin-Setting and the simplicity, speed, and locating ability of Fourier ﬁltering. Examples show the ability of this method to remove essentially all background material while leaving the similar looking intrusions intact. The classiﬁer is trained using a few small square patches extracted from images or image regions representing the background classes of interest. Processed images related to four different problems as well as cumulative numerical results of many tests performed on one of those problems are presented. Excellent performance is observed for the examples considered here. & 2012 The Franklin Institute. Published by Elsevier Ltd. All rights reserved.

1. Introduction This paper presents a way to largely rid an image of unpromising background as a preliminary step in target detection and classiﬁcation. Effectively, it uses Fourier methods n

Corresponding author. Tel.: þ1 256 372 5587; fax: þ1 256 372 5855. E-mail addresses: [email protected] (K. Heidary), John.Caulﬁ[email protected] (H. John Caulﬁeld). 1 Tel.: þ1 256 372 5844. 0016-0032/$32.00 & 2012 The Franklin Institute. Published by Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.jfranklin.2012.05.013

2936

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

to locate and remove items that represent probable background. This can be done by spatial and/or spectral ﬁltering, but we have only demonstrated spatial ﬁltering, as the objective here is background removal and detection of targets in gray-scale digital imagery. The intent of this work is to make fast and robust negative target detection possible. In the context of this paper ‘‘negative target detection’’ is deﬁned as follows: Rather than basing the target detection process on the stated target attributes, provide a method of recognizing the dominant background—the things the target is not. That is the problem attacked here, and its detailed analysis dictates the paper organization outlined below. First, one begins with what is known to dominate the image—call it the background B(x, y). It is assumed that at least a few samples of B are available to use for instructing an algorithm to recognize occurrences of B at any location, even allowing for speciﬁc B’s to differ somewhat from those trained on. This, of course, is the standard statistical pattern recognition problem. To distinguish between B and all other things, called potential intrusions A(x, y), the best that can be done in the absence of any information about A is to assume that A or its Fourier spectrum has equal probability at all points in the image and in its power spectrum. The equivalent action here is to remain silent on the spatial and the spectral content of potential intrusions. Second, the background removal algorithm must be computationally efﬁcient and amenable to real-time implementation using readily available processors. Methods that require serial examination of countless small regions of each image frame are not likely to be widely applicable. A space-invariant discriminant that can be applied in parallel at all points in the image is required. For well known reasons, this imposes symmetry conditions that effectively dictate the use of Fourier plane ﬁltering. Third, it is vital that the discrimination achieved between A and B be very reliable, in discrimination terminology, very robust. The process must be highly reliable even when the intrusions are fairly analogous to the background. Unfortunately, there is a conﬂict between what requirements two and three seem to imply. Requirement two dictates Fourier correlation, which can only implement linear discriminants. Yet, requirement three points to the extreme improbability of intrusion instances A being linearly discriminable from background B. Fourth, it is seldom true that anywhere near enough samples are available to achieve the kinds of reliable robustness users seek according to PAC (probably approximately correct) learning theory. So whatever method is used must outperform PAC learning predictions dramatically. This too argues against Fourier ﬁltering, even though Fourier ﬁltering is essentially required for practical systems. The problem considered here is not new and there have been many attempts to solve it. So far as we know, no prior method provides all of the sought-after advantages, namely:

Distinguishes B from an A with totally unknown and unknowable spatial and spectral characteristics. Operates on images in parallel at high speed. Discriminates against B with very high reliability. Accomplishes those feats with only a small number of samples in the training set.

The distinguishing between A and B is a classical Bayesian problem of distinguishing between two hypotheses, that is, H1: The image is B plus noise, and H2: The image is A

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

2937

plus noise. This disallows the hypothesis H3: The image is (AþB) plus noise, because the neighborhood is assumed to be too small to contain signiﬁcant amounts of both A and B. That assumption is made to provide high spatial resolution in distinguishing B from A. An extended A may be much larger than a single neighborhood. In that case each small part of an extended A ought to be independently distinguishable from B. In Sections 2 and 3, we brieﬂy explore other background elimination methods and, more-or-less-equivalently anomaly ﬁnders, and show that they present problems that our approach eliminates or at least greatly reduces. Section 3 shows how to modify the strictlylinear method of Fourier ﬁltering to accomplish highly nonlinear discrimination. This is done by proper use of multiple Fourier ﬁlters, thresholding each, and nonlinearly combining their outputs. Any non Fourier approach will inevitably be slower than Fourier ﬁltering, because of the latter’s parallel space-invariant nature. It does not matter to this paper whether the Fourier methods are electronic or optical. The problem analysis and mathematical formulation of the algorithm are covered in Section 4. This includes a detailed description of the algorithm implementation. Similarities and differences between Margin-Setting with two other discrimination approaches are brieﬂy discussed. In order to demonstrate the efﬁcacy of the algorithm, in Section 5 we apply it to four quite different examples. In all cases, ﬁnding the anomalies by eye was very difﬁcult, while their locations became obvious after ﬁltering. One of the examples is examined extensively in order to obtain quantitative assessment of the classiﬁer performance. The other three examples are used to show before and after images. Finally, in Section 6 we offer a brief review of the paper and state some conclusions supported by the work reported here. 2. Prior work Whereas no prior work known to us has successfully achieved the goals outlined above, one set of papers [1–3] has successfully attacked a related problem. That problem is tracking an object by background elimination as well as positive identiﬁcation in the case of a moving camera. They had to use a fairly simple linear discriminant, because frame-byframe retraining is required. Our work allows for slower training (perhaps updated every minute not every frame), so it can use much more powerful algorithms. In fact, there has been a large volume of work dealing with ﬁtting models to the background to allow its recognition and subtraction [4–14]. Other related work goes under the name ‘‘anomaly detection.’’ The idea is to characterize expected objects and look for the unexpected. For example, [15] treats every neighborhood as having a particular normal distribution and assigns classes accordingly. Spectral aspects of the background were emphasized in [16]. Statistical characterization of neighborhoods is utilized in [17] as a means of anomaly detection. Power spectrum is used in [18] to characterize image neighborhoods. Somewhat distantly related work goes under many names but most commonly ‘‘novelty ﬁlters.’’ Those readers who seek further knowledge in that ﬁeld might start with review articles by Markou and Singh [19–21]. The general idea put forth in these papers is to look for and isolate whatever has changed between frames. Face detection in arbitrary composite images has attracted a great deal of attention [22–24]. Detecting and locating faces with different scales, orientations, and poses against complex backgrounds is a challenging problem, and is the necessary ﬁrst step towards the subsequent face recognition task. Advances in low latency face detection algorithms have

2938

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

led to robust detection systems with fast frame rate capabilities [25–29]. Face detection algorithms are based on training the classiﬁer with many exemplars consisting of face and non-face image templates. AdaBoost based cascade classiﬁers in [27] consist of simple early-stage ﬁlters to eliminate large non-face image sections, while complex late-stage ﬁlters concentrate on more challenging portions of the image in order to locate potential faces. In contrast to the above face detection systems, the anomaly detection algorithm developed in this paper attempts to learn from samples of only one class, namely background, in order to detect and locate objects that are not members of the trained-on class. Similar to face detection systems of [27–29] ours is also a cascade ﬁlter. The analogy, however, stops there as the ﬁlter stages comprising our anomaly detection classiﬁer are entirely different from existing face detection ﬁlters. The problem attacked in this paper is somewhat distantly related to the broad class of problems dealing with detection of the human face in an image, characterization and recognition of faces and identity veriﬁcation based on the facial image [30–32]. A large number of face images are used in [30] in order to develop a feature space called eigenface that fully spans signiﬁcant variations of the training set. Face locations in the input image are isolated by projecting the image into the eigenface and subsequent thresholding. Face tracking systems also can be construed as negative ﬁltering operations in the sense that the algorithm attempts to eliminate all nonface content from each image in the input sequence. This brief survey suggests two things. First, this is an active ﬁeld of research. Second, the approach taken in this paper is quite distinct from prior work. The value of this approach will be shown by argument and, more convincingly, we hope, by before-and-after pictures that show how cleanly the intruding areas can be segregated from the background even when they are almost impossible for a human observer to ﬁnd. That and the real-time capability suggest this approach may have practical value. 3. Analytical approach and related work The approach taken in this work is to use Fourier correlation to recognize and blank out all regions of an image likely to belong to one or more classes of interest. Here, classes of interest are collectively referred to as background. Background is represented by samples taken from a reference image or set of images. Each sample consists of a square patch, and small samples will be used to give good spatial resolution. The Fourier approach allows the image to be processed in parallel. It is important to use a powerful discriminant: One that classiﬁes image regions with very high reliability as belonging to the class of interest and not to any other random class. A Fourier ﬁlter is a linear discriminant applied in the Fourier domain, so Fourier ﬁltering is very unlikely to be able to accomplish that discrimination robustly. Thus it is vital to combine the space-invariant characteristics of Fourier ﬁltering with the most powerful nonlinear discrimination tool available to us. That problem has been solved by our super generalized matched ﬁlter (SGMF) as described in [33,34] and recapitulated by the concept map of Fig. 1. VanderLugt in his seminal work [35] established matched ﬁlter (MF) as an important tool in the optics community. He showed that such ﬁlters are normally complex valued and hence not suitable for amplitude-only ﬁltering, while holographic matched ﬁlters could be made to represent the complex MF. Early on, it was realized that in order to handle realistic problems the theory must be extended and one must go beyond MF [36,37]. The MF is not even deﬁned for problems of greatest interest—those in which the target signal is

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

2939

Fig. 1. The evolution and relationships among the matched ﬁlter (MF), generalized matched ﬁlter (GMF), and super-generalized matched ﬁlter (SGMF) shows each as nested within the other and offering progressive robustness.

ill deﬁned and represented by a set of what are presumed to be fair samples of a potentially inﬁnite set of possible signals. Filter design methodologies for optimal detection of multisignal targets is still an active area of research. A small subset of the numerous papers dealing with this topic has been collected in an SPIE Milestones volume [38]. Again, no assertion is being made that optics is better or worse as a means to do this task, even though much of the published work is in the context of optical implementation. For purposes of this discussion, one set of papers is more relevant than the others. That is the set of papers dealing directly with the topic of generalized matched ﬁlters (GMFs). These ﬁlters have two deﬁning characteristics. First, they can handle a target class represented by a set of examples. Second, when the set of examples contains only one member, the GMF reduces to the MF. Said the other way, the GMF subsumes the MF. This paper discusses an approach that subsumes GMF, so we call it super-generalized

2940

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

matched ﬁlter or SGMF. The SGMF is a nonlinear ﬁlter [33,34,38–40] comprised of multiple GMFs whose output planes are nonlinearly combined. This enables the ﬁlter designer to retain the especially valuable space-invariant ﬁltering characteristics provided by Fourier correlators while gaining the added discrimination advantages offered by nonlinear ﬁltering. Given a set of target-class images, henceforth referred to as the training set or trainers, the algorithm developed herein computes an ordered set of classiﬁer ﬁlters – generalized matched ﬁlters (GMFs) – and a threshold value for each. An unlabeled image is applied to the classiﬁer ﬁlter set, hereafter referred to as the super-generalized matched ﬁlter (SGMF). If the peak response of any of the constituent ﬁlters (GMFs) to the unlabeled test image exceeds the respective threshold level the decision is made in favor of labeling the image as target-class otherwise it is labeled non-target-class. Fig. 1 shows the logical progression of ﬁlter design for the two-class problem. If one class is a ﬁxed signal and the other is noise with known distribution, a matched ﬁlter (MF) is deﬁned. The MF is the single ﬁlter that gives the highest expected value of processed signal level to processed noise level [41]. If the ﬁrst class is represented by a set of signals, it is possible to deﬁne a generalized matched ﬁlter (GMF) that handles that problem well and reduces to the MF when the set of signals has only one member. The MF is not deﬁned for the case in which one class is represented by a set of more than one signal. We explore here a sequentially derived set of GMFs that handle the case of one object class represented by a set of images. It contains the GMF as a special case when the ﬁlter sequence has only one member. Thus, symbolically, SGMF+GMF+MF. There are numerous designs for powerful Fourier ﬁlters, but only two meet the deﬁnition of a generalized matched ﬁlter, namely a ﬁlter that works well for multiple objects and reduces to matched ﬁlter when the number of objects is one. Those are linear combinations of matched ﬁlters, and the Caulﬁeld–Haimes discriminant [36]. The algorithm consists of two major components. In one part a cluster, comprised of a judiciously selected subset of the training set, is selected using the current training set. In the second part the cluster is utilized to train a classiﬁer (generalized matched ﬁlter GMF) by employing a bio-mimetic algorithm fashioned after the immune system evolution process. Training of GMFs can be based on any available robust classiﬁcation algorithm. In this paper, however, we have used a powerful pattern classiﬁcation algorithm called Margin-Setting [42,43] for this purpose. The GMF is then used to eliminate some members from the current training set. These are cluster members that fall within the sphere of inﬂuence of the GMF. The remaining members of the training set form the current training set for the next training round. This process continues, generating a GMF in each round, until the current training set is empty. The ordered set of GMFs so generated constitutes the SGMF. 4. Problem formulation The training set consists of multiple sub-images derived from one or more images comprising the training source image set. For example, training a classiﬁer capable of distinguishing ﬂoating objects from the background sea surface involves a set of subimages (training set) obtained from various sea surface images (training source image set), acquired at different scales and view angles, under varied environmental and lighting conditions, and devoid of any ﬂoating objects. All images here and henceforth are assumed

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

2941

to be gray-scale and are represented by real valued matrices. The training set is deﬁned below, where ai, S and Ns denote, respectively, a trainer, the set of all trainers and the number of trainers. S ¼ fai : 1rirNs g

ð1Þ

A typical trainer is obtained by extraction of a rectangular sub-image from a contiguous region of one of the training source images as shown below. In (2) ai and Bj denote, respectively, a trainer and the training source image from which it is drawn; k, l are randomly chosen integers which determine the spatial neighborhood of the source image from which the trainer is obtained; M, N and M0, N0 denote, respectively, the number of pixels along vertical and horizontal directions of the trainer and the respective source image. ai ¼ Bj ðk : ðk þ M1Þ, 1rkrM0 M þ 1,

l : ðl þ N1ÞÞ

ð2aÞ

1rlrN0 N þ 1

ð2bÞ

The trainer set is normalized such that for each image the mean pixel intensity is set to zero and the sum of squares of pixel values is set to one in accordance to (3), where ai denotes a typical normalized trainer. PM1 PN1 ai ðm,nÞ ð3aÞ ai ¼ ai m ¼ 0 n ¼ 0 MN ai ai ¼ h PM1 PN1 m¼0

2 n ¼ 0 ai

ð3bÞ

i1=2

Next, the trainer spectra and the mutual correlations between all trainer pairs are computed as shown in (4), where Ak and lkl denote, respectively, the spectrum of a particular trainer and the peak correlation between a pair of trainers. Ak ðp,qÞ ¼

M1 1 X X N

ak ðm,nÞej2pðmp=Mþnq=NÞ ;

0rprM1,

0rqrN1

ð4aÞ

m¼0n¼0

"

1 XN X 1 M1 Ak ðp,qÞAnl ðp,qÞej2pðmp=Mþnq=NÞ lkl ¼ max m,n MN p¼0q¼0

L ¼ ½lkl ;

# ð4bÞ

1rk, lrNs

ð4cÞ n

where ak , Ak represent, respectively, a normalized trainer and its spectrum; j, denote the unit imaginary number and complex conjugate operator, respectively; L represents the peak cross-correlation matrix and Ns is the number of trainers. A cluster is deﬁned as a group of NC trainers for which the mutual correlation coefﬁcients satisfy certain conditions outlined below. The parameter value NC is user speciﬁed and denotes cluster population. The cluster formation process proceeds as follows. First, a pair of trainers whose peak correlation is highest among all current trainer pairs is identiﬁed. If there is more than one such pair, one is chosen randomly. The trainer pair so identiﬁed constitutes the ﬁrst two cluster members. Among the remaining trainers, the trainer whose minimum peak correlation with respect to all cluster members is largest is identiﬁed and added to the cluster. If there is more than one such trainer, one is chosen randomly. The process

2942

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

continues, adding one trainer to the cluster in each step, until the number of trainers in the cluster reaches NC. The procedure for selecting the ﬁrst two members of the cluster is described by (5a). Following the selection of the ﬁrst two members the third member of the cluster is chosen in accordance to (5b). (k, l kal lkl Zluv , (w wak,l

81ru, vrNs ,

uav ) *1 ¼ ak ,

*2 ¼ a l

ð5aÞ

minðlwk ,lwl ÞZmin lpk ,lpl ,

8p 1rprNs ,

pak,l ) *3 ¼ aw

ð5bÞ

where *1 ,*2 ,*3 denote the ﬁrst three cluster members. The process described in (5b) continues, ﬁnding one new cluster member in each cycle and adding it to the cluster set. The process is terminated when the cluster population reaches Nc or there are no additional trainers. The cluster is subsequently used to compute the classiﬁer ﬁlter (GMF) and the respective threshold using the procedure outlined below. The computation of the classiﬁer ﬁlter begins with forming a large number of ﬁlters (typically one-thousand), each obtained as a weighted sum of the spectra of all the cluster members using randomly generated weight coefﬁcients. Each one of the randomly generated ﬁlters is normalized with respect to the sum of squares of its pixels. Eq. (6a,b) describe the formation of a typical ﬁlter and its subsequent normalization. F k ðm,nÞ ¼

Nc X

wkl C^ l ðm,nÞ,

1rkrNF ,

wkl 2 ½01

ð6aÞ

l¼1

F k ðm,nÞ F k ðm,nÞ ¼ h 2 i1=2 PM1 PN1 F ðm,nÞ k m¼0 n¼0

ð6bÞ

where C^ l is the spectrum of a typical cluster member, weight coefﬁcients wkl are randomly chosen from a uniform probability distribution function in [01], NF denotes the number of synthesized ﬁlters, and F k is the spectrum of a typical normalized ﬁlter. Following the synthesis of all ﬁlters, to each ﬁlter a utility value is assigned in accordance to the ﬁlter peak response to all the cluster members. Filter peak response to a typical cluster member is deﬁned as the peak correlation between the ﬁlter and the respective cluster member and is computed as shown in (7a). The utility value of a typical ﬁlter is the minimum peak response of the ﬁlter with respect to all cluster members from which it is synthesized and is computed as shown in (7b). " # 1 n XN X 1 M1 j2pðmp=Mþnq=NÞ Rkl ¼ max Fk ðp,qÞC^ l ðp,qÞe ð7aÞ , 1rlrNc m,n MN p¼0q¼0 Uk ¼ minðRkl Þ, l

1rkrNF

ð7bÞ

where Nc is the number of trainers in the cluster, NF is the number of synthesized ﬁlters and Uk denotes the utility value of the synthesized ﬁlter F k . Next, the synthesized ﬁlters are rank ordered in accordance to their utility values and a subset consisting of NF0 highest ranked ﬁlters are chosen for further processing. Typical values for user-prescribed parameters NF , NF0 , are one-thousand and ten, respectively. The set of ﬁlters so chosen constitutes the generation-zero ﬁlter set, and the highest ranked among them is denoted as

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

2943

the generation-zero prototype. Each normalized ﬁlter in (6b) is represented by its corresponding weight vector which is a point in the NcD space. Eq. (8a,b) below describe the process of forming the generation-zero ﬁlter set and selecting the generation-zero prototype. 0

F k ðm,nÞ ¼

Nc X

0 wkl C^ l ðm,nÞ,

1rkrNF0

ð8aÞ

l¼1 0

0 F^ ðm,nÞ ¼ F p ðm,nÞ; 0 F^ ðm,nÞ ¼

Nc X

Up ZUk ,

8k 1rp,k r NF

ð8bÞ

w^ 0l C^ l ðm,nÞ

ð8cÞ

l¼1

n o ^ 0¼ w ^ 01 , w ^ 02 ,. . ., w^ 0Nc W

ð8dÞ

0

0

^0 where F k ðm,nÞ is a typical member of the generation-zero ﬁlter set, and F^ ðm,nÞ, W denote, respectively, the generation-zero prototype and the corresponding weight vector. The generation-zero ﬁlter set is mutated by applying a perturbation process to the respective weight vectors. For each weight vector a mutually independent NcD Gaussian process is deﬁned whose mean is equal to the respective weight vector and the variance along all directions is the same and speciﬁed by the user. From each Gaussian process a number, commensurate with the utility value of the corresponding ﬁlter, of new weight vectors are chosen randomly. The weight vector mutation process isshown in (9). This leads to a large number (NF) of altered weight vectors from which NF0 vectors are chosen randomly. 1 0 W k N W k ,s2 , 1rkrNF0 ð9aÞ 1 1 1 f 1 W k1 ,W k2 ,. . .,W kNc ¼ Wk

1 ð2pÞNc =2 sNc

h

e

1

0

1=2 W k W k

ih

1

0

W k W k

iT ð9bÞ

where s is the user-prescribed standard deviation (typically set at 0.1), superscript T denotes matrix transpose, and f 1 is the multivariate Gaussian process from which the Wk

1

mutated weight vectors are randomly chosen. The chosen weight vectors W k are used to synthesize the generation-one ﬁlter set which are then normalized with respect to their energy content as shown in (6b). The utility value of each ﬁlter is computed and ﬁlters are rank ordered in the manner described before. The ﬁlter with highest utility value is designated as the generation-one prototype. This process is repeated for a user speciﬁed number of mutation cycles (typically ﬁve) or until the overall ﬁlter utility values reach a plateau and no further improvement is observed. It is noted that in each cycle superior ﬁlters (those with higher utility values) mutate preferentially and generate a proportionately larger offspring set. In each mutation cycle the ﬁlters are chosen randomly from the large number of mutant ﬁlters which are spawned from the set of ﬁlters in the mutation cycle immediately preceding it. There exists an inherent upward trend and overall improvement in the ﬁlter utility values as one proceeds from one mutation cycle to the

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

2944

next. This is because in any cycle a larger proportion of the synthesized ﬁlters are progenies of superior parents from the earlier cycle. Since the present-cycle ﬁlters are chosen from this set at random, there is an overall improvement in performance vis-a-vis the ﬁlter utility value. At the termination of the mutation process the ﬁlter with the highest utility value is declared the round-one ﬁlter and its utility value, deﬁned in (7), is the corresponding zeromargin threshold. The round-one threshold is deﬁned as below. T ð1Þ ¼ ð1 þ 0:01 dÞT0ð1Þ

ð10Þ T0ð1Þ ,T ð1Þ denote,

respectively, zero-margin where d is the percent-margin parameter, and threshold and threshold for the round-one ﬁlter. The typical value for the user-prescribed percent-margin lies between zero and ten. The response of the round-one ﬁlter to all trainers is computed as shown below. Those trainers whose response values exceed the threshold are removed from the training set. " # 1 XN X 1 M1 ð1 Þn j2pðmp=Mþnq=NÞ Rk ¼ max I ðp,qÞAk ðp,qÞe ð11Þ ; 1rkrNs m,n MN p¼0q¼0 where I(1) denotes the round-one ﬁlter, Ak is the spectrum of a typical trainer, and Rk is the ﬁlter peak-response to the trainer. The set of trainers for which the peak-response exceeds the round-one ﬁlter threshold are said to be subsumed by the ﬁlter and are removed from the training set, leading to a reduced set of trainers. ð12aÞ S ð1Þ ¼ ai : Ri ZT ð1Þ S~ ¼ S\S ð1Þ

ð12bÞ

where S, S ð1Þ , S~ denote, respectively, the original trainer set, set of trainers subsumed by the round-one ﬁlter, and the reduced trainer set. This concludes the round-one training process. The round-two training process repeats all the round-one steps described above using the reduced training set given in (12b). It concludes with computation of the roundtwo ﬁlter I(2), the respective threshold T(2), and a further reduced trainer set. This process continues for a user-speciﬁed number of training rounds or until there are less than two trainers left. Each round of training starts with an input trainer set and concludes with the computation of a ﬁlter-threshold pair and a reduced trainer set that is passed to the next training round. At the end of the training process the threshold values are adjusted by multiplying them with the relaxation parameter as shown below. T

ðnÞ

¼ ð10:01 bÞT ðnÞ ,

ðnÞ

1rnrNR

ð13Þ

where T , T ðnÞ are, respectively, the adjusted (relaxed) and computed threshold values for a typical classiﬁer,NR denotes the number of classiﬁcation rounds and b is the relaxation parameter (0rbr100) with typical value in 10–30 range. The training process in its entirety generates a multi-round cascade classiﬁer. Each classiﬁer round comprises one ﬁlter-threshold pair. The heuristic clustering procedure described here has some resemblance to K-means [44,45]. The two methods, however, are fundamentally different in that K-means uses a predetermined number of clusters, whereas here we compute one cluster using the available exemplars, train a ﬁlter (GMF) using the computed cluster, utilize the ﬁlter to remove some samples from the training set, and start the process anew. The user speciﬁes the

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

2945

cluster population, but the number of clusters is not preordained. This is because the number of trainers eliminated by a GMF in a particular training round may be equal to, smaller than, or larger than the cluster population it is trained on. Effective clustering is the K-mean’s objective, whereas, the intent here is the robust distinction between background and intrusions. The classiﬁer is applied to the input test image by computing the cross-correlation of the test image with respect to each ﬁlter comprising the classiﬁer. The cross-correlation results are converted to the respective binary images by setting all pixels at which the correlation exceeds the corresponding threshold to zero and all other pixels to one. The computed binary images are multiplied pixel-wise and the resultant binary image is complemented to produce the anomaly map, in which all background and non-background (anomalies) pixels have values of one and zero, respectively. 5. Classiﬁer performance tests In order to demonstrate the efﬁcacy of the algorithm for detection of image anomalies a number of simulations were conducted using a diverse set of backgrounds and anomalies. The underlying scenario in all the simulations comprises a target-class source image consisting of one or multiple spatially consistent background regions from which the classiﬁer is trained. The background image (target-class source image) is then contaminated by replacing one or multiple small square image patches at random locations with square patches of the same size extracted from arbitrary non-target class images. The classiﬁer is subsequently applied to the contaminated image in order to detect and locate the anomalous (non-target) regions. Fig. 2 shows two gray-scale images, ivy-leaves representing the target-class source, and deep-grass representing non-target contamination source. Two small square patches were extracted from the deep-grass image and were subsequently inserted in the ivy-leaves image replacing squares of the same size in the host image. The positions of the extracted deepgrass image patches and their destinations in the ivy-leaves image were chosen randomly. The ivy-leaves image transplanted with two deep-grass patches is shown in Fig. 3a. Visually the images of Fig. 2 are quite similar, and indeed it is impossible to use simple inspection to locate the two transplanted deep-grass anomalies in the corrupted image of 3a. A classiﬁer was trained using forty 20 20-pixel image patches drawn randomly from

Fig. 2. Ivy-leaves image (a) represents the target-class and deep-grass image (b) is the source from which anomalies are extracted for transplantation in the target-class host image.

2946

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

Fig. 3. The Input image (a) is comprised of ivy-leaves and two small non-leaves (deep-grass) patches. The output image (b) shows that the ﬁlter correctly detected and located all the foreign (non-leaves anomalies) patches.

6 false-negative false-positive

error rates (percent)

5 4 3 2 1 0 20

21

22

23

24 25 26 27 relaxation (percent)

28

29

30

Fig. 4. Effect of relaxation parameter on error rates. Percent error rates are plotted as functions of relaxation parameter.

the ivy-leaves image of Fig. 2a (training source). The classiﬁer was then applied to the corrupted image of Fig. 3a, and all the image sections classiﬁed as target-class members were set to white. As seen from the output image of Fig. 3b, the classiﬁer correctly identiﬁed all the foreign image segments, i.e., the two deep-grass patches in this case. This experiment was repeated multiple times, each time square patches were extracted from Fig. 2b and were transplanted in Fig. 2a at random locations. In some of the experiments the extracted image patches were randomly scaled and rotated prior to transplantation in the host image. In every case the classiﬁer correctly located the image anomalies. In order to obtain statistically reliable performance metrics for the algorithm, a classiﬁer was trained, using forty 20 20-pixel image patches randomly extracted from the targetclass image (ivy-leaves), and was subsequently applied to one-thousand test images comprised of randomly selected 20 20-pixel image patches equally divided between target

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

2947

9 8 7 6 5 4 3 2 1 0

0.35 N = 20 30 40 50

false-positive (percent)

false-negative (percent)

and non-target (deep-grass) source images of Fig. 2. The process of trainer selection, classiﬁer computation, test image set formation and testing was repeated one-hundred times and the detection results were averaged. There are two types of errors, namely falsenegative, which refers to the classiﬁer failure to correctly identify target-class test images, and false-positive, representing misclassiﬁcation of non-target class test images as targetclass. The plots of Fig. 4 show the false-negative and false-positive error rates as functions of the threshold relaxation parameter. As seen in (13), increasing the relaxation parameter b lowers the ﬁlter threshold values and therefore results in a more permissive classiﬁer which in turn detects a larger percentage of the target-class inputs (lower false-negative) at the expense of higher false-positive error. It is noted that misclassiﬁcation of non-target class patches (deep-grass in this case) as target class (ivy-leaves) is virtually zero. For example, setting the relaxation parameter at 30-percent, b ¼ 30 in (13), results in false-negative and false-positive error rates of 0.67percent and 0.23-percent, respectively. Fig. 5 illustrates the effect of the number of trainers on the classiﬁer performance. All the parameters are the same as those in Fig. 4 with the exception of the number of trainers. It is seen that using a few trainers the algorithm generates a classiﬁer that can recognize foreign (non-target) image patches with virtual certainty. For example, a classiﬁer trained with only twenty 20 20 -pixel target-class image patches classiﬁed 96-percent of the previously unseen target-class patches correctly and misclassiﬁed less that 0.23-percent of the non-target patches. It is also seen that the overall performance improves as the number of trainers is increased, as expected. As the number of trainers is increased the false-negative error decreases substantially at the expense of a slight increase in the false-positive error. Next, the effect of trainer spatial dimensions on the classiﬁer performance is examined. The number of trainers was set at 40 in all cases while image-patch dimensions were varied. The plots of Fig. 6 show error rates as functions of threshold relaxation with the trainer dimension used as the control parameter. It is seen that increasing the threshold relaxation leads to lower false-negative and higher false-positive error rates. It is also seen that for ﬁxed relaxation the false-negative error increases with larger-dimension trainers. This is to be expected, as larger trainers will result in more restrictive classiﬁers, which in turn explains the extremely low false-positive error rates for the 20 20-pixel and 30 30-pixel

0.3 0.25

N = 20 30 40 50

0.2 0.15 0.1 0.05 0

20 21 22 23 24 25 26 27 28 29 30 relaxation (percent)

20 21 22 23 24 25 26 27 28 29 30 relaxation (percent)

Fig. 5. Effect of the number of trainers on the classiﬁer performance. The false-negative (a) and false-positive (b) error rates are plotted as functions of relaxation (b) with the number of trainers (N) as the control parameter.

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

102

false-negative (percent)

8

10x10 20x20 30x30

7 6

10x10 20x20

false-positive (percent)

2948

5 4 3 2 1 0 20 21 22 23 24 25 26 27 28 29 30 relaxation (percent)

101

100

10-1 20 21 22 23 24 25 26 27 28 29 30 relaxation (percent)

Fig. 6. Effect of trainer spatial dimensions on the classiﬁer performance. The false-negative (a) and false-positive (b) error rates are plotted as functions of relaxation (b) with trainer dimensions (M N) as the control parameter.

8

9

6 5 4 3 2 1 0 10

false-negative false-positive

8

false-negative false-positive

error rate (percent)

error rate (percent)

7

7 6 5 4 3 2 1

15

20 25 30 number of trainers

35

40

0 10

15

20 25 30 number of trainers

35

40

Fig. 7. Effect of number of trainers on the classiﬁer performance. The classiﬁer was trained using 25 25-pixel (a), and 15 15-pixel (b) trainers extracted from the ivy-leaves source image. The classiﬁer was tested using image patches extracted from the ivy-leaves and deep-grass images of Fig. 2.

trainer cases. Indeed the false-positive error rate for the classiﬁer trained with 30 30-pixel patches is zero for all relaxation values of Fig. 6b and therefore is not displayed on the logarithmic scale. The classiﬁer trained with forty 20 20-pixel patches and relaxation of 30-percent has false-negative rate of 0.675-percent and false-positive rate of 0.23-percent. Increasing the trainer dimension to 30 30 while keeping all other parameters the same results in an elevated false-negative rate of 1.8-percent whereas the false positive rate remains zero. The effect of the number of trainers on the classiﬁer performance is examined next. The classiﬁers were trained using two distinct sets of training images comprised of square patches extracted from the ivy-leaves image of Fig. 2a. In the ﬁrst case the trainers consisted of 25 25-pixel image patches and in the second case they consisted of 15 15pixel patches. The threshold relaxation parameter was set at 30-percent for all test cases. The number of trainers for each scenario was varied from ten to forty, and for each case

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

2949

the classiﬁer was tested using one thousand test images comprised of randomly chosen image patches extracted from target (ivy-leaves) and non-target (deep-grass) classes of Fig. 2. As before, the test images were equally divided between the two classes, and the training and testing process was repeated one-hundred times for each scenario and detection results were averaged. The plots of Fig. 7 show the error rates as functions of the trainer population for each setting of the trainer dimension. It is noted that the classiﬁer based on forty 25 25-pixel trainers achieves false-negative and false-positive error rates of 0.2-percent and zero, respectively. This means that the classiﬁer, on average, correctly classiﬁed 99.8-percent of the test image patches extracted from the ivy-leaves image and rejected 100-percent of the ones extracted from the deep-grass image. The classiﬁer based on forty 15 15-pixel trainers, on the other hand, achieved false-negative and falsepositive error rates of 0.175 and 7.8, respectively. In case of the smaller trainers, as one increases the number of training elements, the volume in the feature hyperspace encompassed by the classiﬁer increases and inevitably includes non-target space. This accounts for the overall upward trend in the false-positive error rate. This phenomenon, however, diminishes for dimensionally larger training elements. The images of Fig. 8 show one of the trainers comprising the training set and all the ﬁve ﬁlters comprising the classiﬁer for a particular training scenario. The classiﬁer is based on thirty 20 20-pixel trainers extracted from the ivy-leaves image of Fig. 2, one of which is shown in the upper-left corner of Fig. 8. The cluster population and the margin-value were set at ﬁve and zero, respectively. The training process resulted in a ﬁve-ﬁlter classiﬁer

Fig. 8. The upper left image is one of the thirty 20 20-pixel ivy-leaves trainers and the other ﬁve images represent the ﬁve-ﬁlter classiﬁer (SGMF). The ﬁlter orders increase from left to right and from top to bottom. Filters order one through ﬁve subsumed 9, 7, 5, 5, 4 trainers and have threshold values of 0.5520, 0.4925, 0.6897, 0.4336, 0.3873, respectively.

2950

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

(SGMF) all of which are shown in Fig. 8. The number of trainers utilized in the synthesis of each particular ﬁlter and the respective threshold value are listed in the caption of Fig. 8. Figs. 9–11 show additional examples that demonstrate the effectiveness of this technique for detecting and locating image anomalies. The original images for these examples were obtained from the Texture Database in [46]. Fig. 9 shows two types of skin where, for the purpose of this demonstration, Fig. 9a is the image of type-one skin and is designated as target, and Fig. 9b is the image of type-two skin and is designated as non-target. The target and non-target sample images shown in Fig. 9 have dimensions (300 300) and (200 200) pixels, respectively. The image of Fig. 9c is the target corrupted by ﬁve small patches randomly extracted from the non-target image. The objective here is to obtain a classiﬁer that is capable of detecting and locating all potential non-target intrusions of any type in previously untrained-on input images consisting of type-one skin. The classiﬁer was trained using thirty randomly selected 15 15-pixel patches from the target-class. In this example cluster population of ﬁve and relaxation parameter of 20-percent were used, and

Fig. 9. Images of type-one (a) and type-two (b) skin represent target and non-target classes. The image in (c) shows type-one skin corrupted by patches extracted from the type-two skin. The image in d is the result of the application of the ﬁltering process to the image in (c).

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

2951

Fig. 10. Images of net (a) and carpet (b) represent target and non-target classes. The image in (c) shows net corrupted by patches extracted from carpet. The image in (d) is the result of the application of the ﬁltering process to the image in (c).

the training process resulted in a seven-ﬁlter SGMF classiﬁer. The classiﬁer was applied to the image of Fig. 9c and it correctly detected all non-target image sections. The ﬁltered image is shown in Fig. 9d, and close visual inspection of the corrupted image shows that the classiﬁer has indeed detected and located all the skin anomalies successfully. The above example was repeated using type-two skin as the target class. A new classiﬁer was trained using thirty 15 15-pixel samples randomly extracted from Fig. 9b. The input image was subsequently synthesized by transplanting in the image of 9b a few randomly selected patches extracted from the image of Fig. 9a. The experiment was repeated multiple times, and the classiﬁer detected all intrusions in every trial. In the example of Fig. 10, a classiﬁer was trained to distinguish one type of fabric (target-class) against all other fabric types. Fig. 10a shows the image of a piece of net, which constitutes the target-class, and Fig. 10b shows the carpet image, which represents non-target. The target and non-target sample images shown in Fig. 10 have dimensions (300 300) pixels. The classiﬁer was trained using twenty 10 10-pixel patches randomly selected from Fig. 10a, a cluster

2952

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

Fig. 11. Image of background (target-class) consisting of two skin types corrupted by anomalies comprised of net and carpet patches (a), and the ﬁltered image (b).

population of four and relaxation parameter of 25-percent. The training process resulted in a three-ﬁlter classiﬁer. In principle, this classiﬁer is capable of distinguishing backgrounds comprised of net from all non-net intrusions. Three randomly selected 10 10-pixel patches were extracted from the carpet image and were randomly transplanted in the net image. Fig. 10c and d show the corrupted net image and the ﬁltered image. Close visual examination of the Fig. 10c image shows that the classiﬁer correctly located all non-net image sections. The example of Fig. 11 shows an image consisting of a background comprised of the two skin types of Fig. 9. The image also contains intrusions which were extracted from net and carpet images of Fig. 10 and were implanted at random locations. A classiﬁer was trained using ﬁfty 15 15-pixel patches obtained from the images of two skin types of Fig. 9. Twenty-ﬁve training patches were randomly chosen from each skin type and collectively were called the background class. The classiﬁer was trained using cluster population of ﬁve and relaxation parameter of 20-percent, resulting in a nine-round classiﬁer. It is noted that in the training process no distinction is made between the type-one and type-two skin and all training samples are considered as members of the same class, namely background or target-class. Fig. 11b shows the ﬁltered image. It is noted that the classiﬁer has correctly isolated all image regions that are not members of the background class. Representative data pertaining to the performance of the anomaly detection classiﬁers for an assortment of background-anomaly combinations and various settings of the training parameters are listed in Table 1. As before, false-negative error rate (FN) is the percentage of test images comprised of arbitrarily chosen square patches of the background source image that are misclassiﬁed as anomalies. False-positive error rate (FP), on the other hand, is the percentage of arbitrarily chosen square patches of the anomaly source image that are misclassiﬁed as background. Although in the examples presented here all anomalies are rectangular, the computed SGMF classiﬁer is capable of detecting arbitrarily shaped anomalies. Corrupting a test image with an arbitrarily shaped anomaly, involves superimposing the anomaly on the test image at the desired location. This is tantamount to replacing the affected test image pixel

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

2953

Table 1 Classiﬁer performance. Environment

Training parameters

Error rates

Background

Anomaly

P

Q

b

FN

FP

Ivy Leaves

Deep grass

20 40

20 20

20 10 30 15 25 15 15

30 20 25 30 30 30 30 30 30 25 25

4 5.75 3.02 0.67 0.32 0.35 1.8 0.2 0.2 0 0

0.23 0 0 0.23 0.26 37 0 7.8 0 0 0

Type-one skin Type-two skin

Type-two skin Type-one skin

50 40 40 40 40 30 30

Net

Carpet

20

10

25

0

0

Skin

Fabric

50

15

25

0

0

P represents number of trainers, Q represents the spatial dimensions of each trainer Q Q, b is the relaxation parameter (percent), FN and FP denote, respectively, percent false-negative and percent false-positive error rates.

values with values of the respective pixels of the anomaly. As the result of this transplantation process a rectangular region of the test image, whose area is larger than the oddly shaped anomaly, constitutes the anomalous region. This is despite the fact that some of its pixels are the original target-class pixels. Applying the SGMF to the test image will result in classiﬁcation of the rectangular region circumscribing the foreign section as anomalous, and hence detection of the oddly shaped anomaly is materialized. 6. Summary and conclusions This paper presents a practical algorithm for negative Fourier ﬁltering of imagery data, where the objective is the removal of expected background rather than ﬁnding anomalies and intrusions which may constitute potential targets. In four challenging and varied scenarios, the classiﬁer introduced in this paper, worked extremely well and was able to remove the background and expose potential targets for further processing. In each case the classiﬁer was trained using a few background samples, each consisting of a small square image patch randomly extracted from the background. The classiﬁer was able to remove all the background pixels, most of which were not included in the training set, while leaving all other objects intact. Our approach could work (to varying degrees) with any good classiﬁer that can be implemented in the Fourier transform domain. The aim of this paper is to explain the concept and show that it works well using different images that all have the feature of making human search for the ‘‘anomalies’’ very difﬁcult. The use of SGMFs for Fourier recognition and subsequent blanking of regions that are likely to belong to the expected class or classes has been shown to be a very powerful means to highlight the unexpected in a scene. The salient contribution of the work presented here is the fusion of

2954

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

Fourier ﬁltering and Margin-Setting techniques for development of negative-ﬁlter classiﬁers for anomaly detection. Possible applications are immense, e.g., locating

Defects in materials on a conveyor belt or produced as sheets Objects ﬂoating in the sea. Marijuana growing in ﬁelds of other grass. Small patches of diseased or necrotic tissue in an otherwise healthy region. Downed airmen bobbing about in the sea. Defects in a largely-acceptable manufactured item.

References [1] H.T. Nguyen, A. Smeulders, Robust tracking using foreground-background texture discrimination, International Journal of Computer Vision 69 (2006) 277–283. [2] H.T. Nguyen, A. Smeulders, Fast occluded object tracking by a robust appearance ﬁlter, IEEE Transactions on Pattern Analysis and Machine Intelligence 26 (8) (2004) 1099–1104. [3] Y. Sugaya, and K. Kanatani Extracting moving objects from a moving camera video sequence, in: Proceedings of the 10th Symposium on Sensing via Imaging Information, (2004), pp. 279–284. [4] J. Cheng, J. Yang, Y. Zhou, H. Cui, Flexible background mixture models for foreground segmentation, Image and Vision Computing 24 (2006) 473–482. [5] W.E. Grimson, C. Stauffer, R. Romano, L. Lee Using adaptive tracking to classify and monitor activities in a site, in: Proceedings of the IEEE Computer Vision and Pattern Recognition, (1998), pp. 22–29. [6] C. Stauffer, and W.E. Grimson Adaptive background mixture models for real-time tracking, in: Proceedings of the IEEE Computer Vision and Pattern Recognition 2, (1999), pp. 246–252. [7] N. Friedman, and S. Russell Image segmentation in video sequences: a probabilistic approach, in: Proceedings of the Thirteenth Conference on Uncertainty in Artiﬁcial Intelligence, (1997), pp. 175–181. [8] J. Kato, T. Watanabe, S. Joga, J. Rittscher, A. Blake, An HMM-based segmentation method for trafﬁc monitoring movies, IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (9) (2002) 1291–1296. [9] C.R. Wren, A. Azarbayejani, T. Darrell, A.P. Pentland, Pﬁnder: real-time tracking of the human body, IEEE Transactions on Pattern Analysis and Machine Intelligence 19 (7) (1997) 780–785. [10] C. Ridder, O. Munkelt, and H. Kirchner Adaptive background estimation and foreground detection using Kalman-ﬁltering, in: Proceedings of the International Conference on Recent Advances in Mechatronics, (1995), pp. 193–199. [11] J. Rittscher, J. Kato, S. Joga, A. Blake, A probabilistic background model for tracking, Proceedings of the Sixth European Conference on Computer Vision (2000) 336–350. [12] B. Stenger, V. Ramesh, N. Paragios, F. Coetzee, J.M. Buhmann, Topology free hidden Markov models: application to background modeling, Proceedings of the IEEE International Conference on Computer Vision 1 (2001) 294–301. [13] P.W. Power, J.A. Schoonees, Understanding background mixture models for foreground segmentation, Proceedings of the Image and Vision Computing, New Zealand (2002) 267–271. [14] K. Toyama, J. Krumm, B. Brumitt, B. Meyers, Wallﬂower: principles and practice of background maintenance, Proceedings of the IEEE International Conference on Computer Vision 1 (1999) 255–261. [15] D.W.J. Stein, S.G. Beaven, L.E. Hoff, E.M. Winter, A.P. Schaum, A.D. Stocker, Anomaly detection from hyperspectral imagery, IEEE Signal Processing Magazine 19 (2002) 58–69. [16] H. Kwan, S.Z. Der, N.M. Nasrabadi, Adaptive anomaly detection using subspace separation for hyperspectral imagery, Optical Engineering 42 (2003) 3342–3351. [17] A. Goldman, and I. Cohen Anomaly Detection Based on an Iterative local statistics approach, Convention of the Electrical and Electronic Engineers in Israel, (6–7 September 2004), pp. 440–443. [18] Q. Cheng, Y. Xu, E. Grunsky, Integrated spatial and spectrum method for geochemical anomaly separation, Natural Resources Research 9 (2000) 43–52.

K. Heidary, H. John Caulfield / Journal of the Franklin Institute 349 (2012) 2935–2955

2955

[19] M. Markou, S. Singh, Novelty detection: a review, Part I: Statistical approaches, Signal Processing 83 (2003) 2481–2497. [20] M. Markou, S. Singh, Novelty detection: a review, Part II: Neural network based approaches, Signal Processing 83 (2003) 2499–2521. [21] S. Singh, M. Markou, An approach to novelty detection applied to the classiﬁcation of image regions, IEEE Transactions on Knowledge and Data Engineering 16 (4) (2004) 396–407. [22] H.A. Rowley, B. Shumeet, T. Kanade, Neural network-based face detection, IEEE Transactions on Pattern Analysis and Machine Intelligence 20 (1) (1998) 23–38. [23] M.H. Yang, D. Kriegman, N. Ahuja, Detecting faces in images: a survey, IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (1) (2002) 34–58. [24] P. Viola, M.J. Jones, Robust real-time object detection, Proceedings of IEEE Workshop on Statistical and Computational Theories of Vision (2001). [25] H. Schneiderman, T. Kanade, A statistical method for 3D object detection applied to faces and cars, Computer Vision and Pattern Recognition (2000). [26] P. Viola, M.J. Jones, Rapid object detection using a boosted cascade of simple features, IEEE Computer Vision and Pattern Recognition I (2001) 511–518. [27] P. Viola, M.J. Jones, Robust real-time face detection, International Journal of Computer Vision 57 (2) (2004) 137–154. [28] E. Grossman, Automatic design of cascade classiﬁers, International IAPT Workshop on Statistical Pattern Recognition (2004). [29] M.M. Dundar, J. BI, Joint optimization of cascaded classiﬁers for computer aided detection, IEEE Conference on Computer Vision and Pattern Recognition (2007). [30] M. Turk, A. Pentland, Eigenfaces for recognition, Journal of Cognitive Neuroscience 3 (1) (1991) 71–86. [31] R. Ebrahimpour, E. Kabir, M.R. Yousef, Teacher-directed learning in view-independent face recognition with mixture of experts using single-view eigenspaces, Journal of the Franklin Institute 34 (2008) 87–101. [32] A. Khoukhi, S.F. Ahmed, A genetically modiﬁed fuzzy linear discriminant analysis for face recognition, Journal of the Franklin Institute 348 (2011) 2701–2717. [33] K. Heidary, H.J. Caulﬁeld, Application of supergeneralized matched ﬁlters to target classiﬁcation, Applied Optics 44 (1) (2005) 47–54. [34] R.Barry Johnson, Kaveh Heidary, A uniﬁed approach for database analysis and application to ATR performance metrics, Proceedings of SPIE 7696 (2010) 1–20. [35] A. VanderLugt, Signal detection by complex ﬁltering, IEEE Transactions on Information Theory IT 10 (1964) 139–145. [36] H.J. Caulﬁeld, R. Haimes, Generalized matched ﬁltering, Applied Optics 19 (1980) 181–183. [37] H.J. Caulﬁeld, M.H. Weinberg, Computer recognition of 2-D patterns using generalized matched ﬁlters, Applied Optics 21 (1982) 1699–1704. [38] M.A.G. Abushagur, H.J. Caulﬁeld (Eds.), Selected Papers on Fourier Optics, SPIE Milestone, Volume MS 105, SPIE-International Society for Optical Engine, 1995. [39] E.R. Dougherty, J. Astola, Nonlinear Filters for Image Processing, Wiley-IEE Press, 1999. [40] S. Lototsky, R. Mikulevicius, B.L. Rozovski, Nonlinear ﬁltering recisited: a spectral approach,, SIAM Journal on Control and Optimization 35 (2) (1997) 435–461. [41] S. Haykin S., Communications Systems, fourth ed., Wiley, New York, 2001. [42] H.J. Caulﬁeld, K. Heidary, Exploring margin setting for good generalization in multiple class discrimination, Journal of Pattern Recognition 38 (8) (2005) 1225–1238. [43] K. Heidary, H.J. Caulﬁeld, Discrimination among similar looking, noisy color patches using margin setting, Optics Express, Journal of Optical Society of America 15 (1) (2007) 62–75. [44] J.B. MacQueen, J. B., Some methods for classiﬁcation and analysis of multivariate observations, in: Proceedings of Fifth Berkeley Symposium on Mathematical Statistics and Probability, University of California Press, 1:281–297 (1967). [45] T. Kanungo, D.M. Mount, N. Netanyahu, C. Piatko, R. Silverman, A.Y. Wu, An efﬁcient k-means clustering algorithm: analysis and implementation, IEEE Transactions Pattern Analysis and Machine Intelligence 24 (2002) 881–892. [46] University of Western Australia Texture Database, /http://local.wasp.uwa.edu.au/pbrouke/textureS.

Needles in a haystack: Fast spatial search for targets in similar-looking backgrounds

Needles in a haystack: Fast spatial search for targets in similar-looking backgrounds

Recommend Documents