> h i > r r P > > > if r > 0: : pl;i pl;i ql;i
Definition 1 (Multi-bitrate attention map). The multi-bitrate attention map that measures the attention score, following Postulates 1–7, at any spatial location P i and any bitrate q of image reconstruction fidelity, is
fAðPi ; qÞgPi ;q ; ð1Þ
l
611
ð2Þ
where
AðPi ; qÞ ¼
X tP
I½Pi ; t=Q;
ð3Þ
i
Hence we can now give a formal definition of the global attention map for a given image, following the rational approach to the measurement of attention score:
with the sum over times tPi , up to the given bitrate q, such that attention is directed to P i at time t Pi .
Fig. 1. Advertisement images and publicist areas of interest.
Fig. 2. Advertisement images and publicist areas of interest.
612
J.A. García et al. / Pattern Recognition Letters 31 (2010) 609–618
(using computational attention) with respect to the reference rank order (target distinctness measured by human observers):
2.1. Model evaluation The multi-bitrate attention map fAðP i ; qÞgPi ;q will provide us with a computational attention score for each spatial location P i at high and low quality versions of the image reconstruction. Next we show that a rational model of computational attention can be used to predict visual distinctness in a Database of target scenes that was presented in (Toet et al., 1998, 2001; Garcia et al., 2001). The images used in the experiment are slides made during the DISSTAF (Distributed Interactive Simulation, Search and Target Acquisition Fidelity) field test, that was designed and organized by NVESD (Night Vision & Electro-optic Sensors Directorate, Ft. Belvoir, VA, USA) and that was held in Fort Hunter Liggett, California, USA (Toet et al., 1998). These slides depict 44 different scenes. Each scene represents a military vehicle in a complex rural background. The visibility of the targets varies throughout the entire stimulus set. This is mainly due to variations in the structure of the local background, the viewing distance, the luminance distribution over the target support (shadows), the orientation of the targets, and the degree of occlusion of the targets by vegetation. Here we firstly compute the multi-bitrate attention map fAðP i ; qÞgPi ;q for each target scene in the Database from the DISSTAF field test (see Figs. 2–5 in (Garcia et al., 2001)). Then we calculate the lowest bitrate, q , of reconstruction fidelity for which the attention score of some point in the target area will be in the upper quartile of the attention map fAðPi ; q ÞgPi at bitrate q . A small value of q means that the computational model brings the attention onto the target using a low bitrate of picture quality which corresponds to a high saliency of the target area. Hence a lower value of q predicts a faster detection (due to the higher saliency) of the target in the cluttered scene; therefore the value q calculated for each image can be used to rank order the visual distinctness of the targets. Secondly, a psychophysical experiment is performed in which human observers estimate the visual distinctness of targets using the slides made during the DISSTAF field test. In the psychophysical experiment search times and cumulative detection probabilities were measured for nine military targets in complex natural backgrounds. A total of 64 civilian observers, aged between 18 and 45 years, participated in the visual search experiment. The procedure of the search experiment is described in (Toet et al., 1998, 2001). Search performance is usually expressed as the cumulative detection probability as function of time, and it can be approximated by Krendel and Wodinsky (1960), Rotman et al. (1989) and Waldman et al. (1991):
( P d ðtÞ ¼
0
t < t0 o 0 ; t P t0 1 exp tt q n
) ;
ð4Þ
where P d ðtÞ is the fraction of correct detections at time t, t 0 is the minimum time required to response, and q is a time constant. The curves are rank-ordered according to the area beneath their graphs (Toet et al., 1998). This subjective ranking induced by the psychophysical target distinctness is adopted as the reference rank order in the comparative study. Targets in a particular dataset will have similar visual distinctness if they give rise to closely spaced cumulative detection curves that are similar in accordance with a Kolmogorov–Smirnov test (Toet et al., 2001). In fact, the target images in the Dataset are clustered into four groupings of targets with comparable visual distinctness (Garcia et al., 2001). For the target dataset given by Figs. 2–5 in (Garcia et al., 2001), we have calculated the probability of correct classification PCC for the rational attention model with risk aversion (r ¼ 0:6) with respect to gambles on location-dependent attention. The evaluation function PCC is defined by the fraction of correctly classified targets
Fig. 3. Advertisement images and publicist areas of interest.
613
J.A. García et al. / Pattern Recognition Letters 31 (2010) 609–618
Number of correctly classified targets : Number of targets
The rational model of computational attention (which follows Postulates 1–7) yields a high probability of correct classification (P CC ¼ 0:7272). It implies a correlation between human and model predictions of visual attention. Recall that the rational model of attention does not extract any visual feature like as color, intensity or orientation. Instead, it is only based on the multi-bitrate attention map fAðP i ; qÞgPi ;q for each target scene. Ref. (Itti et al., 1999) applied the Itti’s model of human visual search based on the concept of a ‘‘salience map” (Itti et al., 1998) to a wide range of target detection tasks using the DISSTAF images. Through a 2D map, the saliency of objects in the visual environ-
ment is encoded. In (Itti et al., 1999), low-level visual features (color, intensity, and orientation) are extracted in parallel from nine spatial scales, and the resulting feature maps are combined to yield three saliency maps. These, in turn, feed into a single saliency map (a 2D layer of integrate-and-fire neurons). Competition among neurons in this map yields to a single winning location corresponding to the next attended target. Inhibiting this location transiently suppresses the currently attended location, causing the focus of attention to shift to the next most salient location. With respect to the predicted search times of the Itti’s attention model on the DISSTAF images, Itti et al. (1999) found a poor correlation between human and model search times (see Fig. 8 in (Itti et al., 1999)). It may be a consequence of the fact that the Itti’s attention model was originally designed not to find small, hidden targets, but
#1
SPIHT KAKADU JASPER
SNRlog
ATENTION
#1
SPIHT KAKADU JASPER
16:1 58:1100:1150:1200:1250:1300:1350:1400:1450:1 512:1
16:1 58:1100:1150:1200:1250:1300:1350:1400:1450:1 512:1
Compression Ratio
#2
#2
SPIHT KAKADU JASPER
16:1 58:1100:1150:1200:1250:1300:1350:1400:1450:1 512:1
SPIHT KAKADU JASPER
SNRlog
ATTENTION
Compression Ratio
16:1 58:1100:1150:1200:1250:1300:1350:1400:1450:1 512:1
Compression Ratio
#3
#3
ATTENTION
Compression Ratio
SPIHT KAKADU JASPER
16:1 58:1100:1150:1200:1250:1300:1350:1400:1450:1 512:1
SPIHT KAKADU JASPER
SNRlog
PCC ¼
16:1 58:1100:1150:1200:1250:1300:1350:1400:1450:1 512:1
Compression Ratio
Compression Ratio
#4
#4
SNRlog
ATTENTION
SPIHT KAKADU JASPER
16:1 58:1100:1150:1200:1250:1300:1350:1400:1450:1 512:1
Compression Ratio
SPIHT KAKADU JASPER
16:1 58:1100:1150:1200:1250:1300:1350:1400:1450:1 512:1
Compression Ratio
Fig. 4. (Left column) Normalized mean attention score within the areas of interest; (Right column) SNRlog measured on the important areas.
614
J.A. García et al. / Pattern Recognition Letters 31 (2010) 609–618
rather to find the few most obviously conspicuous objects in an image.
3. Evaluation of important information visibility Figs. 1–3 show, on the left column, a dataset of advertisement images, originally presented in (Mancas et al., 2007). The same figures also illustrate (on the right column) the respective important areas selected by the publicist since they provide its main message to the potential consumer. Next we use this dataset to compare the visual efficiency of the advertisement when it is transmitted using
three different coders: the SPIHT coder (http://www.cipr.rpi.edu/ research/SPIHT/spiht0.html), the KAKADU coder (http://www.kakadusoftware.com/), and JASPER (http://www.ece.uvic.ca/~mdadams/jasper/). The three transmission methods were applied without region-dependent quality of encoding. Thus for each one of the advertisement images in Figs. 1–3 we firstly compute the multi-bitrate attention map fAðPi ; qÞgPi ;q , using each one of the coding methods under analysis in order to obtain the advertisement reconstruction at bitrate q of reconstruction fidelity. Secondly, for each coding method we calculate the average attention score within the areas of interest provided by the pub-
#5
#5
SNRlog
ATTENTION
SPIHT KAKADU JASPER
SPIHT KAKADU JASPER
16:1 58:1100:1 150:1 200:1 250:1 300:1 350:1 400:1 450:1 512:1
16:1 58:1100:1 150:1 200:1 250:1300:1 350:1 400:1 450:1 512:1
Compression Ratio
Compression Ratio
#6
#6
SPIHT KAKADU JASPER
SNRlog
ATTENTION
SPIHT KAKADU JASPER
16:1 58:1100:1 150:1 200:1 250:1 300:1 350:1 400:1 450:1 512:1
16:1 58:1100:1 150:1 200:1 250:1 300:1 350:1 400:1 450:1 512:1
Compression Ratio
Compression Ratio
#7
#7
ATTENTION
SPIHT KAKADU JASPER
SNRlog
SPIHT KAKADU JASPER
16:1 58:1100:1 150:1 200:1 250:1 300:1 350:1 400:1 450:1 512:1
16:1 58:1100:1 150:1 200:1 250:1 300:1 350:1 400:1 450:1 512:1
Compression Ratio
Compression Ratio
#8
#8
SNRlog
ATTENTION
SPIHT KAKADU JASPER
SPIHT KAKADU JASPER
16:1 58:1100:1 150:1 200:1 250:1 300:1 350:1 400:1 450:1 512:1
16:1 58:1100:1 150:1 200:1 250:1 300:1 350:1 400:1 450:1 512:1
Compression Ratio
Compression Ratio
Fig. 5. (Left column) Normalized mean attention score within the areas of interest; (Right column) SNRlog measured on the important areas.
615
J.A. García et al. / Pattern Recognition Letters 31 (2010) 609–618
licist, for each bitrate q, based on the attention map fAðPi ; qÞgPi . The mean attention score within the areas of interest is normalized by dividing by the average attention achieved in the adver-
tisement reconstruction at bitrate q. Appendix B provides a specification of the algorithm to compute the mean attention score.
#9
#9
SNRlog
ATTENTION
SPIHT KAKADU JASPER
SPIHT KAKADU JASPER
16:1 58:1100:1150:1200:1250:1300:1350:1400:1450:1 512:1
16:1 58:1100:1150:1200:1250:1300:1350:1400:1 450:1 512:1
Compression Ratio
Compression Ratio
#10
#10
SNRlog
ATTENTION
SPIHT KAKADU JASPER
SPIHT KAKADU JASPER 16:1 58:1100:1150:1200:1250:1300:1350:1400:1450:1 512:1
16:1 58:1100:1150:1200:1250:1300:1350:1400:1450:1 512:1
Compression Ratio
Compression Ratio
#11
#11
SPIHT KAKADU JASPER
SNRlog
ATTENTION
SPIHT KAKADU JASPER
16:1 58:1100:1150:1200:1250:1300:1350:1400:1450:1 512:1
16:1 58:1100:1150:1200:1250:1300:1350:1400:1450:1 512:1
Compression Ratio
Compression Ratio
#12
#12
SPIHT KAKADU JASPER
SNRlog
ATTENTION
SPIHT KAKADU JASPER
16:1 58:1100:1150:1200:1250:1300:1350:1400:1450:1 512:1
Compression Ratio
Compression Ratio
#13
#13 SPIHT KAKADU JASPER
SPIHT KAKADU JASPER
SNRlog
ATTENTION
16:1 58:1100:1150:1200:1250:1300:1350:1400:1450:1 512:1
16:1 58:1100:1150:1200:1250:1300:1350:1400:1450:1 512:1
16:1 58:1100:1150:1200:1250:1300:1350:1400:1450:1 512:1
Compression Ratio
Compression Ratio
Fig. 6. (Left column) Normalized mean attention score within the areas of interest; (Right column) SNRlog measured on the important areas.
616
J.A. García et al. / Pattern Recognition Letters 31 (2010) 609–618
A high value of the mean attention within the areas of interest at reconstruction fidelity q means that the coding method brings the attention onto the important regions to advertising using a bitrate q of picture quality, which corresponds to a high saliency of the areas of interest provided by the publicist. Hence it predicts a faster detection (due to the higher saliency) of the important areas in the advertisement scene; therefore the mean attention score across bitrates can be used to rank order the important information visibility using SPIHT, KAKADU, and JASPER. For each advertisement image in the dataset, Figs. 4–6 (left column) show the respective rate–attention curve using each one of the three transmission methods, as given by the normalized mean attention score within the areas of interest across bitrates of reconstruction fidelity. Following these results, rate–attention curves predict a higher saliency of the important areas using SPIHT on advertisement images #1, #3, #4, #6, #9, #10, and #12. By the contrary, areas of interest would be faster detected using JASPER on images #2, #5, #7, #11, and #13 and using KAKADU on image #8. We also perform an objective coder evaluation of the reconstruction fidelity by using the rate–SNRlog curves. Figs. 4–6 (right column) show the rate–SNRlog curves on the advertisement images given in Figs. 1–3. The bit rate ranges from 0.015625 bpp to 0.5 bpp. Although the SNRlog has a good physical and theoretical basis, this measure is often found to correlate poorly with subjective ratings due to the fact that the human visual system does not process the image in a point-by-point manner but rather in a selective way according to the decisions made on a cognitive level, by choosing specific data on which to make judgments and weighting this data more heavily than the rest of the image (Garcia et al., 2001). To overcome this problem, different weightings have been proposed, for example logarithmic and cube-root ones. Furthermore, both preprocessing the pair of images and emphasizing their edge content have been suggested as well. Here we use a different approach to improve the correlation between subjective ratings and the SNRlog: The differences between the original advertisement image and its decoded outputs are only evaluated on important areas provided by the publicist. Figs. 4–6 (right column) show rate–SNRlog curves on the thirteen test images, using this approach to evaluate the SNRlog. From these plots we have that, at most bit rates, SPIHT decoded outputs achieve a better objective quality than KAKADU and JASPER decoded outputs for all the advertisement images. From both rate–attention and rate–SNRlog curves, we can conclude that the SPIHT coder is the overall winner according to the new paradigm of seeing important information first on images #1, #3, #4, #6, #9, #10, #12. Regarding the other images #2, #5, #7, #11, #13, and #8, SPIHT (without region-dependent quality of encoding) should improve its capabilities in terms of important information visibility across bitrates.
4. Conclusions The multi-bitrate attention map will provide us with a computational attention score for each spatial location at high and low quality versions of the image reconstruction. The novelty of this map is that it allows distinct attention score for the same spatial location at different picture quality, as well as avoids certain forms of behavioral inconsistency in the absence of a priori knowledge about the locations of interest, and its is not tuned for only certain images. We have evaluated the visual efficiency of each advertisement image in a dataset when it is reconstructed at high and low fidelity using three transmission methods. A high value of the normalized mean attention within the areas of interest at a given reconstruc-
tion fidelity q predicts that the coding method brings the attention onto the important regions to advertising using a bitrate q of picture quality. An objective coder evaluation of the reconstruction fidelity is achieved by using the rate–SNRlog curve on important areas provided by the publicist. From both rate–attention and rate–SNRlog curves we have demonstrated (using a dataset of advertisement images) that a potential consumer may see important areas faster using SPIHT (without region-dependent quality of encoding) on a significant number of advertisement images, even though this transmission method should improve its capabilities in terms of important information visibility across bitrates. But what are the limitations of the proposed approach? Here we have dealt with a computational approach to the rational characteristics of visual attention. The overall objective of developing a rational approach of attention does not purport to describe the ways in which the Human Visual System (HVS) actually do behave in making choices among possible locations of interest for allocating attention. Instead we are interested in the aspects of rationality that seem to be present in the decision making of the HVS: At any time a rational system should choose among candidate spatial locations to avoid certain forms of inconsistency. Regarding future work, we are going to develop a new method to rank sets of fused and input images in the order of important information visibility. The objective of image fusion is to represent relevant information from multiple individual images in a single image. Some fusion methods may represent important visual information more distinctively than others, thereby conveying it more efficiently to the human observer. We will propose to rank order images fused by different methods according to the computational attention value of their visually important details. First we need to compute for each of the fused images a multi-bitrate attention map, following a rational model of computational attention. From this attention map, we then calculate the average attention score within areas of interest (e.g., living creatures, man-made objects, and terrain features), for each bitrate. A high computed mean attention value within the areas of interest at any reconstruction fidelity corresponds to a high computational saliency of the areas of interest. The main advantages of this approach for comparative visual efficiency analysis are its simplicity and speed. We have to study if the computational results agree with human observer performance, making the approach valuable for practical applications. Also, we are going to develop a publicly available suite of Webbased comparative visual efficiency tools designed to facilitate comparison of fused images. In addition we will provide an interface for comparative visual efficiency analysis, which like all of the tools reported here, will be freely available to the scientific community. Acknowledgments The authors thank Dr. Alexander Toet (TNO Human Factors, Soesterberg, The Netherlands) for providing us with image data, search times, and cumulative detection probabilities from search experiments made during the DISSTAF field test. Thanks are due to the reviewers for their constructive suggestions. Appendix A. Mathematical results The following mathematical results were first presented in (Garcia et al., 2009). They are given for the sake of completeness. Anyway the proofs are skipped here but provided in (Garcia et al., 2009). Proposition 1. A computational attention model that aspires to analyze the decision problem fP; G; C; g at any time t in accordance with Postulates 1–5 should verify that:
617
J.A. García et al. / Pattern Recognition Letters 31 (2010) 609–618
(a) Degrees of belief about gray-level occurrence sets fGl;i ; l 2 Lg are represented in the form of finite probability distributions Ri fpðGl;i jP i ; tÞ; l 2 Lg, with pðGl;i jPi ; tÞ denoting the probability of gray level l in the neighborhood of location P i that would result from the improvement in reconstruction fidelity achieved by allocating attention to P i at time t; (b) Numerical values attached to the consequences fcl;i ; l 2 Lg foreseen if there exists a particular degree of reconstruction fidelity given by the gray-level occurrence set fGl;i ; l 2 Lg are represented in the form of a utility function. Proof. See (Garcia et al., 2009).
h
Proposition 2. If Postulates 6 and 7 both hold, then a well-behaved utility function u for consequences must have one of the following three functional forms (‘‘well-behaved” means local, twice differentia00 ðpÞ exists): ble, and that limp!0 p uu0 ðpÞ
8 r > < ðpl;i Þ uðRi ; Gl;i Þ ¼ log pl;i > : ðpl;i Þr
if r < 0;
– INPUT: I1 and I2 images – OUTPUT: For I1 and I2 , compute the number of different bits at location ði; jÞ. Return the overall sum for all the spatial locations. 1. cnt 0 2. For each ði; jÞ (a) aux jI1 ði; jÞjxorjI2 ði; jÞj (b) For each bit b in aux i. if (auxðbÞ is 1) cnt cnt þ 1 end if (c) end for 3. end for 4. return cnt END PROCEDURE PROCEDURE UTILITY
if r ¼ 0; if r > 0;
with Ri fpl;i jl 2 Lg; pl;i ¼ pðGl;i jP i ; tÞ being the probability of gray level l in the neighborhood of location Pi that would result from the improvement in visual quality achieved by allocating attention to Pi at time t. If r > 1, the resultant attentional model exhibits a risk-seeking posture with respect to ‘‘gambles” on location-dependent attention; whereas r < 1 implies risk-averse behavior regarding gambles on location-dependent attention. Proof. See (Garcia et al., 2009).
h
Proposition 3. Let ql;i ¼ pðGl;i jt 1Þ be the probability of gray level l in the neighborhood of location P i using the level of reconstruction fidelity given at time t 1 (i.e., before time t). Let pl;i ¼ pðGl;i jPi ; tÞ be the probability of gray level l in the neighborhood of location P i that would result from the improvement in visual quality achieved by allocating attention to Pi at time t. In a rational attention model for which Postulates 6 and 7 hold, the possible functional forms of the expected increase in utility provided by the allocation of attention to spatial location P i at time t, when the initial probability distribution Q fql;i ; l 2 Lg is strictly positive, are as follows:
8 P h r i r > if r < 0; > > pl;i ql;i pl;i > > l >
> h i > r r >P > > if r > 0; : pl;i pl;i ql;i
ð5Þ
l
where if r > 1, the system exhibits a risk-seeking posture with respect to gambles on location-dependent attention, while r < 1 implies risk aversion. Risk neutrality is given by r ¼ 1. Proof. See (Garcia et al., 2009).
h
Appendix B. Attention algorithm Definitions – – – – – –
PROCEDURE BITHIGH
ICRi : decoded output using the C coder at compression ratio Ri TWðICRi Þ: wavelet transformed image for ICRi Q 1 . . . Q s : quantizers for TWðICRl Þ ROI : regions of interest for the original image N: number of rows in the original image M: number of columns in the original image
– INPUT: I1 and I2 images – OUTPUT: Compute the expected increase in utility between I1 and I2 images 0 1. factor1 0 2. factor2 3. For each ði; jÞ factor 1 þ jI1 ði; jÞj (a) factor1 factor 2 þ jI2 ði; jÞj (b) factor2 4. end for 0:6 factor 1 5. f1 0:6 6. f2 factor 2 7. sum 0 8. For each ði; jÞ 0:6 0:6 jI1 ði;jÞj jI2 ði;jÞj jI1 ði;jÞj (a) sum sum þ factor f2 f1 1 9. end for 10. return sum
END PROCEDURE BEGIN 1. For each ði; jÞ (a) Atði; jÞ 0 2. end for maximum compression ratio 1 3. Rnext maximum compression ratio 4. Rprev ious 5. while (Rnext P Robjectiv e ) (a) For each ði; jÞ i. For each Q k A. nbits BITHIGH½Q k ðTWðICRnext ÞÞ; Q k ðTWðICRprev ious ÞÞ B. For each coefficient cl;m in Q k ðTWðICRnext ÞÞ if wavelet coefficient cl;m comes from spatial location ði; jÞ UTILITY½Q k ðTWðICR ÞÞ;Q k ðTWðICR ÞÞ next prev ious Atði; jÞ Atði; jÞ þ nbits end if C. end for ii. end for (b) (c) (d)
end for Rnext Rprev ious Rnext Rnext 1
6. end while 7. mean 0; meanroi 8. For each ði; jÞ
0; npointsroi
0
618
J.A. García et al. / Pattern Recognition Letters 31 (2010) 609–618
(a) if ði; jÞ 2 ROI meanroi þ Atði; jÞ i. meanroi npointsroi þ 1 ii. npointsroi (b) (c)
end if mean
mean þ Atði; jÞ
9. end for meanroi ; mean 10. meanroi npointsroi mean roi 11. return mean
mean NM
END References Garcia, J.A., Fdez-Valdivia, J., Fdez-Vidal, X.R., Rodriguez-Sanchez, R., Computing Models for Predicting Visual Target Distinctness. SPIE Bellingham, Washington, USA. PM-95. Garcia, J.A., Fdez-Valdivia, J., Fdez-Vidal, X.R., Rodriguez-Sanchez, R., Information theoretic measure for visual target distinctness. IEEE Pattern Anal. Machine Intell. 23 (4), 362–383.
2001. Press, 2001. Trans.
Garcia, J.A., Rodriguez-Sanchez, R., Fdez-Valdivia, J., 2004. Progressive Image Transmission: The Role of Rationality, Cooperation and Justice. SPIE Press, Bellingham, Washington USA. PM-140. Garcia, J.A., Rodriguez-Sánchez, R., Fdez-Valdivia, J., 2009. Axiomatic approach to computational attention. Pattern Recognition. doi:10.1016/j.patcog.2009.09. 027. Itti, L., Koch, C., Niebur, E., 1998. A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal Machine Intell. 20, 1254– 1259. Itti, L., Koch, C., 1999. Target detection using saliency-based attention. In: NATO SCI12 Workshop on Search and Target Acquisition, The Netherlands, vol. 3 (1), pp. 3–10. Krendel, E.S., Wodinsky, J., 1960. Visual search in an unstructured visual field. J. Opt. Soc. Amer. 50, 562–568. Mancas, M., Computational attention: Towards attentive computers, Presses universitaires de Louvain, Belgium, ISBN: 978-2-87463-099-6. Rotman, S.R., Gordon, E.S., Kowalczyk, M.L., 1989. Modeling human search and target acquisition performance: I. First detection probability in a realistic multitarget scenario. Opt. Eng. 28, 1216–1222. Toet, A., Kooi, F.L., Bijl, P., Valeton, J.M., 1998. Visual conspicuity determines human target acquisition performance. Opt. Eng. 37 (7), 1969–1975. Toet, A., Bijl, P., Valeton, J.M., 2001. Image dataset for testing search and detection models. Opt. Eng. 40 (9), 1756–1759. Waldman, G., Wootton, J., Hobson, G., 1991. Visual detection with search: An empirical model. IEEE Trans. Systems Man Cybernet. 21, 596–606.