Information Fusion 11 (2010) 69–77
Contents lists available at ScienceDirect
Information Fusion journal homepage: www.elsevier.com/locate/inffus
Fast natural color mapping for night-time imagery Maarten A. Hogervorst, Alexander Toet * TNO Human Factors, P.O. Box 23, 3769ZG Soesterberg, The Netherlands
a r t i c l e
i n f o
Article history: Received 31 October 2007 Received in revised form 5 August 2008 Accepted 17 June 2009 Available online 21 June 2009 Keywords: Image fusion Color mapping False color Nightvision
a b s t r a c t We present a new method to render multi-band night-time imagery (images from sensors whose sensitive range does not necessarily coincide with the visual part of the electromagnetic spectrum, e.g. image intensifiers, thermal camera’s) in natural daytime colors. The color mapping is derived from the combination of a multi-band image and a corresponding natural color daytime reference image. The mapping optimizes the match between the multi-band image and the reference image, and yields a nightvision image with a natural daytime color appearance. The lookup-table based mapping procedure is extremely simple and fast and provides object color constancy. Once it has been derived the color mapping can be deployed in real-time to different multi-band image sequences of similar scenes. Displaying night-time imagery in natural colors may help human observers to process this type of imagery faster and better, thereby improving situational awareness and reducing detection and recognition times. Ó 2009 Elsevier B.V. All rights reserved.
1. Introduction Night vision cameras are a vital source of information for a wide-range of critical military and law enforcement applications related to surveillance, reconnaissance, intelligence gathering, and security. The two most common night-time imaging systems are low-light-level (e.g. image-intensified) cameras, which amplify the reflected visible to near-infrared (VNIR) light, and thermal infrared (IR) cameras, which convert thermal energy from the midwave (3–5 lm) or the long wave (8–12 lm) part of the spectrum into a visible image. Until recently a gray- or green-scale representation of nightvision imagery has been the standard. However, the increasing availability of multispectral imagers and sensor systems (e.g. [1–5]) has led to a growing interest in the (false) color display of night vision imagery [6–26]. In principle, color imagery has several benefits over monochrome imagery for surveillance, reconnaissance, and security applications. The human eye can only distinguish about 100 shades of gray at any instant. As a result, grayscale nightvision images are usually hard to interpret and may give rise to visual illusions and loss of situational awareness. Since people can discriminate several thousands of colors defined by varying hue, saturation, and brightness, a false-color representation may facilitate nightvision image recognition and interpretation. For instance, color may improve feature contrast, which allows for better scene segmentation and object detection [27]. This may enable an observer to construct a * Corresponding author. Tel.: +31 346 356237; fax: +31 346 353977. E-mail addresses:
[email protected] (M.A. Hogervorst),
[email protected] (A. Toet). 1566-2535/$ - see front matter Ó 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.inffus.2009.06.005
more complete mental representation of the perceived scene, resulting in better situational awareness. It has indeed been found that scene understanding and recognition, reaction time, and object identification are faster and more accurate with color imagery than with monochrome imagery [12,28–34]. Also, observers are able to selectively attend to task-relevant color targets and to ignore non-targets with a task-irrelevant color [35–37]. As a result, simply producing a false-color nightvision image by mapping multiple spectral bands into a three dimensional color space already generates an immediate benefit, and provides a method to increase the dynamic range of a sensor system [38]. However, the quality of a color rendering is determined by the task at hand. Although general design rules can be used to assure that the information available in the sensor image is optimally conveyed to the observer [39], it is not trivial to derive a mapping from the various sensor bands to the three independent color channels, especially when the number of bands exceeds three (e.g. with hyperspectral imagers [40]). In practice, many tasks may benefit from a representation that renders the scene in daytime colors. Jacobson and Gupta [39,40] therefore advise to use a consistent color mapping according to a natural palette. The use of natural colors facilitates object recognition by allowing access to stored color knowledge [41]. Experimental evidence indicates that object recognition depends on stored knowledge of the object’s chromatic characteristics [41]. In natural scene recognition paradigms, optimal reaction times and accuracy are obtained for normal natural (or diagnostically) colored images, followed by their grayscale version, and lastly by their (nondiagnostically) false-colored version [28,29,31,34,42]. When sensors operate outside the visible waveband, artificial color mappings generally produce false-color images whose chromatic characteristics do not correspond in any
70
M.A. Hogervorst, A. Toet / Information Fusion 11 (2010) 69–77
intuitive or obvious way to those of a scene viewed under natural photopic illumination. As a result, this type of false-color imagery may disrupt the recognition process by denying access to stored knowledge. In that case observers need to rely on color contrast to segment a scene and recognize the objects therein. This may lead to a performance that is even worse compared to single band imagery alone [43]. Experiments have indeed convincingly demonstrated that a false color rendering of night-time imagery which resembles natural color imagery significantly improves observer performance and reaction times in tasks that involve scene segmentation and classification [10,44–48], whereas color mappings that produce counterintuitive (unnaturally looking) results are detrimental to human performance [44,45,49]. One of the reasons often cited for inconsistent color mapping is a lack of physical color constancy [45]. Thus, the challenge is to give nightvision imagery an intuitively meaningful (‘‘naturalistic”) and stable color appearance, to improve the viewer’s scene comprehension and enhance object recognition and discrimination [22]. Several techniques have been proposed to render night-time imagery in color (e.g. [15,26,50–52]). Simply mapping the signals from different night-time sensors (sensitive in different spectral wavebands) to the individual channels of a standard color display or to the individual components of perceptually de-correlated color spaces, sometimes preceded by principal component transforms or followed by a linear transformation of the color pixels to enhance color contrast, usually results in imagery with an unnatural color appearance (e.g. [13,23,25,49,53]). More intuitive color schemes may be obtained by using more elaborate false-color mappings ([54]), or by opponent processing through feedforward center-surround shunting neural networks similar to those found in vertebrate color vision [6,7,11,16– 18,21,55,56]. Although this approach produces fused night-time images with optimal color contrast, the resulting color schemes remain arbitrary and are usually not related to the actual daytime color scheme of the scene that is registered. Toet [15] presented a color mapping that gives night-time imagery a natural color appearance. This method matches the first order statistical properties (mean and standard deviation) of nightvision imagery to those of a target daylight color image. As a result, the color appearance of the colorized nightvision image resembles that of the natural target image. The composition of the target image should therefore be similar to that of the nightvision scene (i.e. both images should contain similar details in similar proportions). When the composition (and thus the overall color statistics) of the target image differs significantly from that of the nightvision image, the resulting colorized nightvision image may look unnatural (the color scheme will be biased to that of the target image). For instance, the colorized nightvision image may appear too greenish if the target image contains significantly more vegetation than the nightvision image. Thus, when panning, object colors may change over time. Hence, this method has two properties which make it less suitable for practical implementation: it is computationally expensive and it does not provide consistent object colors (object colors depend on scene context). Zheng and Essock [57] recently introduced an improvement to Toet’s [15] method. This new ‘‘local-coloring” method renders a nightvision image segment-by-segment by using image segmentation, pattern recognition, histogram matching and image fusion. Specifically, a false-color image (source image) is formed by assigning the multi-band nightvision image to three RGB (red, green and blue) channels. A nonlinear diffusion filter is then applied to the false-colored image to reduce the number of colors. The final grayscale segments are obtained by using clustering and merging techniques. With a supervised nearest-neighbor paradigm, a segment can be automatically associated with a known ‘‘color scheme”. The statistic-matching procedure is merged with the histogram-
matching procedure to enhance the color mapping effect. Instead of extracting the color set from a single target image, the mean, standard deviation and histogram distribution of the color planes from a set of natural scene images are used as the target color properties for each color scheme. The target color schemes are grouped by their scene contents and colors such as plants, mountain, roads, sky and water. Regarding computational complexity, the local-coloring method is even more expensive than Toet’s [15] original color transform, since it involves time-consuming procedures such as nonlinear diffusion, color space transform, histogram analysis and wavelet-based fusion. Most of the aforementioned color mapping techniques (1) are not focused on using colors that closely match the daytime colors, (2) do not achieve color constancy, and (3) are computationally expensive. Although Toet’s [15] method yields natural color rendering, it is computationally expensive and achieves no color constancy. A recently presented extension of this method does achieve color constancy, at the expense of an increased computational complexity [57]. Here we will introduce a new simple and fast method to consistently apply natural daytime colors to multi-band nightvision imagery. First, we will present some improvements on Toet’s method [15] that yield consistent object colors and enable realtime implementation. Next, we present a new lookup-table based method that gives multi-band night-time imagery a consistent natural daytime color appearance. The method (for which a patent application is currently pending [58]) is simple and fast, and can easily be deployed in real-time. After explaining how the color transformation can be derived from a given multi-band sensor image and a corresponding daytime reference image, we will show how this color transformation can be deployed at night and implemented in real-time.
2. Natural color mapping 2.1. Statistical similarity Toet [15] presented a method for applying natural colors to a multi-band night-time image. In this method certain statistical properties of a reference daytime image are transferred to the multi-band night-time image. First, two or three bands of a multi-band night-time image are mapped onto the RGB channels of a false-color image. The resulting false-color RGB nightvision image is then transformed into a perceptually de-correlated color space. In this color space the first order statistics (the mean and standard deviation of each channel) of a natural color image (the target scene) are transferred to the multi-band nightvision image (the source scene). The inverse transformation back to RGB space yields a night-time image with a daytime color appearance. Fig. 1 shows an example of this coloring scheme in which the two input night-time images (Fig. 1a and b) were taken with a NVG (night vision goggle) device in combination with respectively a low-pass and a high-pass filter, splitting the NVG-sensitivity range into a short (wavelengths lower than 700 nm; Fig. 1a) and a long wavelength part (wavelengths higher than 700 nm; Fig. 1b). Fig. 1d shows the reference color image and Fig. 1e shows the colorized night-time image resulting from the application Toet’s [15] method. The main characteristic of this coloring scheme is the use of global statistical image properties. Note that the target image need not represent the same scene as the source image; it should merely be similar in composition and content. Toet’s [15] color transformation is computationally expensive. But more importantly, it does not provide color constancy for dynamic imagery [57]. When the algorithm is applied to a sequence of images (e.g. from a panning camera) the image statistics vary
M.A. Hogervorst, A. Toet / Information Fusion 11 (2010) 69–77
71
Fig. 1. Visible (a) and near-infrared (b) band of an NVG image. (c) False color image obtained by assigning (a) to the green and (b) to the red channel of an RGB color image (the blue channel is set to zero). The inset in this figure represents all possible combinations of signal values that can occur in the two sensor bands. Upper left: both sensors give zero output; lower right: both sensors give maximal output. (d) Reference daylight color image. (e) Result of applying Toet’s [15] method to transfer the first order statistics of the reference image (d) to the false color night-time image (c). The inset in this figure represents the result of applying the same transform to the inset of (c), and shows all possible colors that can occur in the recolored sensor image (i.e. after applying the color mapping) (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.).
over time. As a result objects change their color over time, since the color of an object depends on its context. Large objects will determine the color scheme to a large extent. We propose to improve this method by using a fixed color transformation that yields consistent object colors. One way to implement this efficiently is by using color lookup tables [39,40,58]. In this approach the color mapping is derived for a subset (or all possible) multi-band sensor outputs. This (sub)set of multi-band sensor outputs forms the reference set. The color attributed to any multi-band sensor value is equal to the color corresponding to the reference sensor output that is most similar to the actual sensor output. This idea is easily implemented using standard color lookup table techniques (which also allows for standard dithering techniques to be used). The method is illustrated in Fig. 1 for a multi-band image consisting of two channels. First, the two sensor images are fed into two of the three channels of an RGB multi-bandimage. In this example (Fig. 1c) the short wavelength part is (arbitrarily) mapped to R (red) and the long wavelength part is mapped to G (green). The result is a false-color representation of the multi-band image (Fig. 1c). The inset in Fig. 1c represents all possible multi-band signal combinations as different shades of red, green and yellow. The inset in Fig. 1e shows the result of the application of Toet’s [15] color transformation (derived from respectively Fig. 1a, b and d) to the inset of Fig. 1c. For any color in the false-color representation of Fig. 1c the corresponding color after the application of Toet’s [15] statistical color transformation can now easily be found by (1) locating the color in the inset in Fig. 1c (2) finding the corresponding color in the transformed inset in Fig. 1e. For instance, a pixel representing a high response in the short wavelength channel and low response in the long wavelength channel corresponds to a pixel with a red color (high value of red, low value of green) in the inset in Fig. 1c. In the inset of Fig. 1e the same pixel appears in a brownish color. The color transformation can thus be implemented by turning the pair of insets in Fig. 1c and e into color look-
up tables. Then, the false-color image Fig. 1c can be transformed into an indexed image using the red–green color lookup table (the inset of Fig. 1c). The color mapping that transforms Fig. 1c into Fig. 1e can then simply be applied by replacing the color lookup table of the indexed image Fig. 1c by the transformed color lookup table (the inset of Fig. 1e). The result of this action is again Fig. 1e. Note that the color mapping scheme is fully defined by the two color lookup tables (represented by the two insets in Fig. 1c and e). When all possible combinations of an 8-bit multiband system are represented, these color lookup tables contain 256 256 entries. When a color lookup table with less entries is used (e.g. only 256), the color mapping can be achieved by determining the closest match of the table entries to the observed multi-band sensor values. Once the color mapping has been derived from a multi-band night-time image and its corresponding reference image, and once it has been defined as a lookup table transform, it can be applied to different multi-band images. The advantage of this method is that the color of objects depends only on the multi-band sensor values and is independent of the rest of the image content. Therefore, objects keep the same color over time when registered with a moving camera. Another advantage of this implementation is that it requires minimal computing power. Once the color transformation has been derived and the pair of color lookup tables that defines the mapping has been created, the new color lookup table transform can be used in a (real-time) application. 2.2. Method based on samples Next, we will describe a new method for applying natural colors to multi-band images. Instead of using the statistical properties of natural images, the color transformation is derived from a set of samples for which both the multi-band sensor values and the corresponding natural color (RGB-value) are known [58]. We will
72
M.A. Hogervorst, A. Toet / Information Fusion 11 (2010) 69–77
show that this method results in rendered multi-band images with colors that match the daytime colors more closely than the result of the statistical approach. In contrast to the statistical method the derivation of the color mapping requires a multi-band image and a daytime reference image that are in exact correspondence (the pixels serve as samples in this case). Once the color mapping has been derived it can be applied to different multi-band nighttime images. Again, we will implement the color transformation using a color lookup table transformation, thus enabling real-time implementation. The problem can be formalized as follows: given a set of samples for which both the multi-band sensor output and the corresponding daytime colors are known, the problem of deriving the optimal color transformation is to find a transformation that optimally maps the N-dimensional (e.g. in our examples N = 2) multiband sensor output vectors (one for each sample) to the 3-D vectors corresponding to the daytime colors (RGB). The mapping should minimize the difference between the modeled colors and the measured colors. Moreover, the transformation should predict the mapping of untrained samples. Several methods exist to derive a suitable mapping, such as neural networks and support vector machines. What constitutes a suitable mapping is determined by the function that is minimized. Also the statement that the difference between the modeled colors and the measured colors is minimized should be formalized. We will minimize the average perceptual color difference between the modeled color and the
measured color. More precisely, we will minimize the average squared distance between the perceptual color vectors lab (see [59]). We will describe a (relatively) simple implementation that is not focused towards finding the theoretical optimum mapping, but that will lead to robust and good results and can be understood intuitively. We will now describe our new method for deriving a natural color transformation using the example shown in Fig. 2. Fig. 2a depicts the daytime reference, in this case a color photograph taken with a standard digital camera. We simulate a two-band nighttime sensor containing a visible and a near-infrared band (similar to the sensor system depicted in Fig. 1). The first sensor band is obtained by taking a picture with a digital image intensifier, with a filter in front of the lens that transmits only wavelengths lower than 700 nm (Fig. 2b). The second band is obtained by using a filter that cuts off all wavelengths lower than 700 nm (Fig. 2c). Fig. 2f shows the result of applying daytime colors to the two-band night-time sensor image using our new color mapping technique. The method works as follows. First, the multi-band sensor image is transformed to a false-color image by taking the individual bands (Fig. 2b and c) as input to the R and G channels (and B when the sensor contains three bands), referred to as the RG-image (Fig. 2e). In practice any other combination of two channels can also be used (one could just as well use the combinations R and B or B and R). Mapping the two bands to a false-color RGB-image allows us to use standard image conversion techniques, such as
Fig. 2. (a) Natural daylight color reference image. Visible (b) and near-infrared (c) band NVG images of the same scene. (d) The color mapping derived from corresponding pixel pairs in (a–c). (e) Combined RG false color representation of (b) and (c), obtained by assigning (b) to the green and (c) to the red channel of an RGB color image (the blue channel is set to zero). (f) Result of the application of the mapping scheme in (d) to the two-band false color image in (e). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
M.A. Hogervorst, A. Toet / Information Fusion 11 (2010) 69–77
indexing. In the next step the resulting false-color (RG-image) Fig. 2e is converted to an indexed image. Each pixel in such an image contains a single index. The index refers to an RGB-value in a color lookup table (the number of entries can be chosen by the user). In the present example of a sensor image consisting of two bands (R and G; Fig. 2e) the color lookup table contains various combinations of R and G values (the B-values are zero when the sensor or sensor pair provides only two bands). For each index representing a given R,G combination (a given false color) the corresponding natural color equivalent is obtained by locating the pixels in the target image with this index and finding the corresponding pixels in the (natural color) reference image (Fig. 2a). First, the RGB-values are converted to perceptually de-correlated lab values (see [59]). Next, we calculate the average lab-vector over this ensemble of pixels. This assures that the computed average color reflects the perceptual average color. Averaging automatically takes the distribution of the pixels into account: colors that appear more frequently are attributed a greater weight. Let us for instance assume that we would like to derive the natural color associated with index 1. In that case we locate all pixels in the (indexed) false color multi-band target image with index 1. We then take all corresponding pixels in the reference daytime color image, convert them to lab, and calculate the average lab-value. Next, we transform the resulting average lab-value back to RGB. Finally, we assign this RGB-value to index 1 of the new color lookup table. These steps are carried out for all indices. This process yields a new color lookup table containing the natural colors associated with the various multi-band combinations in the false color (RG) color lookup table. Replacing the RG-color lookup table (left side of Fig. 2d) by the color lookup table with natural colors (right side of Fig. 2d) yields a two-band image with a natural color appearance, in which the col-
73
ors are optimized for this particular sample set (Fig. 2f). Note that the red parts in the scene in Fig. 2a do not turn out red again in the rendered night-time image Fig. 2f. This is due to the fact that other materials which occupy a larger area of the scene (and which therefore dominate the color mapping) give the same sensor output in the two bands. Also, the red flags are not apparent in the visible band (Fig. 2b). This has only a minor effect on the overall appearance of the scene as long as the parts that change between the different band recordings occupy only a relatively small area. The colors in the resulting rendered two-band image Fig. 2f closely match the colors in the reference image Fig. 2a. This is due to the fact that the color mapping has been optimized for (derived from) this particular reference image (i.e. for this environment). Fig. 3 shows that the new color mapping method can also produce a natural color representation for multi-band night-time images of scenes that differ from the one for which the transform has been derived. Images in the left column of Fig. 3 represent the same scene as shown in Fig. 2. Images in the middle column of Fig. 3 represent a different part of the same environment and result from a panning operation. The right column in Fig. 3 shows images taken in a different environment. The top row of Fig. 3 shows the result of applying the color transformation derived in Fig. 2 (for the first scene) to the two-band night-time images of these different scenes (i.e. using exactly the same color lookup tables to transform these different scenes). The second row of Fig. 3 shows the result of the color mapping that was derived for the third scene. The bottom row of Fig. 3 shows the actual daytime color photographs of the three scenes. Evidently, applying the color mapping to the scene for which it has been derived yields the best match between the colorized night-time images and the corresponding daytime color photographs, as can be seen by comparing Fig. 3a and g, and Fig. 3c and i. The colors in Fig. 3b are very similar to the colors in
Fig. 3. Demonstration of color consistency of the new mapping method. Left column (a, d, g): the same scene as shown in Fig. 2. Middle column (b, e, h): a different part of the environment shown in the left column, resulting from a panning operation. Right column (c, f, i): images taken in a different environment. Top (a–c) and middle (d–f) rows: results of applying the color transformation derived for the first scene (see Fig. 2) to the two-band night-time images of all three different scenes. Middle row (d–f): the result of the color mapping derived for the third scene to the two-band night-time images of all three different scenes. Bottom row (g–i): the actual daytime color photographs of the three scenes. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
74
M.A. Hogervorst, A. Toet / Information Fusion 11 (2010) 69–77
Fig. 3a. This shows that the object colors remain invariant, even under panning (and even more so than in the daytime color photographs in this case: compare Fig. 3g and h). Fig. 3c represents a scene that is different from the one shown in Fig. 3a. Still, Fig. 3c shows that the result of the color mapping attributes similar colors to similar objects. However, the colors differ from those observed in the daytime color reference image for this figure (Fig. 3i). For instance, concrete is less reddish in the colorized night-time image than in the daytime equivalent. This is a result of the fact that the nightvision sensor may represent different objects and materials by the same sensor values, even when their colors differ in daylight conditions (so called metamers). Such details will be assigned the same color when applying the color mapping scheme. This will particularly be the case for man-made objects (e.g. cars) and artificial materials (e.g. concrete). In the case of Fig. 3c the difference may be attributed to differences in the lighting conditions or differences in camera settings. Fortunately most practical applications do not require an exact color representation. Most important is that the color is characteristic for the material, enabling fast scene recognition and interpretation (diagnostic colors, i.e. grass should be green, but the exact shade of green is not important: [28,29,31]). The second row of Fig. 3 shows the result of the color mapping derived for the third scene. As a result the colors of Fig. 3f most closely resemble its daytime color counterpart (Fig. 3i). Fig. 3d and e contain colors that differ slightly from the colors in their corresponding daytime references (Fig. 3g and h). This is because the reference images depict the scenes under slightly different lighting conditions. However, most importantly, the different materials in the scenes can easily be identified in all cases. In the previous examples a two-band NVG system was simulated using images recorded during daytime. In a more critical test we used real NVG images recorded at night. Reference images showing the daytime colors were taken during daytime from the same viewpoint. Fig. 4 shows two NVG images obtained with a filter transmitting wavelengths lower than 700 nm (Fig. 4a) and higher than 700 nm (Fig. 4b). Fig. 4c shows the red–green false color representation of the two-band NVG image. Fig. 4d shows the daytime reference image corresponding to the multi-band sensor
image. Straightforward application of our method results in Fig. 4e. The colors closely match the daytime colors (e.g. the sky is blue). However, the image looks noisy and certain objects appear in the wrong color (e.g. the bench and parts of the roof). This is due to the fact that the luminance in the colorized image does not increase continuously with increasing sensor output (the luminance value in Fig. 4c). This gives a ‘‘solarizing” effect which is undesirable. We therefore derived from this color map (inset in Fig. 4e) another color map (inset in Fig. 4f) in which the luminance increases linearly with the distance to the top-left corner. Fig. 4f shows the result of this new color mapping. The colors match the daytime colors. The sky is dark instead of light-blue. This corresponds to the intuition that the sky should look dark at night, and does not affect the situational awareness. Also important is that the second color transformation (Fig. 4f) is smooth in contrast to the first one (Fig. 4e). Intuitively a small variation in sensor output should lead to a small color change, i.e. a smooth color transformation is expected to lead to better matching colors when applied to other multi-band sensor images. Also, with smooth color transformations noise leads to less clutter. Fig. 5 shows the result of applying the same coloring scheme to different multi-band sensor images. Again, the colors do not always match the daytime colors, but are still characteristic for the different materials displayed in the scene. Thus, the colorized fused image facilitates interpretation of the scene (situational awareness). This is clear when comparing a normal NVG image (Fig. 4b) with a two-band color fused image (Fig. 4d). 2.3. Practical use In practice the method will be used at night when no daytime color reference images are available. The idea is to derive different color mapping schemes in advance for different environments. At night an appropriate color mapping scheme should then be selected to colorize the night-time images. Once the color transformation has been derived, the color transformation consists of (i) creating a false-color image (RG-image, see Fig. 2e), (ii) converting the image to an index image using the RG-color table, and (iii) replacing the color lookup table its daytime equivalent (RGB-color
Fig. 4. Two NVG images obtained with a filter transmitting wavelengths lower than 700 nm (a) and higher than 700 nm (b). (c) Combined red–green false color representation of (a) and (b). (d) Daytime reference color image. (e) Result of straightforward application of the new lookup table transform method. (f) Result after smoothing and linearising the color lookup table. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
M.A. Hogervorst, A. Toet / Information Fusion 11 (2010) 69–77
75
Fig. 5. Results of applying the coloring scheme from Fig. 4f to different multi-band sensor images. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
table, see Fig. 2d, right). The whole transformation is defined by the two color lookup tables (the RG-color table and the RGB color table). The software implementation can be very fast. Different environments require different color mapping schemes to obtain a correct color representation. The sample images used to derive the color mapping scheme should reflect the statistics of the environment in which the mapping will be deployed. In practice the ratio between the sensor outputs is characteristic for different materials. This becomes apparent when one inspects the color map images (e.g. Fig. 2d, right side) corresponding to the optimal color mapping of different reference and test images. In those images the hue varies little along straight lines through the top-left corner (lines for which the ratio between the two sensor outputs has a constant value). This feature can be used when deriving a color mapping from a limited number of samples. Also, the color mapping (represented by Fig. 2d, right side) can be expected to be smooth, i.e. from point to point the color variations will be smooth. When a smooth color mapping scheme is used more subtle differences between sensor outputs will become visible.
2.4. Accentuating potential targets Camouflaged targets (e.g. persons or vehicles in military colors) may become indistinguishable from their local background when natural colors are applied to a multi-band sensor image. An example of this effect is shown in Fig. 6, which presents an intensified image (Fig. 6a) and a thermal infrared (8–12 lm) image (Fig. 6b) of a person wearing a military battle dress with a camouflage pattern in a rural background. Fig. 6c shows the two-band false color image that is constructed by mapping the images from Fig. 6a and b to respectively the R and G channels of an RGB color image (the B channel was set to zero). Fig. 6d shows the full color daytime reference image (a standard digital color photograph). Using our coloring method (presented in Section 2.2) we derived the color transformation that results in an optimal match between the colors in the reference image (Fig. 6d) and the multi-band sensor values (Fig. 6c). The color transformation for all sensor combinations is represented by the insets of Fig. 6c and e, while Fig. 6e shows the result of the color mapping. Note that the colors in the colorized multi-band image (Fig. 6e) closely match the colors in the reference image (Fig. 6d). However, since the person wears clothing in
Fig. 6. Result of the application of a natural color mapping to the combination of two monochrome night-time images of the same scene. (a) Intensified visual image and (b) thermal (8–12 lm) image of a rural scene with a person standing in the foreground. (c) Two-band false color image obtained by mapping the images from (a) and (b) to respectively the G and R channels of an RGB color image (the B channel was set to zero). (d) A daylight color picture of the same scene. (e) Result of the application of the color mapping defined by inset to (c). (f) Result of the application of the color mapping defined by the inset to the two-band night-time image in (c). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
76
M.A. Hogervorst, A. Toet / Information Fusion 11 (2010) 69–77
camouflage colors, he is also camouflaged in the colorized nighttime image. Although the person is clearly visible in the thermal image (Fig. 6b) he can hardly be detected in the colorized nighttime image. To make the person more salient in the colorized night-time image a color transformation can be used that depicts hot items (which usually are potential targets like vehicles or living beings) in red. For this purpose we created an alternative color mapping by manipulating the inset of Fig. 6e. The resulting lookup table is depicted in the inset of Fig. 6f. Fig. 6f shows the result of the application of this color transformation to the false color two-band night-time image in Fig. 6c. Hot items are now shown in red while the (relatively cold) background is still depicted in natural colors. In this image representation, the naturally colored background provides the context and potential targets are highlighted. This color mapping may be useful for applications like surveillance and navigation, since these tasks require a correct perception of the background (terrain) for situational awareness in combination with optimal detection of targets and obstacles.
3. Discussion and conclusion We have presented a new method for applying natural daylight colors to multi-band night-time images (a patent application for this method is currently pending: [58]). The method derives an optimal color transformation from a set of corresponding samples taken from a daytime color reference image and a multi-band night-time image. The colors in the resulting colorized multi-band night-time image closely resemble the colors in the daytime color reference image. Moreover, when the same color scheme is applied to different multi-band images there will be a close match between the colorized night-time images and their daylight color counterparts. Also, object colors remain largely invariant under panning operations and are largely independent of the scene content. The optimal color scheme for a certain type of environment should be derived in advance. During the night this color scheme can then be used to display night-time images in natural daytime colors. Another option is to obtain a color reference image of a static scene at night by integration over a long time period, using standard RGB-filters. Thus, the color scheme can be derived on the spot for the acute sensor settings and for a matching environment. The derivation of the color scheme may require some time. However, once the color scheme is derived the color lookup table technique allows for a (real-time) software implementation that requires a minimal amount of computer processing. The similarity between a colorized night-time image and its natural daylight counterpart depends on the correlation between the sensor values and the daylight colors. A higher correlation results in a more natural coloring. The natural colors are not well predicted when this correlation is low. It can therefore be expected that no natural coloring scheme can be obtained for a night-time sensor that is sensitive to wavelengths in the far infrared region of the spectrum. In such a case, the sensor values simply do not contain any information about the daylight colors of objects. Sensors that are sensitive to wavelengths in or near the visible part of the spectrum (e.g. nightvision goggles) are therefore expected to give the best results. We created a prototype multi-band night-time fusion system consisting of two nightvision goggles fitted with filters that transmit only the visible part or the near-infrared part of the sensitive range (see e.g. Fig. 1; [60]). This prototype is highly suitable for presenting night-time imagery in daytime colors. This is especially the case for a natural environment. It can be expected that the correlation between the daytime colors and the sensor output is lower for man-made objects. We therefore expect that it will be more difficult to obtain a naturalistic color transformation in an urban environment, in which case there will
be different objects that display different daytime colors but yield the same multi-band sensor output (so called metamers). Such objects cannot be discriminated on the basis of the sensor output alone. A possibility to discriminate these cases would be to incorporate texture as an additional cue to the underlying material and daytime color (see [33]). As mentioned above (see Section 2.4), it can be expected that the ratio between the multi-band sensor outputs reflects material properties. However, this does not take the gain control of the sensor into account, which determines how the input is converted to the sensor output. It may therefore be better to assign a color to a combination of sensor input values (instead of to a combination of sensor output values). This is of course only possible when the conversion is known. This is not a problem when the conversion is the same for the two (or more) bands. If the conversion differs for the two bands it will affect their ratio. Hence, whenever possible, the conversion from sensor input to output should be taken into account. In this paper we use examples from a sensor system containing two bands. Note however that the method also applies to multiband sensor systems containing more than two bands. For sensor systems containing three bands the implementation is straightforward. In that case, the false-color image used for input simply represents three bands (R, G and B). Implementing this scheme for a sensor system containing more than three bands can be done in a similar way. The only difference is that the conversion from and to an index image is not readily available for images containing more than three bands. Summarising, we have presented a simple lookup-table based algorithm for applying natural daytime colors to multi-band night-time images with bands largely outside the visual part of the spectrum. We have derived a suitable color transformation that is simple to implement and intuitive. It may be possible to derive more naturalistic color mappings using techniques like neural networks or support vector machines. Also, methods developed for enhancing visual imagery may be applied to improve the color transformation [39,40]. References [1] N. Cohen, G. Mizrahni, G. Sarusi, A. Sa’ar, Integrated HBT/QWIP structure for dual color imaging, Infrared Physics and Technology 47 (1–2) (2005) 43–52. [2] R. Breiter, W.A. Cabanski, K.-H. Mauk, W. Rode, J. Ziegler, H. Schneider, M. Walther, Multicolor and dual-band IR camera for missile warning and automatic target recognition, in: W.R. Watkins, D. Clement, W.R. Reynolds (Eds.), Targets and Backgrounds: Characterization and Representation VIII, The International Society for Optical Engineering, Bellingham, WA, 2002, pp. 280– 288. [3] A.C. Goldberg, P. Uppal, M. Winn, Detection of buried land mines using a dualband LWIR/LWIR QWIP focal plane array, Infrared Physics and Technology 44 (5–6) (2003) 427–437. [4] E. Cho, B.K. McQuiston, W. Lim, S.B. Rafol, C. Hanson, R. Nguyen, A. Hutchinson, Development of a visible–NIR/LWIR QWIP sensor, in: B.F. Andresen, G.F. Fulop (Eds.), Infrared Technology and Applications XXIX, The International Society for Optical Engineering, Bellingham, WA, 2003, pp. 735–744. [5] S.V. Bandara et al., Infrared Physics and Technology 44 (5–6) (2003) 369–375. [6] M. Aguilar, D.A. Fay, D.B. Ireland, J.P. Racamoto, W.D. Ross, A.M. Waxman, Field evaluations of dual-band fusion for color night vision, in: J.G. Verly (Ed.), Enhanced and Synthetic Vision, The International Society for Optical Engineering, Bellingham, WA, 1999, pp. 168–175. [7] M. Aguilar, D.A. Fay, W.D. Ross, A.M. Waxman, D.B. Ireland, J.P. Racamoto, Realtime fusion of low-light CCD and uncooled IR imagery for color night vision, in: J.G. Verly (Ed.), Enhanced and Synthetic Vision, The International Society for Optical Engineering, Bellingham, WA, 1998, pp. 124–135. [8] L. Bai, W. Qian, Y. Zhang, B. Zhang, Theory analysis and experiment study on the amount of information in a color night vision system, in: H. Lu, T. Zhang (Ed.), Third International Symposium on Multispectral Image Processing and Pattern Recognition, The International Society for Optical Engineering, Bellingham, WA, 2003, pp. 5–9. [9] S. Das, Y.-L. Zhang, W.K. Krebs, Color night vision for navigation and surveillance, in: J. Sutton, S.C. Kak (Eds.), Proceedings of the Fifth Joint Conference on Information Sciences, 2000. [10] E.A. Essock, M.J. Sinai, J.S. McCarley, W.K. Krebs, J.K. DeFord, Perceptual ability with real-world night-time scenes: image-intensified, infrared, and fusedcolor imagery, Human Factors 41 (3) (1999) 438–452.
M.A. Hogervorst, A. Toet / Information Fusion 11 (2010) 69–77 [11] D.A. Fay, A.M. Waxman, M. Aguilar, D.B. Ireland, J.P. Racamato, W.D. Ross, W. Streilein, M.I. Braun, Fusion of multi-sensor imagery for night vision: color visualization, target learning and search, in: Proceedings of the Third International Conference on Information Fusion, ONERA, Paris, France, 2000, pp. TuD3-3–TuD3-10. [12] M.T. Sampson, An assessment of the impact of fused monochrome and fused color night vision displays on reaction time and accuracy in target detection (Report AD-A321226), Naval Postgraduate School, Monterey, CA, 1996. [13] J. Schuler, J.G. Howard, P. Warren, D.A. Scribner, R. Klien, M. Satyshur, M.R. Kruer, Multiband E/O color fusion with consideration of noise and registration, in: W.R. Watkins, D. Clement, W.R. Reynolds (Eds.), Targets and Backgrounds VI: Characterization, Visualization, and the Detection Process, The International Society for Optical Engineering, Bellingham, WA, USA, 2000, pp. 32–40. [14] A. Toet, Colorizing single band intensified nightvision images, Displays 26 (1) (2005) 15–21. [15] A. Toet, Natural colour mapping for multiband nightvision imagery, Information Fusion 4 (3) (2003) 155–166. [16] A.M. Waxman, D.A. Fay, A.N. Gove, M.C. Seibert, J.P. Racamato, J.E. Carrick, E.D. Savoye, Color night vision: fusion of intensified visible and thermal IR imagery, in: J.G. Verly (Ed.), Synthetic Vision for Vehicle Guidance and Control, The International Society for Optical Engineering, Bellingham, WA, 1995, pp. 58– 68. [17] A.M. Waxman, A.N. Gove, D.A. Fay, J.P. Racamoto, J.E. Carrick, M.C. Seibert, E.D. Savoye, Color night vision: opponent processing in the fusion of visible and IR imagery, Neural Networks 10 (1) (1997) 1–6. [18] A.M. Waxman, M. Aguilar, R.A. Baxter, D.A. Fay, D.B. Ireland, J.P. Racamoto, W.D. Ross, Opponent-color fusion of multi-sensor imagery: visible, IR and SAR, in: Proceedings of the 1998 Conference of the IRIS Specialty Group on Passive Sensors, 1998, pp. 43–61. [19] A.M. Waxman, A.N. Gove, M.C. Seibert, D.A. Fay, J.E. Carrick, J.P. Racamato, E.D. Savoye, B.E. Burke, R.K. Reich, et al., Progress on color night vision: visible/IR fusion, perception and search, and low-light CCD imaging, in: J.G. Verly (Ed.), Enhanced and Synthetic Vision, The International Society for Optical Engineering, Bellingham, WA, 1996, pp. 96–107. [20] A.M. Waxman et al., Solid-state color night vision: fusion of low-light visible and thermal infrared imagery, MIT Lincoln Laboratory Journal 11 (1999) 41– 60. [21] G. Huang, G. Ni, B. Zhang, Visual and infrared dual-band false color image fusion method motivated by Land’s experiment, Optical Engineering 46 (2) (2007) 027001-1–027001-10. [22] D. Scribner, P. Warren, J. Schuler, Extending color vision methods to bands beyond the visible, in: Proceedings of the IEEE Workshop on Computer Vision Beyond the Visible Spectrum: Methods and Applications Institute of Electrical and Electronics Engineers, 1999, pp. 33–40. [23] J.G. Howard, P. Warren, R. Klien, J. Schuler, M. Satyshur, D. Scribner, M.R. Kruer, Real-time color fusion of E/O sensors with PC-based COTS hardware, in: W.R. Watkins, D. Clement, W.R. Reynolds (Eds.), Targets and Backgrounds VI: Characterization, Visualization, and the Detection Process, The International Society for Optical Engineering, Bellingham, WA, 2000, pp. 41–48. [24] D. Scribner, P. Warren, J. Schuler, M. Satyshur, M. Kruer, Infrared color vision: an approach to sensor fusion, Optics and Photonics News 9 (8) (1998) 27–32. [25] D. Scribner, J.M. Schuler, P. Warren, R. Klein, J.G. Howard, Sensor and image fusion, in: R.G. Driggers (Ed.), Encyclopedia of Optical Engineering, Marcel Dekker Inc., New York, USA, 2003, pp. 2577–2582. [26] Y. Zheng, B.C. Hansen, A.M. Haun, E.A. Essock, Coloring night-vision imagery with statistical properties of natural colors by using image segmentation and histogram matching, in: R. Eschbach, G.G. Marcu (Eds.), Color Imaging X: Processing, Hardcopy and Applications, The International Society for Optical Engineering, Bellingham, WA, 2005, pp. 107–117. [27] G.L. Walls, The Vertebrate Eye and its Adaptive Radiation, Cranbrook Institute of Science, Bloomfield Hills, Michigan, 2006. [28] V. Goffaux, C. Jacques, A. Mouraux, A. Oliva, P. Schyns, B. Rossion, Diagnostic colours contribute to the early stages of scene categorization: behavioural and neurophysiological evidence, Visual Cognition 12 (6) (2005) 878–892. [29] G.A. Rousselet, O.R. Joubert, M. Fabre-Thorpe, How long to get the ‘‘gist” of real-world natural scenes?, Visual Cognition 12 (6) (2005) 852–877 [30] J.A. Cavanillas, The Role of Color and False Color in Object Recognition with Degraded and Non-Degraded Images, Naval Postgraduate School, Monterey, CA, 1999. [31] A. Oliva, P.G. Schyns, Diagnostic colors mediate scene recognition, Cognitive Psychology 41 (2000) 176–210. [32] I. Spence, P. Wong, M. Rusan, N. Rastegar, How color enhances visual memory for natural scenes, Psychological Science 17 (1) (2006) 1–6. [33] K.R. Gegenfurtner, J. Rieger, Sensory and cognitive contributions of color to the recognition of natural scenes, Current Biology 10 (13) (2000) 805–808. [34] F.A. Wichmann, L.T. Sharpe, K.R. Gegenfurtner, The contributions of color to recognition memory for natural scenes, Journal of Experimental Psychology: Learning, Memory, and Cognition 28 (3) (2002) 509–520. [35] U. Ansorge, G. Horstmann, E. Carbone, Top–down contingent capture by color: evidence from RT distribution analyses in a manual choice reaction task, Acta Psychologica 120 (3) (2005) 243–266.
77
[36] B.F. Green, L.K. Anderson, Colour coding in a visual search task, Journal of Experimental Psychology 51 (1956) 19–24. [37] C.L. Folk, R. Remington, Selectivity in distraction by irrelevant featural singletons: evidence for two forms of attentional capture, Journal of Experimental Psychology: Human Perception and Performance 24 (3) (1998) 847–858. [38] R.G. Driggers, K.A. Krapels, R.H. Vollmerhausen, P.R. Warren, D.A. Scribner, J.G. Howard, B.H. Tsou, W.K. Krebs, Target detection threshold in noisy color imagery, in: G.C. Holst (Ed.), Infrared Imaging Systems: Design, Analysis, Modeling, and Testing XII, The International Society for Optical Engineering, Bellingham, WA, 2001, pp. 162–169. [39] N.P. Jacobson, M.R. Gupta, Design goals and solutions for display of hyperspectral images, IEEE Transactions on Geoscience and Remote Sensing 43 (11) (2005) 2684–2692. [40] N.P. Jacobson, M.R. Gupta, J.B. Cole, Linear fusion of image sets for display, IEEE Transactions on Geoscience and Remote Sensing 45 (10) (2007) 3277–3288. [41] J.E. Joseph, D.R. Proffitt, Semantic versus perceptual influences of color in object recognition, Journal of Experimental Psychology: Learning, Memory, and Cognition 22 (2) (1996) 407–429. [42] A. Oliva, Gist of a scene, in: L. Itti, G. Rees, J.K. Tsotsos (Eds.), Neurobiology of Attention, Academic Press, 2005, pp. 251–256. [43] M.J. Sinai, J.S. McCarley, W.K. Krebs, Scene recognition with infra-red, lowlight, and sensor fused imagery, in: Proceedings of the IRIS Specialty Groups on Passive Sensors, IRIS, Monterey, CA, 1999, pp. 1–9. [44] A. Toet, J.K. IJspeert, Perceptual evaluation of different image fusion schemes, in: I. Kadar (Ed.), Signal Processing, Sensor Fusion, and Target Recognition X, The International Society for Optical Engineering, Bellingham, WA, 2001, pp. 436–441. [45] J.T. Vargo, Evaluation of operator performance using true color and artificial color in natural scene perception (Report AD-A363036), Monterey, Naval Postgraduate School, CA, 1999. [46] A. Toet, J.K. IJspeert, A.M. Waxman, M. Aguilar, Fusion of visible and thermal imagery improves situational awareness, in: J.G. Verly (Ed.), Enhanced and Synthetic Vision, International Society for Optical Engineering, Bellingham, WA, USA, 1997, pp. 177–188. [47] M.J. Sinai, J.S. McCarley, W.K. Krebs, E.A. Essock, Psychophysical comparisons of single- and dual-band fused imagery, in: J.G. Verly (Ed.), Enhanced and Synthetic Vision, The International Society for Optical Engineering, Bellingham, WA, 1999, pp. 176–183. [48] B.L. White, Evaluation of the Impact of Multispectral Image Fusion on Human Performance in Global Scene Processing, Naval Postgraduate School, Monterey, CA, 1998. [49] W.K. Krebs, D.A. Scribner, G.M. Miller, J.S. Ogawa, J. Schuler, Beyond third generation: a sensor-fusion targeting FLIR pod for the F/A-18, in: B.V. Dasarathy (Ed.), Sensor Fusion: Architectures Algorithms and Applications, vol. II, International Society for Optical Engineering, Bellingham, WA, USA, 1998, pp. 129–140. [50] S. Sun, Z. Jing, Z. Li, G. Liu, Color fusion of SAR and FLIR images using a natural color transfer technique, Chinese Optics Letters 3 (4) (2005) 202–204. [51] V. Tsagiris, V. Anastassopoulos, Fusion of visible and infrared imagery for night color vision, Displays 26 (4–5) (2005) 191–196. [52] L. Wang, W. Jin, Z. Gao, G. Liu, Color fusion schemes for low-light CCD and infrared images of different properties, in: L. Zhou, C.-S. Li, Y. Suzuki (Eds.), Electronic Imaging and Multimedia Technology, vol. III, The International Society for Optical Engineering, Bellingham, WA, 2002, pp. 459–466. [53] J. Li, Q. Pan, T. Yang, Y. Cheng, Color based grayscale-fused image enhancement algorithm for video surveillance, in: Proceedings of the Third International Conference on Image and Graphics (ICIG’04), IEEE Press, Washington, USA, 2004, pp. 47–50. [54] A. Toet, J. Walraven, New false colour mapping for image fusion, Optical Engineering 35 (3) (1996) 650–658. [55] P. Warren, J.G. Howard, J. Waterman, D.A. Scribner, J. Schuler, Real-time, PCbased color fusion displays (Report A073093), Naval Research Lab., Washington, DC, 1999. [56] D.A. Fay, A.M. Waxman, M. Aguilar, D.B. Ireland, J.P. Racamato, W.D. Ross, W. Streilein, M.I. Braun, Fusion of 2-/3-/4-sensor imagery for visualization, target learning, and search, in: J.G. Verly (Ed.), Enhanced and Synthetic Vision, SPIE – The International Society for Optical Engineering, Bellingham, WA, USA, 2000, pp. 106–115. [57] Y. Zheng, E.A. Essock, A local-coloring method for night-vision colorization utilizing image analysis and fusion, Information Fusion 9 (2) (2008) 186–199. [58] M.A. Hogervorst, A. Toet, F.L. Kooi, TNO Defense Security and Safety, Method and system for converting at least one first-spectrum image into a secondspectrum image 06076532 8-2202, 2006. [59] D.L. Ruderman, T.W. Cronin, C.-C. Chiao, Statistics of cone responses to natural images: implications for visual coding, Journal of the Optical Society of America A 15 (8) (1998) 2036–2045. [60] A. Toet, M.A. Hogervorst, Portable real-time color night vision, in: B.V. Dasarathy (Ed.), Multisensor, Multisource Information Fusion: Architectures, Algorithms, and Applications, The International Society for Optical Engineering, Bellingham, WA, USA, 2008, pp. 697402-1–697402-12.