Multi-focus image fusion using PCNN

Multi-focus image fusion using PCNN

ARTICLE IN PRESS Pattern Recognition 43 (2010) 2003–2016 Contents lists available at ScienceDirect Pattern Recognition journal homepage: www.elsevie...

1MB Sizes 2 Downloads 111 Views

ARTICLE IN PRESS Pattern Recognition 43 (2010) 2003–2016

Contents lists available at ScienceDirect

Pattern Recognition journal homepage: www.elsevier.com/locate/pr

Multi-focus image fusion using PCNN Zhaobin Wang a,b, Yide Ma a,, Jason Gu b a b

School of Information Science and Engineering, Lanzhou University, Lanzhou 730000, China Department of Electrical and Computer Engineering, Dalhousie University, Halifax, Nova Scotia, Canada B3J 2X4

a r t i c l e in fo

abstract

Article history: Received 13 May 2009 Received in revised form 24 November 2009 Accepted 17 January 2010

This paper proposes a new method for multi-focus image fusion based on dual-channel pulse coupled neural networks (dual-channel PCNN). Compared with previous methods, our method does not decompose the input source images and need not employ more PCNNs or other algorithms such as DWT. This method employs the dual-channel PCNN to implement multi-focus image fusion. Two parallel source images are directly input into PCNN. Meanwhile focus measure is carried out for source images. According to results of focus measure, weighted coefficients are automatically adjusted. The rule of auto-adjusting depends on the specific transformation. Input images are combined in the dualchannel PCNN. Four group experiments are designed to testify the performance of the proposed method. Several existing methods are compared with our method. Experimental results show our presented method outperforms existing methods, in both visual effect and objective evaluation criteria. Finally, some practical applications are given further. & 2010 Elsevier Ltd. All rights reserved.

Keywords: PCNN Image fusion Focus measure

1. Introduction Optics of lenses with a high degree of magnification suffers from the problem of a limited depth of field. The larger the focal length and magnification of the lens, the smaller the depth of field becomes. As a result, fewer objects in the image are in focus. However, sometimes people like to get the image in which more objects are in focus. This is to say that people want to see more and clearer objects in one image. The perfect state is that the whole image is clear or in focus. This directly leads to the appearance of multi-focus image fusion technology. Multi-focus image fusion is the process in which different images with different focus settings are fused to produce a new image with extended depth of field. Its purpose is to attempt to increase the apparent depth of field through the fusion of object within several different fields of focus. Hence it plays important roles in many different fields such as biomedical imaging and computer vision. Multi-focus image fusion is one of the main research fields of image fusion. Nowadays, there are two common schemes to be used in the field of multi-focus image fusion. The one is to use multiresolution approaches which usually employ the discrete wavelet transform (DWT) [15,22] or various pyramid algorithms such as contrast pyramid [25], FSD pyramid [1], gradient pyramid [3], morphological pyramid [24], etc. However, this scheme is complicated and time-consuming to implement. The other is

 Corresponding author.

E-mail addresses: [email protected] (Z. Wang), [email protected] (Y. Ma). 0031-3203/$ - see front matter & 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.patcog.2010.01.011

based on the selection of image blocks from source images [14]. The idea of this scheme is to select the better image blocks from source images to construct the fused image. Clearly, the latter is simpler than the former. However, the latter has some disadvantages. For instance, the effect of this scheme depends on focus measurement to a great extent. Some errors in the fused image are obtained by this scheme. Therefore, the third scheme is employed in the paper. PCNN is a biologically inspired neural network based on the work by Eckhorn et al. [4]. Pioneering work in the implementation of these algorithms was done by Johnson and his colleagues [11,10,9,21]. It has been proven that PCNN is widely applicable in the field of image processing [18,19] such as image segmentation, image enhancement, pattern recognition, etc. PCNN also plays an important role in image fusion. In fact, Broussard et al. [2] applied PCNN into image fusion for object detection as early as 1999. In the same year Johnson and Padgett [9] pointed out that there was the great potential for PCNN in the field of image and data fusion. Many multi-focus image fusion algorithms based on PCNN [7,8,13,20,2,16] have been published in different journals or proceedings so far. Although different authors have different schemes, most of them employ the same characteristic of PCNN, namely, the mechanism of synchronous pulse bursts. For multi-focus image fusion, different from other methods, PCNN has its own advantages. Firstly, PCNN model derives from the research on cat visual cortex. Its mode of information processing is much closer to the mode of human visual processing. And then PCNN has the flexible structure which can be changed according to different tasks. Additionally, the existing PCNN methods also show PCNN has the higher performance. However, the problem is that the existing PCNN methods are still

ARTICLE IN PRESS 2004

Z. Wang et al. / Pattern Recognition 43 (2010) 2003–2016

complicated and time-consuming. After careful analysis, we find: (1) almost all the methods with PCNN employ more PCNNs [7,8,13,20] or combine with other algorithms such as DWT [2,16]; (2) almost authors adopt the simplified PCNN model and all models have one stimulus. Generally an intricate algorithm usually costs much time to compute and operate intermediate variables so as to make the whole system inefficient. From (1) we know that their schemes are very complicated, it is why these existing methods are not efficient. From (2) we believe that only having one stimulus is the root cause of complication and inefficiency of PCNN methods. In other words, to some extent, the standard PCNN structurally limits its application in image fusion. In order to make PCNN more suitable for image fusion, we improve the standard PCNN and propose the dual-channel PCNN, which can solve the problem of complication and inefficiency of PCNN methods very well. Compared with previous methods, this method with the dualchannel PCNN does not decompose the input source images and need not employ more PCNNs or combine with other algorithms such as DWT. So its scheme of image fusion is very simple and experimental results also show the proposed method is feasible and efficient. The rest of the paper is organized as follows. In Section 2, the standard PCNN is briefly reviewed, and then the dual-channel PCNN is introduced in details. Section 3 concretely describes our proposed multi-focus image fusion algorithm based on the dualchannel PCNN. It mainly includes focus measurement introduction, principle and implementation of algorithm, and other relative contents. Section 4 gives experimental results and performance evaluation, and then introduces some practical applications. Conclusions are summarized in the end.

feeding). The linking receives local stimulus from the output of surrounding neurons, while the feeding, besides local stimulus, still receives external stimulus. In the following expressions, the indexes i and j refer to the pixel location in the image, k and l refer to the dislocation in a symmetric neighborhood around one pixel, and n denotes the current iteration (discrete time step). Here n varies from 1 to N (N is the total number of iterations) X wijkl Yij ½n1 þSij ð1Þ Fij ½n ¼ eaF Fij ½n1 þ VF k;l

X Lij ½n ¼ eaL Lij ½n1 þVL mijkl Yij ½n1 Uij ½n ¼ Fij ½nð1 þ bLij ½nÞ ( Yij ½n ¼

ð3Þ

1;

Uij ½n 4 Tij ½n

0;

otherwise

ð4Þ

Tij ½n ¼ eaT Tij ½n1 þ VT Yij ½n

ð5Þ

The dendritic tree is given by Eqs. (1)–(2). The two main components F and L are called feeding and linking, respectively. wijkl and mijkl are the synaptic weight coefficients and S is the external stimulus. VF and VL are normalizing constants. aF and aL are the time constants; generally, aF o aL. The linking modulation is given in Eq. (3), where Uij[n] is the internal state of the neuron. b is the linking parameter and the pulse generator determines the firing events in the model in Eq. (4). Yij[n] depends on the internal state and threshold. The dynamic threshold of the neuron is Eq. (5), where VT and aT are normalized constant and time constant, respectively. Here is a brief review of the standard PCNN. The detailed description of the implementation of the standard PCNN model on digital computers can be found in literature [9]. More details about PCNN will be found in the literatures [18,19].

2. PCNN model Because the dual-channel PCNN is proposed based on the PCNN model, firstly PCNN model is briefly introduced. After analyzing its standard model, we will improve the model according to practical demands of multi-focus image fusion.

2.2. Dual-channel PCNN model Analysis of the PCNN exposes a defect preventing one PCNN from fusing multi-focus images. In this case, we modify the standard model and propose a new improved model which overcomes some limits of the standard model in multi-focus image fusion. The dual-channel neuron model (see Fig. 2) consists of three parts: the dentritic tree, information fusion pool and the pulse generator. The function of dentritic tree is to receive the stimulus including external inputs and surrounding neuron stimuli; information fusion pool is the place where all data are fused; the pulse generator is to generate the output pulse.

2.1. Standard PCNN model In the standard PCNN model, the PCNN neuron consists of three parts: the dendritic tree, the linking modulation, and the pulse generator, as shown in Fig. 1. The role of the dendritic tree is to receive the inputs from two kinds of receptive fields. Depending on the type of the receptive field, it is subdivided into two channels (the linking and the surrounding neurons Si j

W

exp( −α F )

Threshold

VF

Ti j

Fi j

exp( −α T )

VT

Feeding 1 Linking M

Li j

Ui j

Yi j

β

VL

exp(−α L ) Dendritic tree

ð2Þ

k;l

Linking modulation

Fig. 1. The structure of PCNN.

Pulse generator

ARTICLE IN PRESS Z. Wang et al. / Pattern Recognition 43 (2010) 2003–2016

surrounding neurons

2005

β ijA

SijA

1 W

H

A ij

Threshold

σ

Ti j

exp( −α T )

VT

Yi j

Ui j M

H ijB 1

S

B ij

β

B ij

Dendritic tree

Information fusion pool

Pulse generator

Fig. 2. The structure of dual-channel PCNN.

In the improved model there are two input channels. So both stimuli can be input into the model at the same time. The following expressions describe its mathematical model: X wijkl Ykl ½n1 ð6Þ HijA ½n ¼ SAij þ k;l

HijB ½n ¼ SBij þ

X mijkl Ykl ½n1Þ

ð7Þ

k;l A

B

Uij ½n ¼ ð1þ bij HijA ½nÞð1 þ bij HijB ½nÞ þ s ( Yij ½n ¼ ( Tij ½n ¼

Uij ½nSurij ½n;

Uij ½n 4 Tij ½n1

0;

otherwise

eaT Tij ½n1; VT ;

Yij ½n ¼ 0 otherwise

ð8Þ

ð9Þ

ð10Þ

Compared with the standard PCNN, the dual-channel PCNN has few parameters. In the dual-channel model, instead of feed channel (F) and link channel (L), HA and HB stand for two symmetrical input channels. bA and bB are the weighting coefficients of two symmetrical channels, s the level factor to adjust the average level of internal activity. When s =  1, UZ0. Parameters (U, T, Y, VT, wijkl, mijkl, and aT) have the same meanings as these in the standard model. Sur denotes the input from surrounding P neurons. Generally, kijkl ¼ wijkl ¼ mijkl ; Surij ¼ k;l kijkl Ykl ½n1. Now we describe the data fusion process of the dual-channel PCNN. Firstly, two channels of neuron receive external stimuli and output of surrounding neurons. Furthermore, the data from these channels are weighted and mixed in the information fusion pool according to the weighting coefficients. Finally, the mixed data are released by neuron as its output with the attenuation of the threshold. The implementation of the dual-channel PCNN in our experiments is described as follows: (1) Initialize parameters and matrices. U=O = Y= 0, T= 1. The Initialization of W and M are different from those of other matrices. Here K= W=M. Its values are determined manually. (2) If SA =SB, then O = SA or SB; and go to step (6) (3) Normalize external stimuli to lie within [0, 1]. (4) Sur= YK; HA = SA +Surij; HB = SB + Sur; U= (1 + bA  HA) (1+ bB  HB)+ s; If Uij 4Tij then Yij = Uij  Surij, else Yij = 0; If S Aij ¼ S Bij or bAij ¼ bBij , then Oij ¼ S Aij or S Bij ; else Oij = Yij; If Yij = 0 then Tij =exp(  aT)  Tij, else Tij = VT;

(5) If all neurons have fired, go to the next step, else go back to step (4). (6) O is the output of the dual-channel PCNN. Note, ‘‘a neuron fires’’ means a PCNN’s neuron generates a pulse. Here we would like to explain the principle of data fusion using the dual-channel PCNN. Eqs. (6)–(8) mathematically show the way of data fusion. For one neuron, the final output value is not completely determined by weighting coefficients (bA and bB). The output from surrounding neurons also plays a role via the mechanism of synchronous pulse bursts and this output is unexpected and even random for one neuron. The mechanism of synchronous pulse bursts, implemented by time-domain iterative processing, can make neurons with similar state generate synchronous pulses; hence, the way of PCNN fusing data is not linear but nonlinear. For the dual-channel PCNN model, it inherits good features from the standard PCNN. For example, the dual-channel model remains the mechanism of synchronous pulse bursts. The exponential attenuation characteristic of the threshold is also kept. It is believed that this characteristic is coincident with human visual characteristic. These remaining features are propitious to image processing.

3. Image fusion algorithm In the section, multi-focus image fusion scheme used in the paper is introduced. Image sharpness measure is so important for multi-focus image fusion that some common evaluation methods of image sharpness are given in order to make the paper understandable and logical. 3.1. Image focus measure For multi-focus image fusion, we usually select one or more focus measure methods to estimate image sharpness. So, it is crucial to choose a good focus measure method. The focus measure defined in literatures [12,17] is a maximum for the best focused image and it generally decreases as the defocus increases. That is, the focused image should produce maximum focus measures, while the defocused should produce minimum focus measures. Recently, there are many focus measure techniques. Here are some focus measure methods: variance, energy of image gradient (EOG), energy of Laplacian of the image (EOL), sum-modifiedLaplacian (SML), and spatial frequency (SF). Huang and Jing [7,8]

ARTICLE IN PRESS 2006

Z. Wang et al. / Pattern Recognition 43 (2010) 2003–2016

assessed these methods according to some objective standards. Experiment results show that SML and EOL can provide better performance than other focus measures. However, SML spends more implementation time under the same conditions. Therefore, EOL is used in the paper to measure image focus. Because evaluation of different focus measure methods is not our topic of this paper, more details can be found in the Refs. [7,8]. For EOL, its corresponding mathematic expressions are also given as follows. Let f(i, j) denote the gray level intensity of pixel (i, j). XX ½f ði1; j1Þ4f ði1; jÞf ði1; j þ 1Þ EOL ¼ i

j

4f ði; j1Þ þ 20f ði; jÞ4f ði; j þ 1Þf ði þ1; j1Þ 4f ði þ 1; jÞf ði þ1; j þ1Þ2

ð11Þ

This method carries out focus measure by analyzing high spatial frequencies associated with image border sharpness, and it is implemented through Laplacian operator.

weighting coefficients [5]. This method is explained in details in the following paragraphs. Suppose source images IA(i, j) and IB(i, j) turn into measured images MA(i, j) and MB(i, j) after focus measure. The difference between MA(i, j) and MB(i, j) is defined by: D(i, j)=MA(i, j)–MB(i, j). Generally speaking, if D(i, j)40, it indicates the pixel value at location (i, j) in image A should be chosen. Otherwise, its counterpart from B is selected. However, this measure alone in practice is not sufficient to pick out the better focused image on a pixel-by-pixel basis, since using single pixel information to make decision is vulnerable to wide fluctuations caused by outer environment such as various noises. So, it is necessary to maintain robustness of the algorithm through more information from neighboring pixels. In order to make full use of surrounding information, it is necessary to require summing the D(i, j) s over a (r+ 1)  (r + 1) region surrounding each decision-point Dði; jÞ ¼

r=2 X

r=2 X

Dði þ m; j þ nÞ

ð12Þ

m ¼ r=2 n ¼ r=2

Hence their weighting coefficients are

3.2. Principle of multi-focus mage fusion Now we explain how to fuse multi-focus image using the dualchannel PCNN. Suppose IAand IB are two multi-focus images with the same size. IA(i, j) and IB(i, j) denote the pixels of IAand IB in the same position, respectively. The goal of multi-focus image fusion is to get clear one from IA(i, j) and IB(i, j). Actually, from Section 2.2, we know the dual-channel PCNN can carry on data fusion. Because the data from two channels are weighted according to the weighting coefficients, as long as clear pixel get large weighting coefficient while blurred pixel get small one, the dual-channel PCNN can achieve the purpose of multi-focus image fusion. Therefore, the core of this method is how to make sure that the change of the weighting coefficients depends on the clarity of input stimuli. Here a new method is employed to implement the transformation from the importance of input stimuli to the

bAij ¼

1

ð13Þ

1þ eZDði;jÞ

and

bBij ¼

1

ð14Þ

1þ eZDði;jÞ

where Z is a constant. Constant Z has an important influence on weighting coefficients bA and bB (shown in Fig. 3). Larger Z markedly increases the difference between bA and bB while reducing Z will shrink this difference. Typically, when Z = 0, bA = bB. Hence, adjusting Z will change the trend of bA and bB and their difference. Because weighting coefficients play a crucial role in the multi-focus image fusion, constant Z is usually set via several experiments to meet practical demands. 3.3. Implementation of fusion algorithm

1

According to the above statements, we introduce the implementation of our proposed multi-focus image fusion algorithm based on the dual-channel PCNN in this section. The dual-channel PCNN used in our experiments is a single layer two-dimensional array of laterally linked neurons and all neurons are identical. The number of neurons in the network is equal to the number of pixels in the input

0.8

βA

η = 0.05

β ij

0.6

0.4

η = 0.01

βB

0.2

0 -300

Table 1 Parameter setting in the experiment. -200

-100

0

100

200

300

Parameters

s

aT

VT

r

Z

Their values

1

0.12

1000

14

0.01

D (i, j ) Fig. 3. The relation between constant Z and weighting coefficients.

Source

IA

image A

Focus

MA

transformation

Measure

IA

SA

IB

Dual-channel

Fused

PCNN

image

SB

Source image B

βA

Coefficient

Coefficient

Focus IB

Measure

MB

transformation

βB

Fig. 4. Procedure of multi-image fusion in the paper.

ARTICLE IN PRESS Z. Wang et al. / Pattern Recognition 43 (2010) 2003–2016

Table 2 Parameters of Huang’s method. Parameters

aL

aT

VL

VT

r

Their values

1.0

5.0

0.2

20.0

13.0

2007

image. In terms of position there exists a one-to-one correspondence between the image pixels (IA(i, j) and IB(i, j)) and the neuron (N(i, j)). In other words, the external stimuli of N(i, j) are IA(i, j) and IB(i, j). Now we describe the implementation of algorithm. The procedure of our proposed multi-focus image fusion is shown in

Fig. 5. Test images which consist of four groups: LETTER, BADGE, TEAPOT, and PHOTO. Each group include two source images (labeled by A and B). (a) LETTER A, (b) LETTER B, (c) BADGE A, (d) BADGE B, (e) TEAPOT A, (f) TEAPOT B, (g) PHOTO A, and (h) PHOTO B.

ARTICLE IN PRESS 2008

Z. Wang et al. / Pattern Recognition 43 (2010) 2003–2016

Fig. 4. Note that our algorithm is introduced on the assumption that input multi-focus images have been registered. The proposed multi-focus image fusion method consists of the following steps: (1) Carry out focus measure with Eq. (11) for two multi-focus images (IA, IB). Denote measured images by MA and MB, respectively. (2) Compute the weighting coefficients (bA and bB) via MA and MB according to Eqs. (12)–(14). (3) Input IA and IB, taken as two stimuli, into the dual-channel PCNN, and then start PCNN. (4) Fuse the multi-focus images via PCNN. (5) Obtain the fused image after finishing the process of PCNN.

4. Simulations and results This section consists of three parts: parameter setting, performance evaluation, and practical applications. Different parameters of algorithm can produce different effects. So this section begins with parameter setting. In order to testify the capability of the proposed method, performance evaluation is done in succession with visual effect and objective evaluation criteria. Finally, we illustrate some applications in practical work.

4.1. Parameter setting In the dual-channel PCNN, parameter settings are as follows: large K will make computer simulation cost much time, so synaptic weight matrices K =[1, 0.5, 1; 0.5, 0, 0.5; 1, 0.5, 1]; level factor, time constant and normalized offset parameter see Table 1. In addition, parameters in Eqs. (12)–(14) are also in Table 1. Note that these parameters are manually set by experiments. For comparison, several existing methods are used: contrast pyramid (CP), FSD pyramid (FSD), gradient pyramid (GP), morphological pyramid (MP), SIDWT with Harr (SIDWT), and an existing PCNN method (Huang’s method). In order to make comparison reliable and repeatable, we use the image fusion toolbox for MATLAB developed by Rockinger [23]. The toolbox includes all the above methods except Huang’s method and can be downloaded on the web (http://www.metapix.de/toolbox.htm). Their parameters in toolbox are set by: pyramid level= 4, selection rules: highpass= select max, lowpass= average. Note, with this toolbox, our experiment can be reproduced exactly according to parameters provided above. If you want to obtain more details, please read related references [6]. For the scheme and PCNN model of Huang’s method, Huang and Jing [7,8] give a detailed explanation in their paper. Its iteration time is 300 and other parameters see Table 2.

Fig. 6. Test results (LETTER). (a) Our method, (b) CP method, (c) Huang’s method, (d) FSD method, (e) GP method, (f) MP method, (g) SIDWT method, and (h) reference image.

ARTICLE IN PRESS Z. Wang et al. / Pattern Recognition 43 (2010) 2003–2016

4.2. Performance evaluation To evaluate the performance of our proposed fusion method, extensive experiments with multi-focus image fusion and different sensor image fusion have been performed. Here, we give four groups (LETTER, BADGE, TEAPOT, and PHOTO) experiments. All test images used in the experiment are shown in Fig. 5. Note, images in LETTER, BADGE and TEAPOT are synthetic while images in PHOTO are real with synthetic blur. The content of LETTER is simple; there is more information in BADGE and TEAPOT; PHOTO is the real picture acquired by camera. Hence, the content of image becomes more and more complicated from LETTER to PHOTO. Here we give an explanation for figures: images in Figs. 6, 9, 12 and 15 are experimental results obtained by several methods. In each figure, there are nine images: one reference image and eight images fused by eight corresponding algorithms. Figs. 7, 8, 10, 11, 13, 14, 16 and 17 show the performance of different algorithms.

2009

4.2.1. Objective evaluation To evaluate objectively those methods mentioned above, we choose two methods: the root mean squared error (RMSE) and the structural similarity (SSIM) index. Here R and F denote reference image and fused image respectively. RMSE is used to evaluate the performance of the focus measures. RMSE is defined as vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u M X N u 1 X ð15Þ ½Fði; jÞRði; jÞ2 : RMSE ¼ t M  Ni¼1j¼1 Usually, less RMSE signifies better performance for the algorithm of fusion. The results of RMSE assessment are shown in Figs. 7, 10, 13 and 16. The structural similarity (SSIM) index is proposed by Wang et al [26]. Because it is a better approach to image quality measure, we use it to assess the performance of different methods, too. The mathematical expression of SSIM index is SSIM ¼

ð2mR mF C1 Þð2sRF þ C2 Þ ; ðm2R þ m2F þ C1 Þðs2R þ s2F þ C2 Þ

3 2.5 2 1.5 1 0.5 0

Our method RMSE 0.0216

CP 0.5914

Huang's method 1.6782

FSD 2.8283

GP

MP

SIDWT

2.8288

0.6319

1.2887

Fig. 7. Objective evaluation of LETTER (RMSE).

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

SSIM index

Our method

CP

Huang's method

FSD

GP

MP

SIDWT

0.9999

0.9999

0.9308

0.8291

0.829

0.9372

0.9863

Fig. 8. Objective evaluation of LETTER (SSIM index).

ð16Þ

ARTICLE IN PRESS 2010

Z. Wang et al. / Pattern Recognition 43 (2010) 2003–2016

where C1 =(K1L)2, C2 = (K2L)2. Where L is the dynamic range of the pixel values (255 for 8-bit grayscale images); K1, K2 51 are small constants. More descriptions are found in reference. Note that SSIM index MATLAB program is downloaded on the website (http://www.ece.uwaterloo.ca/  z70wang/research/ssim/). The values of SSIM index in the paper are computed with default parameters. SSIM index describes the similarity of two inputs.

Larger value shows both inputs more similar. Experimental results of SSIM index are shown in Figs. 8, 11, 14 and 17. As for objective evaluation, our proposed method is not inferior to most methods from Figs. 7, 8, 10, 11, 13 and 14. According to the evaluation rule of RMSE and SSIM index, the better method of multifocus image fusion should have smaller RMSE and larger SSIM index. In fact, experimental results demonstrate that our method has the

Fig. 9. Test results (BADGE). (a) Our method, (b) CP method, (c) Huang’s method, (d) FSD method, (e) GP method, (f) MP method, (g) SIDWT method, and (h) reference image.

7 6 5 4 3 2 1 0

RMSE

Our method

CP

Huang's method

FSD

GP

MP

SIDWT

0.5792

1.5996

2.3938

4.6158

4.5491

6.4707

2.7038

Fig. 10. Objective evaluation of BADGE (RMSE).

ARTICLE IN PRESS Z. Wang et al. / Pattern Recognition 43 (2010) 2003–2016

smallest RMSE (in Figs. 7, 10, 13 and 16) and the largest SSIM index (in Figs. 8, 11 and 17) among several algorithms. However, Fig. 14 shows SIDWT have the largest SSIM index. At the same time, the suggested method, CP, SIDWT, and Huang’s methods have similar performances in Figs 13, 14, 16 and 17. Hence it is essential to carry on subjective evaluation in this case.

2011

4.2.2. Subjective evaluation As we know, performance evaluation usually includes two sides: subjective (or qualitative) evaluation and objective (or quantitative) evaluation. For multi-focus image fusion subjective evaluation means the evaluation of visual effect. Here we use visual evaluation to evaluate every method.

1

0.99

0.98

0.97

0.96

0.95

0.94

SSIM index

Our method

CP

Huang’s method

FSD

GP

MP

SIDWT

0.9996

0.9981

0.9934

0.9818

0.9821

0.9625

0.9953

Fig. 11. Objective evaluation of BADGE (SSIM index).

Fig. 12. Test results (TEAPOT). (a) Our method, (b) CP method, (c) Huang’s method, (d) FSD method, (e) GP method, (f) MP method, (g) SIDWT method, and (h) reference image.

ARTICLE IN PRESS 2012

Z. Wang et al. / Pattern Recognition 43 (2010) 2003–2016

From Figs. 6, 9, 12, and 15, it is clear that these two images fused by FSD method and GP method have lower lightness than the others. It shows FSD method and GP method make the fused images have low contrast. This is not what we want. MP method has a good contrast but the process of edges in images is bad, typically such as Figs. 6f, 9f, and 12f. Although the remaining four methods (the suggested method, CP, SIDWT, and Huang’s methods) have the similar visual effect, the difference among them still exists. In fact, our method has the better performance. Here we use one method to make this performance easily seen. We take LETTER (in Fig. 6) and TEAPOT for example; our schemes of magnifying the difference among these four methods are as follow: For LETTER:

(1) Let R0, R1, R2, and R3 denote the fused image of the suggested method, CP, SIDWT, and Huang’s methods, respectively. Rs denotes reference image. (2) Get the binary images. B0= R040; B1 =R140; B2 = R24 0;B3 = R340; Bs =Rs 40.

(3) Obtain error images. E0= XOR(R0,Bs); E1= XOR(R1,Bs); E2= XOR(R2,Bs);E3= XOR(R3,Bs). Here XOR indicates the exclusive OR operation. The results see Fig. 18. For TEAPOT: (1) Let R0, R1, R2, and R3 denote the fused image of the suggested method, CP, SIDWT, and Huang’s methods, respectively. (2) Make the difference between Rs and R0, R1, R2, R3, respectively. Namely, D0=jRs R0j; D1=jRsR1j; D2=jRs R2j; D3= jRs R3j. Here, Rs denotes reference image, and || indicates computing the absolute value of the difference. (3) Obtain error images. E0 =D0 40; E1= D140; E2 =D2 40; E3= D340. The results see Fig. 19. For Figs. 18 and 19, all the images are binary images and light point means error. The error image of our method in Fig. 18 is completely dark, which shows the fused image using our method is more similar to reference image than others. Obviously, CP method and Huang’s method have very bad results in Fig. 19.

7 6 5 4 3 2 1 0 Our method

CP

Huang's method

FSD

GP

MP

SIDWT

3.5905

4.5103

3.6549

5.1081

5.12

6.0499

3.7865

RMSE

Fig. 13. Objective evaluation of TEAPOT (RMSE).

1 0.99 0.98 0.97 0.96 0.95 0.94 0.93 0.92

SSIM index

Our method

CP

Huang's method

FSD

GP

MP

SIDWT

0.9839

0.9884

0.9886

0.9608

0.9619

0.9456

0.9903

Fig. 14. Objective evaluation of TEAPOT (SSIM index).

ARTICLE IN PRESS Z. Wang et al. / Pattern Recognition 43 (2010) 2003–2016

Comparing Fig. 19(1) with Fig. 19(4), you will find that the big teapot have the same light while two little teapots in Fig. 19(1) are clearly darker than the ones in Fig. 19(4). It is indicated that the proposed method is better than SIDWT method. Hence, our method is better than others.

2013

By the way, we would like to compare our method with Huang’s method from the references [7,8]; after all, they have similar performance in both objective and subjective standards. Compared with Huang’s method, our method at least has two advantages. One is that our method has less complexity

Fig. 15. Test results (PHOTO). (a) Our method, (b) CP method, (c) Huang’s method, (d) FSD method, (e) GP method, (f) MP method, (g) SIDWT method, and (h) reference image.

7 6 5 4 3 2 1 0

RMSE

Our method

CP

Huang's method

FSD

GP

MP

SIDWT

2.3865

2.3951

2.402

6.0212

6.0169

4.0758

2.4105

Fig. 16. Objective evaluation of PHOTO (RMSE).

ARTICLE IN PRESS 2014

Z. Wang et al. / Pattern Recognition 43 (2010) 2003–2016

1 0.99 0.98 0.97 0.96 0.95 0.94 0.93

SSIM index

Our method

CP

Huang's method

FSD

GP

MP

SIDWT

0.9954

0.9934

0.9905

0.9876

0.9877

0.9583

0.9936

Fig. 17. Objective evaluation of PHOTO (SSIM index).

Fig. 18. Error test of LETTER. (a) Our method, (b) CP method, (c) Huang’s method, and (d) SIDWT.

Fig. 19. Error test of TEAPOT. (a) Our method, (b) CP method, (c) Huang’s method, and (d) SIDWT.

because it does not need to divide the input images into many image blocks and it also need not compare and choose the best image block. Obviously, the other is that our method has less time cost (shown in Table 3), for it has fewer operates than the latter. Additionally, a large amount of time, spent in the process of determining the adjustable parameter b, greatly decreases the latter’s efficiency. Here, what we need to explain is that data in Table 3 are obtained in such experimental platform that the computer is IBM graphics workstation (Intellistation Z Pro) and all codes have been run in Matlab 7.0.

4.3. Practical application Here we take four groups of images as example to demonstrate our method’s applications in practice. The first two groups

Table 3 Comparison of time cost between two methods (time unit: second). Methods

LETTER

BADGE

TEAPOT

PHOTO

Our method Huang’s method

0.3340 5.5749

0.4131 10.1060

0.6866 18.5210

0.6469 20.6628

are microscope images required by Motic digital microscope with focus change (see Figs. 20 and 21). The latter two groups are a body of photographs (see Figs. 22 and 23) which are obtained from the internet (http://www.imgfsr.com/ifsr_ifs.html; http://www2.mta.ac.il/  tal/Fusion/). Image A and image B are source images with different focuses. Image C is the fused image obtained by our proposed method.

ARTICLE IN PRESS Z. Wang et al. / Pattern Recognition 43 (2010) 2003–2016

2015

Fig. 20. Microscope images (1). (a) image A, (b) image B, and (c) image C.

Fig. 21. Microscope images (2). (a) image A, (b) image B, and (c) image C.

Fig. 22. Example of photography (1). (a) image A, (b) image B, and (c) image C.

Fig. 23. Example of photography (2). (a) image A, (b) image B, and (c) image C.

5. Conclusion This paper presents a novel multi-focus image fusion algorithm based on the dual-channel PCNN. This method improves the standard PCNN model and simplifies the process of image fusion using PCNN in comparison with previous methods. Previous methods usually employ more PCNNs or combination with other algorithms such as DWT, while the proposed method in this paper just uses one dual-channel PCNN to implement multi-focus image fusion. Experimental results show that our presented method

excels the existing methods in both visual effect and objective evaluation criteria. In practical applications, it is proved that our method is feasible. Because the method is simple and easy to implement, it is suitable to work in real-time system platform, too.

Acknowledgements We thank the associate editor and the reviewers for their helpful and constructive suggestions. The authors also thank Ying

ARTICLE IN PRESS 2016

Z. Wang et al. / Pattern Recognition 43 (2010) 2003–2016

Zhu for her support and help. This paper is jointly supported by National Natural Science Foundation of China (No.60872109), Program for New Century Excellent Talents in University (NCET06-0900), China Scholarship, and the Fundamental Research Funds for the Central Universities of Lanzhou University in China (lzujbky-2009-129). References [1] C.H. Anderson, A filter-subtract-decimate hierarchical pyramid signal analyzing and synthesizing technique. US Patent 718104. 1987. [2] R.P. Broussard, S.K. Rogers, M.E. Oxley, G.L. Tarr, Physiologically motivated image fusion for object detection using a pulse coupled neural network, IEEE Transaction Neural Networks 10 (1999) 554–563. [3] P.J. Burt, A gradient pyramid basis for pattern-selective image fusion, Society for Information Displays (SID) International Symposium Digest of Technical Papers 23 (1992) 467–470. [4] R. Eckhorn, H.J. Reitboeck, M. Arndt, P.W. Dicke, Feature linking via synchronization among distributed assemblies: Simulation of results from cat cortex, Neural Computation 2 (1990) 293–307. [5] H.A. Eltoukhy, S. Kavusi, A computationally efficient algorithm for multi-focus image reconstruction, in: Proceedings of SPIE-The International Society for Optical Engineering, Santa Clara. 2003, pp.332–341. [6] R.C. Gonzalez, P. Wintz, Digital image processing, Perason Education, NJ, 1978. [7] W. Huang, Z.L. Jing, Multi-focus image fusion using pulse coupled neural network, Pattern Recognition Letter 28 (2007) 1123–1132. [8] W. Huang, Z.L. Jing, Evaluation of focus measures in multi-focus image fusion, Pattern Recognition Letter 28 (2007) 493–500. [9] J.L. Johnson, M.L. Padgett, PCNN models and applications, IEEE Transaction Neural Networks 10 (1999) 480–498. [10] J.L. Johnson, H.S. Ranganath, G. Kuntimad, H.J. Caulfield, Pulse coupled neural networks, Neural Networks and Pattern Recognition (1998) 1–56. [11] J.L. Johnson, D. Ritter, Observation of periodic waves in a pulse-coupled neural network, Optical Letter 18 (1993) 1253–1255.

[12] E. Krotkov, Focusing, International Journal of Computer Vision 1 (1987) 223. [13] M. Li, W. Cai, Z. Tan, A region-based multi-sensor image fusion scheme using pulse-coupled neural network, Pattern Recognition Letter 27 (2006) 1948–1956. [14] S. Li, J.T. Kwok, Y. Wang, Multifocus image fusion using artificial neural networks, Pattern Recognition Letter 23 (2002) 985–997. [15] H. Li, B.S. Manjunath, S.K. Mitra, Multisensor image fusion using the wavelet transform, Graphical Models and Image Processing 57 (1995) 235–245. [16] W.Li, X.F. Zhu, A new image fusion algorithm based on wavelet packet analysis and PCNN, in: Proceedings of the Fourth International Conference on Machine Learning and Cybernetics. Guangzhou. 2005, pp.5297–5301. [17] G. Ligthart, F. Groen, A comparison of different autofocus algorithms, in: Proceedings of International Conference on Pattern Recognition, Munich. 1982, pp. 597–600. [18] T. Lindblad, J.M. Kinser, Image processing using pulse-coupled neural networks (second edition), Springer press, New York, 2005. [19] Y.D. Ma, L. Li, Y.F. Wang, R.L. Dai, Principle of Pulse Coupled Neural Network and Its Applications, Science Press, Beijing, 2006. [20] Q. Miao, B. Wang, A novel adaptive multi-focus image fusion algorithm based on PCNN and sharpness, in: Proceedings of SPIE-the International Society for Optical Engineering, Orlando, 2005, pp. 704–712. [21] H.S. Ranganath, G. Kuntimad, J.L. Johnson, Pulse coupled neural networks for image processing, in: Proceedings of Southeast conference on ‘Visualize the Future’. Raleigh. 1995, pp. 37–43. [22] O. Rockinger, Image sequence fusion using a shift-invariant wavelet transform. In: Proceedings of the IEEE International Conference on Image Processing. Santa Barbara. 1997, pp. 288–291. [23] O. Rockinger, 1999. Image fusion toolbox for Matlab. Technical report, Metapix. /http://www.metapix.de/toolbox.htmS. [24] A. Toet, A morphological pyramidal image decomposition, Pattern Recognition Letter 9 (1989) 255–261. [25] A. Toet, L.J. van Ruyven, J.M. Valeton, Merging thermal and visual images by a contrast pyramid, Optical Engineering 28 (1989) 789–792. [26] Z. Wang, A.C. Bovik, H.R. Sheikh, E.P. Simoncelli, Image quality assessment: from error visibility to structural similarity, IEEE Transactions Image Processing 13 (2004) 600–612.

About the Author—ZHAOBIN WANG received the B.A. degree in electronic information science and technology from Yantai University, Yantai, China, in 2004. He is currently pursuing the Ph.D. degree in radio physics at Lanzhou University, Lanzhou, China. Since 2004, he has been working on image processing techniques, most in the biological images. In particular, he has developed a special interest in biomedical image segmentation and measurement, pattern recognition and artificial neural network especially PCNN.

About the Author—YIDE MA received his B.S. degree and M.S. degree in Electronic Engineering from University of Electronic Science and Technology of China, Chengdou City, China. But his Ph.D. degree was received from Lanzhou University. He is now a professor in School of Information Science and Engineering, Lanzhou University. His research focuses on image processing and embed-system’s application. He has authored and coauthored more than 100 publications in journals, books and international conference proceedings.

About the Author—JASON GU received a B. Sc. degree in 1992 from the Department of Electrical Engineering and Information Science (Special Class for the Young Gifted 1987–1990), University of Science and Technology of China, M. Sc. degree in 1995 from Biomedical & Instrumentation Engineering, Jiaotong University (Shanghai), China and Ph.D. degree from the Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta, Canada. Since September 2000, he has been with the Department of Electrical and Computer Engineering at Dalhousie University and is presently an associate professor of Electrical Engineering. His research focuses on Robotics, Biomedical Engineering, Intelligent Systems, etc.