A sparse representation based pansharpening method

A sparse representation based pansharpening method

Accepted Manuscript A sparse representation based pansharpening method Xiaomin Yang, Lihua Jian, Binyu Yan, Kai Liu, Lei Zhang, Yiguang Liu PII: DOI:...

6MB Sizes 1 Downloads 48 Views

Accepted Manuscript A sparse representation based pansharpening method Xiaomin Yang, Lihua Jian, Binyu Yan, Kai Liu, Lei Zhang, Yiguang Liu

PII: DOI: Reference:

S0167-739X(17)32424-X https://doi.org/10.1016/j.future.2018.04.096 FUTURE 4196

To appear in:

Future Generation Computer Systems

Received date : 28 October 2017 Revised date : 4 March 2018 Accepted date : 29 April 2018 Please cite this article as: X. Yang, L. Jian, B. Yan, K. Liu, L. Zhang, Y. Liu, A sparse representation based pansharpening method, Future Generation Computer Systems (2018), https://doi.org/10.1016/j.future.2018.04.096 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

A sparse representation based pansharpening method Xiaomin Yanga , Lihua Jiana , Binyu Yana , Kai Liub , Lei Zhangc,d , Yiguang Liuc,∗ a Sichuan

University, College of Electronics and Information Engineering, No.24 South Section 1, Yihuan Road, Chengdu, China, 610065 b Sichuan University, School of Electrical Engineering and Information, No.24 South Section 1, Yihuan Road, Chengdu, China, 610065 c Sichuan University, College of Computer Science, No.24 South Section 1, Yihuan Road, Chengdu, China, 610065 d Qinghai University, Department of Computer Technology and application, Qinghai, China, 810016

Abstract Insufficient information captured by a single satellite sensor can hardly be fit real applications. Pansharpening is a hot topic in remote sensing region, which combines the spectral information of multispectral image and spatial details of panchromatic image to obtain high spatial resolution multispectral image. In this paper, we present a novel sparse representation-based pansharpening method, which consists three stages: dictionary construction, panchromatic image decomposition, and high spatial resolution multispectral image reconstruction. First, we use multispectral images as training set and calculate intensity channels of multispectral images. Then we obtain the high-frequency components and low-frequency components of intensity channels. Second, we sparsely decompose the panchromatic image by using a pair of dictionaries to obtain high-frequency components and low-frequency components of the panchromatic image. Third, the optimized high-frequency components of the panchromatic image will be integrated into the multispectral image to generate the final high resolution multispectral image. The quantitative and subjective evaluations show that the proposed method performs better effectiveness and practicality than the existing sparse representation-based methods. ∗ Corresponding

author Email address: [email protected]. (Yiguang Liu)

Preprint submitted to Future Generation Computer System

March 3, 2018

Keywords: Pansharpening, Sparse representation, Multispectral images, Panchromatic image

1. Introduction Satellites are becoming increasingly important in ordinary lives, such as, weather forecasting, environment monitoring, and Earth’s surface observing, etc. Moreover, computer can collect much more remote sensing data through 5

various satellite sensors to analysis and process based on Internet of Things (IoT) [1], such as, drunk driving detection and water pollution tracing by using the optical remote sensing technology to obtain enough precious information. However, satellite cannot capture both high space and spectral resolution images simultaneously [2, 3]. A multispectral (MS) image has a high spectral resolu-

10

tion while suffering from a low space resolution. And the panchromatic (PAN) image possesses the opposite characteristics. Therefore, pansharpening aims at providing more precious, detailed, and abundant information for improving visual perception of satellite images, since it could merge the complementary information, which come from panchromatic band and color band, respectively.

15

In recent years, various pansharpening methods have been proposed by researchers. These methods sharpen the MS images by transferring the detail information estimated from corresponding PAN images [4, 5, 6, 7, 8]. Pansharpening methods can be divided in to three types: component-substitution (CS)-based methods [9, 10, 11, 12, 13, 6], multi-resolution-analysis (MRA)-

20

based methods [14, 15, 16, 17, 18, 19, 20] and sparse representation (SR)-based methods [21, 22, 23, 24, 25, 26, 27, 28, 29]. CS-based methods include component analysis (PCA) [11], Gram-Schmidt (GS) [12], IHS transform [9], and various versions of IHS [6, 10, 13]. As for FIHS-based method [13], the intensity component is computed by the fixed

25

linear combination of MS bands. Subsequently, Rahmani et al. [6] propose an adaptive method that adjusts the linear coefficients of MS bands. However, the weights computed only by using the edges of the PAN image to extract the detail

2

information often lose the color information of the MS image. Therefore, an IAIHS-based method [10] is proposed by Leung et al., which uses both the PAN30

induced and the MS-induced way to balance the weight coefficients. Although these methods are simple and efficient, they suffer from spectral distortion or degradation. MRA-based methods, typically include Laplacian pyramid (LP) [18], wavelet transform (WT) [14, 16], and contourlet transform (CT) [15], etc. The most

35

representative MRA-based method, which is called the AWLP, is proposed by Otazu et al. [19] in 2005. This method uses ´a trous wavelet to extract the high frequency of the PAN image. In 2008, Zheng et al. [20] propose a SVT-based pansharpening method, which uses a series of support value filter to extract the detail features from the PAN image. The MRA-based methods can achieve

40

better effect in preserving the spectral of the MS image. However, these methods often generate spatial blurring phenomena. Sparse representation has been applied into pansharpening, and got satisfied performance. Sparse representation was first applied to pansharpening by Li et al. [21] named CSIF-based method. However, the dictionary of CSIF-based

45

method is constructed by the high resolution multispectral (HMS) images, which cannot be obtained in fact. Furthermore, using the low resolution MS image and single-channel PAN image to reconstruct the multi-channel HMS image is an ill-posed problem, which lead to serious spectral and space distortions. As shown in Fig. 1, compared with the MS image, the CSIF-based method indeed

50

improves the space-resolution of the resultant image. However, the result exists serious spectral distortion and obvious blocking artifacts when compared with the reference HMS image. Then Zhu et al. [22] propose a sparse FI-based method. However, this method constructs the dictionary without using spectrum information of MS image. Guo et al. [24] present a dual-dictionary-based

55

pansharpening method (OCDL). This method performs better in reconstructing the spectral information of MS image, however it has a poor real-time performance. Yin [28] proposes a SRDIP-based method. This method cannot conserve spectral information of MS image. 3

Generally, some limitations of the existing SR-based methods can be sum60

marised as: 1) the practicality is poor because of lacking the real HMS images; 2) the spectral information of MS image cannot be used in constructing the dictionaries; 3) some results suffer from certain degrees of distortion; 4) the computation speed cannot satisfy the real-time requirement. Therefore, there is some room for improvement about the sharpened results.

 

 

  

 

Figure 1: The pansharpening result by the CSIF-based method

65

In this paper, a new SR-based pansharpening method is proposed. 1) We construct the high frequency and low frequency dictionaries with the spectral information of MS images so that the reconstructed HMS image can achieve the best fidelity effect or decrease the spectral distortion of the sharpening result. 2) PAN image is decomposed into high-frequency component (HFC) and high-

70

frequency component (LFC) by the high frequency (HF) and low frequency (LF) dictionaries. Hence, the HFC of PAN image can fit the spatial detail and spectral features of MS image. 3) We merge the MS image and the HFC of PAN image to reconstruct the HMS image. Our method provides two following contributions. First, to build the dictio-

75

naries preserving the spectral information of MS image, HF and LF dictionaries are constructed by the information of MS images. Second, to fit the space detail information and spectral characteristic of the MS image, HFC of PAN image is extracted with the HF dictionary by patch, which can precisely extract the spatial detail information of PAN image.

80

The structure of this paper is as follows. Section 2 briefly introduces the SR-

4

based method; Section 3 illustrates the proposed method in particular; Section 4 discusses the experimental results; At last, section 5 summarizes the proposed method.

2. Related work 85

2.1. Sparse representation The fundamental theory of sparse representation [30, 31] is: given a random signal x ∈ ℜn , we can use an over-complete dictionary D ∈ ℜn×m (n < m) to represent the random signal by a linear combination, which can be written as x = Dα, where α ∈ ℜm is called sparse coefficient. Fig. 2 illustrates the

90

theory of sparse representation. The optimized sparse coefficient vector can be calculated by the following constraint condition: min∥α∥0

s.t.

2 ∥x − Dα∥2 ≤ ε2 ,

(1)

where ∥•∥0 denotes the l0 -norm of α and l0 -norm defines the number of nonzeros element in a vector. ε represents the iterative stopping error which is a positive number.        







        

   

Figure 2: Sparse representation of a random signal

5

95

In real applications, the Eq. (1) can be solved by pursuit (BP) algorithm [32], non-convex algorithm [33] and orthogonal matching pursuit (OMP) algorithm [34].

3. Proposed method The proposed method assumes that the PAN image can be decomposed into 100

HFC and LFC based on HF and LF dictionaries. Then the HFC of PAN image and the MS image are merged to reconstruct HMS image. The proposed method consists of three stages: (1) dictionary construction; (2) PAN image decomposition; (3) high resolution MS image reconstruction. Fig. 3 illustrates the implementation of the proposed method. First, we use the MS training set

105

to obtain the intensity channels, and then these intensity channels are degraded into low-resolution images, which are served as the LF of intensity channels. Subsequently, the HF and LF of intensity channels are randomly sampled to obtain HF and LF dictionaries, respectively. Second, we decompose the PAN image by patch, and use the previous merging dictionaries (HF and LF dictio-

110

naries are merged into one big dictionary) to sparse represent these patches. Thus, the resultant sparse coefficients are divided into two parts, which include HF sparse coefficients and LF sparse coefficients. Then, the HF dictionary multiply by the HF sparse coefficients to obtain the HF of PAN image patch. Third, we reshape these HF of PAN image patches and integrate them into the

115

MS image to obtain the HMS image. 3.1. Dictionary construction To build the dictionaries preserving the spectral information of MS image, HF and LF dictionaries are constructed by the information of MS images. During the dictionary construction, we obtain the intensity channel from the training-

120

set which are original MS images. Thus, this process can be computed as follows: I=

N 1 ∑ M Si , N i=1

6

(2)

 





  

     



      

 

     

 

   &  & 



"

    



'"

     



   # 





"   

  (( 

 # "$%  



!    !  

Figure 3: Scheme of the proposed method

where N denotes the number of spectral channel. M Si is the i − th channel of MS image. I represents the intensity channel. The LFC of I is obtained by low-pass downsampling the intensity and upsampling the previous result as follows: I L = (↑ (↓ I)) , 125

(3)

where ↑ and ↓ represent the operations of upsampling and downsampling, respectively. Then HFC of intensity is calculated as I H = I − I L, where I H and I L denote the HFC and LFC of I, respectively. 7

(4)

According to[21], an unknown compressive image can be linear transformed by the global sampled raw patches, which are from training images. Addition130

ally, the sampled raw patches from the MS images can reflect the true spectral information of scene when reconstructing the HMS image. Therefore, the highlow-frequency dictionary can be constructed by randomly sampling the I H and I L images instead of the dictionary learning-based method, which is hard and time consuming.

135

In this paper, the patch size is set with two different types according to satellite sensors, and the patch number is set to 1 × 104 that ensures the dictionary overcomplete. For each training image, we randomly sample 200 patches. And H L then, we can obtain the high-low-frequency dictionaries Dic and Dic .

3.2. PAN image decomposition 140

In this work, we decompose the PAN image by patch instead of direct decomposition of the whole image. From the stage 3 in Fig. 2, the size of the patch is 8 × 8. First, we rearrange the selected image patch into a column vector in terms of pixels. Then, we represent the column vector over the high-low frequency dictionaries, and obtain the whole sparse coefficients. Finally, we di-

145

vide whole sparse coefficients into two parts, one of which is for reconstructing high-frequency patch in terms of the high-frequency dictionary. The proposed method divided PAN image into overlapping patches. The k − th patch can be decomposed into HFC and LFC as follows: Pk = PkH + PkL ,

(5)

where PkH and PkL represent the HFC and LFC of the k − th patch, respec150

tively. According to the sparse representation theory [30, 31], we can compute the sparse coefficients of HFC and LFC by solving the following optimization function:

) ( min αkH 1 + αkL 1

αH ,αL k k

s.t.

( )

Pk − DH αH + DL αL 2 ≤ ε , ic k ic k

(6)

where αkH and αkL are the sparse coefficients of PkH and PkL , respectively. Thus, PkH and PkL can be sparsely represented in terms of the corresponding dictionary 8

155

by: H H PkH = Dic αk ,

(7)

L L PkL = Dic αk , (8) ( ) ( )T ( )T ( H L) Suppose Dic = Dic , Dic , αk = αkH , αkL , the Eq. (6) can be rewritten

as follows:

min ∥αk ∥1 αk

s.t.

2

∥Pk − Dic αk ∥ ≤ ε ,

(9)

where the sparse coefficient vector αk in Eq. (9) can be solved by using the 160

OMP algorithm. Subsequently, to obtain the HF component of PAN image, the HF dictionary will multiply the HF sparse coefficients to generate the k − th HF patch through by Eq. (7), and then these HF patches are reshaped into a whole HF of PAN image. 3.3. High resolution MS image reconstruction

165

Subsequently, the PAN image can be decomposed patch by patch to obtain the HF, which are rearranged to obtain the P H image. The overlapping patches will be processed by using weight averaging. Due to the high spatial resolution of the PAN image, the decomposed P H image also posses the high spatial resolution. According to the imaging theory , the P H image is the HF detail of the

170

MS image. Therefore, the obtaining HFC of PAN image will be integrated into the MS image to create the HMS image as follows: HM Si = M Si + w (i) P H ,

(10)

where HM Si denotes the i − th channel of HMS image. w (i) is the weighting coefficient of i − th channel, which can be calculated as follows: w (i) = 1 N

M Si . N ∑ M Sj

(11)

j=1

4. Experimental results and analysis 175

To demonstrate the effectiveness of the proposed method, we select test images derived from four forms of satellite dataset including Spot-6, Pl´eiades, 9

WorldView-2 and IKONOS. The difference among these satellite dataset is shown in Tab. 1. The MS image of WorldView-2 contains eight channels, i.e., red channel (R), green channel(G), blue channel (B), red-edge channel (RE), 180

yellow channel (Y), coastal channel (C), near infrared channel (NIR1), and rear infrared channel (NIR2) , only four channels (R, G, B, NIR1) are utilized in our experiments. The MS images of Spot-6, Pl´eiades and IKONOS satellites contain four spectral channels (R, G, B, NIR1). Furthermore, the spatial resolution of each pair image (the PAN image and the MS image) is different from

185

the other one. However, the ratio of the resolution is 4 for each pair dataset. These satellite images, covering different scenes of vegetation area, architecture area, and soil area, are divided into training image set and test image set. Fig. 4 shows the training image set of the four forms of satellites.

10

        

          

        

       

Figure 4: The portion training image set used in this paper. (a) Spot-6 training image set. (b) WorldView-2 training image set. (c) Pl´ eiades training image set. (d) IKONOS training image set

11

Table 1: The difference among these satellite dataset Features

Spot-6

Pl´ eiades

WorldView-2

IKONOS

11

12 or 16

11

11

Dynamic Range (bits / pixel) Spatial

PAN

MS

PAN

MS

PAN

MS

PAN

Resolution(m)

1.5

6

0.5

2

0.46

1.84

1

B: 455-525 Spectral

455-745

Range(nm)

G: 530-590 R: 625-895

B: 430-550 470-830

NIR1: 760-890

C: 400-450

G: 500-620

450-800

R: 590-710 NIR1: 740-940

R: 630-690

B: 450-510

RE: 705-745

G: 510-580

NIR1: 770-895

Y: 585-625

NIR2: 860-1040

450-900

Both MS and PAN images captured by satellites are partitioned into the size of 256 × 256 pixels and 1024 × 1024 pixels, respectively. Due to no real HMS image for reference, the original MS and PAN images are to downsample lower resolution by bicubic interpolation with a downsampling factor 1/4. Therefore, the input MS image is with size of 64 × 64 pixels, and the input PAN image

195

is with the size of 256 × 256 pixels. The downsampled MS image is sharpened back to the original size (256 × 256 pixels) as the output HMS image. In this way, the original MS image can be served as the reference HMS image for visual comparison and objective measurements. The size of dictionary patches are set 16×16 for the Spot-6, WorldView-2 and

200

Pl´eiades satellites, while 8×8 for the IKONOS satellite. According to the sparse representation theory, the randomly sampling blocks of the proposed method are set 1 × 104 to ensure the over-completeness of the dictionary. Therefore, both HR and LR dictionaries are the same size of 256 × 104 for the Spot-6,

WorldView-2 and Pl´eiades satellites, and the size of 64 × 104 for the IKONOS

205

satellite. Moreover, to validate advancements of the proposed method, ten traditional pansharpening methods are selected for comparison. These methods are divided into four class: CS-based, MRA-based, SR-based and recent methods ( 12

4 B: 450-520

4.1. Parameters setup 190

MS

G: 510-600 R: 630-700 NIR1: 760-850

see Tab. 2 ). The parameters of these methods are strictly given by authors, 210

since we can search these available methods online. To demonstrate the experimental effectiveness fairly, the same training set are adopted for the SR-based methods. However, the CSIF-based method uses the reference HMS images to construct dictionaries, while the SRDIP-based method takes the MS images directly. There are two forms of experiments to be implemented in this paper: 1)

215

demonstrating the effection by using MS images in dictionary construction. 2) comparing ten pansharpening methods with the proposed method. Table 2: Comparative methods

Class of methods

Method

CS-based

FIHS [13], LS [16]

MRA-based

AWLP [19], SVT [20]

Recent methods

FEP [35], AIHS [6], IAIHS [10], MMP [36]

SR-based

CSIF [21], SRDIP [28]

4.2. Quality evaluation To further quantitatively evaluate various existing methods, we employ six global metrics with reference image to measure the performance of different 220

pansharpening methods: 1) CC (correlation coefficient) [7] computes the correlation degree between the fused result and the reference. The fused result is closer to the reference when CC is larger. 2) SAM (spectral angle mapper) [37] denotes the absolute value of the spec-

225

tral angle between the vectors of fused result and the reference. The less spectral angle distortion has a smaller SAM value. 3) RMSE (root mean squared error) indicates the average squared difference between the fused result and the reference. The smaller RMSE indicates the better performance.

13

230

4) UIQI (universal image quality indexes) [38] reflects the similarity of the fused image to reference. The value of UIQI increases, the quality becomes better. The highest value for UIQI is 1. 5) ERGAS (relative dimensionless global error in synthesis) [39] measures the overall performance of fused result. The better fused result has a smaller

235

value. 6) RASE (relative average spectral error) [40] reflects the average performance in the spectral bands. The lower value of RASE means the higher spectral quality. 4.3. Results and Comparison

240

(1) Validating the effect of dictionary selection As shown in Fig. 5, it exhibits the correlation coefficients (CC) between MS HM S the HFCs of thirty PAN images decomposed by using the Dic and Dic

dictionaries, respectively. From the results, we can see that the CC of two methods are all higher than 0.980 and the average CC is 0.989. Therefore, 245

MS the HFCs of PAN images decomposed by using the Dic dictionary are highly HM S similar with those by using the Dic dictionary.

14



                       

     

Figure 5: The correlation coefficients (CC) of high-frequency components (HFCs) decomposed by using different dictionaries.

Fig. 6 shows the CC between the original PAN images and the PAN images HM S MS dictionaries. Where the CC-MS and Dic reconstructed by using the Dic MS denotes the CC by using the Dic dictionary, the CC-HMS represents the CC 250

HM S by using the Dic dictionary. The reconstructed results by two dictionaries

are almost exactly the same. To further analysis the difference between the two methods, Fig. 7 gives the difference by using the two dictionaries. It can be seen that the range of the difference is always under 0.01. Thus, reconstructing MS HM S results by using the Dic and Dic dictionaries are consistent with each other.

15







                              

Figure 6: The correlation coefficients (CC) of the original PAN images and the reconstructed M S and D HM S dictionaries, respectively PAN images by using the Dic ic

                                  

Figure 7: The difference values between the two correlation coefficients (CC) in Fig.6

255

Through the two groups of experimental results, we can conclude that decomposing and reconstructing the PAN image by the HF and LF dictionaries which is constructed based on the information of MS images are effective. 16

(2) Spot-6 satellite images Spot-6 satellite images related to architecture area are employed to further 260

validate the proposed method. The results are shown in Fig. 8. In order to fit human visual system (HVS), the sharpened results only display the RGB channels.

Figure 8: (a) MS image. (b) PAN image. (c) Reference HMS image. (d)-(o) various sharpened results. (d) FIHS. (e) AWLP. (f) SVT. (g) MMP. (h) FEP. (i) AIHS. (j) IAIHS. (k) BFLP. (l) LS. (m) CSIF. (n) SRDIP. (o) Proposed.

From the results of right-up red area, the FIHS-, the AWLP-, the SVT-, the MMP-, the FEP-, the AIHS-, the IAIHS-, the BFLP- and the LS- based 265

methods suffer from different degrees of spectral distortion(see Figs. 8(d)-(h)

17

Table 3: Objective quality metrics for Fig. 8

Methods

CC

ERGAS

UIQI

SAM

RASE

RMSE

FIHS

0.904

3.815

0.881

0.068

15.307

44.211

AWLP

0.930

3.584

0.912

0.071

14.740

42.576

SVT

0.903

3.958

0.892

0.071

15.977

46.149

MMP

0.921

3.838

0.914

0.067

15.231

43.994

FEP

0.926

3.732

0.923

0.071

15.175

43.831

AIHS

0.948

3.340

0.914

0.067

13.234

38.226

IAIHS

0.943

3.409

0.909

0.071

13.519

39.047

BFLP

0.945

2.990

0.939

0.071

12.197

35.229

LS

0.920

3.608

0.920

0.065

14.222

41.077

CSIF

0.900

4.046

0.864

0.074

16.038

46.323

SRDIP

0.949

2.907

0.943

0.072

11.814

34.123

Proposed

0.947

2.815

0.944

0.064

11.415

32.971

and (l)). The IAIHS- and the CSIF- based methods generate the results that are more bright than the reference HMS image (see Figs. 8(j) and (m)). And the sharpened result by CSIF-based method is blurred compared with other methods, since the serious block effect leads to the fuzzy result. As a whole, the 270

AIHS-, the BFLP-, the SRDIP- based methods and the proposed method perform better effect in terms of visual quality. The objective quality metric values for this example gives in Tab. 3, which are consistent with the visual effect. The performance of the CSIF-based method is the worst in terms of objective effect, while the SRDIP-based method and the proposed method acquire the

275

better objective performance. Although the proposed method achieves not better performance in the term of CC index, it reveals the best effect for remaining indexes, i.e., CC, ERGAS, UIQI, RASE, and RMSE. Another test example covered by the vegetation is shown in Fig. 9. The FIHS-based method performs obvious spectral distortion (see Fig. 9(d)). The

18

280

result created by the AWLP-based method has more colorful information than the reference HMS image in the green area (see Fig. 9(e)). In Fig. 9(h), the FEP-based method is more dim than the reference HMS image in terms of the overall hue. Compared with the SRDIP-based method and the proposed method, the CSIF-based method performs serious block effect (see Fig. 9(m)).

285

The remainder methods not only enhance the spatial resolution of MS image, but also have the similar performance with each other.

Figure 9: (a) MS image. (b) PAN image. (c) Reference HMS image. (d)-(o) various sharpened results. (d) FIHS. (e) AWLP. (f) SVT. (g) MMP. (h) FEP. (i) AIHS. (j) IAIHS. (k) BFLP. (l) LS. (m) CSIF. (n) SRDIP. (o) Proposed.

Tab. 4 gives the corresponding objective results for Fig. 9. We can see that

19

Table 4: Objective quality metrics for Fig. 9

Methods

CC

ERGAS

UIQI

SAM

RASE

RMSE

FIHS

0.879

2.833

0.862

0.056

12.229

35.685

AWLP

0.923

2.553

0.903

0.055

11.762

34.325

SVT

0.914

2.492

0.904

0.060

11.262

32.864

MMP

0.931

2.442

0.923

0.060

10.707

31.245

FEP

0.919

2.741

0.913

0.063

13.168

38.428

AIHS

0.932

2.221

0.917

0.055

10.096

29.461

IAIHS

0.942

2.193

0.926

0.055

10.045

29.315

BFLP

0.945

2.079

0.939

0.055

9.807

28.618

LS

0.926

2.376

0.925

0.053

10.177

29.697

CSIF

0.921

2.396

0.910

0.058

10.877

31.740

SRDIP

0.946

2.041

0.941

0.056

9.768

28.506

Proposed

0.944

1.972

0.942

0.050

9.236

26.951

the FIHS-based method presents the worst effect in terms of CC, ERGAS and UIQI indexes, the SAM, RASE and RMSE that come from the FEP-based 290

method are the worst. The CSIF-based method although possesses a better effect than some classical methods, i.e., the FIHS-based method and the AWLPbased method, it still shows worse than some excellent methods such as the IAIHS-based method and the BFLP-based method. As a whole, the proposed method and the SRDIP-based method are obviously better than these compar-

295

ative methods in terms of five indexes except the CC metric. Therefore, that enough demonstrates that the proposed method performs better than others. (3) WorldView-2 satellite images Fig. 10 shows the test images captured by WorldView-2 satellite, which is related to the architecture area. Figs. 10 (a)-(c) are MS, PAN and reference

300

HMS images, respectively. From the results, we can see that all methods are confronted with different degrees of spectral distortion in the blue roof and road

20

area. Obvious spectral distortions are in the architecture area of the AWLPbased and the CSIF-based methods (see Figs. 10(e) and (m)). In the results of the AIHS-based and the IAIHS-based methods, the white marks of the road 305

are more blurred than other methods (see Figs. 10(i) and (j)). Tab. 5 lists the objective values of these methods. The AWLP-based method and the CSIFbased method still perform unsatisfied effect, and the BFLP-based method, the SRDIP-based method and the proposed method do much better than other methods. In spite of the CC and UIQI indexes that are lower than the best one,

310

the remainder indexes all work the best.

Figure 10: (a) MS image. (b) PAN image. (c) Reference HMS image. (d)-(o) various sharpened results. (d) FIHS. (e) AWLP. (f) SVT. (g) MMP. (h) FEP. (i) AIHS. (j) IAIHS. (k) BFLP. (l) LS. (m) CSIF. (n) SRDIP. (o) Proposed.

21

Table 5: Objective quality metrics for Fig. 10

Methods

CC

ERGAS

UIQI

SAM

RASE

RMSE

FIHS

0.783

9.007

0.764

0.097

35.492

109.127

AWLP

0.818

9.484

0.804

0.098

37.988

116.799

SVT

0.796

9.094

0.792

0.107

35.318

108.591

MMP

0.818

8.559

0.813

0.101

33.583

103.255

FEP

0.828

9.034

0.824

0.100

35.452

109.002

AIHS

0.804

8.676

0.784

0.094

34.033

104.640

IAIHS

0.806

8.550

0.786

0.091

33.531

103.097

BFLP

0.854

7.607

0.852

0.090

30.562

93.967

LS

0.841

8.000

0.837

0.093

31.429

96.633

CSIF

0.791

9.596

0.789

0.121

38.236

117.563

SRDIP

0.848

7.746

0.844

0.091

30.728

94.477

Proposed

0.853

7.580

0.850

0.086

30.000

92.238

Table 6: Objective quality metrics for Fig. 11

Methods

CC

ERGAS

UIQI

SAM

RASE

RMSE

FIHS

0.898

3.381

0.888

0.064

14.133

80.063

AWLP

0.921

3.406

0.906

0.060

14.153

80.174

SVT

0.898

3.534

0.895

0.078

14.711

83.338

MMP

0.921

3.240

0.920

0.072

14.394

81.541

FEP

0.920

3.641

0.916

0.071

17.263

97.792

AIHS

0.927

2.989

0.919

0.062

13.090

74.156

IAIHS

0.929

2.907

0.921

0.060

12.747

72.214

BFLP

0.935

2.848

0.932

0.060

12.275

69.536

LS

0.935

2.928

0.935

0.060

13.160

74.550

CSIF

0.914

3.189

0.912

0.077

13.911

78.805

SRDIP

0.932

2.854

0.930

0.061

12.332

69.862

Proposed

0.934

2.735

0.933

0.053

11.684

66.192

22

(4) Pl´eiades satellite images

Figure 11: (a) MS image. (b) PAN image. (c) Reference HMS image. (d)-(o) various sharpened results. (d) FIHS. (e) AWLP. (f) SVT. (g) MMP. (h) FEP. (i) AIHS. (j) IAIHS. (k) BFLP. (l) LS. (m) CSIF. (n) SRDIP. (o) Proposed.

Fig. 11 gives a group of vegetation area captured by the Pl´eiades satellite. We can obviously find that the FIHS-based method suffer from serious spetral distortion(see Fig. 11(d)). However, the AWLP-based method and the 315

SVT-based method show the dim color in road area, and in the same area, the AIHS, the IAIHS, the LS and the CSIF based methods exist the spectral distortion(Figs. 11(i)-(j),(l) and (m)). Additionally, the AIHS-based method and the

23

IAIHS-based method loss some significant detail information(see Figs. 11(k)-(l)), i.e., the labeled line of the green grass in the close-up area. Relatively speaking, 320

the BFLP-based method, the SRDIP-based method and the proposed method have desired effect. Tab. 6 gives the objective values, which are consistent with the visual quality. Fig. 12 and Fig. 13 are captured by the scenes of soil areas and high contrast spectral areas, respectively. Both subjective and objective evaluations( Tabs. 7-

325

8 ) show the proposed method outperform other methods.

Figure 12: (a) MS image. (b) PAN image. (c) Reference HMS image. (d)-(o) various sharpened results. (d) FIHS. (e) AWLP. (f) SVT. (g) MMP. (h) FEP. (i) AIHS. (j) IAIHS. (k) BFLP. (l) LS. (m) CSIF. (n) SRDIP. (o) Proposed.

24

Table 7: Objective quality metrics for Fig. 12

Methods

CC

ERGAS

UIQI

SAM

RASE

RMSE

FIHS

0.911

3.061

0.904

0.063

13.351

73.044

AWLP

0.922

3.264

0.909

0.058

14.217

77.780

SVT

0.911

3.237

0.906

0.074

13.947

76.304

MMP

0.929

3.038

0.929

0.064

13.832

75.675

FEP

0.927

3.280

0.926

0.064

15.278

83.590

AIHS

0.934

2.774

0.929

0.060

12.442

68.072

IAIHS

0.938

2.702

0.933

0.058

12.207

66.785

BFLP

0.941

2.688

0.938

0.058

12.079

66.085

LS

0.941

2.758

0.941

0.057

12.621

69.052

CSIF

0.932

2.875

0.930

0.069

13.209

72.265

SRDIP

0.940

2.659

0.938

0.059

12.017

65.748

Proposed

0.940

2.595

0.939

0.053

11.603

63.482

Table 8: Objective quality metrics for Fig. 13

Methods

CC

ERGAS

UIQI

SAM

RASE

RMSE

FIHS

0.890

4.846

0.869

0.076

19.385

118.548

AWLP

0.910

4.961

0.900

0.074

20.054

122.642

SVT

0.892

4.837

0.890

0.083

19.399

118.639

MMP

0.898

4.794

0.895

0.086

19.097

116.788

FEP

0.905

4.817

0.900

0.085

19.283

117.926

AIHS

0.918

4.390

0.891

0.075

17.622

107.772

IAIHS

0.920

4.335

0.895

0.074

17.390

106.353

BFLP

0.925

4.013

0.924

0.074

16.320

99.808

LS

0.903

4.584

0.902

0.073

18.274

111.756

CSIF

0.899

4.566

0.895

0.099

18.511

113.206

SRDIP

0.926

3.961

0.924

0.075

16.021

97.979

Proposed

0.928

3.908

0.928

0.068

15.754

96.347

25

Figure 13: (a) MS image. (b) PAN image. (c) Reference HMS image. (d)-(o) various sharpened results. (d) FIHS. (e) AWLP. (f) SVT. (g) MMP. (h) FEP. (i) AIHS. (j) IAIHS. (k) BFLP. (l) LS. (m) CSIF. (n) SRDIP. (o) Proposed.

(5) IKONOS satellite images Moreover, our comparative experiments are also conducted on the test images which are the scenes of the earthquake disaster region of Wenchuan in China. And these images are captured by the IKONOS satellite. From the two 330

group of test images, it can be seen that the FIHS-, AWLP-, SVT- and FEPbased methods are subjected to serious spectral distortion(see Figs. 14(d),(f),(h) and Figs. 15 (d),(f),(h)). In Fig. 14(g), the MMP-based method is blurred in terms of space quality. Fig. 14(m) which is the result of the CSIF-based method

26

performs apparent block effect. However, the Fig. 15(m) suffers from serious 335

spectral distortion and the MMP-based method in Fig. 15(g) performs better sharpened effect. The remainder methods have the similar performance with each other in terms of space and spectral aspect. Tab. 9 and Tab. 10 give the objective evaluations for the Fig. 14 and Fig. 15, respectively. They all demonstrate the proposed method achieves a better effect than other methods.

Figure 14: (a) MS image. (b) PAN image. (c) Reference HMS image. (d)-(o) various sharpened results. (d) FIHS. (e) AWLP. (f) SVT. (g) MMP. (h) FEP. (i) AIHS. (j) IAIHS. (k) BFLP. (l) LS. (m) CSIF. (n) SRDIP. (o) Proposed.

27

Table 9: Objective quality metrics for Fig. 14

Methods

CC

ERGAS

UIQI

SAM

RASE

RMSE

FIHS

0.753

3.892

0.720

0.070

17.164

53.430

AWLP

0.737

5.182

0.704

0.066

21.786

67.816

SVT

0.815

3.738

0.810

0.078

16.040

49.932

MMP

0.882

3.196

0.873

0.070

15.631

48.656

FEP

0.844

3.644

0.844

0.068

16.707

52.008

AIHS

0.837

3.436

0.831

0.067

15.591

48.532

IAIHS

0.856

3.264

0.850

0.065

15.164

47.202

BFLP

0.872

3.110

0.870

0.066

14.467

45.034

LS

0.895

3.043

0.891

0.066

15.068

46.906

CSIF

0.879

3.228

0.875

0.071

15.731

48.969

SRDIP

0.875

3.069

0.872

0.066

14.333

44.616

Proposed

0.884

2.997

0.882

0.063

14.195

44.189

Table 10: Objective quality metrics for Fig. 15

Methods

CC

ERGAS

UIQI

SAM

RASE

RMSE

FIHS

0.748

3.658

0.675

0.060

16.124

55.451

AWLP

0.816

3.526

0.807

0.050

15.059

51.789

SVT

0.836

3.126

0.834

0.060

13.150

45.224

MMP

0.894

2.685

0.884

0.057

12.955

44.552

FEP

0.858

3.136

0.858

0.057

14.788

50.857

AIHS

0.873

2.791

0.869

0.052

12.735

43.797

IAIHS

0.878

2.721

0.874

0.050

12.436

42.767

BFLP

0.896

2.540

0.894

0.050

11.696

40.223

LS

0.909

2.503

0.905

0.052

12.249

42.126

CSIF

0.887

2.882

0.882

0.065

14.900

51.242

SRDIP

0.892

2.569

0.889

0.051

11.730

40.339

Proposed

0.907

2.437

0.905

0.048

11.457

39.403

28

Figure 15: (a) MS image. (b) PAN image. (c) Reference HMS image. (d)-(o) various sharpened results. (d) FIHS. (e) AWLP. (f) SVT. (g) MMP. (h) FEP. (i) AIHS. (j) IAIHS. (k) BFLP. (l) LS. (m) CSIF. (n) SRDIP. (o) Proposed.

340

Generally speaking, we can conclude that the CSIF-based method outperforms the classical methods (FIHS, AWLP, SVT), but it has some drawbacks when compared with the current methods (IAIHS, MMP, BFLP, LS). Although both the AWLP- and the BFLP- based methods use the ARSIS concept [29], they still performs worse than the SRDIP-based method. Thus, it fully demon-

345

strates the advancements of SR. Through comparing the proposed method with the SRDIP-based method, the proposed method always achieve the best performance in terms of spectrum. Therefore, the spatial detail extracted by using

29

the dictionary, which is constructed from the MS training-set to decompose the PAN image, is more fit the property of MS image.

350

5. Conclusion In this paper, we propose a new SR-based pansharpening method. Different from existing fusion methods related to remote sensing image, the proposed method adopts the MS image to obtain the high-low-frequency dictionaries, which are practical basis. In addition, the proposed method extracts detail

355

images through decomposing the PAN image by using the high-low-frequency dictionaries that fit the spectral property. Compared with other methods used in this paper, the proposed method not only performs better on both subjective and objective indications, but also has a better performance than the current SR-based methods, i.e., the CSIF-based and the SRDIP-based methods.

360

Acknowledgements The research in our paper is sponsored by National Natural Science Foundation of China (No. 61701327, No. 61711540303, and No. 61473198), National Research Foundation of Korea (No. NFR-2017K2A9A2A06013711), also is supported by the Priority Academic Program Development of Jiangsu Higer

365

Education Institutions (PAPD) Fund, Jiangsu Collaborative Innovation Center on Atmospheric Environment and Equipment Technology (CICAEET) Fund.

References [1] J. Gubbi, R. Buyya, S. Marusic, M. Palaniswami, Internet of things (iot): A vision, architectural elements, and future directions, Future Generation 370

Computer Systems 29 (7) (2013) 1645–1660. [2] Y. Zhang, Understanding image fusion, Photogrammetric Engineering & Remote Sensing 70 (6) (2004) 657–661.

30

[3] G. Simone, A. Farina, F. C. Morabito, S. B. Serpico, L. Bruzzone, Image fusion techniques for remote sensing applications, Information Fusion 3 (1) 375

(2002) 3–15. [4] J. Liu, S. Liang, Pan-sharpening using a guided filter, Taylor & Francis, Inc., 2016. [5] H. Ghassemian, A review of remote sensing image fusion methods, Information Fusion 32 (PA) (2016) 75–89.

380

[6] S. Rahmani, M. Strait, D. Merkurjev, M. Moeller, T. Wittman, An adaptive ihs pan-sharpening method, IEEE Geoscience & Remote Sensing Letters 7 (4) (2010) 746–750. [7] L. Alparone, L. Wald, J. Chanussot, C. Thomas, P. Gamba, L. M. Bruce, Comparison of pansharpening algorithms: Outcome of the 2006 grs-s data-

385

fusion contest, IEEE Transactions on Geoscience & Remote Sensing 45 (10) (2007) 3012–3021. [8] A. Jameel, M. M. Riaz, A. Ghafoor, Guided filter and ihs-based pansharpening, IEEE Sensors Journal 16 (1) (2015) 192–194. [9] T. M. Tu, S. C. Su, H. C. Shyu, P. S. Huang, A new look at ihs-like image

390

fusion methods, Information Fusion 2 (3) (2001) 177–186. [10] Y. Leung, J. Liu, J. Zhang, An improved adaptive intensitychuecsaturation method for the fusion of remote sensing images, IEEE Geoscience & Remote Sensing Letters 11 (5) (2013) 985–989. [11] J. Sun, Y. Jiang, S. Zeng, A study of pca image fusion techniques on

395

remote sensing, Proceedings of SPIE - The International Society for Optical Engineering 5985 (2005) 739–744. [12] B. Aiazzi, S. Baronti, M. Selva, L. Alparone, Ms + pan image fusion by an enhanced gram-schmidt spectral sharpening.

31

[13] T. M. Tu, P. S. Huang, C. L. Hung, C. P. Chang, A fast intensity-hue400

saturation fusion technique with spectral adjustment for ikonos imagery, IEEE Geoscience & Remote Sensing Letters 1 (4) (2004) 309–312. [14] J. Nunez, X. Otazu, O. Fors, A. Prades, V. Pal, R. Arbiol, Multiresolutionbased image fusion with additive wavelet decomposition, IEEE Transactions on Geoscience & Remote Sensing 37 (3) (1999) 1204–1211.

405

[15] K. P. Upla, P. P. Gajjar, M. V. Joshi, Pan-sharpening based on Nonsubsampled Contourlet Transform detail extraction, 2013. [16] N. I. Cho, Wavelet-domain satellite image fusion based on a generalized fusion equation, Journal of Applied Remote Sensing 8 (1) (2014) 080599. [17] M. Choi, R. Y. Kim, M. R. Nam, O. K. Hong, Fusion of multispectral

410

and panchromatic satellite images using the curvelet transform, IEEE Geoscience & Remote Sensing Letters 2 (2) (2005) 136–140. [18] B. Aiazzi, L. Alparone, A. Barducci, S. Baronti, Multispectral fusion of multisensor image data by the generalized laplacian pyramid, in: Geoscience and Remote Sensing Symposium, 1999. IGARSS ’99 Proceedings.

415

IEEE 1999 International, 1999, pp. 1183–1185 vol.2. [19] X. Otazu, M. Gonzalez-Audicana, O. Fors, J. Nunez, Introduction of sensor spectral response into image fusion methods. application to wavelet-based methods, IEEE Transactions on Geoscience & Remote Sensing 43 (10) (2005) 2376–2385.

420

[20] S. Zheng, W. Z. Shi, J. Liu, J. Tian, Remote sensing image fusion using multiscale mapped ls-svm, Geoscience & Remote Sensing IEEE Transactions on 46 (5) (2008) 1313–1322. [21] S. Li, B. Yang, A new pan-sharpening method using a compressed sensing technique, IEEE Transactions on Geoscience & Remote Sensing 49 (2)

425

(2011) 738–746.

32

[22] X. X. Zhu, R. Bamler, A sparse image fusion algorithm with application to pan-sharpening, IEEE Transactions on Geoscience & Remote Sensing 51 (5) (2013) 2827–2836. [23] S. Li, H. Yin, L. Fang, Remote sensing image fusion via sparse representa430

tions over learned dictionaries, IEEE Transactions on Geoscience & Remote Sensing 51 (9) (2013) 4779–4789. [24] M. Guo, H. Zhang, J. Li, L. Zhang, H. Shen, An online coupled dictionary learning approach for remote sensing image fusion, IEEE Journal of Selected Topics in Applied Earth Observations & Remote Sensing 7 (4) (2014)

435

1284–1294. [25] M. Cheng, C. Wang, J. Li, Sparse representation based pansharpening using trained dictionary, IEEE Geoscience & Remote Sensing Letters 11 (1) (2014) 293–297. [26] C. Jiang, H. Zhang, H. Shen, L. Zhang, Two-step sparse coding for the

440

pan-sharpening of remote sensing images, IEEE Journal of Selected Topics in Applied Earth Observations & Remote Sensing 7 (5) (2014) 1792–1805. [27] W. Wang, L. Jiao, S. Yang, K. Rong, Distributed compressed sensing-based pan-sharpening with hybrid dictionary, Neurocomputing 155 (C) (2015) 320–333.

445

[28] H. Yin, Sparse representation based pansharpening with details injection model, Signal Processing 113 (C) (2015) 218–227. [29] T. Ranchin, B. Aiazzi, L. Alparone, S. Baronti, L. Wald, Image fusionthe arsis concept and some successful implementation schemes, Isprs Journal of Photogrammetry & Remote Sensing 58 (1C2) (2003) 4–18.

450

[30] E. J. Candes, J. Romberg, T. Tao, Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information, IEEE Press, 2006.

33

[31] D. L. Donoho, Compressed sensing, IEEE Transactions on Information Theory 52 (4) (2006) 1289–1306. 455

[32] S. S. Chen, D. L. Donoho, M. A. Saunders, Atomic decomposition by basis pursuit. reprinted from siam j. sci. comput. 20, Siam Review (1) (2001) 129–159. [33] I. F. Gorodnitsky, B. D. Rao, Sparse signal reconstruction from limited data using focuss: a re-weighted minimum norm algorithm, IEEE Transactions

460

on Signal Processing 45 (3) (2002) 600–616. [34] J. A. Tropp, Greed is good: Algorithmic results for sparse approximation, IEEE Transactions on Information Theory 50 (10) (2004) 2231–2242. [35] J. Lee, C. Lee, Fast and efficient panchromatic sharpening, IEEE Transactions on Geoscience & Remote Sensing 48 (1) (2009) 155–163.

465

[36] X. Kang, S. Li, J. A. Benediktsson, Pansharpening with matting model, IEEE Transactions on Geoscience & Remote Sensing 52 (8) (2014) 5088– 5099. [37] R. Yuhas, A. F. H. Goetz, J. W. Boardman, Descrimination among semiarid landscape endmembers using the spectral angle mapper (sam) algo-

470

rithm, 1992. [38] Z. Wang, A. C. Bovik, A universal image quality index, IEEE Signal Processing Letters 9 (3) (2002) 81–84. [39] L. Wald, Quality of high resolution synthesised images: Is there a simple criterion ?, 2000, pp. 99–103.

475

[40] Y. Yang, W. Wan, S. Huang, F. Yuan, S. Yang, Y. Que, Remote sensing image fusion based on adaptive ihs and multiscale guided filter, IEEE Access 4 (2016) 4573–4582.

34

Authors Biography: Xiaomin Yang is currently an Associate Professor in College of Electronics and Information Engineering, Sichuan University. She received her BS degree from Sichuan University, and received her PhD degree in communication and information system from Sichuan University. She worked in University of Adelaide as a post doctorate for one year. Her research interests are image processing and pattern recognition. Lihua Jian received his MS degree of Instrument Science and Technology from University of Electronic Science and Technology of China. He is currently pursuing doctoral degree in College of Electronics and Information Engineering, Sichuan University. His research interests are image process and computer vision. Binyu Yan is currently an Associate Professor in College of Electronics and Information Engineering, Sichuan University. He received his BS degree from Sichuan University, and received his MS and Ph D degrees in communication and information system from Sichuan University. His research interests are image process and pattern recognition. Kai Liu is currently a professor in the School of Electrical Engineering and Information at Sichuan University, China. He received his BS and MS degrees in computer science from Sichuan University, and his PhD in electrical engineering from the University of Kentucky, respectively. His main research interests include computer/machine vision, active/passive stereo vision and image processing. He is a senior member of the IEEE. Lei Zhang is currently an Associate Professor in College of Computer Science, Sichuan University. He received his MS and Ph D degrees from Sichuan University. His research interests are machine learning. Yiguang Liu received the M.S. degree from Peking University, in 1998, and the Ph.D. degree from Sichuan University, in 2004. He was a Research Fellow, Visiting Professor, and Senior Research Scholar with the National University of Singapore, Imperial College London, and Michigan State University, respectively. He was chosen into the program for new century excellent talents of MOE in 2008, and chosen as a Scientific and Technical Leader in Sichuan Province in 2010.

Authors Photo:

Xiaomin Yang

Lihua Jian

Binyu Yan

Kai Liu

Lei Zhang

Yiguang Liu

Highlights: HF and LF dictionaries are constructed by the information of MS images. HFC of PAN image is precisely extracted with the HF dictionary.