Accepted Manuscript A sparse representation based pansharpening method Xiaomin Yang, Lihua Jian, Binyu Yan, Kai Liu, Lei Zhang, Yiguang Liu
PII: DOI: Reference:
S0167-739X(17)32424-X https://doi.org/10.1016/j.future.2018.04.096 FUTURE 4196
To appear in:
Future Generation Computer Systems
Received date : 28 October 2017 Revised date : 4 March 2018 Accepted date : 29 April 2018 Please cite this article as: X. Yang, L. Jian, B. Yan, K. Liu, L. Zhang, Y. Liu, A sparse representation based pansharpening method, Future Generation Computer Systems (2018), https://doi.org/10.1016/j.future.2018.04.096 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
A sparse representation based pansharpening method Xiaomin Yanga , Lihua Jiana , Binyu Yana , Kai Liub , Lei Zhangc,d , Yiguang Liuc,∗ a Sichuan
University, College of Electronics and Information Engineering, No.24 South Section 1, Yihuan Road, Chengdu, China, 610065 b Sichuan University, School of Electrical Engineering and Information, No.24 South Section 1, Yihuan Road, Chengdu, China, 610065 c Sichuan University, College of Computer Science, No.24 South Section 1, Yihuan Road, Chengdu, China, 610065 d Qinghai University, Department of Computer Technology and application, Qinghai, China, 810016
Abstract Insufficient information captured by a single satellite sensor can hardly be fit real applications. Pansharpening is a hot topic in remote sensing region, which combines the spectral information of multispectral image and spatial details of panchromatic image to obtain high spatial resolution multispectral image. In this paper, we present a novel sparse representation-based pansharpening method, which consists three stages: dictionary construction, panchromatic image decomposition, and high spatial resolution multispectral image reconstruction. First, we use multispectral images as training set and calculate intensity channels of multispectral images. Then we obtain the high-frequency components and low-frequency components of intensity channels. Second, we sparsely decompose the panchromatic image by using a pair of dictionaries to obtain high-frequency components and low-frequency components of the panchromatic image. Third, the optimized high-frequency components of the panchromatic image will be integrated into the multispectral image to generate the final high resolution multispectral image. The quantitative and subjective evaluations show that the proposed method performs better effectiveness and practicality than the existing sparse representation-based methods. ∗ Corresponding
author Email address:
[email protected]. (Yiguang Liu)
Preprint submitted to Future Generation Computer System
March 3, 2018
Keywords: Pansharpening, Sparse representation, Multispectral images, Panchromatic image
1. Introduction Satellites are becoming increasingly important in ordinary lives, such as, weather forecasting, environment monitoring, and Earth’s surface observing, etc. Moreover, computer can collect much more remote sensing data through 5
various satellite sensors to analysis and process based on Internet of Things (IoT) [1], such as, drunk driving detection and water pollution tracing by using the optical remote sensing technology to obtain enough precious information. However, satellite cannot capture both high space and spectral resolution images simultaneously [2, 3]. A multispectral (MS) image has a high spectral resolu-
10
tion while suffering from a low space resolution. And the panchromatic (PAN) image possesses the opposite characteristics. Therefore, pansharpening aims at providing more precious, detailed, and abundant information for improving visual perception of satellite images, since it could merge the complementary information, which come from panchromatic band and color band, respectively.
15
In recent years, various pansharpening methods have been proposed by researchers. These methods sharpen the MS images by transferring the detail information estimated from corresponding PAN images [4, 5, 6, 7, 8]. Pansharpening methods can be divided in to three types: component-substitution (CS)-based methods [9, 10, 11, 12, 13, 6], multi-resolution-analysis (MRA)-
20
based methods [14, 15, 16, 17, 18, 19, 20] and sparse representation (SR)-based methods [21, 22, 23, 24, 25, 26, 27, 28, 29]. CS-based methods include component analysis (PCA) [11], Gram-Schmidt (GS) [12], IHS transform [9], and various versions of IHS [6, 10, 13]. As for FIHS-based method [13], the intensity component is computed by the fixed
25
linear combination of MS bands. Subsequently, Rahmani et al. [6] propose an adaptive method that adjusts the linear coefficients of MS bands. However, the weights computed only by using the edges of the PAN image to extract the detail
2
information often lose the color information of the MS image. Therefore, an IAIHS-based method [10] is proposed by Leung et al., which uses both the PAN30
induced and the MS-induced way to balance the weight coefficients. Although these methods are simple and efficient, they suffer from spectral distortion or degradation. MRA-based methods, typically include Laplacian pyramid (LP) [18], wavelet transform (WT) [14, 16], and contourlet transform (CT) [15], etc. The most
35
representative MRA-based method, which is called the AWLP, is proposed by Otazu et al. [19] in 2005. This method uses ´a trous wavelet to extract the high frequency of the PAN image. In 2008, Zheng et al. [20] propose a SVT-based pansharpening method, which uses a series of support value filter to extract the detail features from the PAN image. The MRA-based methods can achieve
40
better effect in preserving the spectral of the MS image. However, these methods often generate spatial blurring phenomena. Sparse representation has been applied into pansharpening, and got satisfied performance. Sparse representation was first applied to pansharpening by Li et al. [21] named CSIF-based method. However, the dictionary of CSIF-based
45
method is constructed by the high resolution multispectral (HMS) images, which cannot be obtained in fact. Furthermore, using the low resolution MS image and single-channel PAN image to reconstruct the multi-channel HMS image is an ill-posed problem, which lead to serious spectral and space distortions. As shown in Fig. 1, compared with the MS image, the CSIF-based method indeed
50
improves the space-resolution of the resultant image. However, the result exists serious spectral distortion and obvious blocking artifacts when compared with the reference HMS image. Then Zhu et al. [22] propose a sparse FI-based method. However, this method constructs the dictionary without using spectrum information of MS image. Guo et al. [24] present a dual-dictionary-based
55
pansharpening method (OCDL). This method performs better in reconstructing the spectral information of MS image, however it has a poor real-time performance. Yin [28] proposes a SRDIP-based method. This method cannot conserve spectral information of MS image. 3
Generally, some limitations of the existing SR-based methods can be sum60
marised as: 1) the practicality is poor because of lacking the real HMS images; 2) the spectral information of MS image cannot be used in constructing the dictionaries; 3) some results suffer from certain degrees of distortion; 4) the computation speed cannot satisfy the real-time requirement. Therefore, there is some room for improvement about the sharpened results.
Figure 1: The pansharpening result by the CSIF-based method
65
In this paper, a new SR-based pansharpening method is proposed. 1) We construct the high frequency and low frequency dictionaries with the spectral information of MS images so that the reconstructed HMS image can achieve the best fidelity effect or decrease the spectral distortion of the sharpening result. 2) PAN image is decomposed into high-frequency component (HFC) and high-
70
frequency component (LFC) by the high frequency (HF) and low frequency (LF) dictionaries. Hence, the HFC of PAN image can fit the spatial detail and spectral features of MS image. 3) We merge the MS image and the HFC of PAN image to reconstruct the HMS image. Our method provides two following contributions. First, to build the dictio-
75
naries preserving the spectral information of MS image, HF and LF dictionaries are constructed by the information of MS images. Second, to fit the space detail information and spectral characteristic of the MS image, HFC of PAN image is extracted with the HF dictionary by patch, which can precisely extract the spatial detail information of PAN image.
80
The structure of this paper is as follows. Section 2 briefly introduces the SR-
4
based method; Section 3 illustrates the proposed method in particular; Section 4 discusses the experimental results; At last, section 5 summarizes the proposed method.
2. Related work 85
2.1. Sparse representation The fundamental theory of sparse representation [30, 31] is: given a random signal x ∈ ℜn , we can use an over-complete dictionary D ∈ ℜn×m (n < m) to represent the random signal by a linear combination, which can be written as x = Dα, where α ∈ ℜm is called sparse coefficient. Fig. 2 illustrates the
90
theory of sparse representation. The optimized sparse coefficient vector can be calculated by the following constraint condition: min∥α∥0
s.t.
2 ∥x − Dα∥2 ≤ ε2 ,
(1)
where ∥•∥0 denotes the l0 -norm of α and l0 -norm defines the number of nonzeros element in a vector. ε represents the iterative stopping error which is a positive number.
Figure 2: Sparse representation of a random signal
5
95
In real applications, the Eq. (1) can be solved by pursuit (BP) algorithm [32], non-convex algorithm [33] and orthogonal matching pursuit (OMP) algorithm [34].
3. Proposed method The proposed method assumes that the PAN image can be decomposed into 100
HFC and LFC based on HF and LF dictionaries. Then the HFC of PAN image and the MS image are merged to reconstruct HMS image. The proposed method consists of three stages: (1) dictionary construction; (2) PAN image decomposition; (3) high resolution MS image reconstruction. Fig. 3 illustrates the implementation of the proposed method. First, we use the MS training set
105
to obtain the intensity channels, and then these intensity channels are degraded into low-resolution images, which are served as the LF of intensity channels. Subsequently, the HF and LF of intensity channels are randomly sampled to obtain HF and LF dictionaries, respectively. Second, we decompose the PAN image by patch, and use the previous merging dictionaries (HF and LF dictio-
110
naries are merged into one big dictionary) to sparse represent these patches. Thus, the resultant sparse coefficients are divided into two parts, which include HF sparse coefficients and LF sparse coefficients. Then, the HF dictionary multiply by the HF sparse coefficients to obtain the HF of PAN image patch. Third, we reshape these HF of PAN image patches and integrate them into the
115
MS image to obtain the HMS image. 3.1. Dictionary construction To build the dictionaries preserving the spectral information of MS image, HF and LF dictionaries are constructed by the information of MS images. During the dictionary construction, we obtain the intensity channel from the training-
120
set which are original MS images. Thus, this process can be computed as follows: I=
N 1 ∑ M Si , N i=1
6
(2)
& &
"
'"
#
"
((
# "$%
! !
Figure 3: Scheme of the proposed method
where N denotes the number of spectral channel. M Si is the i − th channel of MS image. I represents the intensity channel. The LFC of I is obtained by low-pass downsampling the intensity and upsampling the previous result as follows: I L = (↑ (↓ I)) , 125
(3)
where ↑ and ↓ represent the operations of upsampling and downsampling, respectively. Then HFC of intensity is calculated as I H = I − I L, where I H and I L denote the HFC and LFC of I, respectively. 7
(4)
According to[21], an unknown compressive image can be linear transformed by the global sampled raw patches, which are from training images. Addition130
ally, the sampled raw patches from the MS images can reflect the true spectral information of scene when reconstructing the HMS image. Therefore, the highlow-frequency dictionary can be constructed by randomly sampling the I H and I L images instead of the dictionary learning-based method, which is hard and time consuming.
135
In this paper, the patch size is set with two different types according to satellite sensors, and the patch number is set to 1 × 104 that ensures the dictionary overcomplete. For each training image, we randomly sample 200 patches. And H L then, we can obtain the high-low-frequency dictionaries Dic and Dic .
3.2. PAN image decomposition 140
In this work, we decompose the PAN image by patch instead of direct decomposition of the whole image. From the stage 3 in Fig. 2, the size of the patch is 8 × 8. First, we rearrange the selected image patch into a column vector in terms of pixels. Then, we represent the column vector over the high-low frequency dictionaries, and obtain the whole sparse coefficients. Finally, we di-
145
vide whole sparse coefficients into two parts, one of which is for reconstructing high-frequency patch in terms of the high-frequency dictionary. The proposed method divided PAN image into overlapping patches. The k − th patch can be decomposed into HFC and LFC as follows: Pk = PkH + PkL ,
(5)
where PkH and PkL represent the HFC and LFC of the k − th patch, respec150
tively. According to the sparse representation theory [30, 31], we can compute the sparse coefficients of HFC and LFC by solving the following optimization function:
) ( min αkH 1 + αkL 1
αH ,αL k k
s.t.
( )
Pk − DH αH + DL αL 2 ≤ ε , ic k ic k
(6)
where αkH and αkL are the sparse coefficients of PkH and PkL , respectively. Thus, PkH and PkL can be sparsely represented in terms of the corresponding dictionary 8
155
by: H H PkH = Dic αk ,
(7)
L L PkL = Dic αk , (8) ( ) ( )T ( )T ( H L) Suppose Dic = Dic , Dic , αk = αkH , αkL , the Eq. (6) can be rewritten
as follows:
min ∥αk ∥1 αk
s.t.
2
∥Pk − Dic αk ∥ ≤ ε ,
(9)
where the sparse coefficient vector αk in Eq. (9) can be solved by using the 160
OMP algorithm. Subsequently, to obtain the HF component of PAN image, the HF dictionary will multiply the HF sparse coefficients to generate the k − th HF patch through by Eq. (7), and then these HF patches are reshaped into a whole HF of PAN image. 3.3. High resolution MS image reconstruction
165
Subsequently, the PAN image can be decomposed patch by patch to obtain the HF, which are rearranged to obtain the P H image. The overlapping patches will be processed by using weight averaging. Due to the high spatial resolution of the PAN image, the decomposed P H image also posses the high spatial resolution. According to the imaging theory , the P H image is the HF detail of the
170
MS image. Therefore, the obtaining HFC of PAN image will be integrated into the MS image to create the HMS image as follows: HM Si = M Si + w (i) P H ,
(10)
where HM Si denotes the i − th channel of HMS image. w (i) is the weighting coefficient of i − th channel, which can be calculated as follows: w (i) = 1 N
M Si . N ∑ M Sj
(11)
j=1
4. Experimental results and analysis 175
To demonstrate the effectiveness of the proposed method, we select test images derived from four forms of satellite dataset including Spot-6, Pl´eiades, 9
WorldView-2 and IKONOS. The difference among these satellite dataset is shown in Tab. 1. The MS image of WorldView-2 contains eight channels, i.e., red channel (R), green channel(G), blue channel (B), red-edge channel (RE), 180
yellow channel (Y), coastal channel (C), near infrared channel (NIR1), and rear infrared channel (NIR2) , only four channels (R, G, B, NIR1) are utilized in our experiments. The MS images of Spot-6, Pl´eiades and IKONOS satellites contain four spectral channels (R, G, B, NIR1). Furthermore, the spatial resolution of each pair image (the PAN image and the MS image) is different from
185
the other one. However, the ratio of the resolution is 4 for each pair dataset. These satellite images, covering different scenes of vegetation area, architecture area, and soil area, are divided into training image set and test image set. Fig. 4 shows the training image set of the four forms of satellites.
10
Figure 4: The portion training image set used in this paper. (a) Spot-6 training image set. (b) WorldView-2 training image set. (c) Pl´ eiades training image set. (d) IKONOS training image set
11
Table 1: The difference among these satellite dataset Features
Spot-6
Pl´ eiades
WorldView-2
IKONOS
11
12 or 16
11
11
Dynamic Range (bits / pixel) Spatial
PAN
MS
PAN
MS
PAN
MS
PAN
Resolution(m)
1.5
6
0.5
2
0.46
1.84
1
B: 455-525 Spectral
455-745
Range(nm)
G: 530-590 R: 625-895
B: 430-550 470-830
NIR1: 760-890
C: 400-450
G: 500-620
450-800
R: 590-710 NIR1: 740-940
R: 630-690
B: 450-510
RE: 705-745
G: 510-580
NIR1: 770-895
Y: 585-625
NIR2: 860-1040
450-900
Both MS and PAN images captured by satellites are partitioned into the size of 256 × 256 pixels and 1024 × 1024 pixels, respectively. Due to no real HMS image for reference, the original MS and PAN images are to downsample lower resolution by bicubic interpolation with a downsampling factor 1/4. Therefore, the input MS image is with size of 64 × 64 pixels, and the input PAN image
195
is with the size of 256 × 256 pixels. The downsampled MS image is sharpened back to the original size (256 × 256 pixels) as the output HMS image. In this way, the original MS image can be served as the reference HMS image for visual comparison and objective measurements. The size of dictionary patches are set 16×16 for the Spot-6, WorldView-2 and
200
Pl´eiades satellites, while 8×8 for the IKONOS satellite. According to the sparse representation theory, the randomly sampling blocks of the proposed method are set 1 × 104 to ensure the over-completeness of the dictionary. Therefore, both HR and LR dictionaries are the same size of 256 × 104 for the Spot-6,
WorldView-2 and Pl´eiades satellites, and the size of 64 × 104 for the IKONOS
205
satellite. Moreover, to validate advancements of the proposed method, ten traditional pansharpening methods are selected for comparison. These methods are divided into four class: CS-based, MRA-based, SR-based and recent methods ( 12
4 B: 450-520
4.1. Parameters setup 190
MS
G: 510-600 R: 630-700 NIR1: 760-850
see Tab. 2 ). The parameters of these methods are strictly given by authors, 210
since we can search these available methods online. To demonstrate the experimental effectiveness fairly, the same training set are adopted for the SR-based methods. However, the CSIF-based method uses the reference HMS images to construct dictionaries, while the SRDIP-based method takes the MS images directly. There are two forms of experiments to be implemented in this paper: 1)
215
demonstrating the effection by using MS images in dictionary construction. 2) comparing ten pansharpening methods with the proposed method. Table 2: Comparative methods
Class of methods
Method
CS-based
FIHS [13], LS [16]
MRA-based
AWLP [19], SVT [20]
Recent methods
FEP [35], AIHS [6], IAIHS [10], MMP [36]
SR-based
CSIF [21], SRDIP [28]
4.2. Quality evaluation To further quantitatively evaluate various existing methods, we employ six global metrics with reference image to measure the performance of different 220
pansharpening methods: 1) CC (correlation coefficient) [7] computes the correlation degree between the fused result and the reference. The fused result is closer to the reference when CC is larger. 2) SAM (spectral angle mapper) [37] denotes the absolute value of the spec-
225
tral angle between the vectors of fused result and the reference. The less spectral angle distortion has a smaller SAM value. 3) RMSE (root mean squared error) indicates the average squared difference between the fused result and the reference. The smaller RMSE indicates the better performance.
13
230
4) UIQI (universal image quality indexes) [38] reflects the similarity of the fused image to reference. The value of UIQI increases, the quality becomes better. The highest value for UIQI is 1. 5) ERGAS (relative dimensionless global error in synthesis) [39] measures the overall performance of fused result. The better fused result has a smaller
235
value. 6) RASE (relative average spectral error) [40] reflects the average performance in the spectral bands. The lower value of RASE means the higher spectral quality. 4.3. Results and Comparison
240
(1) Validating the effect of dictionary selection As shown in Fig. 5, it exhibits the correlation coefficients (CC) between MS HM S the HFCs of thirty PAN images decomposed by using the Dic and Dic
dictionaries, respectively. From the results, we can see that the CC of two methods are all higher than 0.980 and the average CC is 0.989. Therefore, 245
MS the HFCs of PAN images decomposed by using the Dic dictionary are highly HM S similar with those by using the Dic dictionary.
14
Figure 5: The correlation coefficients (CC) of high-frequency components (HFCs) decomposed by using different dictionaries.
Fig. 6 shows the CC between the original PAN images and the PAN images HM S MS dictionaries. Where the CC-MS and Dic reconstructed by using the Dic MS denotes the CC by using the Dic dictionary, the CC-HMS represents the CC 250
HM S by using the Dic dictionary. The reconstructed results by two dictionaries
are almost exactly the same. To further analysis the difference between the two methods, Fig. 7 gives the difference by using the two dictionaries. It can be seen that the range of the difference is always under 0.01. Thus, reconstructing MS HM S results by using the Dic and Dic dictionaries are consistent with each other.
15
Figure 6: The correlation coefficients (CC) of the original PAN images and the reconstructed M S and D HM S dictionaries, respectively PAN images by using the Dic ic
Figure 7: The difference values between the two correlation coefficients (CC) in Fig.6
255
Through the two groups of experimental results, we can conclude that decomposing and reconstructing the PAN image by the HF and LF dictionaries which is constructed based on the information of MS images are effective. 16
(2) Spot-6 satellite images Spot-6 satellite images related to architecture area are employed to further 260
validate the proposed method. The results are shown in Fig. 8. In order to fit human visual system (HVS), the sharpened results only display the RGB channels.
Figure 8: (a) MS image. (b) PAN image. (c) Reference HMS image. (d)-(o) various sharpened results. (d) FIHS. (e) AWLP. (f) SVT. (g) MMP. (h) FEP. (i) AIHS. (j) IAIHS. (k) BFLP. (l) LS. (m) CSIF. (n) SRDIP. (o) Proposed.
From the results of right-up red area, the FIHS-, the AWLP-, the SVT-, the MMP-, the FEP-, the AIHS-, the IAIHS-, the BFLP- and the LS- based 265
methods suffer from different degrees of spectral distortion(see Figs. 8(d)-(h)
17
Table 3: Objective quality metrics for Fig. 8
Methods
CC
ERGAS
UIQI
SAM
RASE
RMSE
FIHS
0.904
3.815
0.881
0.068
15.307
44.211
AWLP
0.930
3.584
0.912
0.071
14.740
42.576
SVT
0.903
3.958
0.892
0.071
15.977
46.149
MMP
0.921
3.838
0.914
0.067
15.231
43.994
FEP
0.926
3.732
0.923
0.071
15.175
43.831
AIHS
0.948
3.340
0.914
0.067
13.234
38.226
IAIHS
0.943
3.409
0.909
0.071
13.519
39.047
BFLP
0.945
2.990
0.939
0.071
12.197
35.229
LS
0.920
3.608
0.920
0.065
14.222
41.077
CSIF
0.900
4.046
0.864
0.074
16.038
46.323
SRDIP
0.949
2.907
0.943
0.072
11.814
34.123
Proposed
0.947
2.815
0.944
0.064
11.415
32.971
and (l)). The IAIHS- and the CSIF- based methods generate the results that are more bright than the reference HMS image (see Figs. 8(j) and (m)). And the sharpened result by CSIF-based method is blurred compared with other methods, since the serious block effect leads to the fuzzy result. As a whole, the 270
AIHS-, the BFLP-, the SRDIP- based methods and the proposed method perform better effect in terms of visual quality. The objective quality metric values for this example gives in Tab. 3, which are consistent with the visual effect. The performance of the CSIF-based method is the worst in terms of objective effect, while the SRDIP-based method and the proposed method acquire the
275
better objective performance. Although the proposed method achieves not better performance in the term of CC index, it reveals the best effect for remaining indexes, i.e., CC, ERGAS, UIQI, RASE, and RMSE. Another test example covered by the vegetation is shown in Fig. 9. The FIHS-based method performs obvious spectral distortion (see Fig. 9(d)). The
18
280
result created by the AWLP-based method has more colorful information than the reference HMS image in the green area (see Fig. 9(e)). In Fig. 9(h), the FEP-based method is more dim than the reference HMS image in terms of the overall hue. Compared with the SRDIP-based method and the proposed method, the CSIF-based method performs serious block effect (see Fig. 9(m)).
285
The remainder methods not only enhance the spatial resolution of MS image, but also have the similar performance with each other.
Figure 9: (a) MS image. (b) PAN image. (c) Reference HMS image. (d)-(o) various sharpened results. (d) FIHS. (e) AWLP. (f) SVT. (g) MMP. (h) FEP. (i) AIHS. (j) IAIHS. (k) BFLP. (l) LS. (m) CSIF. (n) SRDIP. (o) Proposed.
Tab. 4 gives the corresponding objective results for Fig. 9. We can see that
19
Table 4: Objective quality metrics for Fig. 9
Methods
CC
ERGAS
UIQI
SAM
RASE
RMSE
FIHS
0.879
2.833
0.862
0.056
12.229
35.685
AWLP
0.923
2.553
0.903
0.055
11.762
34.325
SVT
0.914
2.492
0.904
0.060
11.262
32.864
MMP
0.931
2.442
0.923
0.060
10.707
31.245
FEP
0.919
2.741
0.913
0.063
13.168
38.428
AIHS
0.932
2.221
0.917
0.055
10.096
29.461
IAIHS
0.942
2.193
0.926
0.055
10.045
29.315
BFLP
0.945
2.079
0.939
0.055
9.807
28.618
LS
0.926
2.376
0.925
0.053
10.177
29.697
CSIF
0.921
2.396
0.910
0.058
10.877
31.740
SRDIP
0.946
2.041
0.941
0.056
9.768
28.506
Proposed
0.944
1.972
0.942
0.050
9.236
26.951
the FIHS-based method presents the worst effect in terms of CC, ERGAS and UIQI indexes, the SAM, RASE and RMSE that come from the FEP-based 290
method are the worst. The CSIF-based method although possesses a better effect than some classical methods, i.e., the FIHS-based method and the AWLPbased method, it still shows worse than some excellent methods such as the IAIHS-based method and the BFLP-based method. As a whole, the proposed method and the SRDIP-based method are obviously better than these compar-
295
ative methods in terms of five indexes except the CC metric. Therefore, that enough demonstrates that the proposed method performs better than others. (3) WorldView-2 satellite images Fig. 10 shows the test images captured by WorldView-2 satellite, which is related to the architecture area. Figs. 10 (a)-(c) are MS, PAN and reference
300
HMS images, respectively. From the results, we can see that all methods are confronted with different degrees of spectral distortion in the blue roof and road
20
area. Obvious spectral distortions are in the architecture area of the AWLPbased and the CSIF-based methods (see Figs. 10(e) and (m)). In the results of the AIHS-based and the IAIHS-based methods, the white marks of the road 305
are more blurred than other methods (see Figs. 10(i) and (j)). Tab. 5 lists the objective values of these methods. The AWLP-based method and the CSIFbased method still perform unsatisfied effect, and the BFLP-based method, the SRDIP-based method and the proposed method do much better than other methods. In spite of the CC and UIQI indexes that are lower than the best one,
310
the remainder indexes all work the best.
Figure 10: (a) MS image. (b) PAN image. (c) Reference HMS image. (d)-(o) various sharpened results. (d) FIHS. (e) AWLP. (f) SVT. (g) MMP. (h) FEP. (i) AIHS. (j) IAIHS. (k) BFLP. (l) LS. (m) CSIF. (n) SRDIP. (o) Proposed.
21
Table 5: Objective quality metrics for Fig. 10
Methods
CC
ERGAS
UIQI
SAM
RASE
RMSE
FIHS
0.783
9.007
0.764
0.097
35.492
109.127
AWLP
0.818
9.484
0.804
0.098
37.988
116.799
SVT
0.796
9.094
0.792
0.107
35.318
108.591
MMP
0.818
8.559
0.813
0.101
33.583
103.255
FEP
0.828
9.034
0.824
0.100
35.452
109.002
AIHS
0.804
8.676
0.784
0.094
34.033
104.640
IAIHS
0.806
8.550
0.786
0.091
33.531
103.097
BFLP
0.854
7.607
0.852
0.090
30.562
93.967
LS
0.841
8.000
0.837
0.093
31.429
96.633
CSIF
0.791
9.596
0.789
0.121
38.236
117.563
SRDIP
0.848
7.746
0.844
0.091
30.728
94.477
Proposed
0.853
7.580
0.850
0.086
30.000
92.238
Table 6: Objective quality metrics for Fig. 11
Methods
CC
ERGAS
UIQI
SAM
RASE
RMSE
FIHS
0.898
3.381
0.888
0.064
14.133
80.063
AWLP
0.921
3.406
0.906
0.060
14.153
80.174
SVT
0.898
3.534
0.895
0.078
14.711
83.338
MMP
0.921
3.240
0.920
0.072
14.394
81.541
FEP
0.920
3.641
0.916
0.071
17.263
97.792
AIHS
0.927
2.989
0.919
0.062
13.090
74.156
IAIHS
0.929
2.907
0.921
0.060
12.747
72.214
BFLP
0.935
2.848
0.932
0.060
12.275
69.536
LS
0.935
2.928
0.935
0.060
13.160
74.550
CSIF
0.914
3.189
0.912
0.077
13.911
78.805
SRDIP
0.932
2.854
0.930
0.061
12.332
69.862
Proposed
0.934
2.735
0.933
0.053
11.684
66.192
22
(4) Pl´eiades satellite images
Figure 11: (a) MS image. (b) PAN image. (c) Reference HMS image. (d)-(o) various sharpened results. (d) FIHS. (e) AWLP. (f) SVT. (g) MMP. (h) FEP. (i) AIHS. (j) IAIHS. (k) BFLP. (l) LS. (m) CSIF. (n) SRDIP. (o) Proposed.
Fig. 11 gives a group of vegetation area captured by the Pl´eiades satellite. We can obviously find that the FIHS-based method suffer from serious spetral distortion(see Fig. 11(d)). However, the AWLP-based method and the 315
SVT-based method show the dim color in road area, and in the same area, the AIHS, the IAIHS, the LS and the CSIF based methods exist the spectral distortion(Figs. 11(i)-(j),(l) and (m)). Additionally, the AIHS-based method and the
23
IAIHS-based method loss some significant detail information(see Figs. 11(k)-(l)), i.e., the labeled line of the green grass in the close-up area. Relatively speaking, 320
the BFLP-based method, the SRDIP-based method and the proposed method have desired effect. Tab. 6 gives the objective values, which are consistent with the visual quality. Fig. 12 and Fig. 13 are captured by the scenes of soil areas and high contrast spectral areas, respectively. Both subjective and objective evaluations( Tabs. 7-
325
8 ) show the proposed method outperform other methods.
Figure 12: (a) MS image. (b) PAN image. (c) Reference HMS image. (d)-(o) various sharpened results. (d) FIHS. (e) AWLP. (f) SVT. (g) MMP. (h) FEP. (i) AIHS. (j) IAIHS. (k) BFLP. (l) LS. (m) CSIF. (n) SRDIP. (o) Proposed.
24
Table 7: Objective quality metrics for Fig. 12
Methods
CC
ERGAS
UIQI
SAM
RASE
RMSE
FIHS
0.911
3.061
0.904
0.063
13.351
73.044
AWLP
0.922
3.264
0.909
0.058
14.217
77.780
SVT
0.911
3.237
0.906
0.074
13.947
76.304
MMP
0.929
3.038
0.929
0.064
13.832
75.675
FEP
0.927
3.280
0.926
0.064
15.278
83.590
AIHS
0.934
2.774
0.929
0.060
12.442
68.072
IAIHS
0.938
2.702
0.933
0.058
12.207
66.785
BFLP
0.941
2.688
0.938
0.058
12.079
66.085
LS
0.941
2.758
0.941
0.057
12.621
69.052
CSIF
0.932
2.875
0.930
0.069
13.209
72.265
SRDIP
0.940
2.659
0.938
0.059
12.017
65.748
Proposed
0.940
2.595
0.939
0.053
11.603
63.482
Table 8: Objective quality metrics for Fig. 13
Methods
CC
ERGAS
UIQI
SAM
RASE
RMSE
FIHS
0.890
4.846
0.869
0.076
19.385
118.548
AWLP
0.910
4.961
0.900
0.074
20.054
122.642
SVT
0.892
4.837
0.890
0.083
19.399
118.639
MMP
0.898
4.794
0.895
0.086
19.097
116.788
FEP
0.905
4.817
0.900
0.085
19.283
117.926
AIHS
0.918
4.390
0.891
0.075
17.622
107.772
IAIHS
0.920
4.335
0.895
0.074
17.390
106.353
BFLP
0.925
4.013
0.924
0.074
16.320
99.808
LS
0.903
4.584
0.902
0.073
18.274
111.756
CSIF
0.899
4.566
0.895
0.099
18.511
113.206
SRDIP
0.926
3.961
0.924
0.075
16.021
97.979
Proposed
0.928
3.908
0.928
0.068
15.754
96.347
25
Figure 13: (a) MS image. (b) PAN image. (c) Reference HMS image. (d)-(o) various sharpened results. (d) FIHS. (e) AWLP. (f) SVT. (g) MMP. (h) FEP. (i) AIHS. (j) IAIHS. (k) BFLP. (l) LS. (m) CSIF. (n) SRDIP. (o) Proposed.
(5) IKONOS satellite images Moreover, our comparative experiments are also conducted on the test images which are the scenes of the earthquake disaster region of Wenchuan in China. And these images are captured by the IKONOS satellite. From the two 330
group of test images, it can be seen that the FIHS-, AWLP-, SVT- and FEPbased methods are subjected to serious spectral distortion(see Figs. 14(d),(f),(h) and Figs. 15 (d),(f),(h)). In Fig. 14(g), the MMP-based method is blurred in terms of space quality. Fig. 14(m) which is the result of the CSIF-based method
26
performs apparent block effect. However, the Fig. 15(m) suffers from serious 335
spectral distortion and the MMP-based method in Fig. 15(g) performs better sharpened effect. The remainder methods have the similar performance with each other in terms of space and spectral aspect. Tab. 9 and Tab. 10 give the objective evaluations for the Fig. 14 and Fig. 15, respectively. They all demonstrate the proposed method achieves a better effect than other methods.
Figure 14: (a) MS image. (b) PAN image. (c) Reference HMS image. (d)-(o) various sharpened results. (d) FIHS. (e) AWLP. (f) SVT. (g) MMP. (h) FEP. (i) AIHS. (j) IAIHS. (k) BFLP. (l) LS. (m) CSIF. (n) SRDIP. (o) Proposed.
27
Table 9: Objective quality metrics for Fig. 14
Methods
CC
ERGAS
UIQI
SAM
RASE
RMSE
FIHS
0.753
3.892
0.720
0.070
17.164
53.430
AWLP
0.737
5.182
0.704
0.066
21.786
67.816
SVT
0.815
3.738
0.810
0.078
16.040
49.932
MMP
0.882
3.196
0.873
0.070
15.631
48.656
FEP
0.844
3.644
0.844
0.068
16.707
52.008
AIHS
0.837
3.436
0.831
0.067
15.591
48.532
IAIHS
0.856
3.264
0.850
0.065
15.164
47.202
BFLP
0.872
3.110
0.870
0.066
14.467
45.034
LS
0.895
3.043
0.891
0.066
15.068
46.906
CSIF
0.879
3.228
0.875
0.071
15.731
48.969
SRDIP
0.875
3.069
0.872
0.066
14.333
44.616
Proposed
0.884
2.997
0.882
0.063
14.195
44.189
Table 10: Objective quality metrics for Fig. 15
Methods
CC
ERGAS
UIQI
SAM
RASE
RMSE
FIHS
0.748
3.658
0.675
0.060
16.124
55.451
AWLP
0.816
3.526
0.807
0.050
15.059
51.789
SVT
0.836
3.126
0.834
0.060
13.150
45.224
MMP
0.894
2.685
0.884
0.057
12.955
44.552
FEP
0.858
3.136
0.858
0.057
14.788
50.857
AIHS
0.873
2.791
0.869
0.052
12.735
43.797
IAIHS
0.878
2.721
0.874
0.050
12.436
42.767
BFLP
0.896
2.540
0.894
0.050
11.696
40.223
LS
0.909
2.503
0.905
0.052
12.249
42.126
CSIF
0.887
2.882
0.882
0.065
14.900
51.242
SRDIP
0.892
2.569
0.889
0.051
11.730
40.339
Proposed
0.907
2.437
0.905
0.048
11.457
39.403
28
Figure 15: (a) MS image. (b) PAN image. (c) Reference HMS image. (d)-(o) various sharpened results. (d) FIHS. (e) AWLP. (f) SVT. (g) MMP. (h) FEP. (i) AIHS. (j) IAIHS. (k) BFLP. (l) LS. (m) CSIF. (n) SRDIP. (o) Proposed.
340
Generally speaking, we can conclude that the CSIF-based method outperforms the classical methods (FIHS, AWLP, SVT), but it has some drawbacks when compared with the current methods (IAIHS, MMP, BFLP, LS). Although both the AWLP- and the BFLP- based methods use the ARSIS concept [29], they still performs worse than the SRDIP-based method. Thus, it fully demon-
345
strates the advancements of SR. Through comparing the proposed method with the SRDIP-based method, the proposed method always achieve the best performance in terms of spectrum. Therefore, the spatial detail extracted by using
29
the dictionary, which is constructed from the MS training-set to decompose the PAN image, is more fit the property of MS image.
350
5. Conclusion In this paper, we propose a new SR-based pansharpening method. Different from existing fusion methods related to remote sensing image, the proposed method adopts the MS image to obtain the high-low-frequency dictionaries, which are practical basis. In addition, the proposed method extracts detail
355
images through decomposing the PAN image by using the high-low-frequency dictionaries that fit the spectral property. Compared with other methods used in this paper, the proposed method not only performs better on both subjective and objective indications, but also has a better performance than the current SR-based methods, i.e., the CSIF-based and the SRDIP-based methods.
360
Acknowledgements The research in our paper is sponsored by National Natural Science Foundation of China (No. 61701327, No. 61711540303, and No. 61473198), National Research Foundation of Korea (No. NFR-2017K2A9A2A06013711), also is supported by the Priority Academic Program Development of Jiangsu Higer
365
Education Institutions (PAPD) Fund, Jiangsu Collaborative Innovation Center on Atmospheric Environment and Equipment Technology (CICAEET) Fund.
References [1] J. Gubbi, R. Buyya, S. Marusic, M. Palaniswami, Internet of things (iot): A vision, architectural elements, and future directions, Future Generation 370
Computer Systems 29 (7) (2013) 1645–1660. [2] Y. Zhang, Understanding image fusion, Photogrammetric Engineering & Remote Sensing 70 (6) (2004) 657–661.
30
[3] G. Simone, A. Farina, F. C. Morabito, S. B. Serpico, L. Bruzzone, Image fusion techniques for remote sensing applications, Information Fusion 3 (1) 375
(2002) 3–15. [4] J. Liu, S. Liang, Pan-sharpening using a guided filter, Taylor & Francis, Inc., 2016. [5] H. Ghassemian, A review of remote sensing image fusion methods, Information Fusion 32 (PA) (2016) 75–89.
380
[6] S. Rahmani, M. Strait, D. Merkurjev, M. Moeller, T. Wittman, An adaptive ihs pan-sharpening method, IEEE Geoscience & Remote Sensing Letters 7 (4) (2010) 746–750. [7] L. Alparone, L. Wald, J. Chanussot, C. Thomas, P. Gamba, L. M. Bruce, Comparison of pansharpening algorithms: Outcome of the 2006 grs-s data-
385
fusion contest, IEEE Transactions on Geoscience & Remote Sensing 45 (10) (2007) 3012–3021. [8] A. Jameel, M. M. Riaz, A. Ghafoor, Guided filter and ihs-based pansharpening, IEEE Sensors Journal 16 (1) (2015) 192–194. [9] T. M. Tu, S. C. Su, H. C. Shyu, P. S. Huang, A new look at ihs-like image
390
fusion methods, Information Fusion 2 (3) (2001) 177–186. [10] Y. Leung, J. Liu, J. Zhang, An improved adaptive intensitychuecsaturation method for the fusion of remote sensing images, IEEE Geoscience & Remote Sensing Letters 11 (5) (2013) 985–989. [11] J. Sun, Y. Jiang, S. Zeng, A study of pca image fusion techniques on
395
remote sensing, Proceedings of SPIE - The International Society for Optical Engineering 5985 (2005) 739–744. [12] B. Aiazzi, S. Baronti, M. Selva, L. Alparone, Ms + pan image fusion by an enhanced gram-schmidt spectral sharpening.
31
[13] T. M. Tu, P. S. Huang, C. L. Hung, C. P. Chang, A fast intensity-hue400
saturation fusion technique with spectral adjustment for ikonos imagery, IEEE Geoscience & Remote Sensing Letters 1 (4) (2004) 309–312. [14] J. Nunez, X. Otazu, O. Fors, A. Prades, V. Pal, R. Arbiol, Multiresolutionbased image fusion with additive wavelet decomposition, IEEE Transactions on Geoscience & Remote Sensing 37 (3) (1999) 1204–1211.
405
[15] K. P. Upla, P. P. Gajjar, M. V. Joshi, Pan-sharpening based on Nonsubsampled Contourlet Transform detail extraction, 2013. [16] N. I. Cho, Wavelet-domain satellite image fusion based on a generalized fusion equation, Journal of Applied Remote Sensing 8 (1) (2014) 080599. [17] M. Choi, R. Y. Kim, M. R. Nam, O. K. Hong, Fusion of multispectral
410
and panchromatic satellite images using the curvelet transform, IEEE Geoscience & Remote Sensing Letters 2 (2) (2005) 136–140. [18] B. Aiazzi, L. Alparone, A. Barducci, S. Baronti, Multispectral fusion of multisensor image data by the generalized laplacian pyramid, in: Geoscience and Remote Sensing Symposium, 1999. IGARSS ’99 Proceedings.
415
IEEE 1999 International, 1999, pp. 1183–1185 vol.2. [19] X. Otazu, M. Gonzalez-Audicana, O. Fors, J. Nunez, Introduction of sensor spectral response into image fusion methods. application to wavelet-based methods, IEEE Transactions on Geoscience & Remote Sensing 43 (10) (2005) 2376–2385.
420
[20] S. Zheng, W. Z. Shi, J. Liu, J. Tian, Remote sensing image fusion using multiscale mapped ls-svm, Geoscience & Remote Sensing IEEE Transactions on 46 (5) (2008) 1313–1322. [21] S. Li, B. Yang, A new pan-sharpening method using a compressed sensing technique, IEEE Transactions on Geoscience & Remote Sensing 49 (2)
425
(2011) 738–746.
32
[22] X. X. Zhu, R. Bamler, A sparse image fusion algorithm with application to pan-sharpening, IEEE Transactions on Geoscience & Remote Sensing 51 (5) (2013) 2827–2836. [23] S. Li, H. Yin, L. Fang, Remote sensing image fusion via sparse representa430
tions over learned dictionaries, IEEE Transactions on Geoscience & Remote Sensing 51 (9) (2013) 4779–4789. [24] M. Guo, H. Zhang, J. Li, L. Zhang, H. Shen, An online coupled dictionary learning approach for remote sensing image fusion, IEEE Journal of Selected Topics in Applied Earth Observations & Remote Sensing 7 (4) (2014)
435
1284–1294. [25] M. Cheng, C. Wang, J. Li, Sparse representation based pansharpening using trained dictionary, IEEE Geoscience & Remote Sensing Letters 11 (1) (2014) 293–297. [26] C. Jiang, H. Zhang, H. Shen, L. Zhang, Two-step sparse coding for the
440
pan-sharpening of remote sensing images, IEEE Journal of Selected Topics in Applied Earth Observations & Remote Sensing 7 (5) (2014) 1792–1805. [27] W. Wang, L. Jiao, S. Yang, K. Rong, Distributed compressed sensing-based pan-sharpening with hybrid dictionary, Neurocomputing 155 (C) (2015) 320–333.
445
[28] H. Yin, Sparse representation based pansharpening with details injection model, Signal Processing 113 (C) (2015) 218–227. [29] T. Ranchin, B. Aiazzi, L. Alparone, S. Baronti, L. Wald, Image fusionthe arsis concept and some successful implementation schemes, Isprs Journal of Photogrammetry & Remote Sensing 58 (1C2) (2003) 4–18.
450
[30] E. J. Candes, J. Romberg, T. Tao, Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information, IEEE Press, 2006.
33
[31] D. L. Donoho, Compressed sensing, IEEE Transactions on Information Theory 52 (4) (2006) 1289–1306. 455
[32] S. S. Chen, D. L. Donoho, M. A. Saunders, Atomic decomposition by basis pursuit. reprinted from siam j. sci. comput. 20, Siam Review (1) (2001) 129–159. [33] I. F. Gorodnitsky, B. D. Rao, Sparse signal reconstruction from limited data using focuss: a re-weighted minimum norm algorithm, IEEE Transactions
460
on Signal Processing 45 (3) (2002) 600–616. [34] J. A. Tropp, Greed is good: Algorithmic results for sparse approximation, IEEE Transactions on Information Theory 50 (10) (2004) 2231–2242. [35] J. Lee, C. Lee, Fast and efficient panchromatic sharpening, IEEE Transactions on Geoscience & Remote Sensing 48 (1) (2009) 155–163.
465
[36] X. Kang, S. Li, J. A. Benediktsson, Pansharpening with matting model, IEEE Transactions on Geoscience & Remote Sensing 52 (8) (2014) 5088– 5099. [37] R. Yuhas, A. F. H. Goetz, J. W. Boardman, Descrimination among semiarid landscape endmembers using the spectral angle mapper (sam) algo-
470
rithm, 1992. [38] Z. Wang, A. C. Bovik, A universal image quality index, IEEE Signal Processing Letters 9 (3) (2002) 81–84. [39] L. Wald, Quality of high resolution synthesised images: Is there a simple criterion ?, 2000, pp. 99–103.
475
[40] Y. Yang, W. Wan, S. Huang, F. Yuan, S. Yang, Y. Que, Remote sensing image fusion based on adaptive ihs and multiscale guided filter, IEEE Access 4 (2016) 4573–4582.
34
Authors Biography: Xiaomin Yang is currently an Associate Professor in College of Electronics and Information Engineering, Sichuan University. She received her BS degree from Sichuan University, and received her PhD degree in communication and information system from Sichuan University. She worked in University of Adelaide as a post doctorate for one year. Her research interests are image processing and pattern recognition. Lihua Jian received his MS degree of Instrument Science and Technology from University of Electronic Science and Technology of China. He is currently pursuing doctoral degree in College of Electronics and Information Engineering, Sichuan University. His research interests are image process and computer vision. Binyu Yan is currently an Associate Professor in College of Electronics and Information Engineering, Sichuan University. He received his BS degree from Sichuan University, and received his MS and Ph D degrees in communication and information system from Sichuan University. His research interests are image process and pattern recognition. Kai Liu is currently a professor in the School of Electrical Engineering and Information at Sichuan University, China. He received his BS and MS degrees in computer science from Sichuan University, and his PhD in electrical engineering from the University of Kentucky, respectively. His main research interests include computer/machine vision, active/passive stereo vision and image processing. He is a senior member of the IEEE. Lei Zhang is currently an Associate Professor in College of Computer Science, Sichuan University. He received his MS and Ph D degrees from Sichuan University. His research interests are machine learning. Yiguang Liu received the M.S. degree from Peking University, in 1998, and the Ph.D. degree from Sichuan University, in 2004. He was a Research Fellow, Visiting Professor, and Senior Research Scholar with the National University of Singapore, Imperial College London, and Michigan State University, respectively. He was chosen into the program for new century excellent talents of MOE in 2008, and chosen as a Scientific and Technical Leader in Sichuan Province in 2010.
Authors Photo:
Xiaomin Yang
Lihua Jian
Binyu Yan
Kai Liu
Lei Zhang
Yiguang Liu
Highlights: HF and LF dictionaries are constructed by the information of MS images. HFC of PAN image is precisely extracted with the HF dictionary.