Journal of Integrative Agriculture 2019, 18(12): 2883–2897 Available online at www.sciencedirect.com
ScienceDirect
RESEARCH ARTICLE
High resolution crop intensity mapping using harmonized Landsat-8 and Sentinel-2 data HAO Peng-yu1, 2, TANG Hua-jun1, CHEN Zhong-xin1, YU Le3, WU Ming-quan4 1
Key Laboratory of Agricultural Remote Sensing, Institute of Agricultural Resources and Regional Planning, Chinese Academy of Agricultural Sciences, Beijing 100081, P.R.China
2
Key Laboratory for Geo-Environmental Monitoring of Coastal Zone of the National Administration of Surveying, Mapping and GeoInformation/Shenzhen Key Laboratory of Spatial Smart Sensing and Services, Shenzhen University, Shenzhen 518060, P.R.China 3 Ministry of Education Key Laboratory for Earth System Modeling, Department of Earth System Science, Tsinghua University, Beijing 100084, P.R.China 4 State Key Laboratory of Remote Sensing Science, Institute of Remote Sensing and Digital Earth, Chinese Academy of Sciences, Beijing 100101, P.R.China
Abstract An increase in crop intensity could improve crop yield but may also lead to a series of environmental problems, such as depletion of ground water and increased soil salinity. The generation of high resolution (30 m) crop intensity maps is an important method used to monitor these changes, but this is challenging because the temporal resolution of the 30-m image time series is low due to the long satellite revisit period and high cloud coverage. The recently launched Sentinel-2 satellite could provide optical images at 10–60 m resolution and thus improve the temporal resolution of the 30-m image time series. This study used harmonized Landsat Sentinel-2 (HLS) data to identify crop intensity. The sixth polynomial function was used to fit the normalized difference vegetation index (NDVI) and enhanced vegetation index (EVI) curves. Then, 15-day NDVI and EVI time series were then generated from the fitted curves and used to generate the extent of croplands. Lastly, the first derivative of the fitted VI curves were used to calculate the VI peaks; spurious peaks were removed using artificially defined thresholds and crop intensity was generated by counting the number of remaining VI peaks. The proposed methods were tested in four study regions, with results showing that 15-day time series generated from the fitted curves could accurately identify cropland extent. Overall accuracy of cropland identification was higher than 95%. In addition, both the harmonized NDVI and EVI time series identified crop intensity accurately as the overall accuracies, producer’s accuracies and user’s accuracies of non-cropland, single crop cycle and double crop cycle were higher than 85%. NDVI outperformed EVI as identifying double crop cycle fields more accurately.
Received 15 August, 2018 Accepted 1 December, 2018 HAO Peng-yu, Mobile: +86-13718668296, E-mail: haopy8296 @163.com; Correspondence TANG Hua-jun, E-mail:
[email protected] © 2019 CAAS. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http:// creativecommons.org/licenses/by-nc-nd/4.0/). doi: 10.1016/S2095-3119(19)62599-2
2884
HAO Peng-yu et al. Journal of Integrative Agriculture 2019, 18(12): 2883–2897
Keywords: crop intensity, time series, sixth polynomial function, harmonized Landsat-8 and Sentinel-2
1. Introduction A growing population and climate change can lead to greater pressure on global food security (Tilman et al. 2011; Bajzelj et al. 2014). Currently, where acreage of arable land is limited, a common strategy for increasing crop production and improving global food security is the intensification of agricultural land use (Siebert et al. 2010). Global warming has caused crop intensity to increase in some areas (Zhang et al. 2013; Challinor et al. 2015); however, the increase in crop intensity reacts to both natural resources and climate. A common example of this effect is the depletion of ground water and increased soil salinity due to significantly increased irrigation (Rodell et al. 2009; Wang et al. 2018), reduced fallow period, greater rates of evapotranspiration, and cooler temperatures (Mueller et al. 2017). Accurate crop intensity distribution maps could provide valuable information for agricultural and environmental studies (Huang et al. 2018). Satellite observations have been widely used to map crop intensity because cropland with multiple crop cycles can be described by multi-temporal remote-sensing data (Boryan et al. 2017; Song et al. 2017). Generally, most crop intensity maps are generated using coarse resolution data, such as data from NOAA-AVHRR, SPOT VEGETATION data and MODIS data, as these data have high temporal frequency (Jain et al. 2013; Gray et al. 2014; Li et al. 2014; Conrad et al. 2016; Estel et al. 2016; Tatsumi 2016; Zhao et al. 2016; Qiu et al. 2018). At finer resolution (such as 30 m), crop intensity extraction is limited since the revisit period for Landsat data is low and the possibility of acquiring cloud-free images in each season is limited (Kovalskyy and Roy 2013; Hao et al. 2014; Yu et al. 2015). The fusion of MODIS and Landsat data improved the temporal frequency of 30-m image time series. The fused data could support crop intensity mapping (Li et al. 2017), but the image fusion algorithms are based on the assumption that MODIS pixels are a linear combination of each land cover classes within the corresponding pixel, which may induce error into the fused image time series (Maselli et al. 1998). Recently, more satellites with high spatial resolution have become available, such as Sentinel-2 (van der Meer et al. 2014; ESA 2016), the temporal resolution of 30-m image time series is significantly improved by merging data from multiple sources (Wang et al. 2017; Xiong et al. 2017; Stumpf et al. 2018). The increased temporal frequency of the 30-m image time series makes it possible to directly
generate high resolution crop intensity maps (Li and Roy 2017). Although the temporal density of image time series is improved by merging data from multiple data sources, the harmonized Landsat-8 and Sentinel-2 data remain irregular because the data contain a number of missing values due to cloud cover. Therefore, most existing crop intensity extraction methods (such as moving window and secondorder difference algorithms) cannot process harmonized data as these methods need regular time series (Li et al. 2014; Zhao et al. 2016). Although an image composition method was used to reduce the missing values within the image time series at 30-m resolution (Xiong et al. 2017), the composited images had low temporal frequency (commonly monthly or two-monthly composition) which cannot support crop intensity mapping. Therefore, a new process was required to deal with irregular time series data for crop intensity extraction. This study aims to use harmonized Landsat-8 and Sentinal-2 data to identify crop intensity at 30-m resolution. The objectives are: (1) to estimate the data availability of harmonized Landat-8 and Sentinel-2 data for crop intensity mapping, and (2) to propose a method which can deal with irregular time series and calculate crop intensity. Four study regions with different crop planting systems were used to test the performance of the harmonized 30-m image time series and the proposed method for crop intensity mapping.
2. Study areas and datasets 2.1. Study regions There are four regions investigated in this study (Fig. 1). The first study region (96°0´–97°0´W, 38°0´–39°0´N) is located in Kansas (KS), USA. This study region is dominated by agriculture where the major crop types are corn, soybean sorghum and winter wheat, with growing season of between April and October. After the harvest, some fields were used to plant summer crops; therefore, the study region has both single and double crop cycles and the fields size range from 65 to 245 ha (Wardlow and Egbert 2008; Masialeti et al. 2010; Hao et al. 2015). The second study region (28°00´–29°00´E, 25°20´–26°20´S) is located in the Gaoteng Province of South Africa (GT); it is a representative region of the farming-pastoral transitional zone. Agricultural activities here are mixed with cattle ranching and sheep farming. The dominant crop types are maize, wheat, sugar cane and sunflowers. In addition, cropland fields in the GT site are mainly medium (2.56–16 ha) and large (larger than 16 ha)
2885
HAO Peng-yu et al. Journal of Integrative Agriculture 2019, 18(12): 2883–2897
116°40´E
117°0´E
B
0
15
97°20´W
97°40´W
120°W
30
60
97°0´W
36°40´N
38°0´N
38°0´N
36°20´N
38°20´N
38°20´N
38°40´N
38°40´N
A
116°20´E
37°0´N
116°0´E
96°40´W
36°20´N
97°0´W
37°0´N
97°20´W
36°40´N
97°40´W
0 116°0´E
96°40´W
60°W
15
30
60
km
0°
60°E
116°20´E
120°E
116°40´E
km
117°0´E
N
30°N 30°S
30°S
30°N
C
29°0´E
D
76°0´E
25°40´S
76°20´E
76°40´E
77°0´E
E
28°0´E
28°20´E
30
28°40´E
60
km
29°0´E
0 76°0´E
76°20´E
12.5
25
76°40´E
50
29°0´N
15
26°20´S
0
29°0´N
26°0´S
26°0´S 26°20´S
120°E
29°20´N
28°40´E
60°E
25°20´S
28°20´E
Cropland
29°40´N
0°
60°W
25°40´S
25°20´S
28°0´E
10 000 km
29°40´N
120°W
2 500 5 000
29°20´N
0
km
77°0´E
Fig. 1 Location of study regions and their zoom in images. A, Kansas, USA. B, Gaoteng, South Africa. C, location of the four study regions. D, Punjab, India. E, Shandong, China.
farm plots and the crop fields in the region generally have a single crop cycle. The crop growing seasons are during November to April of the next year. The third study region (76°00´–77°10´E, 28°50´–29°00´N) is located in Punjab (PUN), India; adjacent to the Himalayas, it receives heavy rainfall. This study region has high food production and the major crops are wheat and cotton. Most crop fields in this study region have two crop cycles (Gupta 1993;
Anindita et al. 2009). The last study region (84°0´–85°0´E, 41°0´–42°0´S) is located in Shandong Province (SHA), China. This region is a representative region of temperate and monsoonal climate with four clearly distinct seasons. The crop cycles in this study region contain both single crop cycles (cotton and peanut) and double crop cycles (winterwheat and summer-maize). Crop seasons occur during March to November and the average field size in this study
2886
HAO Peng-yu et al. Journal of Integrative Agriculture 2019, 18(12): 2883–2897
region is small (smaller than 2.56–16 ha).
2.2. Harmonized Landsat Sentinel-2 (HLS) data The remote sensing data used in this study were acquired from the Harmonized Landsat Sentinel-2 (HLS) project (NASA 2017), which aims to produce a consistent, harmonized surface reflectance product from Landsat-8 and Sentinel-2 data. The start point of HLS is the Top of Atmosphere Reflectance data from Landsat-8 and Sentinel-2. After several processing steps (atmospheric correction, cloud masking, geometric resampling and geographic registration, bidirectional reflectance distribution function (BRDF) normalization and Band pass adjustment), stackable surface reflectance product is seen with the Sentinel Tiling System. The HLS data used in this study are the M30 surface reflectance product with 30-m spatial resolution and a 5-day temporal resolution. Table 1 showed the HLS data used for each of the four study regions. The beginning date of each time series differs as these study regions have different growing seasons. For the HLS images, three spectral bands (Blue band, Red band and NIR band) are acquired and then the normalized difference vegetation index (NDVI) and enhanced vegetation index (EVI) were calculated accordingly, using eqs. (1–2) (Rouse et al. 1974; Huete et al. 2002). We then used the NDVI and EVI time series to calculate crop intensity. ρNIR–ρRed (1) NDVI= ρ +ρ NIR Red ρNIR–ρRed
(2) ρNIR+6×ρRed–7.5×ρBlue+1 where ρNIR, ρRed, and ρBlue are the surface of the NIR band, Red band and Blue band, respectively. EVI=2.5×
2.3. Training and validation samples Training samples used in this study contain two parts: (1) open-access samples collected from global datasets - FROM-GLC and GFSAD30 - as cropland/non-cropland (Thenkabail et al. 2012; Gong et al. 2013); and (2) visually interpreted polygons, based on samples from (1) and very high resolution Google Earth images. The polygons were
converted to 30-m pixels corresponding to the images, which were then used as training samples for cropland extent identification. The number of training polygons and samples are shown in Table 2. To verify the cropland extent and crop intensity map generated from this study, 1 000 points were randomly generated for both study regions and these points were visually interpreted as cropland/non-cropland based on the very high-resolution images. The cropland samples were then visually identified as single crop/double crop cycles based on the NDVI time series of each pixel. These samples were used to validate cropland extent and crop intensity maps; the single and double crop cycles points were merged as cropland when verifying the cropland extent identification. In addition, we also included some groundsurveyed samples for validation. In GT, the surveyed fields were surveyed monthly and land cover type recorded, as well as the crop cycles of the surveyed fields. In SHA, we observed the fields in July, the winter wheat-summer maize fields could be identified from the single crop cycle fields as there is wheat stubble visible; these fields are different from single crop cycle fields as there is wheat stubble visible. Although the number of field-surveyed samples is low, these samples allowed the generation of validation data. Our field surveys greatly improved our ability to classify the samples interpreted solely from high resolution satellite imagery. The number of validation samples used for each crop cycle type are shown in Table 3.
3. Methods 3.1. Flowchart Fig. 2 shows a flowchart of the process of generating crop intensity maps: the NDVI and EVI time series at 30-m resolution were calculated for both study regions; then, the polynomial method was used to fit the VI time series for both study regions and generate 15-day NDVI/EVI images time series; next, the 15-day NDVI and EVI time series were used to derive a cropland extent map. Afterwards, we used the fitted sixth polynomial function of the NDVI time series and EVI time series to generate the VI peaks and calculate crop
Table 1 Information of the Harmonized Landsat Sentinel-2 (HLS) data used in this study1) Study region2) KS GT PUN SHA 1) 2)
Tile 14SPH 35JPM 43RFN 50SMF
Beginning date (DOY/Year) 288/2015 183/2016 136/2016 41/2016
Ending date (DOY/Year) 282/2016 152/2017 103/2017 304/2016
No. of Landsat-8 images No. of Sentinel-2 images 12 46 8 34 11 43 13 47
DOY, days of a year. KS, Kansas, USA; GT, Gaoteng, South Africa; PUN, Punjab, India; SHA, Shandong, China.
2887
HAO Peng-yu et al. Journal of Integrative Agriculture 2019, 18(12): 2883–2897
the VI time series. As most fields in the four study regions have single and double crop cycles, and the sixth order polynomial function has been proven to describe the NDVI/ EVI time series of fields with a double crop cycle (Piao et al. 2010; Jeong et al. 2011; Liu et al. 2017), the sixth order polynomial function (eq. (3)) was used in this study, defined as: VI(t)=a0+a1t+a2t2+...+a6t6(3) where t is time series length refers to the beginning of time series (for example, the beginning date of the HLS data in
intensity, respectively. Finally, the performances of NDVI and EVI time series when generating cropland extent and intensity were compared in both study regions.
3.2. Reconstruction of time series In this study, polynomial functions were used to reconstruct Table 2 Number of training polygons and samples Study region1) Classes Polygon number Sample number KS Non-cropland 45 253 Cropland 41 314 GT Non-cropland 64 200 Cropland 57 252 PUN Non-cropland 51 361 Cropland 59 402 SHA Non-cropland 39 423 Cropland 37 433
Non-cropland Cropland Single crop cycle Double crop cycle
1)
1)
Table 3 Number of validation samples1)
KS, Kansas, USA; GT, Gaoteng, South Africa; PUN, Punjab, India; SHA, Shandong, China.
KS 430
GT 789
PUN 278
SHA 323
456 114
187 31
19 703
146 540
KS, Kansas, USA; GT, Gaoteng, South Africa; PUN, Punjab, India; SHA, Shandong, China.
Harmonized Landsat-8 and Sentinel-2 NDVI time series
Harmonized Landsat-8 and Sentinel-2 EVI time series
Sixth polynomial function fitting
Training samples
15-day NDVI time series
Dates of NDVI peaks
15-day EVI time series
Dates of EVI peaks
Random forest
Crop intensity mapping
Cropland extend map
Mask
NDVI-based crop intensity map
Validation samples
Cropland extent validation
EVI-based crop intensity map
Crop intensity validation
Fig. 2 Flowchart of the study. NDVI, normalized difference vegetation index; EVI, enhanced vegetation index.
2888
HAO Peng-yu et al. Journal of Integrative Agriculture 2019, 18(12): 2883–2897
when VDVI(t) fell below 0.45 and spurious peaks of the EVI time series were defined as when VDVI(t) fell below 0.35, because the NDVI and EVI peaks of croplands are generally higher than 0.5 and 0.40 (Li et al. 2017). Finally, crop intensity was generated by counting peak numbers for the calendar year: VI´(t)=a1+2×a2t+3×a3t2+...+6×a6t5 (4)
study region SHA is days of a year (DOY) 41 of 2016, the t of DOY 41 is 1, and t of DOY 45 is 5); VI(t) is the fitted NDVI or EVI; and a0, a1, ...an are the fitted parameters.
3.3. Cropland Extent identification After generating 15-day NDVI and EVI time series from the fitted curves, we could acquire a regular NDVI/EVI time series from the irregular HLS series. The Random Forest (RF) classifier was used to identify crop extent (Breiman 2001). RF is an ensemble machine learning technique that combines multiple trees and has shown good potential in classifying images (Hao et al. 2015; Immitzer et al. 2016). In this study, we used the Randomforest package of R to interpret the classifier (Breiman et al. 2013). There are two free parameters of RF: the number of trees (ntree) and the number of features to split the nodes (mtry), we defined ntree as 1 000 and mtry as the square root of the total number of input features; all NDVI and EVI time series were used as the input for crop extent identification.
4. Results 4.1. Time series reconstruction Fig. 3 shows the distribution of the adjusted R-square (adj-R2) of NDVI and EVI time series for the four study regions. Generally, adj-R2 values of EVI time series were higher than NDVI: average adj-R2 were 0.76 and 0.81 for NDVI and EVI in KS; 0.72 and 0.79 in GT; 0.71 and 0.79 in PUN; and 0.73 and 0.78 in SHA. The blue patterns in the Fig. 3 are failure-fitted pixels where less than six cloudfree observations were acquired. Fig. 4 shows some fitted NDVI and EVI time series examples. The sixth polynomial functions showed good potential to describe the annual trend for both single crop cycle and double crop cycle fields, although the VI peaks are slightly underestimated and the VI valleys are overestimated. In addition, the results showed that the VI time series of single crop cycle fields could be better fitted as their R2 and adj-R2 were higher than those of double crop cycle fields.
3.4. Crop intensity extraction The steps for mapping crop intensity are as follows: firstly, the first derivatives of function VI(t) (eq. (4)) were calculated for each pixel, finding the real number solution of VI´(t)=0. Secondly, we rounded each solution to the nearest whole number and generated a list of possible VI peak dates. The VI´(t) of each possible date was calculated and dates that VI´(t)<0 were removed, leaving only VI peaks. Thirdly, the date of spurious peaks were removed from the peak dates. Spurious peaks of the NDVI time series were defined as
4.2. Cropland extent Validation samples were used to verify the crop extent
A
B
C
D
E
F
G
H
adj-R2
0
1
Fig. 3 Adjusted R-square (adj-R2) images of the fitted normalized difference vegetation index (NDVI) and enhanced vegetation index (EVI) time series. A–D, the adj-R2 of fitted NDVI time series in Kansas (USA), Gaoteng (South Africa), Punjab (India) and Shandong (China), respectively. E–H, the adj-R2 of fitted EVI time series in Kansas, Gaoteng, Punjab and Shandong, respectively.
2889
HAO Peng-yu et al. Journal of Integrative Agriculture 2019, 18(12): 2883–2897
B
1.0
0.8
0.6
0.6
0.4
R2=0.9549 adj-R2=0.9336
0.2 0 50
100
150 200 250 Time series length (d)
0.6
0.6
0.4
R2=0.9529 adj-R2=0.9816 150 200 250 Time series length (d)
F
0.8
0.6
0.6
R2=0.9592 adj-R2=0.9577
0 50
100
150
200
250
Time series length (d)
0.6
EVI
0.8
0.6 0.4 R2=0.9631 adj-R2=0.9556 100
150 200 250 Time series length (d)
150 200 250 Time series length (d)
R2=0.7429 adj-R2=0.6524 100
150 200 250 Time series length (d)
300
0.4 R2=0.5815 adj-R2=0.5234
0.2 300
300
1.0
0.8
0 50
100
0.4 0 50
H
1.0
0.2
R2=0.8107 adj-R2=0.7965
0.2 300
300
1.0
0.8
0.2
150 200 250 Time series length (d)
0.4 0 50
300
1.0
0.4
100
0.2
EVI
EVI
100
R2=0.6767 adj-R2=0.6485
D 1.0 0.8
0 50
EVI
0 50
300
0.8
0.2
G
0.4 0.2
NDVI
NDVI
C 1.0
E
1.0
0.8 NDVI
NDVI
A
0 50
100
150 200 250 Time series length (d)
300
Fig. 4 Examples of fitted normalized difference vegetation index (NDVI) and enhanced vegetation index (EVI) time series. A and C, examples of NDVI time series with single NDVI peak. B and D, examples of NDVI time series with double NDVI peak. E and G, examples of EVI time series with single EVI peak. F and H, examples of EVI time series with double EVI peaks. adj-R2, adjusted R-square. Table 4 Cropland extent identification accuracies1) Accuracy metrics2) OA (%) Kappa coefficient PA of cropland (%) PA of non-cropland (%) UA of cropland (%) UA of non-cropland (%) 1) 2)
KS 96.2 0.9219 98.7 92.7 94.8 98.3
Fitted time series GT PUN 97.2 98.3 0.9204 0.9581 97.7 98.2 90.2 99.4 97.1 98.5 99.3 95.5
SHA 94.6 0.8722 97.8 87.6 94.4 94.7
KS 94.4 0.8846 98.3 89.3 92.4 97.6
GFSAD30 GT PUN 88.6 72.2 0.7041 0 92.7 100 67.1 72.2 87.5 0 97.7 0
SHA 86.5 0.6522 99.8 58.2 83.5 99.5
KS, Kansas, USA; GT, Gaoteng, South Africa; PUN, Punjab, India; SHA, Shandong, China. OA, overall accuracy, PA, producer’s accuracy, UA, user’s accuracy.
maps generated in this study. Overall accuracy (OA), producer’s accuracy (PA), user’s accuracy (UA) and the
Kappa coefficient derived from the confusion matrix were used to estimate cropland extent identification accuracy
2890
HAO Peng-yu et al. Journal of Integrative Agriculture 2019, 18(12): 2883–2897
(Congalton 1991). Table 4 shows that 15-day NDVI and EVI time series generated from the fitted NDVI and EVI curves have good potential for identifying cropland extent; OA of the four study regions were generally higher than 95%. In addition, the PA and UA of cropland and noncropland were higher than 90%. We further compared the cropland extent generated in this study with the Global Food
Security Analysis-Support Data (GFSAD30), and verified the classification accuracy of GFSAD30 using the validation samples of this study (Thenkabail et al. 2012; Gumma 2017; Teluguntla 2017, 2018; Xiong et al. 2017a, b; USGS 2018). Results (Table 3) show that the cropland extent acquired in this study outperformed GFSAD30 in the study regions, as the OAs, PAs and UAs of cropland extent maps generated
GT
KS
PUN
SHA
A1
A2
A3
A4
B1
B2
B3
B4
C1
C2
C3
C4
D1
D2
D3
D4
E1
E2
E3
E4
F1
F2
F3
F4
Cropland
Non-cropland
Fig. 5 Wall-to-wall cropland extent maps comparison. A1–A4, false color maps for each study regions. B1–B4, cropland extent maps generated in this study for each study regions. C1–C4, cropland extent maps of GFSAD30 for each study regions. D1–D4, the zoom-in false color maps for each study regions, the green rectangles in A1–A4 are the extent of the zoom-in images. E1–E4, zoom-in cropland extent maps generated in this study. F1–F4, zoom-in cropland extent maps of GFSAD30. KS, Kansas, USA; GT, Gaoteng, South Africa; PUN, Punjab, India; SHA, Shandong, China.
2891
HAO Peng-yu et al. Journal of Integrative Agriculture 2019, 18(12): 2883–2897
in this study were higher than GFSAD30. This result was obtained because we used image time series with 15-day temporal resolution to identify cropland, but GFSAD30 is a cropland extent map at a global scale where two-month composition data were used for cropland mapping, so the higher temporal resolution data used in this study generated more accurate cropland maps. Fig. 5 gives the cropland extent maps for the four study regions. In GT, croplands were mainly distributed in the southeast of the region; in KS, PUN and SHA, croplands are the major landcover type. In KS and GT, the crop fields were medium and large farm plots, with boundaries of the crop fields clearly described (Fig. 5). In SHA and PUN, farmland was dotted amongst villages, which is clearly depicted in the cropland extent map. The accurate cropland maps generated in this study can form a baseline for crop intensity mapping (Li et al. 2017).
4.3. Crop intensity Table 5 shows the classification accuracies of crop intensity mapping using NDVI and EVI time series. Fitted NDVI and EVI time series showed potential to map crop cycles as OAs, PAs and UAs were higher than 85% in most cases. The NDVI time series slight outperformed the EVI Table 5
PUN
SHA
1)
5. Discussion This study used HLS data to identify crop intensity at 30-m resolution. To overcome the drawbacks of currently existing 30-m resolution crop intensity mapping studies which fuse MODIS and Landsat data to generate 30-m image time series (Wu et al. 2015; Li et al. 2017), this study used Landsat-8 and Sentinel-2 data directly to generate 30-m image time series with high temporal frequency. The challenge of cropland extent/intensity mapping is the “irregular time
Crop intensity mapping accuraccy1)
Study region2) KS
GT
time series, as EVI misclassified more double crop cycle pixels as single crop cycle. For example, PA of single crop cycle generated from NDVI time series was 95.4%, which was 4% higher than EVI result; UA of double crop cycle generated from NDVI time series was 80.3 and 93.9% in KS and GT, which were around 6 and 10% higher than EVI result, respectively. Fig. 6 showed the spatial distribution of single and double crop cycle fields. Crop fields in GT primarily have single growing season, and cropland in PUN generally have double growing seasons. KS and SHA contain both single and double crop cycle fields. In SHA, double crop cycle fields are mainly located in the plain area of the study region while single crop cycle fields are mainly in the mountain regions.
VI used3) NDVI
OA (%) 94.4
Kappa 0.9066
EVI
92.7
0.8788
NDVI
98.1
0.9439
EVI
97.1
0.9202
NDVI
98.2
0.9584
EVI
98.1
0.9439
NDVI
93.1
0.8817
EVI
92.6
0.8732
PA (%) 93.7 95.4 92.9 93.9 91.9 91.2 98.5 95.7 100 97.3 95.7 100 98.6 100 98.0 98.6 94.7 98.0 87.9 86.9 97.8 88.2 84.3 97.4
UA (%) 98.3 94.9 80.3 97.6 93.7 74.8 99.1 94.2 93.9 99.3 90.9 83.8 95.4 95.0 99.4 95.5 94.7 99.2 94.9 84.7 94.3 95.0 82.6 93.9
OA, verall accuracy; PA, producer’s accuracy; UA, user’s accuracy. KS, Kansas, USA; GT, Gaoteng, South Africa; PUN, Punjab, India; SHA, Shandong, China. 3) VI, vegetation index; NDVI, normalized difference vegetation index; EVI, enhanced vegetation index. 2)
Crop intensity classes Non Single Double Non Single Double Non Single Double Non Single Double Non Single Double Non Single Double Non Single Double Non Single Double
2892
HAO Peng-yu et al. Journal of Integrative Agriculture 2019, 18(12): 2883–2897
A1
KS
A2
GT
A3
PUN
A4
B1
B2
B3
B4
C1
C2
C3
C4
D1
D2
D3
D4
Non-cropland
Single crop cycle
SHA
Double crop cycle
Fig. 6 Wall-to-wall crop intensity maps comparison. A1–A4, normalized difference vegetation index (NDVI) generated crop intensity maps of the four study regions. B1–B4, enhanced vegetation index (EVI) generated crop intensity maps of the four study regions. C1–C4, zoom-in NDVI crop intensity maps, the red rectangles in row (A) are the extent of the zoom-in images. D1–D4, zoom-in EVI crop intensity maps. KS, Kansas, USA; GT, Gaoteng, South Africa; PUN, Punjab, India; SHA, Shandong, China.
series” problems caused by cloud cover and image noise. This study used a sixth polynomial function to fit the NDVI and EVI curves and thus fill the “missing value” gaps to reduce the influence of image noise. The date of NDVI and EVI peaks were also calculated using the first derivative of the polynomial function. The workflow proposed in this work allows the use of image time series with high temporal resolution to generate crop intensity maps at 30-m resolution. Comparing with existing 30-m crop intensity mapping works, the proposed workflow does not need to fuse MODIS and Landsat data with reduced uncertainty during the data fusion procedure. In addition, existing crop intensity methods rely on the cropland extent map products; however, most 30-m global level cropland extents maps have high uncertainty because, when producing these cropland maps, long period image composition methods (such as two-month or longer composition periods) fail to describe the different cropland and natural vegetation seen (Gong et al. 2013). In this study, we generated 15-day NDVI and EVI time series from the fitted NDVI and EVI curves, meaning the image time series generated with high temporal resolution have better potential to generate accurate cropland extent maps.
A
B
C
D
Available observation 0–5 5–10 10–15 15–20 20–25 25–30 30–35
Fig. 7 Number of cloud-free observations acquired. A, Kansas, USA. B, Gaoteng, South Africa. C, Punjab, India. D, Shandong, China.
Although curve fitting methods and HLS data showed good potential to map crop intensity, there remain some issues with the application of the method. Firstly, cloud-free image availability of HLS is an issue as the sixth polynomial
HAO Peng-yu et al. Journal of Integrative Agriculture 2019, 18(12): 2883–2897
NDVI
1.0
2893
EVI
A
B
adj-R2
0.8 0.6 Kansas 0.4 0.2 0 1.0
0
10
20
30
40
0
C
10
20
30
40
D
adj-R2
0.8 0.6 Gaoteng
0.4 0.2 0 1.0
0
10
20
30
0
E
10
20
30
F
adj-R2
0.8 0.6
Punjab
0.4 0.2 0 1.0
0
10
20
30
40
0
G
10
20
30
40
H
adj-R2
0.8 0.6
Shandong
0.4 0.2 0
0
10
20 30 40 Available observation
0
10
20 30 40 Available observation
Fig. 8 Scatter plots of cloud-free observation and adjusted R-square (adj-R2) values. A and B, normalized difference vegetation index (NDVI) and enhanced vegetation index (EVI) results in Kansas, USA. C and D, NDVI and EVI results in Gaoteng, South Africa. E and F, NDVI and EVI results in Punjab, India. G and H, NDVI and EVI results in Shandong, China. Blue dots means the number of pixels is low and red dots means the number of pixels is high.
2894
HAO Peng-yu et al. Journal of Integrative Agriculture 2019, 18(12): 2883–2897
study. These thresholds are artificially defined. Although the thresholds could effectively remove spurious NDVI and EVI peaks in the two study regions, they should be further estimated and may be changed in other study regions. In addition, the growing season in different regions are different and the beginning point of the growing seasons are artificially defined here, so further work should focus on automatically detecting the beginning point the growing season for each pixel. Thirdly, this study used both NDVI and EVI and found that EVI mislabeled some double crop cycle pixels as single cycles. Basically, a drawback of the polynomial function is that VI peaks are underestimated and VI valleys are overestimated. Some EVI peaks are very low and so similar to the spurious peaks (Fig. 9); therefore, although the fitted EVI time series curves had higher R2 and adj-R2 values, classification accuracy of the EVI time series is low due to misclassification of double crop cycles. As a result, NDVI time series is recommended when identifying crop intensity using polynomial fitting methods.
function was used to fit NDVI and EVI curves; six cloudfree observation for each pixel were essential at the least. Although the harmonization of Landsat-8 and Sentinel-2 data has improved the temporal resolution of 30-m image time series, the number of cloud-free observations were still less than six in some regions (Fig. 7). In these regions, the polynomial functions failed to fit the curve. In addition, image acquisition influences curve-fitting performance and pixels with more acquisition generally have a higher R2 value. Fig. 8 shows that, in GT, most pixels acquired 15–25 cloudfree observations, with R2 values generally higher than 0.6; in PUN, most pixels acquired 15–40 cloud-free observations, with R2 values higher than 0.65. Besides the availability of cloud-free observations, R2 values of curve fitting are also affected by the image noise and temporal phases when acquiring images affect the curve-fitting performance; for example, if the cloud-free observations were mostly acquired in the beginning of the growing season, the fitted curve may poorly describe the VIs in other growing stages. We used HLS data from 2015, 2016 and early 2017 in this study, which is generated using Landsat-8 and Sentinel2A data. As Sentinel-2B data have been available since March 2017, the temporal resolution of HLS 30-m surface reflectance could further improve. Sentinel-2 images were primarily collected over Europe and Africa, so the skipped image acquisition may reduce data availability in other regions. Secondly, the expendability of the workflow should be further improved. For the threshold of VI peaks, NDVI and EVI peaks were defined as 0.5 and 0.35, respectively, in this
A
6. Conclusion This study used HLS data to identify crop intensity and reached the following conclusions: (1) Among the four study regions used in this study, HLS data have high image availability as most pixels have 15–40 cloud-free observations during the growing season. (2) Generally, HLS data have a high temporal resolution, which could support the sixth polynomial function curve fitting, as the average
B
1.0
0.2 0 50
EVI
C
NDVI
0.4
1.0 0.8
0.6 R2=0.8507 adj-R2=0.77 100 150 200 250 Time series length (d)
0.6 0.4 0.2 0 50
300 D
1.0
0.8
0.6
0.6
0.4 0.2 0 50
R2=0.8467 adj-R2=0.7759 100 150 200 250 Time series length (d)
0.4 0.2
300
R2=0.8513 adj-R2=0.7858 100 150 200 250 Time series length (d)
300
1.0
0.8 EVI
NDVI
0.8
0 50
R2=0.8572 adj-R2=0.7769 100 150 200 250 Time series length (d)
300
Fig. 9 Time series and fitted curves of double crop cycle pixels that normalized difference vegetation index (NDVI) identified correctly but enhanced vegetation index (EVI) misclassified. A and B, NDVI time series of the two example pixels. C and D, EVI time series of the corresponding pixels.
HAO Peng-yu et al. Journal of Integrative Agriculture 2019, 18(12): 2883–2897
adj-R2 values were higher than 0.6. (3) Cropland extent is accurately identified using the fitted 15-day NDVI and EVI time series as OA of cropland identification is higher than 95% in the four study regions, which is also the baseline of crop intensity mapping. (4) Both the NDVI and EVI time series showed good potential to map crop intensity as the OAs, PAs and UAs of non-cropland, single crop cycle and double crop cycle lands are higher than 85%. In addition, NDVI outperformed EVI for crop intensity mapping in this study as the fitted VI curves of double crop cycle fields underestimated the VI peak, while some fitted EVI peaks are similar to spurious peaks leading to misclassification. Future work will focus on merging data from more sources to improve the temporal resolution of the 30-m image time series, as well as evaluating the application of the proposed method in more study regions and conducting further testing on the threshold for removing spurious VI peaks.
Acknowledgements This research was supported by the China Postdoctoral Science Foundation (2017M620075 and BX201700286) and the National Natural Science Foundation of China (NSFC-61661136006).
References Anindita S, Sucharita S, Animesh K. 2009. Rice-wheat cropping cycle in Punjab: A comparative analysis of sustainability status in different irrigation systems. Environment, Development and Sustainability, 11, 751–763. Bajzelj B, Richards K S, Allwood J M, Smith P, Dennis J S, Curmi E, Gilligan C A. 2014. Importance of food-demand management for climate mitigation. Nature Climate Change, 4, 924–929. Boryan C G, Yang Z, Willis P, Di L. 2017. Developing crop specific area frame stratifications based on geospatial crop frequency and cultivation data layers. Journal of Integrative Agriculture, 16, 312–323. Breiman L. 2001. Random forests. Machine Learning, 45, 5–32. Breiman L, Cutler A, Liaw A, Wiener M. 2013. Breiman and cutler’s random forests for classification and regression. [2018-12-09]. http://math.furman.edu/~dcs/courses/ math47/R/library/randomForest/html/00Index.html Challinor A J, Parkes B, Ramirez-Villegas J. 2015. Crop yield response to climate change varies with cropping intensity. Global Change Biology, 21, 1679–1688. Congalton R G. 1991. A review of assessing the accuracy of classifications of remotely sensed data. Remote Sensing of Environment, 37, 35–46. Conrad C, Schonbrodt-Stitt S, Low F, Sorokin D, Paeth H. 2016. Cropping intensity in the aral sea basin and its dependency from the runoff formation 2000–2012. Remote
2895
Sensing, 8, 630. ESA (European Space Agency). 2016. Sentinel-2 for agriculture. [2018-10-28]. http://www.esa-sen2agri.org/ SitePages/sentinel2.aspx Estel S, Kuemmerle T, Levers C, Baumann M, Hostert P. 2016. Mapping cropland-use intensity across Europe using MODIS NDVI time series. Environmental Research Letters, 11, 024015. Gong P, Wang J, Yu L, Zhao Y C, Zhao Y Y, Liang L, Niu Z G, Huang X M, Fu H H, Liu S, Li C C, Li X Y, Fu W, Liu C X, Xu Y, Wang X Y, Cheng Q, Hu LY, Yao W B, Zhang H, et al. 2013. Finer resolution observation and monitoring of global land cover: First mapping results with Landsat TM and ETM+ data. International Journal of Remote Sensing 34, 2607–2654. Gray J, Friedl M, Frolking S, Ramankutty N, Nelson A, Gumma M K. 2014. Mapping Asian cropping intensity with MODIS. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 7, 3373–3379. Gumma M K, Thenkabail P S, Teluguntla P, Oliphant A J, Xiong J, Congalton R G, Yadav K, Phalke A, Smith C. 2017. NASA making earth system data records for use in research environments (MEaSUREs) global food security-support analysis data (GFSAD) @ 30-m for South Asia, Afghanistan and Iran: cropland extent product (GFSAD30SAAFGIRCE). [2018-10-28]. https://lpdaac.usgs.gov/dataset_discovery/ measures/measures_products_table/gfsad30saafgirce_ v001 Gupta R K. 1993. Scatterograms behaviour for AVHRR vegetation images of the crop growth cycle. International Journal of Remote Sensing, 14, 75–93. Hao P, Wang L, Niu Z, Aablikim A, Huang N, Xu S, Chen F. 2014. The potential of time series merged from Landsat-5 TM and HJ-1 CCD for crop classification: A case study for Bole and Manas counties in Xinjiang, China. Remote Sensing, 6, 7610–7631. Hao P Y, Zhan Y L, Wang L, Niu Z, Shakir M. 2015. Feature selection of time series MODIS data for early crop classification using random forest: A case study in Kansas, USA. Remote Sensing, 7, 5347–5369. Huang Y, Chen Z X, Yu T, Huang X Z, Gu X F. 2018. Agricultural remote sensing big data: Management and applications. Journal of Integrative Agriculture, 17, 1915–1931. Huete A, Didan K, Miura T, Rodriguez E P, Gao X, Ferreira L G. 2002. Overview of the radiometric and biophysical performance of the MODIS vegetation indices. Remote Sensing of Environment, 83, 195–213. Immitzer M, Vuolo F, Atzberger C. 2016. First experience with Sentinel-2 data for crop and tree species classifications in central Europe. Remote Sensing, 8, 166. Jain M, Mondal P, DeFries R S, Small C, Galford G L. 2013. Mapping cropping intensity of smallholder farms: A comparison of methods using multiple sensors. Remote Sensing of Environment, 134, 210–223. Jeong S J, Ho C H, Gim H J, Brown M E. 2011. Phenology shifts
2896
HAO Peng-yu et al. Journal of Integrative Agriculture 2019, 18(12): 2883–2897
at start vs. end of growing season in temperate vegetation over the Northern Hemisphere for the period 1982–2008. Global Change Biology, 17, 2385–2399. Kovalskyy V, Roy D P. 2013. The global availability of Landsat 5 TM and Landsat 7 ETM+ land surface observations and implications for global 30 m Landsat data product generation. Remote Sensing of Environment, 130, 280–293. Li J, Roy D. 2017. A global analysis of Sentinel-2A, Sentinel2B and Landsat-8 data revisit intervals and implications for terrestrial monitoring. Remote Sensing, 9, 902. Li L, Friedl M A, Xin Q C, Gray J, Pan Y Z, Frolking S. 2014. Mapping crop cycles in China using MODIS-EVI time series. Remote Sensing, 6, 2473–2493. Li L, Zhao Y L, Fu Y C, Pan Y Z, Yu L, Xin Q C. 2017. High resolution mapping of cropping cycles by fusion of landsat and MODIS data. Remote Sensing, 9, 1232. Liu Z, Wu C, Liu Y, Wang X, Fang B, Yuan W, Ge Q. 2017. Spring green-up date derived from GIMMS3G and SPOTVGT NDVI of winter wheat cropland in the North China Plain. ISPRS Journal of Photogrammetry & Remote Sensing, 130, 81–91. Maselli F, Gilabert M A, Conese C. 1998. Integration of high and low resolution NDVI data for monitoring vegetation in Mediterranean environments. Remote Sensing of Environment, 63, 208–218. Masialeti I, Egbert S, Wardlow B D. 2010. A comparative analysis of phenological curves for major crops in Kansas. Giscience & Remote Sensing, 47, 241–259. Mueller N D, Rhines A, Butler E E, Ray D K, Siebert S, Holbrook N M, Huybers P. 2017. Global relationships between cropland intensification and summer temperature extremes over the last 50 years. Journal of Climate, 30, 7505–7528. NASA (National Aeronautics and Space Administration). 2017. Harmonized Landsat-8 and Sentinel-2. [2018-11-12]. https:// hls.gsfc.nasa.gov/ Piao S, Fang J, Zhou L, Philippe C, Zhu B. 2010. Variations in satellite-derived phenology in China’s temperate vegetation. Global Change Biology, 12, 672–685. Qiu B W, Zou F L, Chen C C, Tang Z H, Zhong J P, Yan X F. 2018. Automatic mapping afforestation, cropland reclamation and variations in cropping intensity in central east China during 2001–2016. Ecological Indicators, 91, 490–502. Rodell M, Velicogna I, Famiglietti J S. 2009. Satellite-based estimates of groundwater depletion in India. Nature, 460, 999. Rouse J W, Haas R H, Schell J A, Deering D W, Harlan J C. 1974. Monitoring the Vernal Advancements and Retrogradation of Natural Vegetation. NASA’s Goddard Space Flight Center, USA. pp. 1–137. Siebert S, Portmann F T, Döll P. 2010. Global patterns of cropland use intensity. Remote Sensing, 2, 1625. Song Q, Zhou Q B, Wu W B, Hu Q, Lu M, Liu B. 2017. Mapping regional cropping patterns by using GF-1 WFV sensor data. Journal of Integrative Agriculture, 16, 337–347. Stumpf A, Michea D, Malet J P. 2018. Improved co-registration of Sentinel-2 and Landsat-8 imagery for earth surface
motion measurements. Remote Sensing, 10, 160. Tatsumi K. 2016. Cropping intensity and seasonality parameters across Asia extracted by multi-temporal SPOT vegetation data. Journal of Agricultural Meteorology, 72, 142–150. Teluguntla P, Thenkabail P S, Oliphant A, Xiong J, Gumma M K, Congalton R G, Yadav K, Huete A. 2018. A 30-m Landsatderived cropland extent product of Australia and China using random forest machine learning algorithm on Google Earth Engine cloud computing platform. ISPRS Journal of Photogrammetry and Remote Sensing, 144, 325–340. Teluguntla P, Thenkabail P S, Xiong J, Gumma M K, Congalton R G, Oliphant A J, Sankey T, Poehnelt J, Yadav K, Massey R, Phalke A, Smith C. 2017. NASA Making Earth System Data Records for Use in Research Environments (MEaSUREs) Global Food Security-support Analysis Data (GFSAD) Cropland Extent 2015 Australia, New Zealand, China, Mongolia (GFSAD30AUNZCNMOCE). [2018-10-28]. https://e4ftl01.cr.usgs.gov/MEASURES/ GFSAD30AUNZCNMOCE.001/ Thenkabail K, Ozdogan M, Congalton M K, Wu Z, You S, Milesi C, Giri C. 2012. Assessing future risks to agricultural productivity, water resources and food security: How can remote sensing help? Photogrammetric Engineering & Remote Sensing, 78, 773–782. Tilman D, Balzer C, Hill J, Befort B L. 2011. Global food demand and the sustainable intensification of agriculture. Proceedings of the National Academy of Sciences of the United States of America, 108, 20260–20264. USGS (U.S. Geological Survey). 2018. Global food securitysupport analysis data at 30 m. [2018-10-28]. https:// geography.wr.usgs.gov/science/croplands/index.html Van der Meer F D, van der Werff H M A, van Ruitenbeek F J A. 2014. Potential of ESA’s Sentinel-2 for geological applications. Remote Sensing of Environment, 148, 124–133. Wang Q, Blackburn G A, Onojeghuo A O, Dash J, Zhou L, Zhang Y, Atkinson P M. 2017. Fusion of landsat 8 OLI and Sentinel-2 MSI data. IEEE Transactions on Geoscience and Remote Sensing, 55, 3885–3899. Wang Z H, Liao R K, Lin H, Jiang G J, He X L, Wu W Y, Zhang L L Z. 2018. Effects of drip irrigation levels on soil water, salinity and wheat growth in North China. International Journal of Agricultural & Biological Engineering, 11, 146–156. Wardlow B D, Egbert S L. 2008. Large-area crop mapping using time-series MODIS 250 m NDVI data: An assessment for the U.S. Central Great Plains. Remote Sensing of Environment, 112, 1096–1116. Wu M, Li H, Huang W, Niu Z, Wang C. 2015. Generating daily high spatial land surface temperatures by combining ASTER and MODIS land surface temperature products for environmental process monitoring. Environmental Science Processes & Impacts, 17, 1396–1404. Xiong J, Thenkabail P, Tilton J, Gumma M, Teluguntla P, Oliphant A, Congalton R, Yadav K, Gorelick N. 2017a. Nominal 30-m cropland extent map of continental Africa by
HAO Peng-yu et al. Journal of Integrative Agriculture 2019, 18(12): 2883–2897
integrating pixel-based and object-based algorithms using Sentinel-2 and Landsat-8 data on google earth engine. Remote Sensing, 9, 1065. Xiong J, Thenkabail P S, Tilton J C, Gumma M K, Teluguntla P, Congalton R G, Yadav K, Dungan J, Oliphant A J, Poehnelt J, Smith C, Massey R. 2017b. NASA Making Earth System Data Records for Use in Research Environments (MEaSUREs) Global Food Security-support Analysis Data (GFSAD) Cropland Extent 2015 Africa (GFSAD30AFCE). [2018-10-28]. https://e4ftl01.cr.usgs.gov/MEASURES/ GFSAD30AFCE.001/ Yu L, Shi Y, Gong P. 2015. Land cover mapping and data
2897
availability in critical terrestrial ecoregions: A global perspective with Landsat thematic mapper and enhanced thematic mapper plus data. Biological Conservation, 190, 34–42. Zhang G L, Dong J W, Zhou C P, Xu X L, Wang M, Ouyang H, Xiao X M. 2013. Increasing cropping intensity in response to climate warming in Tibetan Plateau, China. Field Crops Research, 142, 36–46. Zhao Y, Bai L Y, Feng J Z, Lin X S, Wang L, Xu L J, Ran Q Y, Wang K. 2016. Spatial and temporal distribution of multiple cropping indices in the north china plain using a long remote sensing data time series. Sensors, 16, 557.
Executive Editor-in-Chief ZHANG Wei-li Managing editor SUN Lu-juan