Comprehensive accuracy assessment of MODIS daily snow cover products and gap filling methods

Comprehensive accuracy assessment of MODIS daily snow cover products and gap filling methods

ISPRS Journal of Photogrammetry and Remote Sensing 144 (2018) 435–452 Contents lists available at ScienceDirect ISPRS Journal of Photogrammetry and ...

6MB Sizes 0 Downloads 61 Views

ISPRS Journal of Photogrammetry and Remote Sensing 144 (2018) 435–452

Contents lists available at ScienceDirect

ISPRS Journal of Photogrammetry and Remote Sensing journal homepage: www.elsevier.com/locate/isprsjprs

Comprehensive accuracy assessment of MODIS daily snow cover products and gap filling methods

T



James Coll, Xingong Li

Department of Geography and Atmospheric Science, University of Kansas, Lawrence KS 66045, United States

A R T I C LE I N FO

A B S T R A C T

Keywords: MODIS Snow Validation SNOTEL

The accuracy of the standard Moderate Resolution Imaging Spectroradiometer (MODIS) daily snow cover products (Collection 5) and several of the most common and computationally frugal gap-filling methods were validated using 12 years of daily observations from over 800 Snow Telemetry (SNOTEL) stations. While several factors affect snow cover accuracy to some extent, the largest controls are associated with land cover type; accuracy fell from the maximum in croplands (94.9%) to the minimum for observations with a land cover type of water (79.5%). The second largest impact on accuracy is attributed to the changes in solar zenith angle, where accuracy decreased from a maximum of ∼96% at 42° to a minimum of ∼84% at 58°. Based on the results of this work, the highest accuracy binary daily snow cover dataset can be achieved by reclassifying the fractional snow band using a no-snow/snow threshold of 10 from the MODIS Terra sensor. Gap-filling is best accomplished by a temporal filter on the same dataset, and the window length depends on the desired completeness of the final dataset and the amount of missing values in the area of interest. Decreases in accuracy caused by this gap-filling method taper off after 7 days. Therefore, if a larger temporal window is needed to adequately fill in a time series it is best to extend the window to the largest gap present.

1. Introduction Snow cover plays a pivotal role in global water and energy budgets and can be a significant economic benefit or detriment depending on when and where it falls. Its presence or absence has significant bearing on the nature and magnitude of fluxes within the water cycle, and the snow cover state can vary quite drastically in both time and space. To quantify the current and past state of the cryosphere in a global context, remote sensing offers a practical means to do so. Of the available instruments, the Moderate Resolution Imaging Spectroradiometer (MODIS) instruments on the Aqua and Terra satellites are uniquely suited to this task due in parts to their duality, comparably high spatial resolution of 500 m, global coverage, and high temporal resolution of approximately-one day in most snow-covered regions. To discriminate between snow and no snow land cover states, the MODIS daily snow product uses bands 4 (0.545–0.565 µm) and 6 (1.628–1.652 µm) to calculate a Normalized Difference Snow Index (NDSI). The band 6 failure on MODIS Aqua satellite results in the use of band 7 (2.105–2.155 µm) in the MYD products in Collection 5. In addition to NDSI, a normalized difference vegetation index is also used, and a variety of other thresholds and masks are applied and are described in the product user’s guide (Riggs et al., 2006). For simplicity, ⁎

we can reduce the factors which affect the accuracy of these snow cover products into illumination angle and reflectance of the surface. These include the fractional snow cover of a pixel, the land cover state, the temporal variation of these properties at yearly and monthly scales, and sun-sensor geometry. Previous studies have examined these factors in some detail. As of July 2016, the MOD10 products have been validated to stage 2, meaning its accuracy has been validated across a representative number of spatial and temporal domains using other reference data (“EOS Val Status for Snow Cover/Sea Ice: MOD10/29,” n.d.). Efforts from the community place the traditional accuracy of MODIS snow cover products between 85 and 95% depending on the study area, season, and method of validation (Ault et al., 2006; Bitner et al., 2002; Déry et al., 2005; Gao et al., 2011; Hall and Riggs, 2007; Klein and Barnett, 2003; Maurer et al., 2003; Parajka and Blöschl, 2006). However, in order to push the collection into stage 3 validation, the quantification of the uncertainties in these accuracies need to be determined. As is the case with many remote sensing products, one of the primary drawbacks of the MODIS sensor arises from its inability to identify the land cover state through cloud cover, which can at any given time cover more than 70% of the globe (Wylie et al., 2005). Combined with other conditions that interfere with the detection of snow cover (e.g.

Corresponding author. E-mail address: [email protected] (X. Li).

https://doi.org/10.1016/j.isprsjprs.2018.08.004 Received 31 March 2018; Received in revised form 31 July 2018; Accepted 1 August 2018 0924-2716/ © 2018 International Society for Photogrammetry and Remote Sensing, Inc. (ISPRS). Published by Elsevier B.V. All rights reserved.

ISPRS Journal of Photogrammetry and Remote Sensing 144 (2018) 435–452

J. Coll, X. Li

depth is almost identical to a 1-meter snowpack. Additionally, at the scale of the MODIS pixel (500 m), the snow pillow is effectively a point. These sites are also located in open, flat clearings in hydrologically relevant catchments which may not necessarily be representative of the pixel MODIS detects. Nevertheless, this is the most widely used means of validating MODIS snow cover products, and the SWE measurement is more reliable than snow depth sensor (Bitner et al., 2002; Klein and Barnett, 2003; Parajka and Blöschl, 2006). The data was retrieved from the United States Department of Agriculture’s Natural Resources Conservation Services website (“NRCS National Water and Climate Center | SNOTEL Data & Products,” n.d.) for all available SNOTEL stations between 2002/10/01 and 2014/9/30. In Fig. 1, several latitude bands are defined for use later in the validation. The mean length of polar night is also included to aid users in determining how a location is affected by this persistent, gap-inducing condition.

night and detector malfunction), a snow cover time series for any given location may have many missed observations, or “gaps” which hinder the overall utility and accuracy of a timeseries of the product. For example, in this study more than half (52.6% for MODIS Terra) of the potential observations were lost due to missing observations. To rectify these shortcomings, previous studies have employed a variety of cloud removal techniques to fill in those gaps. The most commonly used methodologies usually involve a combination of different MODIS sensors and the use of a temporal autocorrelation filter. The desire for a gap-filled dataset is so strong that the M*D10A1 collection 6 data products provide a gap-filled dataset that relies at least partially on temporal autocorrelation (Hall et al., 2010; Riggs et al., 2016). While these gap-filled datasets are often employed in studies, their accuracy is underexplored and, when reported, use just a handful of ground truth observations. This makes direct comparisons between these different methods difficult as it is unclear how the various gapfilled datasets compare either to each other, to the standard MODIS snow cover dataset, or to ground truth. See Appendix B for a summary table of previous validation efforts based on Table 9 from chapter 9 of Multiscale Hydrologic Remote Sensing. Underlying this gap filling process is the tradeoff between complexity and accuracy. Although more complex gap filling methods have been developed, for rapid data access platforms (such as Google Earth Engine or Amazon) have several M*D10 datasets preloaded into them), if the resources needed to generate such datasets create computational or methodological barriers for minimal gains, the cost to do so outweighs the benefits (Andreadis and Lettenmaier, 2006; Liang et al., 2008; Parajka et al., 2010; Thompson and Lees, 2014). To address these deficiencies, this research undertakes a rigorous validation of the MODIS daily snow cover products, and several common gap-filling methods using 12 years of daily snow observations from 819 Snow Telemetry (SNOTEL) stations distributed across the western United States and Alaska with more than 3.2 million potential ground truth observations. With such a large and varied validation dataset, this work presents one of the largest and most comprehensive validation of MODIS Terra and Aqua daily snow cover products (M*D10A1 Collection 5) and the most commonly employed gap-filling methods.

2.3. Land cover data To examine the impact of land cover type on snow cover accuracy, observations were given a land cover classification based on the International Geosphere‑Biosphere Programme classification as provided by the MODIS MCD12Q1 dataset produced yearly for 2001–2012. Land cover types were retrieved for the SNOTEL stations and joined to the snow cover observations based on the calendar year that covered the bulk of a hydrologic year (October 1 to September 30 of the next year), i.e. snow cover observations from hydrological year 2003 (2002/ 10/01–2003/09/30) were assumed to be associated with the land cover types of calendar year of 2003 (2003/01/01–2003/12/31). The land cover classification scheme, as well as their count percentage with the validation observations, are shown in Table 1. More information on MCDQ12Q1 can be found in the user’s guide (“MCD12C1 | LP DAAC :: NASA Land Data Products and Services,” n.d.) and in Friedl et al. (2010). 2.4. Sensor data To understand the impact of sensor-sun geometry on snow cover accuracy, the MODIS daily surface reflectance datasets (M*D09GA), which contain sensor properties, was extracted and joined with the daily snow cover data (M*D10A1). More on the sensor property data can be found in the product user’s guide (Vermote et al., 2011).

2. Data 2.1. Snow cover data

3. Methods The snow cover data used for this validation comes from the MODIS daily snow cover products M*D10A1 (Collection 5) between 2002/10/ 01 and 2014/09/30. Daily MODIS values at the pixels which contain SNOTEL stations are extracted and, when applicable, reclassified based on a threshold determined experimentally (see Section 4.1). For the purposes of this study, water and ocean observations are counted as a valid land cover observation, and ice is considered snow. More details about the MODIS snow cover datasets can be found in the product user’s guide to Collection 5 (Riggs et al., 2006).

3.1. Cloud removal techniques To assess the accuracy of the standard daily snow cover products, their combinations, and various gap-filling methods, three levels of datasets were designated ordered roughly on the amount of processing required to create them. Level 1 datasets are comprised of the products from a single MODIS sensor, either Terra or Aqua, and include the standard Snow-CoverDaily-Tile band (hereafter referred to as MOD and MYD respectively), or a product that combines the Fractional-Snow-Cover band with the Snow-Cover-Daily-Tile band (hereafter referred to as F-MOD and FMYD respectively). To create F-M*D products, the Fractional-SnowCover band is reclassified into a no-snow/snow/missing state based on an experimentally determined threshold (see Section 4.1). Preference is given to the Fractional-Snow-Cover band and the value in the SnowCover-Daily-Tile band is carried through only when the FractionalSnow-Cover band has a missing value, but the Snow-Cover-Daily-Tile band has a valid value. This is a rare occurrence, happening just 855 times within the timeframe, and the accuracy results reported for FM*D products would be virtually identical to a dataset created from the fractional band alone. Therefore, this step may be skipped if desired. Level 2 datasets include the combinations of level 1 datasets from

2.2. Ground snow observations The ground truth dataset used in this research comes from the SNOTEL network which consists of over 800 sites distributed across 12 states in the western United States including Alaska, shown in Fig. 1. These stations are generally located in high elevation and remote watersheds and collect data which is used for a variety of purposes including flood prediction and climatic research. Among other hydrologic variables, these sites measure daily snow water equivalents (SWE), reported daily with 0.1 in. precision (“NRCS National Water and Climate Center | Home”, n.d.). The use of SNOTEL SWE as a surrogate for snow depth is less than ideal. This is because MODIS only measures the presence or absence of snow; radiometrically a snowpack of 10-meters 436

ISPRS Journal of Photogrammetry and Remote Sensing 144 (2018) 435–452

J. Coll, X. Li

Fig. 1. Distribution of the SNOTEL stations used in this study. Point color and size show the elevation and relative number of observations for that station respectively. Latitude bands are also shown and used later in this work. Code to recreate this image in Google Earth Engine can be found at https://code.earthengine. google.com/c8141da30952d22426e988b1cababada.

Level 3 datasets are created by taking the levels 1 and 2 datasets and applying a temporal filter to further fill data gaps. This method has been implemented in several studies (Ault et al., 2006; Dietz et al., 2012; Gafurov and Bárdossy, 2009; Gao et al., 2011, 2010; Parajka and Blöschl, 2008; Tekeli et al., 2005; Wang and Xie, 2009; Xie et al., 2009), though the validation of such datasets is still underexplored. This filter works under the assumption of temporal autocorrelation, i.e., the state of ground cover is most likely related to its most recent valid observation (MRVO) in time. Based on the length of the temporal window, missing values in a time series are filled in with MRVO in the filter window, looking first backwards and then forwards in time. In this work, half window sizes tested range from 1 to 10 days, and a variable day filter which fills every missing value within the time series with the MRVO, regardless of the length of window. When a temporal filter is applied, the resulting dataset name is appended with a number indicating the half window length of the temporal filter applied, so for example MO/YD-SNOW-4 is a combination of MOD10A1 and MYD10A1 with preference given to snow observations, and a 4-day temporal filter is applied. For a variable day window filter, the dataset is appended with a “-V” instead. As an example. The F-MOD-V dataset is derived from the fractional band of MOD10A1 and has every missing observation filled in regardless of the number of days needed to find a valid observation.

Table 1 Number of validation observations (as a count and as a percentage) by land cover type. IGBP Classification

Count

Percentage

Water Evergreen Needleleaf forest Deciduous Needleleaf forest Deciduous Broadleaf forest Mixed forest Closed shrublands Open shrublands Woody savannas Savannas Grasslands Permanent wetlands Croplands Snow and ice Urban and built-up Cropland/Natural vegetation mosaic

10,590 1,032,310 17,897 1096 75,715 9862 53,398 141,276 31,573 823,638 7669 33,624 1095 3287 85,881

0.455 44.326 0.768 0.047 3.251 0.423 2.293 6.066 1.356 35.366 0.329 1.444 0.047 0.141 3.688

different satellites to fill in cloud gaps, as the satellites have different local overpass times and clouds may have moved in the interim. Additionally, two combination rules used commonly in the literature are tested. In the first method, dataset preference is given to valid observations from Terra, and the resulting datasets have “-MOD” appended to their names. The second combination gives preference to snow observations, regardless of the origin satellite, and the resulting dataset names are appended with “-SNOW”. For example, the F-MO/ YD-SNOW dataset is created by first taking the fractional bands from each of the satellites and combining them using the snow preference rule.

3.2. Accuracy metrics After joining the MODIS snow cover data to the SNOTEL SWE measurements, a modified confusion matrix is calculated, as defined in Table 2, following the same process used in previous validation studies (Gao et al., 2011, 2010; Hall and Riggs, 2007). As is the nature of this 437

ISPRS Journal of Photogrammetry and Remote Sensing 144 (2018) 435–452

J. Coll, X. Li

Table 2 The confusion matrix used for this accuracy assessment.

SNOTEL: Snow SNOTEL: No-snow

MODIS: Snow

MODIS: No-snow

MODIS: Missing

A C

B D

E F

Table 4 The accuracies of level 1 datasets with respect to land cover. The number of observations which comprise each type are reported in brackets.

venture, no one metric exists that can fully describe a confusion matrix. Therefore, three primary metrics, a clear sky accuracy, all sky accuracy, and the Matthews Correlation Coefficient, are chosen to compare the differences in accuracy among the datasets. Clear Sky Accuracy (AC), known as accuracy when using a traditional confusion matrix, is found by calculating the sum of all the correct observations divided by the sum of all valid observations. Although often reported, this metric is neither robust or resistant, and can occasionally imply a better classification performance than is reasonable, particularly when one class greatly outnumbers the other. It is defined as:

AC =

A+D A+B+C+D

(1)

All Sky Accuracy (AA) is a measure of accuracy which considers the number of days missed by a dataset and is defined as:

AA =

A+D A+B+C+D+E+F

(2)

As a metric its value is most appropriately compared among datasets of the same level. It can be a misleading metric when comparing products across different levels, because the products in a higher level will have likely have a higher AA. This is because missing observations, those in the E and F category in the modified confusion matrix, are penalized as being “wrong” when using this metric. Therefore, if one wished to improve the AA metric, this can be accomplished quite simply by giving those missing observations a random assignment, some of which would result in a correct classification. Matthews Correlation Coefficient (MCC) is a metric most often employed in machine learning as a balanced, robust, and resistant metric of the quality of binary classification even when one of the classes is much greater than the other. It returns a value between −1 and 1, with 1 meaning a perfect prediction, 0 interpreted as random, and −1 meaning total disagreement. It can be interpreted in the same manner as the Pearson's correlation coefficient and is defined as:

MCC =

AC

F-MOD

F-MYD

MOD

MYD

Water (3749) Evergreen needleleaf forest (473749) Deciduous needleleaf forest (7841) Mixed forest (31366) Closed shrublands (6481) Open shrublands (24616) Woody savannas (66169) Savannas (14085) Grasslands (398370) Permanent wetlands (3861) Croplands (17226) Crop/Natural vegetation mosaic (43270)

0.7887 0.9069 0.9257 0.9099 0.8852 0.9402 0.9238 0.9380 0.9366 0.9362 0.9404 0.9379

0.7883 0.8853 0.9017 0.8753 0.8693 0.9242 0.9007 0.9228 0.9197 0.9089 0.9359 0.9253

0.7903 0.8990 0.9213 0.9070 0.8776 0.9368 0.9200 0.9344 0.9317 0.9338 0.9331 0.9328

0.7664 0.8689 0.8981 0.8731 0.8682 0.9239 0.8973 0.9208 0.9180 0.9078 0.9352 0.9246

MCC Water (3749) Evergreen needleleaf forest (473749) Deciduous needleleaf forest (7841) Mixed forest (31366) Closed shrublands (6481) Open shrublands (24616) Woody savannas (66169) Savannas (14085) Grasslands (398370) Permanent wetlands (3861) Croplands (17226) Crop/Natural vegetation mosaic (43270)

0.5749 0.8091 0.8560 0.8005 0.6620 0.8788 0.8447 0.8742 0.8731 0.8677 0.8648 0.8688

0.5890 0.7547 0.8050 0.7166 0.6124 0.8414 0.7873 0.8357 0.8341 0.8094 0.8478 0.8307

0.5867 0.7953 0.8494 0.7956 0.6397 0.8734 0.8388 0.8684 0.8647 0.8639 0.8491 0.8590

0.5264 0.7197 0.7990 0.7074 0.6094 0.8414 0.7800 0.8318 0.8312 0.8063 0.8465 0.8291

deciding metric ultimately used to rank the tested datasets. This metric is invalid if one of the classes has no observations, as it sets the denominator to 0. More information on this metric can be found in Powers (2011). It is also worth noting that if the objective task does not require the most overall accurate dataset, for instance if errors of commission are extremely undesirable, a different metric is more appropriate, and these findings may not be applicable. A short example of the interplay of the full confusions matrix, the various metrics tested, and the sensitivity/ specificity is shown in Table 3. As seen, MCC and AC often contradict each other, particularly at the extremes of class prevalence (e.g. August); while AC is ∼0.993, MCC is ∼0.017; AC is unable to capture this A extreme class imbalance as is reflected in the sensitivity A + B metric.

(

)

4. Results

A∗D−B∗C (A + B ) ∗ (A + C ) ∗ (D + B ) ∗ (D + C )

4.1. Fractional band and SWE threshold

(3)

Because this metric considers prevalence and the magnitude of errors of commission and omission within a single value, it was the

In creating a binary dataset from a fractional band (i.e., the F- datasets), the fractional band must first be reclassified as either no snow

Table 3 Comparison of AC, AA, and MCC metrics using monthly data from the F-MOD product. Month

A

B

C

D

E

F

AC

AA

MCC

A A+B

D D+C

1 2 3 4 5 6 7 8 9 10 11 12

99,134 80,492 83,064 63,886 30,313 5538 171 8 1237 18,951 64,085 79,714

5473 3779 6284 13,287 21,423 11,733 2268 140 1033 11,432 16,762 6627

397 330 502 1004 1937 1441 790 1151 2136 5737 3620 985

1355 1655 4407 16,302 59,356 130,360 179,941 174,369 166,161 108,457 25,111 3292

163,031 159,109 172,667 152,530 98,668 25,731 1586 172 3415 44,574 125,803 175,798

1839 1827 4366 15,548 59,702 88,136 87,754 97,594 91,506 81,879 27,115 4859

0.9448 0.9524 0.9280 0.8487 0.7933 0.9116 0.9833 0.9927 0.9814 0.8812 0.8140 0.9160

0.3705 0.3323 0.3224 0.3054 0.3304 0.5168 0.6609 0.6377 0.6305 0.4701 0.3398 0.3060

0.3745 0.4869 0.5797 0.6420 0.6116 0.4693 0.1043 0.0170 0.4381 0.6209 0.6035 0.4707

0.9477 0.9552 0.9297 0.8278 0.5859 0.3207 0.0701 0.0541 0.5449 0.6237 0.7927 0.9232

0.7734 0.8338 0.8977 0.9420 0.9684 0.9891 0.9956 0.9934 0.9873 0.9498 0.8740 0.7697

438

ISPRS Journal of Photogrammetry and Remote Sensing 144 (2018) 435–452

J. Coll, X. Li

Fig. 2. Accuracies of level 1 datasets (MOD, F-MOD, MYD, and F-MYD) by year: Top: AC; Middle: AA; Bottom: MCC.

439

ISPRS Journal of Photogrammetry and Remote Sensing 144 (2018) 435–452

J. Coll, X. Li

Fig. 3. Accuracies of level 1 datasets (MOD, F-MOD, MYD, and F-MYD) by month: Top: AC; Middle: AA; Bottom: MCC.

440

ISPRS Journal of Photogrammetry and Remote Sensing 144 (2018) 435–452

J. Coll, X. Li

Fig. 4. Accuracies of the F-MOD dataset by latitude bands (see Fig. 1) and by month. Left: AC; Right: MCC. Additional figures for other level 1 datasets are available on the HydroShare platform through the link provided in Section 6.

pattern, with less than 20% of the stations having an MCC value less than 0.75 for MOD, and only 15% of the stations have an MCC value below 0.75 for F-MOD. Outliers were found to be the result of extreme imbalances in class distribution, with fewer than 50 occurrences of snow or had underperforming land cover types (identified later in this validation). Accuracies across different land cover types with more than 5000 observations are shown in Table 4. For all land cover classes other than water, the F-MOD dataset outperformed the other level 1 datasets. Additionally, because the MCD12Q1 land cover data is classified based partially on the amount of canopy cover, we can say that snow cover accuracies are generally lower for land cover types with large amounts of canopy which include closed shrublands, evergreen and deciduous needleleaf forests, and mixed forest. Accuracies are highest for croplands and open shrublands.

or snow and to do so a threshold must be determined. Because there is also some uncertainty as to what SWE value should be used to reclassify the SNOTEL observations as snow or no snow, the accuracy of a range of SWE thresholds was also examined. Threshold in this case means classifying pixels to a value of 0 (no snow) for all fractional snow values < t, and a value of 1 (snow) for all values ≥ t. A full accuracy table can be found in Appendix A. The strict interpretation of these results indicates a threshold of 10–12 mm should be used to reclassify SWE to no-snow/snow produces the highest MCC scores, this threshold does not represent a particularly useful ground state. Furthermore, the differences in MCC scores between 10 mm and 2 mm (which is a realistic/useful ground state) are only 0.0052. Therefore, for the remainder of this study SNOTEL stations were classified using 2 mm as the nosnow/snow threshold. 4.2. Geographic factors (station distribution and land cover type)

4.3. Annual and monthly variations To examine microclimate effects and identify outlying stations, the AC, AA, and MCC accuracies were calculated for all level 1 datasets at stations which had more than 350 observations. Generally, AC was found to be > 0.70 ± 0.05 for all stations. MCC also shows a similar

Because snow cover and sensor quality can vary through time as well as in space, yearly and monthly accuracies for all level 1 datasets are shown in Figs. 2 and 3 respectively. Despite variations in accuracy 441

ISPRS Journal of Photogrammetry and Remote Sensing 144 (2018) 435–452

J. Coll, X. Li

Fig. 5. Accuracies for each land cover type by month for the F-MOD dataset: Top: AC; Bottom: MCC. Additional figures for other level 1 datasets are available on the HydroShare platform through the link provided in Section 6.

Fig. 6. Cumulative accuracies of level 1 datasets (MOD, F-MOD, MYD, and F-MYD) across the range of sensor zenith angles: Top: AC; Bottom: MCC. 442

ISPRS Journal of Photogrammetry and Remote Sensing 144 (2018) 435–452

J. Coll, X. Li

Fig. 7. Cumulative accuracies for the F-MOD dataset with respect to sensor zenith under different land cover types: Left: AC; Right: MCC. Additional figures for other level 1 datasets are available on the HydroShare platform through the link provided in Section 6.

exhibits the same general pattern as the other bands. To further explore the effects of land cover on snow retrieval, the accuracies of land cover by month for the F-MOD dataset are presented in Fig. 5. As above, some land covers were omitted for some of the summer months due to a lack of observations which invalidates the metrics used. It is worth noting that water is a landcover type because a few of the SNOTEL stations are close enough to water bodies to be classified as water within the MCD12Q1 dataset.

by year, the relative accuracies between datasets were largely consistent, and only in few cases did one product preferentially improve over another for a given year. F-MOD has the highest AC and MCC across all years and in all but 3 months. Accuracies by month show clear seasonal difference, particularly when using the MCC metric. While AC is > 0.80 for nearly every month, the MCC metric falls to nearly 0 during summer months. July and August in particular show markedly less skill in classifying the snow state. However, the objective of looking at monthly accuracy is to capture the seasonal effects on snow classification. By using all stations in the same month to measure accuracy, we mask out any potential seasonal signal (i.e. May in Colorado is very different from May in Alaska). To capture this variation, the dataset is broken up into 5 different bands of latitude, from ∼32 to 35°, 35 to 40°, 40 to 45°, 45 to 50°, and 58 to 67°, as shown in Fig. 1. These coarser latitude bands ensure that no single station dominates the validation dataset, and we are able to capture any potential seasonal signal in these banded accuracies presented in Fig. 4. While the general shape observed is reminiscent of the accuracies by month presented in Fig. 3, the period of low accuracy in the summer months is shortened or lengthened in the extreme latitude bands. The accuracy of the 58–67° band is subject to slightly greater uncertainty due to a sample size restrictions, but

4.4. Sun-sensor factors (sensor and solar zenith angles) Because the reflectance is largely dictated by the illumination angle, the cumulative accuracies for sensor zenith using an increment of 0.5° for level 1 datasets are shown in Fig. 6. Accuracy metrics tend to decrease at increasing sensor zenith angles. Furthermore, datasets that use the fractional band (F- products) have higher accuracies than their standard counterparts. Additionally, MOD and MYD show very different behaviors, particularly at low sensor zenith angles. The effects of sensor zenith on different land cover types using a sensor zenith angle increment of 5 degrees, again using F-MOD as a template, are shown in Fig. 7. While sensor zenith had some impact on accuracy, difference in land cover type has a far more pronounced 443

ISPRS Journal of Photogrammetry and Remote Sensing 144 (2018) 435–452

J. Coll, X. Li

Fig. 8. Cumulative accuracies for the F-MOD dataset with respect to sensor zenith and month: Left: AC; Right: MCC. Additional figures for other level 1 datasets are available on the HydroShare platform through the link provided in Section 6.

than 350 observations and are otherwise set to zero. Accuracies are calculated in 1° increments for sensor zenith, and by 2° for solar zenith. As evident from the figure, accuracies between these two variables are additive, a behavior consistent across the level 1 datasets.

effect. For example, the difference in accuracy between evergreen needleleaf forests and grasslands is much larger than any difference across the sensor zenith range in either land cover type. Looking at the effects of sensor zenith for different months using the F-MOD dataset, presented in Fig. 8, we can visually confirm that the month of the observation has much more control over the accuracy than the sensor zenith angle, though sensor zenith angle exerts more influence on the resulting accuracy for some months than was the case for land cover type i.e., the MCC of MOD in July has a range of 0.18, which is as large as the range across land cover classes in Fig. 7. The effects of solar zenith on accuracy are binned on increments of 0.5° for those ranges with more than 5000 observations and are presented in Fig. 9. Finally, to examine the interplay between sensor and solar zenith, accuracies of the F-MOD dataset are presented as a surface in Fig. 10. Accuracies are presented only for those combinations which had more

4.5. Gap filling by combining different sensor data To assess the suitability of satellite combination as a gap-filling method, the accuracies of all level 1 & 2 datasets are compared. As shown in Fig. 11, by combining satellites, AA was improved over all level 1 datasets. However, both AC and MCC were higher for the level 1 datasets than those of the level 2 data. As a rule, using MOD as the preferential observation produces a higher AC and MCC value than using snow as the preferential observation, and incorporating the fractional band provided more accurate datasets across all accuracy metrics.

444

ISPRS Journal of Photogrammetry and Remote Sensing 144 (2018) 435–452

J. Coll, X. Li

Fig. 9. Accuracies for level 1 datasets (MOD, F-MOD, MYD, and F-MYD) by solar zenith angle in hundredths of a degree. The total number of validation observation within each solar zenith interval is plotted on the right axis: Top: AC; Bottom: MCC.

4.6. Gap filling using the most recent valid observation (MRVO)

5.1. Fractional snow cover band and SWE thresholds

The second means of gap-filling involves the use of temporal autocorrelation, where missing observations are filled in with the MRVO. As seen in Fig. 12, for all window sizes F-MOD is more accurate across all metrics apart from AA, where it underperforms with regards to the level 2 datasets. The percentage of dataset completeness is also presented in Table 5, which is useful when determining which dataset meets the end user’s needs as demonstrated in the discussion.

Based on this analysis and the table in Appendix A, a fractional threshold of 10 creates datasets which have the highest MCC for both satellites and AC also reaches a maximum for MOD and nearly so for MYD using that threshold value. The results also indicated that 10 mm of SWE creates a slightly more accurate threshold to use when reclassifying stations, but values of 2 mm were used for the remainder of the analysis as that represents a more useful ground truth state. This sensitivity at low snow levels has been noted by several other, but was seemingly more prevalent in collection 4 (Hall and Riggs, 2007; Tong et al., 2009; Wang et al., 2008). This discrepancy is best explained as snow cover is likely more spatially contiguous at higher SWE levels. These results reinforce and extend previous works which examine the relationship between the fractional band, snow depth, and snow detection (Déry et al., 2005; Parajka and Blöschl, 2006; Pu et al., 2007; Salomonson and Appel, 2006).

5. Discussion Overall accuracies were found to be 90%, similar to values found in past studies (see Appendix B). The discussion of these results is presented in the same order as the results to preserve continuity.

445

ISPRS Journal of Photogrammetry and Remote Sensing 144 (2018) 435–452

J. Coll, X. Li

Fig. 10. The accuracy surfaces for the F-MOD dataset by sensor and solar zenith angle: Top: AC; Bottom: MCC. Additional figures for other level 1 datasets are available on the HydroShare platform through the link provided in Section 6.

446

ISPRS Journal of Photogrammetry and Remote Sensing 144 (2018) 435–452

J. Coll, X. Li

Fig. 11. Accuracies of level 1 (MOD and F-MOD) and level 2 (MO/YD-MOD, MO/YD-SNOW, F-MO/YD-MOD, and F-MO/YD-SNOW) datasets.

Terra have remained valid (Casey et al., 2017). Additionally, based on the striking similarities to the patterns of accuracy by month and those by land cover and by latitude band point, we can infer that changes in detection accuracy are due to the decline in sensitivity of snow cover detection in summer, and not immediately tied to seasonal changes in land cover reflectance.

5.2. Impacts of geographic factors As noted in the results, only 15% of the stations have an MCC value below 0.75 when using F-MOD. Outliers in accuracies at individual stations are primarily attributed to underperforming land cover types or have very few valid observations for a land cover class, which negatively impacts the utility of the MCC metric. Accuracies across different land cover types varied by as much as 10% excluding the land cover type of water, which has the lowest accuracy. One possible cause for this behavior could be attributed to the lag time caused by the thermal inertia of water and the timing between snow fall and ice cover during accumulation and ablation, a theory further reinforced when examining land cover accuracies by month. It is also worth noting that evergreen needleleaf forest, which comprised 44% of the observed pixels in this validation, has the worst accuracy among the land cover types excluding water. Therefore, overall accuracies in other portions of the world where evergreen forests are not the dominant land cover type are expected to be higher. This finding further justifies efforts to improve snow cover retrievals in forests in the literature (Klein et al., 1998; Maurer et al., 2003; Wang et al., 2016).

5.4. Impacts of sensor and solar zenith angles Accuracy at large sensor zenith angles, while commonly cited as a source of error in past studies, did not appear to greatly affect the accuracy of those observations despite the increase in pixel size (Dozier et al., 2008; Li et al., 2016; Painter et al., 2012). At those high sensor zenith angles AC seems to suffer very little for MODIS Terra as it has a range of just 0.005. MODIS Aqua on the other hand was affected to a greater degree which has a range of 0.011. This indicates that snow cover tends to vary at scales larger than that of the nadir pixel size of 500 m, and issues which plague MODIS Aqua (collection 5) are exacerbated at those extremes. Furthermore, when examining the accuracies of sensor zenith by month, the slight decreases in accuracy caused by large sensor zenith angles are less than the magnitude of changes in accuracy by month, as was the case when comparing the effects of land cover and sensor zenith. Based on these findings, it appears inadvisable to filter observations by sensor zenith angle, as increases in accuracy are minor, and the loss of potential observations greatly increases the number of gaps in a time series of data product. Accuracies by solar zenith (Fig. 9) exhibits a signal that corresponds to that of the winter-summer seasonal changes, an obvious cause of variation in snow cover. Because sensor/solar illumination geometry is non-unique (i.e., it repeats at least every year), the accuracies between these two components are examined (Fig. 10). Because these accuracies appear additive (the overall accuracies look like the components of their underlying drivers), it implies that this seasonal accuracy cycle is dictated by the onset of snow. It also highlights the importance of the illumination angle (the y axis in Fig. 10) on overall accuracies is visually apparent as there is more variation across solar zenith axis than exists in the sensor zenith axis.

5.3. Annual and monthly variations Examining accuracies at the yearly scale shows that MCC accuracy varies less than 4%, and differences between datasets were generally consistent from year to year. Monthly accuracies followed a similar pattern where differences between datasets were consistent, but the range of variation was quite extreme depending on the accuracy metric and month. While AC indicates that the MODIS system has a greater than 90% accuracy throughout the year, MCC is able to catch the fact that MODIS is unable to accurately capture snow during certain periods, oscillations also found by Brubaker et al. (2005), Simic et al. (2004) and Wang et al. (2015). This dichotomy is a result of the metrics chosen; because so few potential snow observations exist in summer, the sensitivity falls rapidly if even a handful of snow observations are missed. Additionally, the accuracy of accumulation season was slightly higher than the ablation seasons, a behavior also noted by Andreadis and Lettenmaier (2006). Variations among the datasets across the years were primarily caused by increases in misclassified snow, not land. Therefore, despite referenced trends showing a decreasing albedo within collection 5, no obvious degradation in sensor quality is noted using this process, further reinforcing findings showing calibrations for

5.5. Gap filling methods It was found that the inclusion of the MYD sensor in any form reduced both AC and MCC, although AA increases. Additionally, as

447

ISPRS Journal of Photogrammetry and Remote Sensing 144 (2018) 435–452

J. Coll, X. Li

Fig. 12. The effects of temporal filter window size on accuracies for dataset MOD, F-MOD, MO/YD-MOD, MO/YD-SNOW, F-MO/YD-MOD, and F-MO/YD-SNOW: Top: AC; Middle: AA; Bottom: MCC.

448

ISPRS Journal of Photogrammetry and Remote Sensing 144 (2018) 435–452

J. Coll, X. Li

accuracy assessments using similar methodologies despite the comparably small size of their ground truth validation observations. See Appendix B for a modified version of Table 9 from chapter 9 of Multiscale Hydrologic Remote Sensing (Chang and Hong, 2017) which places this work within the context of many of its peers. Additionally, by using a large and geographically diverse set of ground truth validation stations, this work is able to further quantify the effects of land cover and sun-sensor geometries on the accuracy of these datasets. It was shown that reclassifying the MOD10A1 fractional band using 10% as the threshold creates the most accurate daily snow cover dataset. Gap-filling is best accomplished by not combining sensors, but instead applying a temporal filter. The resulting dataset (FMOD-V) has a higher AC, AA and MCC than datasets which are created by combining sensors with various window length. This improvement gained by using F-MOD products holds true even when some gaps are tolerable. As an example, if the use of daily snow cover data requires at least 80% of the time series to have data, Table 5 indicates that the dataset may be achieved with either the F-MO/YD-Snow-1 or F-MOD-2 datasets. While F-MO/YD-Snow-1 requires a shorter temporal filter length than F-MOD-2, F-MOD-2 has a higher AC, AA and MCC. Additionally, the use of a single satellite dataset requires half the storage space as the gap-filling methods that uses both datasets from the two satellites. This relatively simple gap-filling method stands in contrast to other, more computationally intensive gap filling methods which trade computational effort with accuracy (Thompson and Lees, 2014). This validation dataset and additional Figures of the various datasets are available on the HydroShare platform (http://www.hydroshare.org/ resource/c6f2c3e4155848bd944467de87833661) (Coll and Li, 2018). The code used to create these tested datasets is available here (https:// code.earthengine.google.com/47c7e4c101b5ba4cab155d3f439e05e5) on Google Earth Engine, which is currently freely available with an account (https://signup.earthengine.google.com/#!/). Long term datasets are difficult to work with, particularly at the daily, global scale. However, their importance in monitoring, modeling, and operational efforts are undeniable. This work presents a comprehensive assessment of the collection 5 MODIS snow cover products. Future work will use this work to compare with collection 6 and include other remotely sensed snow cover datasets including the European Sentinel platform derived snow cover products, Landsat snow products, and other end user generated datasets. Additional work is also underway to examine the global trends in snow cover frequency using this most accurate gap filled dataset identified.

Table 5 Percentage of days with valid observation within the filter window of different lengths. Although one may expect the MOD and F-MOD datasets to match, on a few occasions the fractional band was present, but the binary classification was missing. Window Length

MOD

F-MOD

MO/YDMOD

MO/YDSNOW

F-MO/ YD-MOD

F-MO/YDSNOW

0 1 2 3 4 5 6 7 8 9 10 V

0.474 0.741 0.852 0.908 0.938 0.956 0.967 0.974 0.979 0.982 0.984 0.990

0.474 0.746 0.859 0.915 0.946 0.964 0.975 0.982 0.987 0.990 0.992 0.998

0.544 0.802 0.900 0.944 0.967 0.979 0.987 0.991 0.994 0.995 0.996 1.000

0.544 0.802 0.900 0.944 0.967 0.979 0.987 0.991 0.994 0.995 0.996 1.000

0.544 0.803 0.900 0.944 0.967 0.979 0.987 0.991 0.994 0.995 0.996 1.000

0.544 0.803 0.900 0.944 0.967 0.980 0.987 0.991 0.994 0.995 0.996 1.000

expected, AC fell and AA rose as the temporal window length increased. This makes sense as the metrics dictate that any gap-filling should result in a higher AA and because these observations are inferred, they are inherently less accurate. The F-MOD dataset has the highest accuracies across the entire range of window lengths. This indicates that the use of F-MOD with a window size from 1 to 6 days or a variable day window creates the most accurate datasets. These shorter temporal filters were also observed by Xie et al. (2009). Longer windows tend to have very little impact on the accuracy of the product. Additionally, the transition from a 10-day window to the variable day window is notable because AC slightly increases. This is due in large part to areas affected by polar night which are, broadly speaking, the only areas which consistently have more than 20 consecutive missing days, a behavior also noted in Dietz et al. (2012) and Parajka et al. (2010). Therefore, it is recommended that if significantly large gaps are present in the study area that the filter be extended out to the length of those largest gaps. 6. Conclusions From the perspective of a polar orbiting satellite, there are several factors which have a large impact on the accuracy of snow cover classification. These can be reduced to relationships between the observer (sensor zenith angle), the target (land cover), and the illumination source (solar zenith angle). These accuracies and relationships are quantitatively examined in order to more thoroughly understand the uncertainty of various MODIS daily snow cover datasets. Using more than 1.5 million observations, this work shows the accuracies of the M*D10A1 (C5) datasets, and an empirically derived binary classification of fractional snow cover band. Then, using more than 3.2 million observations, it examines the accuracies of the most commonly used cloud removal methodologies. These results reaffirm previous MODIS

Acknowledgements This research was supported by a Google Earth Engine Faculty Research Award. We would like especially thank Nicholas Clinton and Tyler Erickson from Google Earth Engine team for their help. The authors would also like to acknowledge the helpful comments from two anonymous reviewers in improving this manuscript.

449

ISPRS Journal of Photogrammetry and Remote Sensing 144 (2018) 435–452

J. Coll, X. Li

Appendix A. Accuracy tables of different combinations of thresholds for MOD and MYD datasets for the identification of the best snow fraction (10%) and SWE (2 mm) thresholds. Red values indicate the highest accuracy within each snow fraction threshold, and the highest accuracy is bolded for readability

Appendix B. A modified summery table from Multiscale hydrologic remote sensing

Study

Region

MODIS Data Set/Time Period Validation Data Set

Summary of studies on MODIS snow cover validation Bitner et al. (2002) Pacific Northwest and the MOD10A1/March-June 2001 NOHRSC satellite products Great Plains Klein and Barnett Upper Rio Grand River Basin, MOD10A1/2000–2001 NOHRSC satellite product (2003) U.S. and SNOTEL daily SD (15 stations) Maurer et al. (2003) Missouri and Columbia River MOD10A1/Winter and spring NOHRSC satellite product Basins, U.S. of 2000–2001 and co-op and SNOTEL daily SD (1330 + 762 stations) Simic et al. (2004) Canada MOD10A1 (V003)/ Daily SD (2000 stations) 2000–2001 Brubaker et al. (2005) Continental U.S. MOD10C1 (V003)/2000 Daily SD Dery et al. (2005) Tekeli et al. (2005) Zhou et al. (2005)

Ault et al. (2006) Parajka and Blöschl (2006) Pu et al. (2007) Şorman et al. (2007) Liang et al. (2008)

Kuparuk River Basin, Alaska

MOD10A1/May 23 and 30, 2002 Karasu Basin, Turkey MOD10A1 (V004)/December 2002 - April 2003 Upper Rio Grand River Basin, MOD10A1 and MOD10A2 U.S. (V003)/February 2000 - June 2004 Lower Great Lakes Region, MOD10_L2 (V004)/ U.S. 2000–2003 Austria MOD10A1 (V004)/ 2000–2005 Tibet Plateau, China MOD10A2 (V004)/ 2000–2003 Karasu Basin, Turkey MOD10A1/March 22–25, 2004 Northern Xinjiang, China MOD10A1 (V004)/ November - March 2002–2005

450

Landsat snow cover

AC in Clear Sky Conditions

94.2–95.1% 94.2% (SNOTEL) 86% (NOHSRC) 87.5–95.8% (with NOHRSC) 74–81% (with ground SD) 93% 70–95% depending on season R2 = 0.91 and 0.19

Snow Courses and snow pillows Daily SWE (four SNOTEL stations)

100%

NWS and GLOBE daily SD

85.6–92.3%

Daily SD (754 stations)

94.7% B

Daily SD (115 stations)

84–91%

Daily SD (24 stations)

100%

AMSR-E and daily SD (20 stations)

93.4% (AMSR-E) 86.7% (stations)

MOD10A1: 92–95.9% MOD10A2: 89.5–93.2%

ISPRS Journal of Photogrammetry and Remote Sensing 144 (2018) 435–452

J. Coll, X. Li

Parajka and Blöschl (2008) Wang et al. (2008)

Austria

MOD10A1 and MYD10A1 (V004)/2003–2005 Northern Xinjiang, China MOD10A2 (V004)/ 2001–2005 Tong et al. (2009) Quesnel River Basin, Canada MOD10A1, MOD10A2 (V005)/2000–2007 Wang and Xie (2009) Northern Xinjiang, China MOD10A1, MOD10A2, MYD10A1, MYD10A2 (V004)/September 2003 August 2004 Gao et al. (2010) Fairbanks and Upper and MOD10A1 and MYD10A1 Susitna Valley, Alaska (V005)/October 2006 September 2007 Coll and Li (2018) Western United States and MOD10A1 and MYD10A1 Alaska (V005)/October 2002 September 2014 Study Region, Period Method

Daily SD (754 stations) Daily SD (20 stations) Daily SD (4 stations) Daily SD (20 stations)

MOD10A1: 67.3–90.1% MOD10A1: 75.9–90.1% MOD10A1 and MYD10A1: 98.8% MOD10A2: 98.1% MYD10A2: 96.3%

Daily SD and SWE (14 stations)

MOD10A1: 90.4% MYD10A1: 88.3%

Daily SWE (819 stations)

MOD10A1: 91.34% MYD10A1: 89.19%

Cloud Coverage before/ after Cloud Impact Reduction

AC before/after Cloud Impact Reduction

Summery of studies focused on cloud impact reduction in MODIS snow cover products Parajka and Blöschl Austria, 2003–2005 1. Aqua and Terra. 2. Spatial (2008) filter. 3. Fixed temporal filters Liang et al. (2008) Northern Xinjiang, China, Multisensor combination November - March (with AMSR-E) 2002–2005 Tong et al. (2009) Quesnel River Basin, Canada, Spatial filter (MOD10A2) 2000–2007 Wang and Xie (2009) Northern Xinjiang, China, 1. Terra + Aqua. 2. Flexible September 2003 - August Temporal filter 2004 Flexible temporal filter of Xie et al. (2009) Colorado Plateau, U.S. and Terra + Aqua Northern Xinjiang, China, September 2003 - August 2004 Snow line elevation Parajka et al. (2010) Austria, 2002–2005

60.1%/45.1%−10.4%

Dietz et al. (2012) Coll and Li (2018)

46.86%/ 52.6%/25.4 to ∼0%

Europe, 2000–2011 Western United States and Alaska, 2002–2014

1. Terra and Aqua Temporal MOD10A1 Fractional band reclassification and flexible temporal filter

MYD10A1: 95.5% MOD10A1: 95.1% 93–97%

59.2–63%/51.7%(1)–4%(3) 754 stations: 95.1–95.5%/ 94.9–92.1% b 61.6%/0%

20 stations: 86.7%/76.1%

64.2–71.9%/12.1–3.6% c

Four stations: 67.3–90.1%/ 76.4–93.1% 20 stations: 98.8–99.1%/ 98.3–98.8%

42–47%/35–3%

39.5–45.1%/30.6–5.3% (U.S.) 43.7–46.5%/ 35.1–5.8% (China)

15 stations (U.S.): ∼40–50%/∼56–82% 20 stations: (China): ∼52–59%/∼65–94% 754 stations 95.1%/ 95.2–91.5% 819 stations 92.07%/ 91.6–91.01%

2000 and 2011 derived from improved MODIS daily snow cover products. Remote Sens. 4, 2432–2454. https://doi.org/10.3390/rs4082432. Dozier, J., Painter, T.H., Rittger, K., Frew, J.E., 2008. Time–space continuity of daily maps of fractional snow cover and albedo from MODIS. Adv. Water Resour. 31, 1515–1526. https://doi.org/10.1016/j.advwatres.2008.08.011. EOS Val Status for Snow Cover/Sea Ice: MOD10/29 [WWW Document], n.d. URL https:// landval.gsfc.nasa.gov/ProductStatus.php?ProductID=MOD10/29 (accessed 6. 19.18). Friedl, M.A., Sulla-Menashe, D., Tan, B., Schneider, A., Ramankutty, N., Sibley, A., Huang, X., 2010. MODIS Collection 5 global land cover: algorithm refinements and characterization of new datasets. Remote Sens. Environ. 114, 168–182. https://doi.org/ 10.1016/j.rse.2009.08.016. Gafurov, A., Bárdossy, A., 2009. Cloud removal methodology from MODIS snow cover product. Hydrol. Earth Syst. Sci. 13, 1361–1373. https://doi.org/10.5194/hess-131361-2009. Gao, Y., Lu, N., Yao, T., 2011. Evaluation of a cloud-gap-filled MODIS daily snow cover product over the Pacific Northwest USA. J. Hydrol. 404, 157–165. https://doi.org/ 10.1016/j.jhydrol.2011.04.026. Gao, Y., Xie, H., Lu, N., Yao, T., Liang, T., 2010. Toward advanced daily cloud-free snow cover and snow water equivalent products from Terra-Aqua MODIS and Aqua AMSRE measurements. J. Hydrol. 385, 23–35. https://doi.org/10.1016/j.jhydrol.2010.01. 022. Hall, D.K., Riggs, G.A., 2007. Accuracy assessment of the MODIS snow products. Hydrol. Process. 21, 1534–1547. https://doi.org/10.1002/hyp.6715. Hall, D.K., Riggs, G.A., Foster, J.L., Kumar, S.V., 2010. Development and evaluation of a cloud-gap-filled MODIS daily snow-cover product. Remote Sens. Environ. 114, 496–503. https://doi.org/10.1016/j.rse.2009.10.007. Klein, A.G., Barnett, A.C., 2003. Validation of daily MODIS snow cover maps of the Upper

References Andreadis, K.M., Lettenmaier, D.P., 2006. Assimilating remotely sensed snow observations into a macroscale hydrology model. Adv. Water Resour. 29, 872–886. https:// doi.org/10.1016/j.advwatres.2005.08.004. Ault, T.W., Czajkowski, K.P., Benko, T., Coss, J., Struble, J., Spongberg, A., Templin, M., Gross, C., 2006. Validation of the MODIS snow product and cloud mask using student and NWS cooperative station observations in the Lower Great Lakes Region. Remote Sens. Environ. 105, 341–353. https://doi.org/10.1016/j.rse.2006.07.004. Bitner, D., Carroll, T., Cline, D., Romanov, P., 2002. An assessment of the differences between three satellite snow cover mapping techniques. Hydrol. Process. 16, 3723–3733. https://doi.org/10.1002/hyp.1231. Brubaker, K.L., Pinker, R.T., Deviatova, E., 2005. Evaluation and comparison of MODIS and IMS snow-cover estimates for the continental United States using station data. J. Hydrometeorol. 6, 1002–1017. https://doi.org/10.1175/JHM447.1. Casey, K.A., Polashenski, C.M., Chen, J., Tedesco, M., 2017. Impact of MODIS sensor calibration updates on Greenland Ice Sheet surface reflectance and albedo trends. The Cryosphere 11, 1781–1795. https://doi.org/10.5194/tc-11-1781-2017. Chang, N.-B., Hong, Y., 2017. Multiscale Hydrologic REMOTE Sensing: Perspectives and Applications., Chapter 9. CRC PRESS, S.l. Coll, J., X. Li (2018). CAAoMOD_ValidationDataset, HydroShare, http://www.hydroshare.org/resource/c6f2c3e4155848bd944467de87833661, n.d. Déry, S.J., Salomonson, V.V., Stieglitz, M., Hall, D.K., Appel, I., 2005. An approach to using snow areal depletion curves inferred from MODIS and its application to land surface modelling in Alaska. Hydrol. Process. 19, 2755–2774. https://doi.org/10. 1002/hyp.5784. Dietz, A.J., Wohner, C., Kuenzer, C., 2012. European snow cover characteristics between

451

ISPRS Journal of Photogrammetry and Remote Sensing 144 (2018) 435–452

J. Coll, X. Li

cover algorithm and validation results. IEEE Trans. Geosci. Remote Sens. 44, 1747–1756. https://doi.org/10.1109/TGRS.2006.876029. Simic, A., Fernandes, R., Brown, R., Romanov, P., Park, W., 2004. Validation of VEGETATION, MODIS, and GOES+ SSM/I snow-cover products over Canada based on surface snow depth observations. Hydrol. Process. 18, 1089–1104. https://doi. org/10.1002/hyp.5509. Şorman, A.Ü., Akyürek, Z., Şensoy, A., Şorman, A.A., Tekeli, A.E., 2007. Commentary on comparison of MODIS snow cover and albedo products with ground observations over the mountainous terrain of Turkey. Hydrol. Earth Syst. Sci. 8. https://doi.org/ 10.5194/hess-11-1353-2007. Tekeli, A.E., Akyürek, Z., Arda Şorman, A., Şensoy, A., Ünal Şorman, A., 2005. Using MODIS snow cover maps in modeling snowmelt runoff process in the eastern part of Turkey. Remote Sens. Environ. 97, 216–230. https://doi.org/10.1016/j.rse.2005.03. 013. Thompson, J.A., Lees, B.G., 2014. Applying object-based segmentation in the temporal domain to characterise snow seasonality. ISPRS J. Photogramm. Remote Sens. 97, 98–110. https://doi.org/10.1016/j.isprsjprs.2014.08.010. Tong, J., Dery, S.J., Jackson, P.L., 2009. Interrelationships between MODIS/Terra remotely sensed snow cover and the hydrometeorology of the Quesnel River Basin, British Columbia, Canada. Hydrol. Earth Syst. Sci. 14. Vermote, E.F., Kotchenova, S.Y., Ray, J.P., 2011. MODIS Surface Reflectance User’s Guide. Wang, J., Zhao, Y., Li, C., Yu, L., Liu, D., Gong, P., 2015. Mapping global land cover in 2001 and 2010 with spatial-temporal consistency at 250m resolution. ISPRS J. Photogramm. Remote Sens. 103, 38–47. https://doi.org/10.1016/j.isprsjprs.2014. 03.007. Wang, R., Chen, J.M., Pavlic, G., Arain, A., 2016. Improving winter leaf area index estimation in coniferous forests and its significance in estimating the land surface albedo. ISPRS J. Photogramm. Remote Sens. 119, 32–48. https://doi.org/10.1016/j. isprsjprs.2016.05.003. Wang, X., Xie, H., 2009. New methods for studying the spatiotemporal variation of snow cover based on combination products of MODIS Terra and Aqua. J. Hydrol. 371, 192–200. https://doi.org/10.1016/j.jhydrol.2009.03.028. Wang, X., Xie, H., Liang, T., 2008. Evaluation of MODIS snow cover and cloud mask and its application in Northern Xinjiang, China. Remote Sens. Environ. 112, 1497–1513. https://doi.org/10.1016/j.rse.2007.05.016. Wylie, D., Jackson, D.L., Menzel, W.P., Bates, J.J., 2005. Trends in global cloud cover in two decades of HIRS observations. J. Clim. 18, 11. Xie, H., Wang, X., Liang, T., 2009. Development and assessment of combined Terra and Aqua snow cover products in Colorado Plateau, USA and northern Xinjiang, China. J. Appl. Remote Sens. 3, 033559. https://doi.org/10.1117/1.3265996. Zhou, X., Xie, H., Hendrickx, J.M.H., 2005. Statistical evaluation of remotely sensed snow-cover products with constraints from streamflow and SNOTEL measurements. Remote Sens. Environ. 94, 214–231. https://doi.org/10.1016/j.rse.2004.10.007.

Rio Grande River Basin for the 2000–2001 snow year. Remote Sens. Environ. 86, 162–176. https://doi.org/10.1016/S0034-4257(03)00097-X. Klein, A.G., Hall, D.K., Riggs, G.A., 1998. Improving snow cover mapping in forests through the use of a canopy reflectance model. Hydrol. Process. 12, 1723–1744. https://doi.org/10.1002/(SICI)1099-1085(199808/09)12:10/11<1723::AIDHYP691>3.0.CO;2-2. Li, H., Li, X., Xiao, P., 2016. Impact of sensor zenith angle on MOD10A1 data reliability and modification of snow cover data for the Tarim River basin. Remote Sens. 8, 750. https://doi.org/10.3390/rs8090750. Liang, T., Zhang, X., Xie, H., Wu, C., Feng, Q., Huang, X., Chen, Q., 2008. Toward improved daily snow cover mapping with advanced combination of MODIS and AMSR-E measurements. Remote Sens. Environ. 112, 3750–3761. https://doi.org/10.1016/j. rse.2008.05.010. Maurer, E.P., Rhoads, J.D., Dubayah, R.O., Lettenmaier, D.P., 2003. Evaluation of the snow-covered area data product from MODIS. Hydrol. Process. 17, 59–71. https:// doi.org/10.1002/hyp.1193. MCD12C1 | LP DAAC:: NASA Land Data Products and Services [WWW Document], n.d. URL https://lpdaac.usgs.gov/dataset_discovery/modis/modis_products_table/ mcd12c1 (accessed 6.19.18). NRCS National Water and Climate Center | Home [WWW Document], n.d. URL https:// www.wcc.nrcs.usda.gov/snotel/snotel_sensors.html (accessed 6.24.18). NRCS National Water and Climate Center | SNOTEL Data & Products [WWW Document], n.d. URL https://www.wcc.nrcs.usda.gov/snow/ (accessed 6.15.18). Painter, T.H., Brodzik, M.J., Racoviteanu, A., Armstrong, R., 2012. Automated mapping of Earth’s annual minimum exposed snow and ice with MODIS: MAPPING MINIMUM EXPOSED SNOW AND ICE. Geophys. Res. Lett. 39. https://doi.org/10.1029/ 2012GL053340. Parajka, J., Blöschl, G., 2008. Spatio-temporal combination of MODIS images – potential for snow cover mapping. Water Resour. Res. 44. https://doi.org/10.1029/ 2007WR006204. Parajka, J., Blöschl, G., 2006. Validation of MODIS snow cover images over Austria. Hydrol. Earth Syst. Sci. Discuss. 10, 679–689. Parajka, J., Pepe, M., Rampini, A., Rossi, S., Blöschl, G., 2010. A regional snow-line method for estimating snow cover from MODIS during cloud cover. J. Hydrol. 381, 203–212. https://doi.org/10.1016/j.jhydrol.2009.11.042. Powers, D.M.W., 2011. Evaluation: From Precision, Recall and F-Measure to Roc, Informedness, Markedness & Correlation. J. Mach. Learn. Technol. 2. Pu, Z., Xu, L., Salomonson, V.V., 2007. MODIS/Terra observed seasonal variations of snow cover over the Tibetan Plateau. Geophys. Res. Lett. 34. https://doi.org/10. 1029/2007GL029262. Riggs, G.A., Hall, D.K., Román, M.O., 2016. MODIS Snow Products Collection 6 User Guide. Riggs, G.A., Hall, D.K., Salomonson, V.V., 2006. MODIS Snow Products User Guide to Collection 5. Salomonson, V.V., Appel, I., 2006. Development of the Aqua MODIS NDSI fractional snow

452