Methodology for estimating availability of cloud-free image composites: A case study for southern Canada

Methodology for estimating availability of cloud-free image composites: A case study for southern Canada

International Journal of Applied Earth Observation and Geoinformation 21 (2013) 17–31 Contents lists available at SciVerse ScienceDirect Internation...

1MB Sizes 1 Downloads 30 Views

International Journal of Applied Earth Observation and Geoinformation 21 (2013) 17–31

Contents lists available at SciVerse ScienceDirect

International Journal of Applied Earth Observation and Geoinformation journal homepage: www.elsevier.com/locate/jag

Methodology for estimating availability of cloud-free image composites: A case study for southern Canada Fuqun Zhou ∗ , Aining Zhang Canada Centre for Remote Sensing, Natural Resources Canada, Canada

a r t i c l e

i n f o

Article history: Received 16 January 2012 Accepted 27 July 2012 Keywords: Cloud cover Cloud fraction product Cloud-free images Cloud-free composite availability

a b s t r a c t Image composites are often used for earth surface phenomena studies at regional or national level. The compromise between residual clouds and the length of compositing period is a necessary corollary to the choice of satellite optical data for monitoring earth surface phenomena dynamics. This paper introduced a methodology for estimating availability of cloud-free image composites for optical sensors with various revisiting intervals, using MODIS MOD06 L2 cloud fraction product in the period of 2000–2008. The methodology starts with downscaling of the cloud fraction product to 1 km × 1 km cloud cover binary images. The binary images are then used for the exploration of spatial and temporal characteristics of cloud dynamics, and subsequently for the simulation of cloud-free composite availability with various revisiting intervals of optical sensors. Using Canada’s southern provinces as an application case, the study explored several factors important for the design of environmental monitoring system using optical sensors of earth observation, in particular, cloud dynamics and its inter-annual variability, sensors’ revisiting intervals, and cloud-free threshold for targeting composites. While the cloud images used in the analysis are at 1 km × 1 km resolution, our analysis suggests that the simulated availabilities of cloud-free image composites may also provide reasonable estimates for optical sensors with higher than 1 km × 1 km resolution, though the closer to 1 km × 1 km resolution the optical sensor, the more pertinent the application. Also, the methodology can be parameterised to different temporal period and different spatial region, depending on applications. Crown Copyright © 2012 Published by Elsevier B.V. All rights reserved.

1. Introduction Satellite-based earth observation systems are widely used for mapping and monitoring large-scale environmental processes. For optical remote sensing, cloud cover is a major obstacle to such remote sensing applications. An essential task in these applications is thus the generation of cloud-free image composites1 from multitemporal image scenes covering the same area (Gutman et al., 1994; Cihlar, 1996; Cihlar et al., 1996; Arvidson et al., 2001; Ju and Roy, 2008; Luo et al., 2008; Bindschadler et al., 2008). The potential number of cloud-free composites available will then serve as a key parameter in the design and/or choice of earth observation systems for a particular application. Examples of such applications

∗ Corresponding author at: Canada Centre for Remote Sensing, Natural Resources Canada, 588 Booth Street, 3th Floor, Rm 311, Ottawa, ON, Canada K1A 0Y7. Tel.: +1 613 947 5282; fax: +1 613 947 1383. E-mail addresses: [email protected], [email protected] (F. Zhou). 1 In the context of optical remote sensing, a cloud-cover pixel means that the field of view of the pixel is blocked by cloud between the optical sensor and the ground surface of the pixel, and a cloud-free image composite means that all pixels of the composite are free from cloud cover.

include monitoring of snow cover (Zhao and Fernanders, 2009; Dietz et al., 2012), forest fire (Li et al., 1999; Sunar and Özkan, 2001; Leblon, 2005; Schroeder et al., 2008), vegetation (Zhang et al., 2004; Fontana et al., 2012; Chen, 1996; Abuelgasim and Leblanc, 2011; le Maire et al., 2011), and surface solar irradiance and the radiation budget (Buriez et al., 1988; Deneke et al., 2008). Countries with large landmass could benefit enormously from satellite earth observation in their understanding and monitoring large scale land surface dynamics. This is particularly true for Canada, in which the short growing seasons make it necessary to employ the “right” observing systems in order for monitoring and adequate assessment of the spatiotemporal dynamics of its ecosystem. Optimally, high spatial resolution and high temporal frequency of earth observations are both desired, but these two demands are usually contradictory in case of satellite monitoring. Thus in practice, a compromise must be struck in the selection of suitable platform(s) and the sensors onboard from available satellite systems. This requires a sufficient understanding of the quantity and the spatial and temporal properties of cloud cover over the landmass under study and their impact on cloud-free image acquisitions with different earth observation systems. The same understanding is also desirable in determining the level of cloud contamination that often has to be permitted in the “cloud-free”

0303-2434/$ – see front matter. Crown Copyright © 2012 Published by Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.jag.2012.07.017

18

F. Zhou, A. Zhang / International Journal of Applied Earth Observation and Geoinformation 21 (2013) 17–31

image composite in order to acquire sufficient number of composites for the monitoring need. A design of effective system for a specific monitoring application, therefore, requires knowledge of the spatio-temporal properties of cloud coverage over a study region, and the frequencies or opportunities and variations with which cloud-free image composites can be acquired. This knowledge, however, is largely limited, impeding the development of operational solutions to mapping and monitoring land surface dynamics using earth observation data. Several studies have been conducted for this need, with cloud information drawn from two main sources: cloud observation stations and remote sensing imagery. An early and preliminary study by Greaves et al. (1970) used cloud cover data from approximately 100 observation stations distributed throughout the world to generate the cloud cover statistics for 29 cloud-climatic regions, and then to simulate the cloud cover distribution of these regions for evaluation of the influence of cloud on the performance of a sensor system. More recently, Wagner and Reuder (2005) use observational data from ground stations for a regional study of the cloud frequency with respect to remote sensing applications at southern Germany. Based on the synoptic cloud observations and statistics at six meteorological stations, this study concludes that in general, during the vegetation growing season, remote sensing data can be used to determine the ground reflectance for one third of a month with reasonable accuracy and for one sixth of a month with very high accuracy. The study points out that, as cloud cover has large spatial variations from one place to another, cloud data from meteorological stations, which are normally sparse relative to cloud dynamics, may have poor representation for a regional study. Marshall et al. (1994) generate cloud cover distributions over the European Arctic Sector from historical (1983–1992) Landsat 4/5 MSS and TM imagery, and then assess the spatial and temporal effect of cloud cover on the acquisition of high quality of Landsat imagery in their studied region for glaciological studies. Sano et al. (2007) examine the impacts of cloudiness on monitoring of the Brazilian Cerrado from Landsat-like sensors. They derive the percent cloud cover from more than 35,500 Landsat quick-looks to determine the spatial and temporal probabilities of obtaining cloud-free Landsat images over the Brazilian tropical savanna for land use/land cover mapping and ecological monitoring. More recently, Ju and Roy (2008) extract cloud fraction data from Landsat ETM+ metadata from December 1999 to November 2002 to evaluate the availability of cloud-free Landsat ETM+ data over conterminous United States and the globe. They find that cloud becomes a constraint when at least two cloud-free observations are required from the same season over the conterminous U.S. Global applications requiring at least one cloud-free observation in a season, in two different seasons, and applications requiring at least two cloud-free observations in a year, are all severely affected by cloud and data availability constraints. Although these studies have served various purposes, due to the data sources used, these studies share one or more major constraint limiting their wider applications: (1) cloud cover data lacked spatial coverage when meteorological station data is used as the source of cloud cover information (Greaves et al., 1970; Wagner and Reuder, 2005); (2) cloud cover data lacked temporal coverage when remote sensing data from satellite sensors with a low frequency of revisit (such as Landsat) (Marshall et al., 1994; Sano et al., 2007; Ju and Roy, 2008). Furthermore, all these studies targeted applications of specific sensors (for example, Landsat ETM+), therefore the results lack generality and cannot be applied to optical sensors with different revisiting frequencies. The Moderate Resolution Imaging Spectroradiometer (MODIS) onboard the Terra satellite has provided observations permitting the monitoring of cloud cover around the globe on a daily basis. From these observations, the MODIS atmospheric science team has produced, and made accessible through NASA website

(http://ladsweb.nascom.nasa.gov/data/), a set of cloud products, including MOD35 L2 cloud mask product at 1 km × 1 km resolution; and the MOD06 L2 cloud fraction product at 5 km × 5 km resolution, derived from MOD35 cloud mask and other atmospheric properties (King et al., 1997; Platnick et al., 2003). Because of the high temporal coverage (daily) and large spatial extent (global), these products provide an opportunity to explore the spatial and temporal behaviours of cloud cover at the regional, national and global levels. Based on this, it is then possible to estimate the likelihood of obtaining cloud-free image composites from sensors with different revisiting intervals, in support of the design of specific monitoring systems for environmental assessment applications. This study was thus the first attempt to exploit this potential. The MOD06 L2 daytime Terra atmospheric product was chosen for the study, with the consideration to capitalize on the sophisticated techniques used in deriving MOD35 cloud mask by the NASA atmospheric science team. A methodology with a set of algorithms was then developed for a comprehensive analysis of spatiotemporal characteristics of cloud dynamics in relation to cloud-free image compositing, and for subsequent estimation of the availability of cloud-free image composites. Simulations and sensitivity assessment were further conducted to estimate the availability of cloud-free image compositing with various sensors and revisiting frequencies, and with different cloud-contamination thresholds. While the study focuses on southern Canada, the methodology is generic, applicable for different regions and at different scales – local, regional, national or global. 2. Methodology With the MODIS MOD06 L2 cloud fraction data downloaded from the NASA website, we first downscaled the 5 km × 5 km pixels of the MODIS cloud fraction products to 1 km × 1 km binary cloud cover images. We then analysed the spatial, temporal and statistical characteristics of the cloud-free conditions over southern Canada during the vegetation growing seasons (May to September) between 2000 and 2008. Next, we developed an algorithm to estimate the availability of cloud-free image composites that may be acquired by satellite sensors with various revisit intervals. We then applied the algorithm to the daily 1 km × 1 km binary cloud cover images, by province and by month, to estimate the availability of cloud-free image composites with three cloud-free thresholds (or cloud contamination). Finally, we analysed the spatiotemporal and statistical characteristics of the composites, with a brief analysis of the likely causes for the observed patterns and trends. 2.1. MODIS cloud products The MODIS MOD35 cloud mask product indicates whether a pixel is unobstructed between ground surface and satellite. The product provides 48 bits of output per 1-km pixel that includes information on sets of multispectral test results (from 19 MODIS spectral bands), the decision tree used to arrive at the product, and limited ancillary information such as a land/ocean and snow/no snow flags (Ackerman et al., 1997; Platnick et al., 2003). Although MODIS MOD35 product is called cloud mask, it is not binary value in term of either cloudy or clear. Instead, the first two bits provide information in four categories: cloudy, probably cloudy, probably clear, and confident clear. To determine the cloudy or clear status of a pixel of the product, it is straightforward for pixels with confident clear (cloud-free) or cloudy bit. For pixels with “probably clear” or “probably cloudy”, however, determining the clear or cloudy state would depend on the cloud information demand of applications and other parameters such as land surface features (MODIS Cloud Mask User Guide).

F. Zhou, A. Zhang / International Journal of Applied Earth Observation and Geoinformation 21 (2013) 17–31

The MOD06 L2 cloud fraction product is derived mainly from MOD35 cloud mask and calculated on a 5 km × 5 km basis. Each MOD06 L2 cloud fraction pixel, containing 25 1 km × 1 km pixels from MOD35 cloud mask, was quantified as the percentage of the number of 1 km × 1 km cloudy pixels divided by 25. The pixel value of the cloud fraction thus ranges from 0 (cloud free) to 100 (fully cloudy) by an increment of 4, and the spatial resolution of the MOD06 L2 cloud fraction is 5 km × 5 km. For the purpose of assessment of cloud-free composite availability in the study, the MOD06 L2 cloud faction product is preferable over MOD35 cloud mask product for the simplicity to use and storage consumption and computation efficiency. 2.2. Data pre-processing The MOD06 L2 cloud fraction product is granule and swath based. To generate cloud fraction image mosaics on a daily basis for the entire study region, we downloaded all the time-series of daily (daytime) cloud fraction swath images over southern Canada (40–70◦ N, 50–145◦ W) during the growing seasons (May 1 to September 30 of 2000–2008). The swath images were reprojected to the geographic (latitude/longitude) coordinate system. The swath images were then joined to generate a full daily cloud fraction coverage image of the study area. All the pixel values of the relevant swath images were considered in the mosaicing process. If a pixel of the study area was covered only by one swath image, the cloud fraction value of the swath image was used; for a pixel covered by two or more swath images, the lowest cloud fraction values at the same location was retained in the daily product (this represents the optimistic situation for cloud-free image compositing). The processed time series of daily cloud fraction mosaics encompass the entire study area, composed of a total of 1377 images covering the growing seasons (May 1 to September 30) from 2000 to 2008. The Canadian provincial boundaries were downloaded from the GeoGratis website (http://www.geogratis.gc.ca). The 10 provincial boundaries were used as polygons within which the temporal and spatial distributions of cloud cover were analysed. Provincial boundaries were used because they delineate areas within which resource management decisions are often made by individual jurisdiction (alternatively one could delineate the boundaries to geographical regions or municipalities, or ecoregions, depending on the application needs). Fig. 1 shows the study region superimposed by a daily cloud fraction mosaic image and provincial administrative boundaries. 2.3. Downscaling and validation of cloud fraction of 5 km × 5 km resolution image to 1 km × 1 km binary cloud cover image In theory, the higher resolution the cloud product, the higher accuracy the estimates of the availability of cloud-free composites. In an attempt to improve the accuracy using MOD06 L2 cloud fraction for finer resolution sensors, a downscale method is developed. The downscaling algorithm has two premises. First, our interest is in an unbiased estimation of the availability of cloud-free image composites at the spatial scale of 1 km × 1 km and up. Second, the cloud fraction estimates of the MOD06 L2 cloud fraction product are unbiased on a 5 × 5 pixel basis. The first premise poses the constraint that the output using the original MOD06 L2 cloud fraction product will be applicable only to remote sensing images with a spatial scale of 5 km × 5 km resolution. The validity of the second is accredited to the significant effort of the NASA atmospheric science team to generate MOD35 cloud mask and then MOD06 cloud fraction product (Platnick et al., 2003). With these understandings, we have downscaled, in an attempt to gain the spatial resolution, the

19

cloud faction product of 5 km × 5 km resolution to a binary cloud cover image of 1 km × 1 km resolution. Downscaling involves the data process that translates data across scales, either in temporal or spatial dimension or in both, to meet the finer resolution data requirements of applications. An example of downscaling of remote sensing imagery is the generation of bands 3–7 MODIS data at 250 m resolution from the original resolution of 500 m (Luo et al., 2008). Recently, downscaling techniques are widely used in climate change studies as outputs of a General Circulation Model usually have a much coarser grid for climate change impact assessment at local or even regional levels (Flowler et al., 2007). Various downscaling techniques have been developed for translation of a large spatial or/and temporal scale of data to a finer scale to meet application needs. Broadly, downscaling algorithms can be classified as two groups: dynamic (or process-based) approach and empirical (statistical) approach. For the former, physical dynamics are solved explicitly with the coarse data as geographical or spectral boundary conditions, such as limited-area model, and the latter is so-called statistical downscaling with a variety of techniques, such as multivariate regression. The general theory, advantages and limitations as well as practice of downscaling are well described in the literature (e.g., Hewitson and Crane, 1996; Wilby and Wigley, 1997; Xu, 1999; Boé et al., 2007; Salathé et al., 2007). To downscale the cloud fraction (5 km × 5 km) data to the cloud binary image (1 km × 1 km) in this study, a simple, spatial downscaling technique is developed. This technique, on a pixel basis, converts 5 km × 5 km pixels of cloud fraction into the number of cloudy pixels in the scale of 1 km × 1 km, and then distributes all the cloudy pixels to the same 5 km × 5 km pixel space within a binary image of 1 km × 1 km spatial resolution. The cloudy pixel distribution starts with a random selection of a 1 km × 1 km pixel within the original 5 km × 5 km pixel space, followed by assigning the rest of the cloudy pixels to the grids adjacent to the seed cloudy pixel, with the consideration of the nature of cloud continuity. The underlying assumption is that the potential bias from the random identification of the cloud location in the resulting estimates of the availability of cloud-free image composites would be mutually cancelled given the massive number of pixels involved in an area under the study. More specifically, the downscaling algorithm starts with the creation of a grid system whose one block of 5 × 5 cells corresponds to one 5 km × 5 km pixel of the cloud fraction image, with each cell corresponding to a 1 km × 1 km pixel of the downscaled cloud cover binary image. From the cloud fraction value of one 5 km × 5 km pixel in the cloud fraction image, it is straightforward to derive the number of 1 km × 1 km pixels that are cloudy, but the location(s) of the cloudy pixel(s) are unknown if cloud fraction is greater than 0 and smaller than 100. To put these cloudy pixels on the cells, we first randomly pick up a cell within the 5 × 5 cells to hold the first (seed) cloudy pixel. If the number of cloudy pixels is larger than 1, we then assign the second cloudy pixel adjacent to the seed and the third adjacent to the first or second or both cloudy pixels, and so on, with the cloud continuity assumption. This process is applied to all the daily cloud fraction image mosaics of the entire study region, during the growing season over the period of 2000–2008. The results are a daily time series of 1 km × 1 km binary cloud cover mosaics (cloud = 1 and cloud-free = 0) of the study region for the growing seasons of 2000–2008. To validate the downscaling algorithm, all cloud mask files (MOD35) of the entire study region during the growing season (May to September) for year 2008, corresponding to the cloud fraction data (MOD06), are downloaded (a total of 1825 swaths). To make cloud mask data comparable with the corresponding downscaled binary cloud cover images under the validation, the MOD35 cloud mask pixels with probably clear value are simply categorized as clear and those with probably cloudy are identified as cloudy.

20

F. Zhou, A. Zhang / International Journal of Applied Earth Observation and Geoinformation 21 (2013) 17–31

Fig. 1. A cloud fraction coverage in Lambert Conformal Conic projection with superimposed Canadian southern provincial administrative boundaries. Lighter colours represent higher cloud fraction values. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

The binary version of the MOD35 cloud mask datasets was then used as the benchmark datasets for comparison with the downscaled binary cloud cover image on a pixel and swath basis. A ratio of agreement (cloudy–cloudy and clear–clear pairs) between the downscaled binary cloud cover image and the simplified binary MOD35 cloud mask data is computed for each and every swath. The result shows that the average accuracy of all the swaths is 88.4%, with the lowest being 75.1%, the highest 97.1% and a standard deviation of 3.4. On a monthly basis, the average accuracy for May to September is 89.1%, 87.5%, 87.4%, 88.6% and 89.3%, with a standard deviation of 2.96, 3.28, 3.98, 3.30, and 3.06, respectively. A number of factors may contribute to the small discrepancy between the two datasets. One of them is the way for generation of the binary MOD35 cloud mask dataset: probably clear and probably cloudy pixels are hard classified as clear or cloudy. Another contributor can be the downscaling method itself as the first cloudy pixel is randomly selected, and then other cloudy pixels are clustered around it, which would produce a larger discrepancy for those pixels with a small cloud fraction number (except 0). Despite these, the validation suggests that the downscaling technique produces reasonable results in both spatial and temporal dimensions. 2.4. Exploratory analysis of spatio-temporal distributions of cloud-free conditions To reveal the spatiotemporal characteristics of cloud-free conditions over a region, we performed various exploratory analyses of the 1377 derived 1 km × 1 km binary cloud cover mosaics, downscaled from the 5 km × 5 km MOD06 data, covering the growing season months from 2000 to 2008. We finally chose two complementary charts of statistical distributions for the examination of the potential of obtaining cloud-free image composites.

The principal chart, as exemplified in Fig. 2, shows the proportions of each provincial territory that have various numbers of cloudy days during a growing season month. There are two salient features in the curves that are of particular interest to the understanding of cloud dynamics of the region. The first is the location in the x-axis (i.e. days with cloud cover) where a curve rises (i.e. the percent of area becomes greater than 0). It indicates the minimum number of cloudy days any location in the study region has during a month period. The peak and the distribution of the curve with respect to the x-axis are the second feature, which reveals the number of cloudy days in a month that most areas of the study region have. These features together portray the temporal characteristics of cloud dynamics of the region. Looking from the daily binary cloud cover image point of view, the second chart, as Fig. 3 shows, visualizes the spatial characteristics of monthly cloud dynamics averaged over the growing seasons from 2000 to 2008. With the x-axis representing cloudfree conditions, as measured by cloud-free percent (CFP) values, and the y-axis representing the frequency of the CFP distributions for the study region, the chart provides an insight about the effort required to compose cloud-free images, or the possibility of obtaining cloud-free composite without conducting image compositing. The detailed discussion of impacts of these spatiotemporal characteristics of the cloud dynamics of these regions on the availability of cloud-free image composites can be found in Section 3 below. 2.5. Simulation of cloud-free composite availability by sensors with various revisiting intervals The exploratory analysis suggested that cloud coverage over southern Canada is extensive, making it hardly possible to obtain a clear sky image mosaic on any day for any province. Thus,

F. Zhou, A. Zhang / International Journal of Applied Earth Observation and Geoinformation 21 (2013) 17–31

a)

May June July Aug. Sept.

16 12 8 4

20

percentage of area

percentage of area

20

0

May June July Aug. Sept.

16 12 8 4

3

5

7

1

9 11 13 15 17 19 21 23 25 27 29 31

3

5

7

Days with Cloud Cover

May June July Aug. Sept.

16 12

c)

20

8 4

May June July Aug. Sept.

16 12 8

d)

4 0

0 1

3

5

7

1

9 11 13 15 17 19 21 23 25 27 29 31

3

5

7

May June July Aug. Step.

16 12

e)

8 4

20

Percentage of area

20

9 11 13 15 17 19 21 23 25 27 29 31 Days with Cloud Cover

Days with Cloud Cover

Percentage of area

9 11 13 15 17 19 21 23 25 27 29 31 Days with Cloud Cover

Percentage of area

Percentage of area

20

May June July Aug. Sept.

16 12 8

f)

4 0

0 1

3

5

7

1

9 11 13 15 17 19 21 23 25 27 29 31

3

5

7

May June July Aug. Sept.

16 12

g)

8 4

20

Percentage of area

20

9 11 13 15 17 19 21 23 25 27 29 31 Days with Cloud Cover

Days with Cloud Cover

Percentage of area

b)

0 1

0

May June July Aug. Sept.

16 12

h)

8 4 0

1

3

5

7

9 11 13 15 17 19 21 23 25 27 29 31

1

3

5

7

9 11 13 15 17 19 21 23 25 27 29 31

Days with Cloud Cover

i)

May June July Aug. Sept.

16 12

Days with Cloud Cover

8 4 0

20

Percentage of area

20

Percentage of area

21

May June July Aug. Sept.

16 12 8

j)

4 0

1

3

5

7

9 11 13 15 17 19 21 23 25 27 29 31 Days with Cloud Cover

1

3

5

7

9 11 13 15 17 19 21 23 25 27 29 31 Days with Cloud Cover

Fig. 2. Monthly averaged percentage of area vs. cloud cover days during 2000–2008. (a) British Columbia, (b) Alberta, (c) Saskatchewan, (d) Manitoba, (e) Ontario, (f) Quebec, (g) New Brunswick, (h) Nova Scotia, (i) Prince Edward Island, and (j) Newfoundland and Labrador.

cloud-free images have to be composited from multiple acquisitions. Despite various exploratory tools available, there is no suitable tool available to estimate the availability of cloud-free image composites for a study region over a given period. We thus developed an algorithm for estimate of cloud-free composites, based on the time series of the 1 km × 1 km binary cloud cover images described in Section 2.3. With the understanding that the availability of cloud-free image composites for a study region depends on both cloud dynamics and the revisiting frequency of

satellite sensors, the algorithm starts with the simplified case of one-day revisiting sensors, and expands to sensors with various revisiting frequencies. For one-day revisiting sensors, the algorithm is as follows: (1) create a grid system with the same spatial resolution of the cloud binary images, i.e. 1 km × 1 km, with delineation of the administrative boundaries (in this case, provincial); (2) superimpose the grid on the binary cloud cover image of day 1 of a month, and examine each pixel of the image for its cloudy status; (3) whenever a

16 14 12 10 8 6 4 2 0

May June July Aug. Sept.

a)

0

10

20

30

40

50

60

70

80

90

Frequency (days)

F. Zhou, A. Zhang / International Journal of Applied Earth Observation and Geoinformation 21 (2013) 17–31

Frequency (days)

22

16 14 12 10 8 6 4 2 0 0

100

May June July Aug. Sept.

b)

10

20

30

May June July Aug. Sept.

c)

0

10

20

30

40

50

60

70

80

90

Frequency (days)

Frequency (days)

16 14 12 10 8 6 4 2 0

16 14 12 10 8 6 4 2 0

10

100

10

0

30

40

50

60

70

80

90

Frequency (days)

Frequency (days)

May June July Aug. Sept.

20

10

16 14 12 10 8 6 4 2 0

100

20

30

40

50

60

70

80

10

90

16 14 12 10 8 6 4 2 0

10

30

40

50

60

70

80

90

100

Cloud-free percent

Frequency (days)

Frequency (days)

40

50

60

70

80

20

30

40

50

60

70

80

90

100

90

100

May June July Aug. Sept.

h)

10

0

100

May June July Aug. Sept.

20

100

20

30

40

50

60

70

80

90

100

Cloud-free percent

i)

0

90

May June July Aug. Sept.

Cloud-free percent 16 14 12 10 8 6 4 2 0

30

f)

0

Frequency (days)

Frequency (days)

May June July Aug. Sept.

20

80

Cloud-free percent

g)

0

70

May June July Aug. Sept.

Cloud-free percent

16 14 12 10 8 6 4 2 0

60

Cloud-free percent

e)

0

50

d)

Cloud-free percent 16 14 12 10 8 6 4 2 0

40

Cloud-free percent

Cloud-free percent

16 14 12 10 8 6 4 2 0

May June July Aug. Sept.

j)

0

10

20

30

40

50

60

70

80

90

100

Cloud-free percent

Fig. 3. Statistical distribution of cloud-free percentages by month and by region: (a) British Columbia, (b) Alberta, (c) Saskatchewan, (d) Manitoba, (e) Ontario, (f) Quebec, (g) New Brunswick, (h) Nova Scotia, (i) Prince Edward Island, and (j) Newfoundland and Labrador.

cloud-free pixel is found in the cloud cover image, flag the corresponding pixel in the grid as cloud-free; (4) continue the process for subsequent pixels of the image, and if needed, the subsequent image (the following day) of the month, until all the pixels of the grid are flagged as cloud-free, representing the availability of one cloud-free image composite; (5) increase the number of cloudfree composite by one and reset the grid system; and (6) return to step (2) but using the binary cloud cover image of the following day unless the end of the month is reached. By repeating the

steps for all growing season months and all the years 2000–2008, we obtained the total number of cloud-free image composites for individual province on a monthly basis, and based on these, we further calculated the means and standard deviations of the monthly availability of cloud-free composites for the period of 2000–2008. As the opportunity to capture cloud-free images declines with an increasing revisit interval of the satellite sensor, we extended the above algorithm from daily sensor revisiting frequency to 2day and multiple-day revisiting frequency. Briefly, the 1-day revisit

F. Zhou, A. Zhang / International Journal of Applied Earth Observation and Geoinformation 21 (2013) 17–31

algorithm was modified to admit only every nth day cloud cover binary image during a month, with n being the revisiting interval for the compositing process. Theoretically n can be the longest revisit frequency among all optical sensors; in practice, we have investigated n value of up to 16, according to the current reality of optical earth observation systems. For every province of the region under study, we superimposed the grid system on the identified daily cloud-cover binary images to determine the number of cloud-free image composites described above. There are n possible combinations when a sensor with nday revisit interval visits a place. For example, if a sensor’s revisit frequency is 5 days and assuming that the first visit of the sensor is on the first day of the month, the sensor could only visit an area in the days of 1, 6, 11, 16, etc. As the first visit of the sensor in a month can be any one of the first 5 days (for the sensor with revisit cycle of 5 days) there are 5 possible cases. As all the 5 cases are possible, the availability of cloud-free composites is estimated as the average of the 5 cases of a month from 2000 to 2008. In each case only the cloud cover binary images of these days are used to estimate the number of cloud free image composites. In the case of revisiting interval being n, the possible cases are: Case 1: day 1 and followed by days with every n day interval to the end of the month; Case 2: day 2 and followed by days with every n day interval to the end of the month; ... Case n: day n and followed by days with every n day interval to the end of the month. Our simulations showed that the chance of obtaining 100% clear sky image composites for any region under the study (the majority of the provinces in this case) is very small. This is consistent with the general practice in EO applications for monitoring (Cihlar et al., 1996). We thus tested two relaxed thresholds for clear-sky requirement in a “cloud-free” composite, 95% and 90%. It should be noted that, while a composite with fewer cloudy pixels is of higher quality and generally preferable for applications, for areas with heavy cloud contaminations, the relaxation of the thresholds may be preferable for the gains of obtaining more composites, with higher phonological consistency in each composite. 3. Results and discussion 3.1. Spatiotemporal characteristics of cloud-free conditions Fig. 2 shows, for each province in the southern Canada, the temporal characteristics of cloud dynamics for each of the growing season months, averaged over the period of 2000–2008. A general pattern can be observed: on average of the period 2000–2008, for every location of southern Canada (as recorded by pixels of 1 km × 1 km resolution), there are around at least 10 or more cloudy days in a month, except for Alberta (Fig. 2b) and Saskatchewan (Fig. 2c) where the lower bound of cloud days is about 6 for July and August. This means that for some parts of these two provinces, cloud appears for as less as 1/5 of a month only, leaving 4/5 of a month cloud-free in these summer months. This is consistent with the understanding of the Prairie climate, most typically the dry region of the Pallisier Triangle. On contrary, the low bound of cloudy days is as high as 12 and 13 days for Quebec and Newfoundland, respectively, suggesting that these provinces have more cloudy days in general. Fig. 2 also reveals some patterns for the statistical distributions of cloudy days over a region: Prairie Provinces and Ontario have near normal distributions with similar means and variation; Atlantic provinces share near normal distributions but

23

with much larger monthly variations and much narrower a normal curve; British Columbia (BC), Quebec and Newfoundland all have distributions highly skewed towards more cloudy days, suggesting the areas having much less cloud-free days. Fig. 3 reveals the spatial characteristics of cloud dynamics for southern Canada provinces over period of 2000–2008. Also on a monthly basis of the growing season, the charts show the frequencies of the various percentages of cloud-free conditions of the daily cloud binary images. Two distinctive patterns of these distributions can be observed. The first, shared by all the noncoastal provinces (Fig. 3b–f), is a heavily right-skewed pattern, which suggests that, overall, these regions have peak occurrences at 25% cloud-free condition, with very low frequencies in the highend range of cloud-free percentage within a month. A look into the monthly patterns within each graph, however, reveals some anomalies in distribution for July and August – for all the noncoastal provinces except for Quebec, the peak frequencies occur at 35–45% CFP, indicating improved clear sky conditions in these two months. Quebec appears to have the lowest number of days with regard to the high-end of CFP values, suggesting that it was the cloudiest province during the period of 2000–2008. The second pattern can be observed from the graphs of the Atlantic Region (Fig. 3g–i). The largest numbers of days appear to be those with only 5% CFP; however it also has the second largest numbers on the other extreme of CFP value – 95%. The extreme distributions imply a better chance of acquiring almost 100% cloud-free daily mosaics for these provinces in some days within a month; hence reduced necessity for cloud-free image compositing. 3.2. Cloud-free image composites available for sensors with one-day revisiting frequency Fig. 4 shows, for each southern Canada province, the simulated averaged numbers of cloud-free image composites, as well as their standard deviations, that could be obtained from a 1-day revisit satellite sensor during each of the growing season months for the period of 2000–2008. They were produced based on the 1 km × 1 km daily cloud binary images downscaled from NASA’s daily MOD06 L2 cloud fraction product, using the method described in Section 2.3, and the algorithm described in Section 2.5. Overall, the results indicate that the opportunity to obtain completely cloud-free (100%) monthly composites is very limited for majority of the provinces. For British Columbia, Quebec, and Newfoundland barely one cloud-free image composite per month can be acquired during the growing season with summer months showing a slight advantage over May and September. The regions with the best opportunity for cloud-free composite acquisition are the Maritime provinces (2–4 composites), in which August and September generally have higher availability, followed by June, July and May. Ontario and the Prairie provinces have a similar level of cloud-free composite opportunities (1–2 composites), with monthly variation larger for Alberta and smaller for Manitoba and Saskatchewan. In practice, some level of cloud contamination is often tolerated in order to obtain sufficient number of composites to meet the needs of applications. When we relaxed cloud-free threshold (i.e. the percentage of cloud-free pixels in a cloud-free image composite) down from 100% to 90%, the numbers of the averaged monthly ‘cloud-free’ composites are jumped by 2–4 times. The spatial patterns, however, remain intact: British Columbia, Quebec, and Newfoundland still have the smallest number of monthly available composites (3–5); the Maritime provinces have the largest numbers (6–10); and Ontario and the Prairie provinces are in between. The same spatial pattern appears when the threshold is set to 95%. These adjustments, however, reveal a non-linear relationship between the cloud-free threshold and the availability of cloud-free image composites. In particular, the composite availability leaps

F. Zhou, A. Zhang / International Journal of Applied Earth Observation and Geoinformation 21 (2013) 17–31

Cloud-free Comp. (count)

14

90%

12

95%

10

100%

14

a) Cloud-free Comp. (count)

24

8 6 4

b)

90%

12

95%

10

100%

8 6 4 2

2

0

0 June

May

July

Aug.

May

Sept.

June

One-day revisit sensor 14

90%

c)

95%

Cloud-free Comp. (count)

Cloud-free Comp. (count)

12

100%

10 8 6 4

14

90%

12

95%

10

100%

d)

4 2 0

May

June

July

Aug.

May

Sept.

June

One-day revisit sensor 14 90% 10

14

e)

12

95%

10

100%

8 6 4 2

Aug.

Sept.

90%

f)

95% 100%

8 6 4 2

0

0 May

June

July

Aug.

May

Sept.

June

One-day revisit sensor 90%

12

95%

10

100%

July

Aug.

Sept.

One-day revisit sensor 14

g)

12 Cloud-free Comp. (count)

14

July

One-day revisit sensor

Cloud-free Comp. (count)

12 Cloud-free Comp. (count)

Sept.

6

0

Cloud-free Comp. (count)

Aug.

8

2

10

8 6 4 2

90%

h)

95% 100%

8 6 4 2

0

0 June

May

July

Aug.

Sept.

May

June

One-day revisit sensor 14

90%

12

95%

10

100%

July

Aug.

Sept.

One-day revisit sensor 14

i)

12 Cloud-free Comp. (count)

Cloud-free Comp. (count)

July

One-day revisit sensor

10

8 6 4 2

90%

j)

95% 100%

8 6 4 2

0

0 May

June

July

Aug.

One-day revisit sensor

Sept.

May

June

July

Aug.

Sept.

One-day revisit sensor

Fig. 4. Availability of cloud-free image composites for sensors with 1-day revisit interval and with various cloud-free percentage thresholds (a) British Columbia, (b) Alberta, (c) Saskatchewan, (d) Manitoba, (e) Ontario, (f) Quebec, (g) New Brunswick, (h) Nova Scotia, (i) Prince Edward Island, and (j) Newfoundland and Labrador.

F. Zhou, A. Zhang / International Journal of Applied Earth Observation and Geoinformation 21 (2013) 17–31

when the cloud-free threshold is reduced by 5% from 100% to 95%; however, the same level of reduction from 95% to 90% yields much less leaps for all the provinces and all the months. Fig. 2 provides some evidence for the explanation. Conceivably, this is due to the fact that some areas are covered with cloud almost all month around, making it impossible to obtain a 100% cloud-free image composite even with a one-day revisiting sensor. A relaxation of the cloud-free thresholds by 5% may have the effect of ignoring these cloudy pixels, pushing the composite availability to a much higher level. Fig. 2 provides supportive evidence for this explanation. As shown in Fig. 2a, d, f and g, curves for May and July in BC, September in Manitoba, May and September in Quebec, and May in Newfoundland are extremely right skewed, indicating that some areas (as captured by some pixels) barely have a single cloud-free day, that is, they are month-around cloud covered. Beside the spatial variations, Fig. 4 also shows the temporal variations. For British Columbia, Quebec and Newfoundland, summer months (June, July and August) are more advantageous for cloud-free image compositing; for Maritime provinces, August and September are the best months; and for Ontario and the Prairie provinces, the best timing to carry out the composting task is June and July. On average, May is the least favourable for compositing for British Columbia, Alberta, Quebec, New Brunswick, Nova Scotia, and Prince Edward Island; and for the rest of the provinces, the least favourable month is September. 3.3. Availability of cloud-free composites at longer revisit intervals Using the algorithm also described in Section 2.5, we further simulated the numbers of cloud-free image composites that can be captured by sensors with revisiting intervals ranging from 2 to 16 days – the range representing the revisit capability of the most of popular optical sensors available. Figs. 5 and 6 exhibit the simulated numbers of cloud-free composites with a revisit interval ranging from 1 to 16 days, for May (the least availability) and August (the most availability), respectively. Overall, the simulation reveals some general patterns of the availability of cloud-free composites with changing sensor revisiting intervals for the study areas. As expected, the number of cloud-free composites available decreases with the increase of the revisit interval. In general, the number of cloud-free composites obtainable by sensors with various revisiting intervals is a simple function of the revisit interval and the availability of cloud-free composites for a 1-day revisiting sensor as the basis. This is evident for a case when clouds are the same every day and when there is no spatial overlap of clouds from one day to another before a composite is produced. In particular, if the availability of cloud-free composites for the area is m with 1-day revisit sensors, the number of cloud-free composites (cfc) that may be acquired with a revisit interval n would be: cfc =

m . n

(1)

However, the results (Figs. 5 and 6) show some deviations from Eq. (1). These deviations are apparent for all the provinces throughout the season. The relationship of the number of cloudfree composites with both 90% and 95% cloud-free thresholds and the revisit frequency shows a (fitted) power function as cfc = a × nb

(2)

where a, b: power function constant; and n: sensor revisit interval in days

25

As evident from Figs. 5 and 6, the values of the constants a and b are subject to the spatial and temporal characteristics of cloud dynamics for the study regions; it also depends on the threshold of cloud tolerance in the composites. Eq. (2) is empirical and based on the outputs of the regions under the study. Likely, the power function like Eq. (2) is not universal and other types of fitting functions may be more suitable for the data of other regions. Since Eq. (2) is based on the averaged value, it subjects to uncertainty. The uncertainty of cloud-free composite availability is discussed in the section below. In general, as suggested by Eq. (2), the difference of cloud-free composite availability between sensors with different revisiting frequencies is smaller than that described by Eq. (1). This can be explained by the fact that although in theory a 1-day revisit sensor has twice the chance to capture a cloud-free image than that of a 2-day revisiting sensor, cloud-free areas in consecutive days could overlap, making the 1-day revisit sensor unable to realize its full potential. The same situation occurs with the relationships between larger than 2-day revisiting sensors. Eq. (2), applicable to compositing with cloud-free percentage thresholds of 90% and 95%, explains 98–99% of the relationship of cloud-free composites and revisiting frequency of a sensor. Fig. 7 shows the fitted power function to the simulated monthly distribution of the available number of cloud-free composites with various sensor revisit intervals in August with 95% cloud-free threshold. However, the equation does not fit to all the trends of composite availability for 100% cloud-free requirement in compositing. The deviation from Eq. (2) is largely due to the constraint of no single cloudy pixel allowed in a composite. Conceivably, even if there is only one single pixel covered by cloud all month-around, no cloud-free composite can be generated even with a 1-day revisit sensor. In this situation, there is no much difference between the numbers of composites obtainable from a 1-day revisit sensor and a sensor with a few day revisit frequency, which is the case for the provinces of British Columbia, Quebec, and Newfoundland in May. Looking into the results with 95% cloud-free threshold in Figs. 5 and 6, we can find that continental interior areas (Prairie provinces, Ontario) share a similar pattern: with a 1-day revisiting sensor, August has the largest opportunity to acquire cloud-free composites (5–6 composites); and the least opportunity (∼4) occurs in May for Alberta and Saskatchewan and in September for Manitoba and Ontario (Fig. 5). As a sensor’s revisit interval increases, the number of cloud-free composites available drops: only about 3 composites are obtainable in most months with a 2day revisit sensor, and about 2 composites with a 3-day revisit sensor. For the most cloudy regions (British Columbia, Quebec, Newfoundland), the best case (1-day revisit) is about 2–3 cloudfree composites per month except in August for British Columbia (4 cloud-free composites), and the number gradually approaches 1 as the revisit interval increases to 4 days or more. The Maritime provinces have the highest opportunity for obtaining cloud-free composites (6–8) for a growing season month with a 1-day revisiting sensor; and the number drops to about 3 when the revisit interval increases to 4 days. Comparing Figs. 5 and 6, it is obvious that for most of the provinces more cloud-free composites are available in August (Fig. 6) than in May (Fig. 5); the difference is in the range of 1–3, depending on the location and the sensor’s revisit interval. This is consistent with the climatology in general: the relative humidity is higher in spring than in late summer. The spatial resolution of images with different revisit interval is not factored in the simulation due to having only 1 km spatial resolution of cloud cover product. A sensor with a longer revisiting interval generally means a higher spatial resolution. The effect of spatial resolution on cloud-free composition will be discussed in Section 3.6.

F. Zhou, A. Zhang / International Journal of Applied Earth Observation and Geoinformation 21 (2013) 17–31

10 9 8 7 6 5 4 3 2 1 0

a)

90% 95% 100%

1

2

3

4

8

12

Cloud-free comp. (count)

Cloud-free comp. (count)

26

10 9 8 7 6 5 4 3 2 1 0

16

b)

90% 95% 100%

1

10 9 8 7 6 5 4 3 2 1 0

c)

90% 95% 100%

1

2

3

4

8

12

10 9 8 7 6 5 4 3 2 1 0

16

100%

4

8

12

Cloud-free comp. (count)

Cloud-free comp. (count)

95%

3

10 9 8 7 6 5 4 3 2 1 0

16

100%

4

8

12

10 9 8 7 6 5 4 3 2 1 0

100%

3

4

8

12

16

Cloud-free comp. (count)

Cloud-free comp. (count)

95%

Sensor revisit interval (days)

12

16

95% 100%

2

3

4

8

12

16

h)

90% 95% 100%

2

3

4

8

12

16

Sensor revisit interval (days)

90%

2

8

90%

1

16

i)

1

4

f)

Sensor revisit interval (days) 10 9 8 7 6 5 4 3 2 1 0

3

Cloud-free comp. (count)

Cloud-free comp. (count)

95%

3

2

Sensor revisit interval (days)

90%

2

16

100%

1

g)

1

12

95%

Sensor revisit interval (days) 10 9 8 7 6 5 4 3 2 1 0

8

Sensor revisit interval (days)

90%

2

4

90%

1

e)

1

3

d)

Sensor revisit interval (days) 10 9 8 7 6 5 4 3 2 1 0

2

Sensor revisit interval (days)

Cloud-free comp. (count)

Cloud-free comp. (count)

Sensor revisit interval (days)

10 9 8 7 6 5 4 3 2 1 0

j)

90% 95% 100%

1

2

3

4

8

12

16

Sensor revisit interval (days)

Fig. 5. The number and uncertainty of ‘cloud-free’ composites simulated for May with various cloud-free percentage thresholds and sensor revisit intervals (a) British Columbia, (b) Alberta, (c) Saskatchewan, (d) Manitoba, (e) Ontario, (f) Quebec, (g) New Brunswick, (h) Nova Scotia, (i) Prince Edward Island, and (j) Newfoundland and Labrador.

F. Zhou, A. Zhang / International Journal of Applied Earth Observation and Geoinformation 21 (2013) 17–31

14

a)

14

b)

12

10 8 90%

6

95% 4

100%

Cloud-free comp. (count)

Cloud-free comp. (count)

12

10 8 90%

6

95% 4

2

100%

2

0

0 Sensor revisit interval (days)

Sensor revisit interval (days)

c)

14

14

10 8 90%

6

95% 4

100%

Cloud-free comp. (count)

Cloud-free comp. (count)

d)

12

12

10 8 90%

6

95%

4

2

100%

2

0

0 Sensor revisit interval (days)

14

Sensor revisit interval (days) 14

e)

f)

12

10 8 90%

6

95% 4

100%

Cloud-free comp. (count)

Cloud-free comp. (count)

12

10

2

8 90%

6

95% 4

100%

2

0

0 Sensor revisit interval (days)

Cloud-free comp. (count)

12

Sensor revisit interval (days)

h)

14

g)

12

10

Cloud-free comp. (count)

14

10

8 90%

6

95%

4

100%

8 90%

6

95% 4

100%

2

2

0

0 Sensor revisit interval (days)

Sensor revisit interval (days)

14

i)

12 10 8 90%

6

95% 4

100%

2 0

Cloud-free comp. (count)

14 Cloud-free comp. (count)

27

j)

12 10 8 90%

6

95% 4

100%

2 0

Sensor revisit interval (days)

Sensor revisit interval (days)

Fig. 6. The number and uncertainty of ‘cloud-free’ composites simulated for August with various cloud-free percentage thresholds and sensor revisit intervals (a) British Columbia, (b) Alberta, (c) Saskatchewan, (d) Manitoba, (e) Ontario, (f) Quebec, (g) New Brunswick, (h) Nova Scotia, (i) Prince Edward Island, and (j) Newfoundland and Labrador.

F. Zhou, A. Zhang / International Journal of Applied Earth Observation and Geoinformation 21 (2013) 17–31

10 9 8 7 6 5 4 3 2 1 0

10 9 8 7 6 5 4 3 2 1 0

a) 95% cloud-free Power (95% cloud-free) y = 3.716x-0.666 R2 = 0.988

2

0

4

6

8

10

12

14

b)

Cloud-free comp. (count)

Cloud-free comp. (count)

28

16

95% cloud-free Power (95% cloud-free) y = 5.293x-0.757 R2 = 0.987

2

0

Sensor revisit interval (days)

Cloud-free comp. (count)

c) 95% cloud-free Power (95% cloud-free) y = 4.976x-0.739 R2 = 0.993

2

0

4

6

8

10

12

14

10 9 8 7 6 5 4 3 2 1 0

16

95% cloud-free Power (95% cloud-free) y = 4.751x-0.739 R2 = 0.990

0

2

e) 95% cloud-free Power (95% cloud-free) y = 5.117x-0.753 R2 = 0.984

0

2

4

6

8

10

12

14

16

10 9 8 7 6 5 4 3 2 1 0

Power (95% cloud-free) y = 2.725x-0.614 R2 = 0.989

0

2

Power (95% cloud-free) y = 6.432x-0.809 R2 = 0.994

2

0 10 9 8 7 6 5 4 3 2 1 0

4 6 8 10 12 14 Sensor revisit interval (days)

Cloud-free comp. (count)

Power (95% cloud-free) -0.859

y = 7.958x R2 = 0.997

4 6 8 10 12 14 Sensor revisit interval (days)

95% cloud-free Power (95% cloud-free) y = 6.537x-0.815 R2 = 0.995

2

4

6

8

10

12

14

16

Sensor revisit interval (days)

95% cloud-free

2

16

h)

0

16

i)

0

10 9 8 7 6 5 4 3 2 1 0

4 6 8 10 12 14 Sensor revisit interval (days)

Cloud-free comp. (count)

95% cloud-free

16

95% cloud-free

16

10 9 8 7 6 5 4 3 2 1 0

j) 95% cloud-free

Cloud-free comp. (count)

Cloud-free comp. (count)

g)

14

f)

Sensor revisit interval (days) 10 9 8 7 6 5 4 3 2 1 0

4 6 8 10 12 Sensor revisit interval (days)

Cloud-free comp. (count)

Cloud-free comp. (count)

Sensor revisit interval (days) 10 9 8 7 6 5 4 3 2 1 0

16

d)

Cloud-free comp. (count)

10 9 8 7 6 5 4 3 2 1 0

4 6 8 10 12 14 Sensor revisit interval (days)

Power (95% cloud-free) y = 2.819x-0.648 R2 = 0.990

0

2

4

6

8

10

12

14

16

Sensor revisit interval (days)

Fig. 7. The fitted power function of the averaged monthly number of cloud-free (95%) composites (August 2000–2008) and various sensor revisit intervals. (a) British Columbia, (b) Alberta, (c) Saskatchewan, (d) Manitoba, (e) Ontario, (f) Quebec, (g) New Brunswick, (h) Nova Scotia, (i) Prince Edward Island, and (j) Newfoundland and Labrador.

F. Zhou, A. Zhang / International Journal of Applied Earth Observation and Geoinformation 21 (2013) 17–31

3.4. Uncertainty of cloud-free composite availability In designing or choosing earth observation systems using optical satellite sensors, it is important to consider both average, and inter-annual variability or uncertainty. Based on the 2000–2008 MODIS cloud fraction daily data within the growing seasons, we computed this variability, as represented by standard deviation, for each region under study and each month in the growing seasons during 2000–2008. The results are also shown in Figs. 4–6, represented by the error bars. As evident from these figures, the uncertainty due to interannual variability associated with cloud-free composite availability is considerable despite some variations among months and regions. In general, the results suggest that, in absolute terms, the larger the average availability of the cloud-free composites, the greater its variability, and vice versa. Prince Edward Island and New Brunswick have the highest monthly average of ∼8 and ∼7 (95% threshold in August), respectively; meanwhile, they have the largest standard deviation of 1.18 and 1.27, respectively. In comparison, the availability for Quebec and Newfoundland is ∼3 and their standard deviation is much lower as 0.46 and 0.24, respectively. In relative terms (i.e. standard deviation as percentage of the mean), however, a lower composite availability generally appears to correspond to a higher variability; this is especially true for the province of British Columbia, as shown in Fig. 4. An exception to the relative higher uncertainty with a low mean availability is the province of Newfoundland, which has the least temporal variation with a small number of composites available for all the growing season months. The general uncertainty patterns, in both the absolute and relative terms, are found applicable to all the situations with cloud-free thresholds of 90%, 95% or 100%. The magnitude of the uncertainties suggests that it is important to factor the uncertainty in designing a monitoring system based on optical satellite sensors, especially for the regions with low availability of cloud-free composites, or the sensors with large revisiting intervals. 3.5. Cloud-free threshold effect A cloud-free threshold determines the maximum percent (100% minus the threshold) of the cloudy pixels that is allowed in a composite. It has been touched upon in Section 3.2, based on a 1-day revisiting sector, that a higher percent of the cloud-free threshold would result in a composite with more useable pixels but with a lower availability of such composites; and vice versa. Figs. 5 and 6 show the comparison of the simulated numbers of cloud-free composites with various sensor revisiting intervals, for the three cloud-free thresholds – 90%, 95%, and 100%. Only the results for May and August are presented due to space constraints. These figures demonstrate that cloud-free threshold has a big impact on the availability of cloud-free composites for all regions throughout the growing season. This impact is particularly significant for sensors with shorter revisit intervals. For the sensors with 1–3 day revisiting intervals, the availability of cloud-free composites with 100% cloud-free threshold is significantly lower than that with either 95% or 90% cloud-free threshold, regardless of months and regions. This difference, however, diminishes quickly as the revisiting interval of sensors increases. As evident from these figures, there is barely a difference when the interval reaches 8 days or above. As discussed in Section 3.2, for 1-day revisiting sensors, a change in cloud-free threshold from 100% to 95% yields a much larger increase in cloud-free composite availability than a change from 95% to 90%. This non-linear relationship between the threshold and the availability, as shown in Figs. 5 and 6, holds well for sensors with 1- or 2-day revisiting intervals; still holds for some regions as the sensors’ revisiting interval increases to 3- or 4-days; and largely disappears as the interval further increases. This is

29

likely due to the same reason as for the non-linear relationship, as explained in Section 3.2. Specifically, some areas may have cloud coverage as persistent as, e.g., 25 days in a month, making any sensor with a revisiting interval of 6 days (for the month of June and September) or 7 days and up (for other months) extremely difficult to assemble a cloud-free composite; this difficulty may not be worsen by further extension of sensor revisiting intervals. The increase in composite availability by relaxing cloud-free threshold is at the price of compromising the “cloud-free” state of the resultant image composites: the lower the threshold, the higher the cloud contamination in a composite. The choice of the cloud-free threshold is therefore another factor for consideration in designing a monitoring system with optical satellite sensors. While a more thorough sensitivity analysis yields an optimal threshold, our experiment with the three thresholds (90%, 95%, and 100%) for sensors with various revisiting intervals suggests that 95% is a preferable threshold in general for the study regions. For Canada’s southern provinces in August, this threshold would, on average, generate 3–8 composites for sensors with daily revisiting sensors, about 2–5 composites for two-day revisiting sensors, and 1–2 composites for 4-day revisiting sensors. As the revisiting interval increases, the composite availability decreases, but noticeably, difference in availability due to different cloud-free thresholds diminishes. 3.6. Spatial resolution effect The simulation of cloud-free composite availability was based on the binary cloud cover images derived from the MODIS MOD06 L2 cloud-fraction products. The binary images represent cloud cover conditions at one point in the diurnal period and over an area of 1 km × 1 km in size. The mid-morning Terra overpass is situated in a period with favourable combination of illumination (local solar angle sufficiently high) and generally low percentage of cloud cover (especially cumulus clouds from mesoscale circulation). For the same reason, many optical satellite missions at various spatial resolutions choose the same overpass window. However, most present and future optical sensors of interest for studies of earth surface dynamics, such as Landsat, SPOT, AWiFS, offer spatial resolutions considerably higher than 1 km × 1 km. It is thus a question whether one can apply the results of this study to the higher resolution cases. An analysis at the pixel level below reveals that the simulated availabilities of cloud-free image composites as a result of this study provide reasonable estimates for optical sensors with higher than 1 km × 1 km resolution, though the closer to 1 km × 1 km resolution the optical sensor, the more pertinent the application. The MODIS MOD06 L2 cloud fraction product, from which the 1 km × 1 km cloud cover binary images were derived, was generated, by a MODIS science team, based on the MODIS MOD35 cloud mask product (Platnick et al., 2003). With a spatial resolution of 1 km × 1 km, the cloud mask has 4 possible values for each pixel: cloudy, probably cloudy, probably clear, and confident clear. It is reasonable to apply the values of confident clear and cloudy directly to higher resolutions pixels for the same area captured by higher resolution optical sensors. The question lies in the pixels of the cloud mask with values of probably clear and probably cloudy. An overview of the cloud mask product suggests that a big portion of southern Canada provinces are cloud-covered daily; and that the clouds are mostly highly concentrated (i.e. clustered). The probably clear and probably cloudy pixels are most likely corresponding to the edging areas of the clustered clouds, which, in quantity, would be much smaller than the number of confident clear and cloudy pixels. The MODIS science team used complex algorithms to classify these pixels in accounting cloud fractions. While the details of these algorithms are beyond the scope of this analysis, it is

30

F. Zhou, A. Zhang / International Journal of Applied Earth Observation and Geoinformation 21 (2013) 17–31

reasonable to ascertain that their approach has resulted in unbiased estimates of cloud fractions. It is reasonable to assume that the edging pixels have a normal distribution of cloud cover percentage; the generalization effect would therefore result in unbiased cloud fraction accounting. The error effect, as a result of a rigorous scientific study, should be random in nature, hence self-cancelled out when looking at an aggregated level, such as a large region. Retaining this unbiased property of the cloud fraction product was the primary goal of the algorithm developed during this study to downscale the product to cloud cover binary images at the 1 km × 1 km resolution (Section 2.3). The resultant 1 × 1 binary images hence carry on the generalization and random error effects pertaining to the cloud fraction product. When used in simulating the availability of cloud-free image composites, the 1 km × 1 km cloud cover binary pixels have the following possibilities in relation to higher resolution pixels: (a) the pixels represent the clear/cloudy states of the MODIS MOD35 cloud mask pixels, hence applicable to higher resolution pixels directly; (b) the pixels with cloudy state cover partial clear conditions, which would contribute to an underestimation of cloud-free composite availability; and (c) the pixels with clear state cover partial cloudy conditions, which would contribute to an overestimation of cloud-free composite availability. The unbiased nature of the cloud cover binary images implies that the overestimations and underestimations would be self-cancelled when aggregated to a higher level, such as the provinces in this case. It is hence our opinion that the methodology presented in this paper is not highly sensitive to spatial resolutions. While it is ideal to use cloud binary images of the same resolution (when they are available) as representing the compositing images for estimating cloud-free composite availability, the methodology presented in this paper provides a simplified alternative when the situations are not ideal, which is often the case in practice.

4. Conclusions This paper introduced a methodology for estimating availability of cloud-free image composites for optical sensors with various revisiting intervals, based on the daily MODIS MOD06 L2 cloud fraction products from 2000 to 2008. The methodology starts with downscaling of the cloud fraction product to 1 km × 1 km cloud cover binary images. These images are then used for the exploration of spatial and temporal characteristics of cloud dynamics, and subsequently for the simulation of cloud-free image composite availability with various revisiting intervals of optical earth observation systems. Using Canada’s southern provinces as an application case, the study explored several factors important for the design of earth observation monitoring systems using optical sensors, in particular, cloud dynamics, sensors’ revisiting interval, cloud-free threshold for targeting composites, inter-annual variability of cloud dynamics. While the cloud cover images used in the analysis are at 1 km × 1 km resolution, our analysis suggests that the simulated numbers of cloud-free image composites available may also provide reasonable estimates for optical sensors with higher than 1 km × 1 km resolution, though the closer to 1 km × 1 km resolution the optical sensor, the more pertinent the application. The methodology may be parameterised to different temporal period and different spatial region, depending on applications. The study revealed that, overall, southern Canada is heavily clouded during growing season months, with British Columbia, Quebec and Newfoundland being the most cloudy (Group 1), Ontario and the Prairie Provinces in the middle (Group 2), and Maritime Provinces the least (Group 3). With 1-day revisiting optical sensors and the 100% cloud-free criterion, the average chance of obtaining cloud-free image composites during a growing season month is less than 1 for Group 1, less than 2 for Group 2, and less

than 4 for Group 3. This chance would further diminish as the sensor’ revisiting interval increases – 2-day revisiting interval would reduce Group 3’s average chance to 1, and 4-day revisiting interval would bring down Group 2’s chance also to 1. When accounting for inter-annual variability, even 1 composite cannot be guaranteed. To strike a balance between composite quantity and quality, 95% cloud-free threshold for the composites is found a preferred choice. The permission of 5% cloudy pixels in the composites could double or even tripled the chance for sensors with less than 4-day revisiting intervals; once the revisiting interval increases above 4 days, however, this level of cloud tolerance would not make a significant difference except for the Maritime Provinces, in which this difference becomes insignificant only when the revisiting interval increases to 8 days or longer. The significant spatial and temporal variability of the availability of cloud-free image composites of the studied regions attested our initial proposition that it will be greatly beneficial if assessment of cloud-free image composite availability is conducted before an operational monitoring system of remote sensing is designed. The method introduced in this paper can be used for this purpose, and can be applied to any regions and any time-frame. The spatial and temporal information of cloud-free image composite availability at the regional level would find a number of applications such as forest fire monitoring and modelling, albedo mapping, leaf area index generation, vegetation phenology study, and others. With the help of the information, researchers are able to choose the optimal combination of earth observation system, time frame, and cloud-free threshold for image compositing to meet their monitoring needs. Acknowledgements The authors would like to thank Drs. Josef Cihlar, Richard Fernandes, and Junhua Li for their valuable comments that have improved this manuscript considerably. The authors wish to thank the anonymous reviewers for their insightful critiques, which led to significant improvements of the manuscript. This research is partially financed by Canadian Space Agency through the Government Related Initiatives Program. NASA and GeoGratis are acknowledged for the provision of input data. References Abuelgasim, A.A., Leblanc, S.G., 2011. Leaf area index mapping in northern Canada. International Journal of Remote Sensing 32 (18), 5059–5076. Ackerman, S.A., Strabala, K.I., Menzel, W.P., Frey, R.A., Moeller, C.C., Gumley, L.E., Baum, B.A., Schaaf, C., Riggs, G., 1997. Discriminating clear-sky from cloud with MODIS algorithm theoretical basis document (MOD35). EOS ATBD web site, 125 pp. ftp://eospso.gsfc.nasa.gov/ATBD/REVIEW/MODIS/ATBD-MOD-06/atbdmod-06.pdf. Arvidson, T., Gasch, J., Goward, S.N., 2001. Landsat 7’s long-term acquisition plan—an innovative approach to building a global imagery archive. Remote Sensing of Environment 78 (1–2), 13–26. Bindschadler, R., Vornberger, P., Fleming, A., Fox, A., Mullins, J., Binnie, D., Paulsen, S.J., Granneman, B., Gorodetzky, D., 2008. The Landsat image mosaic of Antarctica. Remote Sensing of Environment 112 (12), 4214–4226. Boé, J., Terray, L., Habets, F., Martin, E., 2007. Statistical and dynamical downscaling of the Seine basin climate for hydro-meteorological studies. International Journal of Climatology 27, 1643–1655. Buriez, J.C., Bonnel, B., Fouquart, Y., Geleyn, J.F., Morcrette, J.J., 1988. Comparison of model-generated and satellite-derived cloud cover and radiation budget. Journal of Geophysical Research 93 (D4), 3705–3719. Chen, J.M., 1996. Optically-based methods for measuring seasonal variation of leaf area index in boreal conifer stands. Agricultural and Forest Meteorology 80 (2–4), 135–163. Cihlar, J., 1996. Identification of contaminated pixels in AVHRR composite images for studies of land biosphere. Remote Sensing of Environment 56 (3), 149–163. Cihlar, J., Ly, H., Xiao, Q., 1996. Land cover classification with AVHRR multichannel composites in northern environments. Remote Sensing of Environment 58 (1), 36–51. Deneke, H.M., Feijt, A.J., Roebeling, R.A., 2008. Estimating surface solar irradiance from METEOSAT SEVIRI-derived cloud properties. Remote Sensing of Environment 112 (6), 3131–3141.

F. Zhou, A. Zhang / International Journal of Applied Earth Observation and Geoinformation 21 (2013) 17–31 Dietz, A.J., Kuenzer, C., Gessner, U., Dech, S., 2012. Remote sensing of snow—a review of available methods. International Journal of Remote Sensing 33 (July (13)), 4094–4134. Flowler, H.J., Blenkinsop, S., Tebaldi, C., 2007. Linking climate change modelling to impacts studies: recent advances in downscaling techniques for hydrological modelling. International Journal of Climatology 27, 1547–1578. Fontana, F.M.A., Coops, N.C., Khlopenkov, K.V., Trishchenko, A.P., Riffler, M., Wulder, M.A., 2012. Generation of a novel 1 km NDVI data set over Canada, the northern United States, and Greenland based on historical AVHRR data. Remote Sensing of Environment 121 (June), 171–185. Greaves, J.R., Sherr, P.E., Glaser, A.H., 1970. Cloud cover statistics and their use in the planning of remote sensing missions. Remote Sensing of Environment 1 (2), 95–101. Gutman, G.G., Ignatov, A.M., Olson, S., 1994. Towards better quality of AVHRR composite images over land: reduction of cloud contamination. Remote Sensing of Environment 50 (2), 134–148. Hewitson, B.C., Crane, R.G., 1996. Climate downscaling: techniques and application. Climate Research 7 (November), 85–95. Ju, J., Roy, D.P., 2008. The availability of cloud-free Landsat ETM+ data over the conterminous United States and globally. Remote Sensing of Environment 112 (3), 1196–1211. King, M.D., Tsay, S.-C., Platnick, S.E., Wang, M., Liou, K.-N., 1997. Cloud retrieval algorithms for MODIS: optical thickness, effective particle radius, and thermodynamic phase. MODIS Algorithm Theoretical Basis Document No. ATBD-MOD-05, MOD06 – Cloud product (23 December 1997, version 5), 79 pp. Li, Z., Cihlar, J., Fraser, R.H., Khananian, A., 1999. Remote sensing of Canadian boreal forest fires: hotspots, burned area, and smoke plumes. In: Proceedings of the 1999 IEEE International Geoscience and Remote Sensing Symposium (IGARSS’99) ‘Remote Sensing of the Systems Earth—A Challenge for the 21st Century’, vol. 4, Hamburg, Germany, 28 June 1999 to 2 July 1999, pp. 2241–2243. le Maire, G., Marsden, C., Verhoef, W., Ponzoni, F.J., Lo Seen, D., Bégué, A., Stape, J.-L., Nouvellon, Y., 2011. Leaf area index estimation with MODIS reflectance time series and model inversion during full rotations of Eucalyptus plantations. Remote Sensing of Environment 115 (February (2)), 586–599. Leblon, B., 2005. Monitoring forest fire danger with remote sensing. Natural Hazards 35 (July (3)), 343–359. Luo, Y., Trishchenko, A.P., Khlopenkov, K.V., 2008. Developing clear-sky, cloud and cloud shadow mask for producing clear-sky composites at 250-meter spatial

31

resolution for the seven MODIS land bands over Canada and North America. Remote Sensing of Environment 112 (12), 4167–4185. Marshall, G.J., Dowdeswell, J.A., Rees, W.G., 1994. The spatial and temporal effect of cloud cover on the acquisition of high quality Landsat imagery in the European Arctic sector. Remote Sensing of Environment 50 (2), 149–160. Platnick, S., King, M.D., Ackerman, S.A., Menzel, W.P., Baum, B.A., Riedi, J.C., Frey, R.A., 2003. The MODIS cloud products: algorithms and examples from Terra. IEEE Transactions on Geoscience and Remote Sensing 41 (2), 459–472. Salathé Jr., E.P., Mote, P.W., Wiley, M.W., 2007. Review of scenario selection and downscaling methods for the assessment of climate change impacts on hydrology in the United States pacific northwest. International Journal of Climatology 27, 1611–1621. Sano, E.E., Ferreira, L.G., Asner, G.P., Steinke, E.T., 2007. Spatial and temporal probabilities of obtaining cloud-free Landsat images over the Brazilian tropical savanna. International Journal of Remote Sensing 28 (12), 2739–2752. Schroeder, W., Csiszar, I., Morisette, J., 2008. Quantifying the impact of cloud obscuration on remote sensing of active fires in the Brazilian Amazon. Remote Sensing of Environment 112 (2), 456–470. Sunar, F., Özkan, C., 2001. Forest fire analysis with remote sensing data. International Journal of Remote Sensing 22 (August (12)), 2265–2277. Wagner, F., Reuder, J., 2005. Cloud frequency with respect to remote sensing applications: example of Bavaria, southern Germany. International Journal of Remote Sensing 26 (21), 4733–4745. Wilby, R.L., Wigley, T.M.L., 1997. Downscaling general circulation model output: a review of methods and limitations. Progress in Physical Geography 21 (4), 530–548. Xu, C.Y., 1999. From GCMs to river flow: a review of downscaling methods and hydrologic modelling application. Progress in Physical Geography 23 (2), 229–249. Zhao, H., Fernanders, R., 2009. Daily snow cover estimation from advanced very high resolution radiometer polar pathfinder data over Northern Hemisphere land surfaces during 1982–2004. Journal of Geophysical Research D: Atmospheres 114 (March (5)), http://dx.doi.org/10.1029/2008JD011272. Zhang, Q., Pavlic, G., Chen, W., Latifovic, R., Fraser, R., Cihlar, J., 2004. Deriving stand age distribution in boreal forests using SPOT VEGETATION and NOAA AVHRR imagery. Remote Sensing of Environment 91 (June (3–4)), 405–418.