REMOTE SENSING OF ENVIRONMENT
10:175-184(1980)
175
Monitoring Land-Cover Change by Principal Component Analysis of Multitemporal Landsat Data
G.F. BYRNE, P.F. CRAPPER, and K.K. MAYO Division of Land Use Research, CSIRO, P.O. Box 1666, Canberra City, A.C.T. 2601, Australia
Two four-channel Landsat scenes of the same area, which were recorded on different dates, were superimposed and treated as a single eight-dimensional (channel) data array. Principal component analysis (PCA) of this array resulted in the gross differences associated with overall radiation and atmospheric changes appearing in the major component images and statistically minor changes associated with local changes in land cover appearing in the minor component images.
Introduction
It has been shown that all the spectral information in the four data channels of a Landsat multispectral scanner (MSS) image can be very well represented by two-dimensional or, at the most, threedimensional data because channels are highly correlated (Kauth and Thomas, 1976; Wheeler and Misra, 1976). In the more common applications, such as landcover mapping, all such information has equal potential significance and color composite picture production systems are frequently designed to display all the available information (Kaneko, 1978). Principal component analysis, a powerful method of analyzing correlated mtdtidimensional data, can be used in such systems to facilitate the visual interpretation of a mass of data having uniform a pr/or/ significance, by reducing redundancy, i.e., correlation between channels. Information which appears in more than one channel is treated as being of no less potential value than any other. The procedure incidentally renders it susceptible to presentation with a three-channel optical system. ©Elsevier North Holland Inc., 1980 52 Vanderbilt Ave., New York, NY 10017
When two four-channel images of the same area are obtained at different dates and the eight channels are compared with a view to monitoring temporal or seasonal change, there will be high correlation between the two images. It is those parts of the scene which show an absence of correlation which are of interest because they represent areas of change. Here, there is an additional purpose in identifying redundancy and that is to separate it out as "noise", obscuring change. The more usual ways of identifying changed and unchanged areas are ratioing a n d / o r differencing of images, sometimes with pre- or post-classification (e.g., Weismiller et al., 1977). However, to use these approaches, restrictive assumptions must be made as to uniformity of interchannel and interpixel effects of the atmospheric and radiation changes that have occurred between the taking of the two images. The proposed use of principal component analysis (PCA) in the solution of this problem involves decomposing the four-plus-four channels of correlated MSS data into eight orthogonal axes. As indicated below, the firstand second-order component images will
0034-42~7/80/070175 + 10501.75
176
tend to represent unchanged land cover with the result atypical of most applications of the PCA technique (including that mentioned above), that it is the third and later component axes which are of interest. The same considerations as regards orthogonality of the axes apply in this study as have applied in previous applications of the PCA technique to MSS data. The correlation between the input images makes it possible to classify and hence to discard most of the variance present in the data. A prerequisite of the approach is that the bulk of the variance is associated with substantial correlated sources other than those of interest and that the change, i.e., variance, of interest is confined to a fraction of the area. Sources of Variance Each image data set may be regarded as a set of points in four-dimensional space. The effect of change on the position of these points in four-dimensional space may be anticipated to be as follows: (i) Changes in atmospheric conditions might be expected to affect all points similarly. Hence, one would expect rotation about the origin and expansion or contraction to differing degrees in all four dimensions. (ii) Changes in soil water would have a similar effect but confined to those points which represented vegetated surfaces or bare soil. (iii) Differences between satellites and calibration procedures, differences in sea state should all rank high as sources of variance, extending over a significant percentage of pixels. (iv) Changes in land cover would presumably p r o d u c e a significanl
G . F . BYRNE ET AL.
change in the position of the relevant points, but these would be few in number. Thus, differences, i.e., variance between the two images, will typically be of two kinds. First, there are those that extend over a substantial part of the scene such as those arising from changes in atmospheric transmission or soil water status. These would presumably be a substantial source of variance, since they involve all or most of the pixels. Second, there are those that are restricted to parts of the scene such as the clearing of forests, construction of roads, or the erection of buildings. Variance in the latter category would be orthogonal to that of the first category. Unfortunately, differences falling into the first category are not in general readily quantifiable or predictable.
Method and Results of PCA Analysis The area chosen for the study centered on the township of Batemans Bay (35°44'S, 150 ° 15'E) and is approximately 24,200 ha comprising 31% water. Batemans Bay is the largest town in the rapidly developing South Coast area of New South Wales. Two Landsat MSS images (see Table 1) were overlaid and registered by eye. The intricate nature of the coastline made possible overall registration accuracy of half a pixel or better, registration of two small parts of Landsat frames being somewhat easier than registration of a single Landsat image with a map based on a different projection. However, it remains a fact that no firm conclusion can be drawn as to the nature of changes indicated for isolated pixels.
MONITORING LAND-COVER CHANGE TABLE 1
177
D a t a on I_madsat Passes
D a t e of pass Time of pass (AEST) Sun elevation H e i g h t of tide E s t i m a t e d soft w a t e r c o n t e n t *
6 N o v e m b e r 1972 0914 51" 1.6 m 4 0 nun
13 O c t o b e r 1975 0901 43* 0.6 m 70 m m
Area selected for registration a n d s t u d y was 242 lines b y 222 pixels " E s t i m a t e d using t h e "'WATBAL" t e c h n i q u e (MeAlpine, 1970) from r e c o r d e d rainfall a n d t a n k evaporation, for the top I 0 c m of soft.
TABLE 2
Results of Principal C o m p o n e n t Analysis
MEANS A~rD STANDARD DEVIATIONS FOR LANDSAT DATA 1972 Channel Mean S.D.
1 24.8 4.4
2 15.4 5.8
1975 3 24.8 14.7
4 25.7 18.2
5 11.9 3.4
6 10.8 5.8
7 21.5 15.1
8 22.8 17.4
4 10.7 0.9
5 2.2 0.2
6 1.6 0.1
7 1.4 0. I
8 0.8 0.1
E1GENVALUESWITH THEIR ASSOCIATED PERCENTAGES
Component Eigenvalue Percentage variance
1 1067.3 90.9
2 60.0 5.1
3 29.9 2.6
EIGENVECTORS
1 1 2 3 4 5 6 7 8
0.06 0.12 0.44 0.54 0.04 0.11 0.45 0.52
2
3
4
5
6
7
8
- 0.42 - 0.46 0.11 0.41 - 0.35 -0.50 -0.26 - 0.01
0.33 0.35 0.43 0.38 0.02 -0.05 -0.37 - 0.54
0.37 0.41 0.00 - 0.24 - 0.37 -0.62 -0.05 0.34
0.68 - 0.59 -0.17 0.12 0.34 -0.17 -0.07 0.90
- 0.04 - 0.28 0.73 - 0.54 0.03 -0.12 0.23 - 0.16
-- 0.12 0.01 0.24 -- 0.17 0.20 0.23 -0.72 0.54
0.30 -- 0.25 0.03 -- 0.04 - 0.76 0.51 -0.05 0.03
3
4
5
6
7
8
0.41 0.33 0.16 0.1I
0.27 0.23 0.00 -0.04
0.23 -0.15 - 0.01 0.01
0.04
- 0.36
Deterr~inant ffi 0.775E + 98 CORRELATXONS OF CHANNELS WITH .AXES
1 2 3 4 5 6 7 8
0.41 0.66 0.98 0.98 0.38 0.05 0.98 0.98
-0.73 - 0.01 0.06 0.17 - 0.81 - 0.67 -0.13 - 0.01
0.00 -0.14 -0.17
- 0.35 -0.01 0.06
0.15
- 0.04 -0.01 0.01
- 0.01
- 0.03
-0.06 0.06 -0.04 0.01 - 0.03 0.02 -0.01
0.05 0.02 -0.01 0.07
0.05 -0.06 0.04
0.06 -0.04 0.05 0.00 - 0.20 0.08 0.05 0.05
178
G.F. BYRNE ET AL
FIGURE 1. Component-1 image.
The fourth-channel data were first multiplied by two in each case to correct for the calibration difference because of the scale dependence of the PCA technique. The two sets of four-channel data were then combined and treated as a single eight-dimensional data set the first four dimensions being the four channels of the first image. Principal component analysis was performed on the variancecovariance matrix, the results being given in Table 2.
The first five components are displayed in Figs. 1-5, white representing a high value of the component as indicated in Table 3. The first component is bimodal the rest being unimodal. The direction of the first component axis was very much influenced by the two groups of data points, one group being that of the land, the other of the sea areas. For the purposes of the present study, the sea data could have been removed before analysis with perhaps some improvement
MONITORING LAND-COVER CHANGE
FIGURE 2.
179
Component-2 image.
in clarity, but was retained in the interests of generality. An earlier processing of the data, not presented here, in which the fourth-channel data were left unchanged (i.e., not multiplied by two) demonstrated the effect. The following characteristics of these first five components (i.e., the statistically significant ones) are evident: (i) The first component is very highly
correlated with brightness in the infrared channels 3, 4, 7, 8 and is responsible for 91% of the variance. It is apparent that this component is a measure of brightness in the infrared channels, a result which parallels those of previous studies of the four-dimensional Landsat MSS data sets (Kauth and Thomas, 1976; Kaneko, 1978).
G. F. BYRNE ET AL.
180
FIGURE 3.
Component-3 image.
(ii) The second component appears to be a measure of brightness in the two visible bands. Both the first and second components appear to be smoothly varying over the study area. (iii) The third and fourth components, as is evident from the algebraic signs of their weights ("eigenvectors" in Table 2) express, as might be hoped, differences between the two images, component 3 being a measure of
difference between the two images, component 4 having differences in the infrared channels reversed. These differences are spatially discontinuous, major changes often involving only a comparatively small number of pixels. The effect of the difference in tide heights is evident along the beaches and in the shoal areas. It is possible that rotation of the axes could lead to a better separation of visible and infrared
MONITORING LAND-COVER CHANGE
FIGURE 4.
181
Component-4 image.
changes in some cases. However, in the interests of demonstrating the simplicity of the technique, this possibility will not be explored here. (iv) The channel weights in the component-5 image make interpretation difficult, but there is an obvious association with regrowth or forest clearing. Calculated component scores for uniform areas of forest and ocean are also given in Table 3. These calculated scores
provide reference values on the gray images, change being indicated by depar-
tures from a mid-gray in either direction to white or black, i.e., areas which are unchanged appear mid-gray in the thirdand fourth-component images. An interesting feature of the Table 3 values is that it is mainly components 1 and 2 which change in going from land to ocean. The component-1 image gives an excellent impression of ruggedness, suggesting that this view is typically ob-
182
G.F. BYRNE ET AL.
FIGURE 5.
Component-5 image.
scured by spatially and temporally transient changes associated with single images or single channels, although there may be a pseudostereo effect associated with the superposition of images with slightly differing sun angles. Donker and Meijerink (1977) suggested that a firstcomponent transform of a single fourdimensional scene can be used to give a good impression of "relief and dissection of terrain." This is consistent with the fact that component i is heavily weighted
by the infrared channels which are not significantly affected by atmospheric effects, i.e., shadows appear sharp. Linear features such as roads would appear to have been emphasized in this way, although slight misregistration also emphasizes such features. The lack of continuity occasionally evident in the gray scale especially for the later components arises from the limited number of integers used in the Landsat data transmission and storage and hence in the limited
MONITORING LAND-COVER CHANGE
TABLE 3
183
Image Data CO~PONF~T
COMPO~,~'~T
PICTURE
TYPICAL
SCORE
COMPON~
~OR]~
EXTREMES
WroTE
LAND
1
MIm~M 2
MAXIMtrM 151
BLACK 48.3
87.9
66.9
OCEAN 7.0
2
- 109
- 1
- 40.5
- 8.5
- 15.8
- 18.5
3
- 31
69
- 2.7
23.3
14.07
4
- 38
33
- 3.4
10.4
3.8
13.0 5.1
5
- 5
24
3.1
12.1
10.7
10.8
number of values taken on by component scores. No attempt has been made to smooth out these discontinuities, a procedure which would be of little more than cosmetic value. The eigenvalues and variance for components 6, 7, and 8 are similar (see Table 2), suggesting that they characterize a random background noise. There is, however, a spatial pattern evident in these images (not shown) in the form of a clearly visible coast outline. This fact suggests that an exhaustive analysis of the original data would necessitate a measure of the relative positions of the pixels in addition to the measures of spectral properties used in this study.
(a) Acacia regrowth (lo) Clearing associated with the "new" Catalina housing estate (c) Tidal changes on mudflats (d) Forest clearing activity (e) Regeneration following a forest fire early in 1972 (f) Intensive logging of native forest (g) Tidal changes on oyster farm areas (h) Urban development (i) Clearing and windrowing (j) Gravel extraction activity (k) Intensively managed forest area Comparison of Figs. 3 and 4 show that the change in sign on the component 4 axis (white areas on Fig. 3) are associated with regrowth of vegetation.
Land-Cover Changes
Conclusion
Land-cover changes identified by means of the black and white areas on the component-3 image (index letters on Fig. 3) have been readily correlated with known changes in land cover indicated below. It can be fairly stated that information about these changes existed before the numerical treatment described in the present paper, this area having been intensively studied by this Division of CSIRO using more traditional methods (Austin and Cocks, 1978). Features noticeable on the image include:
Principal component analysis has provided an effective way of identifying areas in which change has occurred between two four-channel multispectral scanner images. In the highly variegated area of the present study, the third-, fourth-, and fifth-component axes images contained useful information on change. The first-component image provides a good impression of topography. The theoretical basis of this analysis is susceptible to further exposition in terms of the matrix manipulation involved. This will be outlined in a later paper.
184
Considering the substantial fraction (96%) of variance in the data which is not concerned with change and which is associated with the first- and secondcomponent axes, this study also has implications for the bandwidth of data trankmission systems for satellites designed to monitor change.
Prepublication access to the new CSIRO-ORSER-DISIMP image processing system and to the CSIRO Division of Computing Research's image analysis facility greatly expedited this work and the assistance of Dr. J. O'Callaghan, Dr. D. Fraser and Dr. M. Wilson and of Mr. E. O'Brien of that Division is gratefully acknowledged. Dr. M. P, Austin, Dr. D. L. Jupp and Dr. M. C. Anderson of the Division of Land Use Research contributed a substantial amount o f information on land-cover change and also some useful discussion. J. R. McAlpine of this Division kindly calculated soil water balances. References Austin, M. P., and Cocks, K. D., eds. (1978), Land Use on the South Coast of New South Wales. CSIRO, Melbourne. Donker, N. H. W., and Meijerink, A. M. J. (1977), Digital processing of LANDSAT
G . F . BYRNE ET AL.
imagery for maximum impression of ruggedness. ITC 1ournal, 1977 (4): 691697. Kaneko, T. (1978), Color composite pictures from principal axis components of multispectral scanner data. IBM ]. Res, Dev. 22: 386-392. Kauth, R. J., and Thomas, G. S. (1976), The tasselled cap--A graphic description of the spectral-temporal development of agricultural crops as seen by LANDSAT. Proceedings of 1976 Symposium on Machine Processing of Remote Sensed Data, LARS, Purdue, IN. McAlpine, J. R. (1970), Estimating pasture growth periods and droughts from simple water balance models. Proceedings of the Eleventh International Grassland Congress, Univ. Queensland Press, Brisbane. Weismiller, R. A., Kristof, S. J., Scholz, D. K., Anuta, P. E., and Momin, S. M., (1977), Evaluation of change detection techniques for monitoring coastal zone environments. Proceedings of the Eleventh International Symposium on Remote Sensing of Environment ERIM, Ann Arbor, MI, p. 1229-1238. Wheeler, S. G., and Misra, P. N. (1976), Linear dimensionality of LANDSAT agricultural data with implications for classification. Proceedings of 1976 Symposium on Machine Processing of Remote Sensing Data, LARS, Purdue, IN. Received 10 May 1979; revised 24 April 1980.