Extraction of Endmembers from Spectral Mixtures

Extraction of Endmembers from Spectral Mixtures

Extraction of Endmembers from Spectral Mixtures F. J. Garcı´a-Haro,* M. A. Gilabert,* and J. Melia´* L inear spectral mixture modeling (LSMM) divide...

534KB Sizes 1 Downloads 53 Views

Extraction of Endmembers from Spectral Mixtures F. J. Garcı´a-Haro,* M. A. Gilabert,* and J. Melia´*

L

inear spectral mixture modeling (LSMM) divides each ground resolution element into its constituent materials using endmembers which represent the spectral characteristics of the cover types. However, it is difficult to identify and estimate the spectral signature of pure components or endmembers which form the scene, since they vary with the scale and purpose of the study. We propose three different methods to estimate the spectra of pure components from a set of unknown mixture spectra. Two of the methods consist in different optimization procedures based on objective functions defined from the coordinate axes of the dominant factors. The third one consists in the design of a neural network whose architecture implements the LSMM principles. The different procedures have been tested for the case of three endmembers. First, were used simulated and real data corresponding to mixtures of vegetation and soil. Factors that limit the accuracy of the results, such as the number of channels and the level of data noise have been analyzed. Results have indicated that the three methods provide accurate estimations of the spectral endmembers, especially the third one. Moreover, the second method, that is based on the exploration of the mixture positions in the factor space, has demonstrated to be the most appropriate when the dimensionality of the data is reduced. Finally, this procedure was applied on a Landsat-5 TM scene. Elsevier Science Inc., 1999

INTRODUCTION The growing trend toward quantification in the Earth sciences generally involves the analysis of compositional * Departament de Termodina`mica, Facultat de Fı´sica, Universitat de Vale`ncia, Burjassot, Vale`ncia, Spain Address correspondence to F. J. Garcı´a-Haro, Dept. de Termodina`mica, Facultat de Fı´sica, Univ. de Vale`ncia, Dr. Moliner, 50, 46100 Burjassot, Vale`ncia, Spain. E-mail: [email protected] Received 6 August 1997; revised 24 November 1998. REMOTE SENS. ENVIRON. 68:237–253 (1999) Elsevier Science Inc., 1999 655 Avenue of the Americas, New York, NY 10010

measurements on a large number of materials (Full et al., 1981). One way of viewing such a data set is to consider individual samples as mixtures of a relatively small number of endmembers. Linear spectral mixture modeling (LSMM) divides each ground resolution element into its constituent materials or components using endmembers which represent the spectral characteristics of the cover types. Endmembers are the features recognizable in a scene as being meaningful for an observer, and constitute abstractions of real objects that can be regarded as having uniform properties (Strahler et al., 1986). In many problems, the number of endmembers and their composition may be unknown and the problem becomes to unmix the given compositions into a number of endmembers supported by the dimensionality of the data. The estimation of these endmembers is not straightforward since they vary depending upon the scale and purpose of the study (Milton and Emery, 1995). The endmembers are also influenced by processes such as the scattering by surface materials or other factors such as the illumination geometry, which could be considered as additional sources of noise in the modeling approach. Methods developed in literature to deduce the endmembers have limitations. Some methods extract them simply from the image using the known spectral signatures of surface dominant cover types (Bierwirth, 1990). Other techniques such as principal component (PC) analysis enclose the cloud of pixel spectra within a solid geometric figure whose number of vertex is equal to the dimensionality of the basis space (White and Drake, 1993; Bryant, 1996). However, despite that these methods guarantee the feasibility of the outcoming fractions, they assume that the true endmembers are contained within the data set, which is usually not realistic. In practice, it is difficult to find pure targets (i.e., fully covered by an unique material) at the spatial resolution of satellite sensors. For example, the “matorral” spectral class presents a pixel cover of typically less than 40% in semi-arid areas. In this case its spectral signature cannot be captured 0034-4257/99/$–see front matter PII S0034-4257(98)00115-1

238

Garcı´a-Haro et al.

within a sample set of pixels nor predicted theoretically. Other methods have been developed to solve the problem when the endmembers are not represented by the samples (external endmembers). One of them consists in the alignment of image spectra to reference endmembers by means of matching techniques that compare pixel spectra to each library spectra. It enables us to accept or reject them according to parameters introduced in an expert system (Johnson et al., 1985). Other methods that are based on the QMODEL (Full et al., 1981), fuzzy algorithms (Bezdek, 1981; Full et al., 1982), or the parallel coordinate representation (Bateson and Curtiss, 1996) provide feasible endmembers from the innate relationships between samples located as vectors in the PCs space. It is common that compositional variables are inherently correlated (e.g., vegetation and soil fractions are often negatively correlated). A large degree of dependence between variables implies that the same amount of variability can be carried by a smaller number of new variables, each of which would be mutually independent (Full et al., 1982). These abstract variables (representing a new coordinate reference axes) can be defined and related to the original ones via linear equations, operations which are termed factor analysis (FA). In this article, three new methods have been developed to identify the number and the spectral signature of the major components from a set of mixture spectra. The first method consists in deducing the best possible rotation of the factor abstract axes to “align” them upon the directions of the real component spectra. The second method is based on the position of all the objects in the reduced “factor space.” Both methods employ a minimization procedure of an objective function appropriately defined. The third method consists in the design of a neural network whose architecture implements the LSMM principles. The three methods have been validated for the case of three endmembers. First, we have used simulated and real data corresponding to mixtures of vegetation and soil. Finally, the procedures have been applied on data from the sensor Thematic Mapper (TM) of satellite Landsat-5, since this sensor is well suited for vegetation monitoring (Franklin, 1986).

from the mixtures matrix D (when the fractions matrix F is also unknown). The first step of methods 1 and 2 consists in applying the FA procedure to decompose the original data into a unknown number, c, of abstract spectra (factors) which coincides with the number of pure components [Eq. (2)]: D5R·F

where R and F constitute the abstract representation of the “real” matrices R and F: the column vectors of the matrix R;{R1, R2,..., Rc} are the independent factor loadings which contain the dominant reflecting features, and ideally, each one represents one spectral property (Huete, 1986). The procedure to compute c, R, and F is based on the diagonalization of the covariance matrix of the observations D. The procedure is different depending if the number of spectral channels, n, is greater or smaller than the number of samples, s (Garcı´a-Haro, 1997). The problem of finding the number of factors, c, is one of the least known in FA (Malinkowski and Howery, 1977). Method 1: Rotation of the Factor Spectra {Ri} The original matrix R can be expressed from its abstract matrix by means of a linear transformation T: R5R·T.

(3)

In this way, the problem (i.e., the matrices R and F) is parametrized from uniquely the matrix T. The coefficients, tuv, of matrix T cannot be computed using standard methods, since matrix R is unknown. The procedure consists in computing the coefficients of the matrix T by means of an optimization procedure which minimizes the objective function, P1, defined from the addition of the chi-square residuals of Eq. (1) plus a penalty function which is introduced to constraint the possible solutions of matrices R and F to physically allowable values (i.e., fractions and reflectances must be nonnegative and fractions must add up to 1 within each sample or pixel): rij>0 (i51,2, . . . , n, j51,2, . . . , c), fik>0 (i51,2, . . . , c, k51,2, . . . , s),

THEORY AND METHODS

c

To set up the basis of the problem, we will express the LSMM formulation. Let n be the dimensionality of the spectral observations. Assuming s mixtures of c composing endmembers, the LSMM is expressed as c

dik5 o rij fjk , j51

o fik51

(k51,2, . . . , s).

i51

c

P1(tuv)5 o

o1 s

i51 k51

(1)

where D is the mixtures spectra matrix (n3s), R is the endmembers matrix (n3c), and F is the fractions matrix (c3s). The main aim is the estimation of the matrix R

(4)

The physical sense of introducing these constraints is to guarantee endmember combinations that adequately describe the observed mixture within each pixel. Thus the objective function is expressed as

or matricially: D5R·F

(2)

2

n fik(tuv) 2 d(fik)1 o kfl i51 n

o1 1 j 1

3d(rij)1 o i5

where:

s

5

2

eij(tuv) 2 , reij

o1 c

j51

2

rij(tuv) 2 krl (5)

Extraction of Endmembers from Spectral Mixtures

• e is the residual matrix: e5D2R·F.

51,0,

d(x)5

x>0, x,0.

(6) (7)

• reij represents the uncertainty of the element eij (provided that these uncertainties were unknown, they could be considered gaussian variables, reij5√eij). We took a unique value from averaging the matrix e, which does not alter significantly the results. • k fl and krl are the average values of matrices F and R. These parameters were introduced to weight adequately the two inequality constraints of Eq. (5) (since reflectance and fraction values are different in absolute terms). These parameters provide a formal measure of the uncertainty in each element of the unknown matrices. In order to simplify the computations, they were considered constant (values 1/c and 0.15 were arbitrarily assigned to k f l and krl, respectively), since the solution is quite insensitive to small variations of these parameters. The matrices R, F, and e must be evaluated within every iteration. The matrix R was computed using Eq. (3), and the matrix F was computed by applying the LSMM using the constrained least-squares estimator (Garcı´a-Haro et al., 1996). Finally, e was computed using Eq. (6). The two first terms on the right side of Eq. (5) are distinct from zero for the first iterations, but they vanish when the unknown parameters converge to the solution; hence the objective function becomes a chisquare (v2) one. The minimization procedure was based on the MIGRAD algorithm (Minuit, 1992), which demonstrated to be more accurate than other numerical procedures like SIMPLEX (Minuit, 1992). MIGRAD is a variable metric method with inexact line search, a stable metric updating scheme, and checks for the positive-definiteness of the covariance matrix, which is guaranteed by adding an appropriate constant along the diagonal (as determined by its eigenvalues). The main weakness of the algorithm is that it fails provided that the first derivatives are very inaccurate. Method 2: Exploration of the Positions of the Objects in the Factor Space The columns of the matrix F, which are usually termed the principal components (PCs), constitute the scores of the data points in the abstract axes space. The proposed method is based on the idea that all the objects (data vectors) are enclosed within a straight-sided geometric figure called polytope (Full et al., 1981) whose number of sides (or edges) is equal to the dimensionality of the basis space plus one. In this case all sample vectors can

239

be described as linear combinations of compositions represented by each vertex on such a figure. The procedure to find the vertex of the polytope, E, consists in an optimization procedure of an objective function, P2, defined by c

o1 1 k 1

P25 o i5

s

5

n fpik 2 p7 d(fpik)1 o i51 k fp l

o 1 21a·DIS, j 1 re s

5

eij

(8)

ij

where • E is a (c3c) matrix of unknown parameters. • FP is a (c3s) matrix that is computed (every iteration) from matrices E and F by means of a minimization procedure based on the leastsquares constrained estimator [Eq. (9)]: F5Fp·E.

(9)

• e is the residual of the minimization [Eq. (10)]: e5F2Fp·E.

(10)

• reij represents the uncertainty of the element eij, and k fpl the average of the values of the matrix FP. • The function DIS expresses the “proximity” between the positions of the computed endmembers, E, and the cloud of data, F [Eq. (11)]: c

DIS5 o

s

c

5

5

o o u fij2eiku 1 j 1 k 1

i5

b

(11)

• a and b are constants, which we define according to data characteristics. Once computed, the matrix E, the matrix of pure components spectra, R, was obtained as in Eq. (12): R5R·ET.

(12)

Figure 1 shows a clarifying diagram corresponding to the case of three components, showing how the hyperspace of objects falls within a plane and the polytope is a triangle. The first term on the right in Eq. (8) assures that the extremes {E1, E2, E3} situate on the vertices of the triangle, the second term guarantees that the extremes situate inside the plane and the third term controls the proximity of the extremes to the center of the cloud. As will be seen subsequently, appropriate values of parameter a are about 1 when there are not external endmembers and lay in the range 0.01–0.2 when the pure components are contained within the data. Otherwise, parameter b let us filter the presence of isolated points (i.e., it increases the stability of the mean compared to an individual data value) in order to deduce the endmembers from the entire data structure rather than from a few outlying points (e.g., R and Q in Fig. 1). We have employed as default values a51.0 and b51.0. Method 3: Design of a Neural Network Recently, either artificial intelligence concepts or neural computing techniques have received increasing attention for most remote sensing applications, especially for clas-

240

Garcı´a-Haro et al.

Figure 1. Diagram of the principles of method 2, showing the data vectors in the reduced abstract space. rp represents the residual of point P, and P9 its projection onto the plane.

sification. It is frequent the use of a supervised, feedforward structure employing a backpropagation algorithm that adjusts the network weights to produce convergence between the network outputs and a set of training data (Moody et al., 1996). The primary advantage of the neural networks model is its generality, inasmuch as it does not require a priori knowledge of the data statistical distribution and hence it does not require nonsingular covariance matrices. Neural network algorithms have also potential for treating complex data sets that include multispectral (Baraldi and Parmiggiani, 1995), multitemporal (Gopal and Woodcock, 1996), multisensor (Serpico

and Roli, 1995) or multidirectional (Abuelgasim et al., 1996) characteristics. In this study, we have analyzed the possibilities of neural networks to estimate the spectra of pure components. The neural network consisted in three layers of interconnected neurons. The structure of the neural architecture is shown in Figure 2A, where nli stands for the ith neuron of the lth layer. The input layer presented c processing units or nodes representing the fractions of the components within each mixture (hence c coincides with the number of endmembers). Thus the kth input vector presented to the network input, {xk1, xk2, . . . , xkc}, constitutes the unknown vector of fractions for the kth mixture (1