cover in a tropical coastal area using satellite sensor data, GIS and artificial neural networks

Estuarine, Coastal and Shelf Science 59 (2004) 219e230 Mapping land use/cover in a tropical coastal area using satellite sensor data, GIS and artiﬁci...

Download PDF

741KB Sizes 0 Downloads 31 Views

Report

PDF Reader
Full Text

Estuarine, Coastal and Shelf Science 59 (2004) 219e230

Mapping land use/cover in a tropical coastal area using satellite sensor data, GIS and artiﬁcial neural networks J.F. Mas Instituto de Geografı´a - UNAM, Unidad acada´mica Morelia, Aquiles Serda´n 382, Colonia Centro, C.P. 58000, Morelia, Michoaca´n, Mexico Received 15 January 2003; accepted 25 August 2003

Abstract A common problem when classifying remotely sensed images in order to map land use/cover is spectral confusion: diﬀerent land use/cover classes present similar spectral signatures and are misclassiﬁed. This paper presents a procedure for mapping land use/ cover combining the spectral information from a recent image and data about spatial distribution of land use/cover types obtained from outdated cartography and ancillary data. Two fuzzy maps, which indicate the membership of each land use/cover class, were generated from the ancillary and spectral data, respectively, using an artiﬁcial neural networks approach. The combination of both maps was obtained using fuzzy rules. In comparison with spectral classiﬁcation, this procedure allowed a statistically signiﬁcant increase of accuracy of land use/cover classiﬁcation (from 67% to 79%). The advantages of this procedure for combining spectral and ancillary data, with regard to others previously published in the literature, are that it allows one to take into account previous mapping eﬀorts and to establish relationships between land use/cover and environmental variables speciﬁc to the mapped area. Ó 2003 Elsevier Ltd. All rights reserved. Keywords: coastal land covers; mapping; remote sensing; Landsat; geographical information systems; artiﬁcial neural networks; Mexico

1. Introduction In many regions, such as Mexico, coastal vegetation is being destroyed at an alarming rate by urban and agricultural development (Dugan, 1988; Loa Loza, 1994). To monitor changes eﬃciently over large areas, accurate and inexpensive mapping techniques are required. Digital processing of remotely sensed imagery, such as Landsat ETM+, has many advantages over traditional photo-interpretation mapping. However, the accuracy of the map is dependent upon the spectral signature of the various land cover features, and the ability of the classiﬁcation procedures to discriminate between them. Land cover classes vary spectrally, especially where land covers present high diversity and spatial complexity. These classes often lack unique signature, and diﬀerent land covers can present very

Tel.: +52-443-317-94-23; fax: +52-443-317-94-25. E-mail address: [email protected] 0272-7714/04/$ - see front matter Ó 2003 Elsevier Ltd. All rights reserved. doi:10.1016/j.ecss.2003.08.011

similar spectral features. This interclass confusion introduces errors into the resulting spectral classiﬁcation. In visual classiﬁcation, an interpreter evaluates several characteristics such as tone, texture, size, pattern, location and association and his own knowledge about the land cover distribution in order to identify the components of the image. The majority of these characteristics are not used in conventional digital image classiﬁcation. Attempts based upon diﬀerent approaches, such as the use of texture (Gong and Howarth, 1990; Palubinskas et al., 1995; Franklin et al., 2000), object-oriented approaches (Blaschke et al., 2000; Mansor et al., 2002) and the use of ancillary information (Hutchinson, 1982; Kontoes et al., 1993; Long and Skewes, 1996; Mas and Ramı´ rez, 1996; Srinivasan and Richards, 1990), have been made in order to increase the accuracy of spectral classiﬁcations. However, because of the statistical assumption of the more common algorithms such as maximum likelihood, ancillary information cannot be used directly during the classiﬁcation process (Hutchinson, 1982). The use of Boolean rules in a geographic information system (GIS)

220

J.F. Mas / Estuarine, Coastal and Shelf Science 59 (2004) 219e230

context is the most common way to use ancillary information. However, complex relationships between ecological factors and land cover distribution can hardly be expressed by deterministic decision rules (Hutchinson, 1982; Mas and Ramı´ rez, 1996) and, this approach requires the knowledge of these relationships for the study area. The rising use of artiﬁcial neural networks in image classiﬁcation can provide new ways to combine spectral and ancillary information into the classiﬁcation process (Civco, 1993; Foody, 1995; Atkinson and Tatnall, 1997). An artiﬁcial neural network (ANN) is an information-processing paradigm inspired by the way the densely interconnected, parallel structure of the brain processes information. ANNs are mathematical models that emulate some of the observed properties of biological nervous systems and draw on the analogies of adaptive biological learning. The key element of the ANN paradigm is the structure of the informationprocessing system which is composed of a large number of highly interconnected processing elements that are analogous to neurons and are tied together with weighted connections that are analogous to synapses. Advantages of the ANN approach include the ability to handle non-linear functions, to perform model-free function estimation, to learn from data relationships that are not otherwise known, and to generalize unseen situations. ANNs have been shown to be highly ﬂexible function approximators for any type of data. Therefore, ANNs make powerful tools for models, especially when the underlying data relationships are unknown (Lek and Gue´gan, 1999; Lek et al., 1996). In the past decade, ANNs have seen an explosion of interest and have been successfully applied across a large range of domains such as medicine, molecular biology, ecological and environmental sciences and image classiﬁcation (Atkinson and Tatnall, 1997; Lek and Gue´gan, 1999). Research into ANNs has led to the development of various types of neural networks, suitable to solve different kinds of problems. Nowadays, one of the most

popular ANN is the multi-layer feed-forward neural network or multiplayer perceptron (MLP) (Atkinson and Tatnall, 1997; Bishop, 1995). The MLP is based on a supervised procedure, i.e. the network constructs a model based on examples of data with known outputs. The training is done solely from the examples presented, which are together assumed to implicitly contain the information necessary to establish the relation. An MLP is a powerful system, capable of modelling complex relationships between variables. It allows prediction of an output object for a given input object or a set of input objects. The architecture of the MLP is a layered feed-forward neural network, in which the non-linear elements (neurons) are arranged in successive layers, and the information ﬂows unidirectionally, from the input layer to the output layer, through the hidden layer(s) (Fig. 1). The neurons receive and send signals through these connections. Connections are given a weight, which modulates the intensity of the signal they transmit. When the network is executed, the input variable values are placed in the neurons of the input layer, which activate, successively, the neurons of the hidden and the output layers. Each neuron calculates its activation value by taking the weighted sum of the outputs of the units in the preceding layer. The activation value is passed through the activation function to produce the outputs of the neuron. When the entire network has been executed, the outputs of the output layer act as the output of the entire network and the case is allocated to the most highly activated output. The learning procedure is based on a relatively simple concept: if the network gives the wrong answer, then the weights are corrected so the error is lessened so future responses of the network are more likely to be correct. Training data is presented iteratively in order to adjust the connection weights and obtain the best ﬁt between expected and observed values. The best-known example of a neural network-training algorithm is backpropagation. In this algorithm, a training pattern is presented to

Fig. 1. Schematic illustration of a three-layered perceptron, with one input layer, one hidden layer and one output layer. In this example X1, X2,., X6 are six input variables (e.g. spectral bands); Y1 and Y2 are two output variables (e.g. land use/cover classes).

J.F. Mas / Estuarine, Coastal and Shelf Science 59 (2004) 219e230

the network and the signals are fed forwards as described above. Then, the network output is compared with the desired output and the error is computed. The error is then back-propagated through the network and the weights of the connections are altered according to what is known as the generalized delta rule (Rumelhart et al., 1986; Bishop, 1995): Duij ðt þ 1Þ ¼ hðdj oi Þ þ aDuij ðtÞ

ð1Þ

where uij(t) is the connection weight from input i to neuron j at time t, h the learning rate, dj the error at processing unit j, and a is the momentum parameter. The learning rate controls the size of weight changes made by the algorithm. The addition of momentum causes the backpropagation algorithm to ‘‘pick up speed’’ if a number of consecutive steps change the weights in the same direction. This process of feeding forward signals and back-propagating the error is repeated iteratively until the error of the network as a whole is minimized or reaches an acceptable magnitude. As the network is trained to minimize the error on the training set, a major issue is over-learning or over-ﬁtting to the training data. A network with more weights models a more complex function, and is therefore prone to overﬁtting (Bishop, 1995; Foody and Arora, 1997). In order to avoid over-learning, cross-veriﬁcation is used: some of the training cases (veriﬁcation set) are not actually used for training but to keep an independent check on the progress of training. As training progresses, the training error naturally drops. If the veriﬁcation error stops dropping, or starts to rise, this indicates that the network is starting to over-ﬁt the data, and training should stop. In a classiﬁcation problem, an output unit’s task is to output a strong signal if a case belongs to the class, and a weak signal if it does not. Therefore, the activation value may also be considered as a fuzzy membership value (Civco, 1993; Foody, 1995), which can be perceived as a measure of certainty with regard to belonging to the class. When the ANNs were used to map land use/cover as a function of spectral or environmental variables they produced a membership value, ranging from zero to one, depending on their degree of closeness to the class for each class used in the training process. This study aims at developing a simple procedure able to map land use/cover using spectral and ancillary information based upon using an artiﬁcial neural network approach.

2. Study area The study area is situated in the region of the Lagoon of Te´rminos, in the State of Campeche, located in the south eastern part of Mexico between 18(02# and

221

19(10# North and 91(01# and 92(29# West (Fig. 2) and covers about 19,200 km2. The study area consists of a mosaic of natural grasslands, pasture lands, croplands, mangroves, wetlands (dominated by Cyperus sp. and Typha latifolia) and remnants of tropical forests. Soils are largely dominated by gleysol types (80% of the area); there are also solonchak, rendzina, regosol and vertisol soils, which represented 7, 6, 3 and 2% of the study area, respectively. Relief is ﬂat, 80% of the study area is above 5 m below sea level. The south east section of the study area presents some moderate elevations which reach 130 m above sea level. The spatial distribution of the vegetation is determined by the topography, the soils, and the distance from the coast line. The conversion of natural into man-made cover is dependent upon factors such as the distance from the roads, the elevation and the types of soils (Mas and Puig, 2001). Rates of land use/cover changes in the region are high; annual rates of deforestation reached 2.2 and 5.3% during 1974e1986 and 1986e1991, respectively. Much of the land surrounding the lagoon has been deforested for cattle ranching and rice farming (R. Isaac-Ma´rquez, pers. commun., 1993; Mas, 1999; Mas and Puig, 2001).

3. Materials and methods A Landsat ETM+ image (path 21, row 47) dated April 3, 2000 was registered and resampled to a UTM projected output image composed of 30 m!30 m pixels with an RMS error of less than 1.0 pixel. Resampling was done by the nearest neighbour method, which set the radiometric value of the output pixel equal to the nearest input pixel in the original geometry, in order to preserve the original values of the image. A digital model of elevation along with soil, land use/cover, and road network digital maps with a scale of 1:250,000, was obtained from the National Institute of Geography, Statistics and Informatics (INEGI). The land use/cover map (INEGI, 1984) which was derived from the visual interpretation of aerial photographs dated 1972 and 1980 in addition to intensive ﬁeld work. There is not a statistical accuracy assessment of this map but it is generally considered by the users as accurate but largely outdated due to the rapid land use/cover changes in the region. Additional spatial variables, such as the shortest distance to the nearest road, and to the coast line, were generated because they were considered a priori as factors which can control the pattern of distribution of land use/cover in the region. Binary maps were derived for each soil type from the map of soils and indicated the presence/absence of a given soil. As the digitalizing process of the map can generate errors in the position of the boundary between soil types (in addition to error or fuzziness of the delimitation of soil units during the elaboration of the map), the limits between the diﬀerent

222

J.F. Mas / Estuarine, Coastal and Shelf Science 59 (2004) 219e230

Fig. 2. Localization of the study area.

types of soils were ‘‘fuzzyﬁed’’ applying a low pass ﬁlter which, for each cell, calculated the mean of the value within a neighbourhood (a circle of 200 m radius in this case) and sent it to the corresponding cell location on the output grid. The application of the ﬁlter transformed the hard boundary into a narrow band of pixels which show increasing membership value when going inside

the soil unit (Fig. 3). The Landsat ETM+ image and the digital maps were integrated into a GIS database in raster format using a common UTM projection. A critical issue when carrying out a supervised classiﬁcation with both spectral and ancillary information is that the training data represent the entire variation of each class with regard to the spectral and

J.F. Mas / Estuarine, Coastal and Shelf Science 59 (2004) 219e230

223

Fig. 3. Boolean and fuzzy representation of the boundary between two parches. The black and the white tones indicated a total membership to one class. In the fuzzy representation the grey tones represent partial membership to both classes.

ancillary variables. In order to ensure it, a new and simple way of incorporating the ancillary information was used. For this, the general framework of the classiﬁcation procedure was simple and included two parallel classiﬁcation procedures. The ﬁrst classiﬁer ‘‘learned’’ the distribution of land use/cover types from establishing spatial relationships between the land use/ cover (outdated) map and the ancillary data (e.g. soils, elevation.). This approach avoided a partial or biased training data because the map used for training covered the entire study area. It allowed the production of a digital fuzzy map which portrayed, via each pixel, the possibility of the presence of each land use/cover type. The second classiﬁer produced another ‘‘fuzzy’’ map using a spectral classiﬁcation of the recent remotely sensed image based upon standard training areas. Thus, for each pixel, the two fuzzy maps indicated a membership value which expressed the possibility of the presence of a cover type from its environmental conditions and its spectral features, respectively. Fuzzy operators such as AND and OR can be used to derive a new class membership from two memberships derived from diﬀerent fuzzy classiﬁcations (Zadeh, 1978; Palubinskas et al., 1995). In order to combine the two fuzzy maps, the AND operator was calculated as the minimum of the two membership values. The use of the AND operator ensures that the most stringent requirement for the class selection was met. For example, a pixel which was located upland and presented a spectral signature analogous to tropical and mangrove forest (both have a similar spectral response) has a high membership value related to these two covers in the spectral classiﬁcation. However, in the classiﬁcation based on the ancillary information, the membership value to the mangrove class was low because the classiﬁer learned

from the land use/cover map that no mangroves were found at high elevations. Thus, after combining both spectral and ancillary classiﬁcations, the ﬁnal membership of this pixel to the class mangrove was low and it was classiﬁed as tropical forest (Fig. 4). In order to carry out the classiﬁcation based upon the ancillary data, each spatial variable was overlaid on the land use/cover map from INEGI to establish the relationship between land use/cover and the variables. The overlay operation allowed for the construction of a tabular database which indicated, for each pixel, the value of the spatial variables and the type of land use/ cover. These data were used to train the ﬁrst MLP aimed at classifying land use/cover from the environmental variables. The accuracy of a supervised classiﬁcation depends upon the representativeness of the estimates of the

Fig. 4. Combination of the fuzzy memberships by the AND operator. The columns represent the membership of a given pixel to three cover classes (T.F.: tropical forest, M: mangrove and P: pasture) derived from spectral, ancillary data and the combination of both. The combined membership values are the minimum of spectral and ancillary values for each class.

224

J.F. Mas / Estuarine, Coastal and Shelf Science 59 (2004) 219e230

number and the nature of the spectral classes present in the image data. When insuﬃcient observational or documentary evidence of the nature of the land cover types is available, an exploratory unsupervised classiﬁcation can be carried out. However, the identiﬁcation of the spectral classes picked out by the classiﬁer in terms of information classes is achieved whatever information is available to the analyst. The use of unsupervised classiﬁcation techniques is a method of ensuring that the training area has been well chosen to represent a spectral class (Mather, 1999). In the present study, the spectral classiﬁcation was carried out following two steps in order to obtain representative training areas. First, an unsupervised spectral classiﬁcation was carried out using the isodata algorithm, which iteratively clusters pixels using minimum distance techniques and groups the pixels with similar radiometric values into the same cluster (Tou and Gonzalez, 1974). This unsupervised classiﬁcation allowed the selection of training sites which represented the whole spectral variety a class could represent. For example, in case of a land use/cover class corresponding to various clusters, the training sites of this class were chosen in order to include these various clusters (i.e. to represent the entire spectral variability of the class). A tabular database which indicated, for each pixel of the training sites, its land use/cover class and the digital number value indicating the reﬂectance in the diﬀerent bands, was used to train the second MLP. A backpropagation training algorithm was used in the MLP training processes. Data were divided into three sections: the training set, the veriﬁcation set, and the test set following the proportion 1/2, 1/4 and 1/4, respectively. The veriﬁcation set was used to track the network’s error performance, to identify the more eﬃcient networks, and to stop training if over-learning occurred. The test set was not used in training at all, and gave an independent assessment of the network’s performance when the entire network design procedure was completed. A key design decision was the question of how many input variables and hidden units to include in the network. The network conﬁguration was determined empirically by testing various possibilities and evaluating the accuracy of the classiﬁcation of the test set. In order to select the input variables, a sensibility analysis was carried out. This analysis rates variables according to the deterioration in performance that occurs if that variable is no longer available to the model. It indicates which input variables are considered most important by that particular network and allows to prune out the input variables with low sensibility. Among the MLP architectures which presented good performance (test set classiﬁcation accuracy over 70%), the simplest were chosen: MLPs based upon less input variables and less nodes in the hidden layer(s) were preferred because of their better ability to generalize and

classify unseen pixels accurately (Bishop, 1995; Rosin and Fierens, 1995; Kavzoglu and Mather, 2000). The output from each MLP was an activation value which expressed the membership to each land use/cover class. The result was then two fuzzy land use/cover maps that portrayed gradations of the possibility of each class. The combination of the two maps allowed the generation of a fuzzy land use/cover map which took into account both ancillary and spectral data. A ﬁnal hard (not fuzzy) map was ﬁnally obtained labelling each pixel into the class with the highest fuzzy membership value in order to obtain a ‘‘standard’’ map and assess accuracy. In order to assess the accuracy of the classiﬁed images, a random reference sample of 488 points of veriﬁcation was selected. The land use/cover classes of the surrounding area of these points were checked by visual interpretation using high resolution digital aerial photographs dated October 2000 and March 2002 (pixel size about 1.5 m). An error matrix, which showed the number of points correctly and incorrectly identiﬁed, was constructed. Overall accuracy (proportion of the points correctly identiﬁed) was computed and, commission errors (erroneously including a point from a class), omission errors (erroneously excluding a point from a class), producer’s accuracy (proportion of veriﬁcation points of a category correctly classiﬁed) and user’s accuracy (proportion of points classiﬁed into the class which are correctly identiﬁed) were calculated for each class (Stehman, 1997; Stehman and Czaplewski, 1998). The interval of conﬁdence of the estimate of the accuracy was determined using the following equation (Dicks and Lo, 1990): rﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ pð1 pÞ d¼t n

ð2Þ

where d is the half width of the conﬁdence interval, t ¼ 1:96 the standard normal deviate for the two-sided conﬁdence level, p the accuracy and n is the number of veriﬁcation points. In order to quantify the improvement obtained by the incorporation of the ancillary data, a ﬁrst classiﬁcation was carried out using only the spectral information and another one using the spectral and the ancillary information (‘‘ancillary-improved’’ classiﬁcation). The accuracy of both classiﬁcations was assessed using the same reference sample. Even though the two maps were constructed independently, when the veriﬁcation sample used for the comparison is the same, the test statistic employed for the hypothesis test should take this lack of independence into account. Therefore, when only a single reference sample is used, some type of paired comparison is appropriate, and a test statistic based on an assumption of two independent samples represents at best an approximation to the correct statistical test (Stehman, 1997). In this study, the statistical signiﬁcance

J.F. Mas / Estuarine, Coastal and Shelf Science 59 (2004) 219e230

of the diﬀerence in accuracy between the spectral classiﬁcation and the ancillary-improved classiﬁcation was done using a t-test approach. A paired-samples t-test was used for global and producer’s accuracies which were based upon the same reference sample exactly. In the case of user’s accuracies, because the reference sample is split using the classiﬁed image, diﬀerent samples were used to assess the same class in the two classiﬁed images. Therefore, an independentsamples t-test was used.

4. Results The image was classiﬁed into six land use/cover classes: (1) tropical forest; (2) mangroves; (3) wetlands; (4) agriculture (grasslands, pasture lands and croplands); (5) water; and (6) urban areas. Because of the large amount of data (for example, the training sites on the multispectral image have about 480,000 pixels), between 4000 and 8000 cases for each class were randomly selected to generate the input data for the spectral classiﬁcation MLP, except the class ‘‘urban area’’ which only has 1000 pixels. In total the spectral training set presented 28,700 cases. The ancillary input data presented similar proportion for each class (5000 for the ﬁrst ﬁve classes and 300 for the urban area) and a total of about 25,000 cases. Among the MLP architectures, which presented good performance, the simplest were chosen. The MLP for classifying ancillary data presented nine inputs (six soils type, elevation, distance to roads and distance to the coast) and one hidden layer with six nodes. The MLP for spectral classiﬁcation had ﬁve inputs (bands 2, 3, 4, 5 and 7) and two hidden layers with three and four nodes, respectively (Fig. 5). For both MLPs, the output units have linear activation functions while the hidden units have logistic sigmoid activation functions. The MLPs for classifying spectral and ancillary data were trained by

225

backpropagation (learning rate 0.1, momentum 0.3) with 50 and 40 epochs (iterations of the entire training data) and were able to classify correctly 82 and 73% of the test set, respectively. Fig. 6 shows the spectral and environmental variables used in the classiﬁcation procedure. A preliminary classiﬁcation was obtained using only the spectral information, classifying each pixel into the class which presented the higher membership value in order to assess the improvement obtained by using ancillary data. Table 1 shows the matrix error of such classiﬁcation. Overall accuracy was 74%. However, without taking water into account, which presented 22% of the points of veriﬁcation, the overall accuracy was only 67%. Some classes presented important errors such as crops and pasture lands which presented an error of omission of 52% and wetlands which had an error of commission of 52%. The main confusions were between crop/pasture lands, tropical forest, mangroves and wetlands which presented similar spectral signatures. These errors aﬀect the spatial representation and also the statistics of area of land use/cover types derived from the classiﬁed image. For example, less than 50% of the points of veriﬁcation identiﬁed as crop/pasture lands on the photographs are correctly mapped (error of omission of 52%). On the other hand, 29% of the points of veriﬁcation mapped as crop/pasture lands belonged to this class (error of commission). Therefore, the total area of this class was subestimated in the map. As a follow-up step, the spectral fuzzy maps were combined with the ancillary maps. Fig. 7 shows the fuzzy maps of the wetlands class. It can be observed that large areas of tropical forest and mangroves present high membership to the wetlands class. The combination of the spectral fuzzy map with the ancillary fuzzy map allowed an avoidance of confusion in an important part of these areas. The accuracy of the resulting map was assessed with the same veriﬁcation data. As shown in Table 2, overall accuracy increased to 82% (79% without taking into account the water class). Table 3 shows the global and class accuracies obtained by the

Fig. 5. MLPs used for ancillary and spectral classiﬁcation.

226

J.F. Mas / Estuarine, Coastal and Shelf Science 59 (2004) 219e230

Fig. 6. Input components in the data set. The top six maps show the spectral inputs and the map of soils, the bottom three maps show the elevation, the distance to roads and the distance to the coast. In each map, except the soil map, a dark pixel corresponds to a low value of the input variable.

spectral classiﬁcation and the ancillary-improved classiﬁcation along with the statistical signiﬁcance of the diﬀerence in accuracy. The use of ancillary data allowed a signiﬁcant increase in the accuracy of almost all the classes: both commission and omission errors decreased

below 25% except for the error of commission of the crop/pasture land and both commission and omission errors of urban area which remained the same. All the classes, except urban area, showed a signiﬁcant increase of user’s accuracy (wetlands) or producer’s accuracy

Table 1 Error matrix of the classiﬁcation based upon spectral information only (overall accuracy ¼ 72:54%G3:96). The matrix shows the number of veriﬁcation points. Errors, accuracy and conﬁdence interval are expressed in % Veriﬁcation points Map Crop/pasture Water Mangrove Tropical forest Wetlands Urban area Total Error of omission Producer’s accuracy d (1/2 Interval of conﬁdence)

Crop/ pasture 47

Water

Mangrove

1 107

2 1 57 7 12 3 82 30.49 69.51 9.96

2 7 43 99 52.53 47.47 9.84

1 109 1.83 98.17 2.52

Tropical forest 4

Wetlands

5 68 22

10 1 2 8 71

99 31.31 68.69 9.14

92 22.83 77.17 8.58

Urban area 2

1 4 7 42.86 57.14 36.66

Total 66 109 66 90 149 8 488

Error of commission

User’s accuracy

d

28.79 1.83 13.64 24.44 52.35 50.00

71.21 98.17 86.36 75.56 47.65 50.00

10.92 2.52 8.28 8.88 8.02 34.65

227

J.F. Mas / Estuarine, Coastal and Shelf Science 59 (2004) 219e230

Fig. 7. Fuzzy maps for the wetlands class derived from ancillary (a) and spectral (b) information.

(crop/pasture, water, mangrove and tropical forest). The class urban area was not improved by the ancillary data because there was only one polygon labelled as urban area (Ciudad del Carmen City) in the land use/cover map used in the ‘‘learning’’ of the relationship between land use/cover and ancillary data. Thus, the MLP learned that the characteristics of this town, located on an island (near the coast line, at very low elevation) were the general characteristics of all urban areas, thereby causing a misclassiﬁcation. This point will be discussed further when analysing the limitations of the approach used in this study.

5. Discussion In order to pinpoint the limitations of the procedure, the incorrectly classiﬁed veriﬁcation points were displayed upon the spatial variables. For each point,

ancillary and spectral fuzzy membership values were examined. About 30% of the points considered as incorrectly classiﬁed are ambiguous cases, for which the land use/cover class determined by photo-interpretation may be questioned because the point was located in the transition between diﬀerent covers, in a fragmented area where pixels were composed of diﬀerent covers (mixed pixels) or corresponded to transition covers between two covers considered in the classiﬁcation scheme such as fallow and secondary vegetation which are intermediate between agriculture covers and tropical forest. These pixels may be better represented by the fuzzy map derived from the combination of the ancillary and spectral fuzzy maps rather than the hard classiﬁcation map (Foody, 1992; Wang, 1990). In order to avoid the diﬃculty in choosing a single correct class from the reference photographs, the ambiguous reference points were interpreted again assigning a primary and an alternate interpreted class (Khorram et al., 2000;

Table 2 Error matrix of the classiﬁcation based upon the combination of spectral and ancillary information (overall accuracy ¼ 81:55G3:44). The matrix shows the number of veriﬁcation points. Errors, accuracy and conﬁdence interval are expressed in % Veriﬁcation points Map Crop/pasture Water Mangrove Tropical forest Wetlands Urban area Total Error of omission Producer’s accuracy d (1/2 Interval of conﬁdence)

Crop/ pasture 79 1 8 11 99 20.20 79.80 7.91

Water 3 103 2

1 109 5.50 94.50 4.28

Mangrove 2 64 10 4 2 82 21.95 78.05 8.96

Tropical forest

Wetlands

13

15

3 77 6

2 4 71

99 22.22 77.78 8.19

92 22.83 77.17 8.58

Urban area 3

4 7 42.86 57.14 36.66

Total 115 103 72 99 92 7 488

Error of commission

User’s accuracy

d

31.30 0.00 11.11 22.22 22.83 42.86

68.70 100.00 88.89 77.78 77.17 57.14

8.48 0.00 7.26 8.19 8.58 36.66

228

J.F. Mas / Estuarine, Coastal and Shelf Science 59 (2004) 219e230

Table 3 Comparison between accuracy indices obtained by the spectral classiﬁcation and the ancillary-improved classiﬁcation. For user’s accuracies, comparison was obtained by an independent-samples t-test (with equal variance not assumed), for producer’s and global accuracies by a pairedsamples t-test Classiﬁcation based upon

t-Test for equality of accuracies

Class

Spectral only Spectral þ ancillary Diﬀerence t

User’s accuracy Crop/pasture Water Mangrove Tropical forest Wetlands Urban area

71.21 98.17 86.36 75.56 47.65 50.00

68.70 100.00 88.89 77.78 77.17 57.14

2.51 1.83 2.53 2.22 29.52 7.14

Producer’s accuracy Crop/pasture Water Mangrove Tropical forest Wetlands Urban area

47.47 98.17 69.51 68.69 77.17 57.14

79.80 94.50 78.05 77.78 77.17 57.14

32.33 3.67 8.54 9.09 0.00 0.00

Global accuracy

72.54

81.55

9.01

95% Interval conﬁdence of the diﬀerence

df

Sig

Lower

Upper

0.354 1.421 0.446 0.359 4.906 0.258

137.88 108.00 131.93 183.92 217.21 12.74

0.724 0.158 0.656 0.720 0.000 0.800

16.55 0.72 8.87 10.00 6.01 52.75

11.52 4.39 13.72 14.45 17.66 67.03

5.846 2.028 2.400 2.566

98 108 81 98

0.000 0.045 0.019 0.012

21.35 7.26 1.46 2.06

43.29 0.83 15.61 16.12

4.79

487

0.000

5.32

12.71

Woodcock and Gopal, 2000). For about 84% of the ambiguous points, the primary or the alternate class matched the classiﬁed class. In order to explore further the fuzziness representation given by the fuzzy classiﬁcation, the primary and the alternate reference classes were compared with the two classes which presented the two highest fuzzy membership values in the classiﬁed image. In 61% of the cases, both pairs of classes coincided which indicates that the fuzziness membership gave a good representation of the transition or the mixture between two classes. With the method used to combine spectral and ancillary information, a pixel which presents a high membership value to two or more classes must present high membership values derived from both spectral and ancillary data, i.e. this pixel must present a similar spectral signature to the diﬀerent classes but also be located in the overlap between the spatial distribution of these diﬀerent classes. About 9% of the incorrectly classiﬁed points seem to be associated with the limits of the polygons of the map of soils. Thus, these errors can be attributed to the imprecision of the delimitation of the soil units or to the vagueness of the real limits and the diﬃculty of representing transition zones of soils and cover in a Boolean representation (one pixel, one land use/cover class). The remaining confusion occurred between cover classes which presented a similarity in their spectral response and spatial distribution. In order to improve the discrimination of these classes, additional variables should be taken into account during the classiﬁcation procedure. Thirty-four percent of the erroneously classiﬁed points were correctly classiﬁed taking into account ancillary membership alone. Therefore, many misclassiﬁcations were due to the fact that the candidate pixel presented a high membership value to the correct class in the

ancillary fuzzy map but was discarded because it presented a very low membership value in the spectral fuzzy map. Thus, an increase of accuracy could be obtained by increasing the overlap between class signatures, which leads to an augmentation of the membership values to various class, for example using fuzzy training sites (Eastman and Laney, 2002). Means of the ﬁnal membership values to the winning class were 0.64 and 0.27 for correctly and incorrectly classiﬁed pixels, respectively. Therefore, these values are related to the probability of a pixel to be correctly classiﬁed and can be used to examine classiﬁcation uncertainty and elaborate maps which indicate areas where classiﬁcation is doubtful. The incorporation of ancillary data in an ANNs approach allowed a signiﬁcant improvement of the accuracy of the majority of the land use/cover classes. The advantages of this approach are varied: It allows to take into account previous mapping eﬀorts which were based upon aerial photographic interpretation and intensive ﬁeld work. This was only allowed through the visual interpretation of the satellite sensor imagery in order to update the previous maps (Mas et al., 2002). The use of this method, or a more sophisticated one based upon the same approach, allowed the classiﬁer to learn from the outdated map and to classify using this knowledge and recent spectral data. This approach allowed the establishment of relationships between land use/cover and environmental variables which are site dependent, e.g. speciﬁc to the mapped area. These site-speciﬁc relationships are more susceptible to allow an accurate classiﬁcation than regional relationships. For example, in this mapped region, the mangroves are located relatively

J.F. Mas / Estuarine, Coastal and Shelf Science 59 (2004) 219e230

near the coast line while, in the Petenes region which is located 100 km north, mangroves can be found 15 km from the coast (Rico-Gray, 1982). However, this approach presents some limitations. First, it depends on the accuracy of the previous land use/ cover map. In the event that this map were greatly inaccurate, the MLP could learn wrong relationships between cover and environmental variables. The same problem can arise when the pattern of distribution of the covers has changed signiﬁcantly between the elaboration of the land use/cover map and the updating procedure. Thus, this problem can be more important in highly dynamic areas such as coastal regions or when using very old maps. However, it is worth noting that even though the study area presented important land use/cover changes between the elaboration of the land use/cover map and the date of acquisition of the Landsat image, the relationships between land use/cover distribution and environmental variables did not change importantly. The problem of erroneous relationships between land use/ cover and environmental variables aﬀected only the urban area class because of the reduced area of this class. A way to avoid these problems is the visualization of the fuzzy maps, which indicate the membership value of each location in a given class, in order to detect abnormal membership values in some areas. It also depends on the availability and accuracy of ancillary data. The increasing availability of digital spatial information will reduce part of this problem. However, the quality of this information is also a critical task because attribute errors or position errors (such as the limits between two soil types for example) lead to errors in the classiﬁcation. In some cases, the low precision of data does not allow for the discrimination of some classes. For example, in the study area, lowland ﬂooded forest is associated with topographic depressions which are not represented in the digital model of elevation derived from 10 m elevation curves. ANNs present a promising mode to improve classiﬁcation of remotely sensed images. Many authors reported larger accuracy when classifying spectral images with an ANN approach than with a statistical method such as maximum likelihood (Paola and Showengerdt, 1995; Atkinson and Tatnall, 1997). However, a more important contribution of the ANNs is their ability to incorporate additional data into the classiﬁcation process. In the present study, only ancillary data were used using a perpixel classiﬁcation. However, additional improvement may be expected using information such as the texture, or the shape and the size of the objects in the case of an object-oriented classiﬁcation procedure. Acknowledgements The author wishes to thank the two reviewers who provided helpful suggestions for improving this manu-

229

script. Aerial photographs were obtained from project N011 Actualizacio´n del mapa de uso del suelo y vegetacio´n del A´rea Protegida ‘‘Laguna de Te´rminos’’ y elaboracio´n de una base cartogra´ﬁca digital (Conabio, PEMEX, epomex-University of Campeche) and the National Forest Inventory 2000 (UNAM, SEMARNAT, INEGI). The spatial database was elaborated in project N011. The study was carried out under the project CONACyTSEMARNAT reference number 2002-C01-0075. References Atkinson, P.M., Tatnall, A.R.L., 1997. Neural networks in remote sensing. International Journal of Remote Sensing 18 (4), 699e709. Bishop, C.M., 1995. Neural Networks for Pattern Recognition. Oxford University Press, Oxford, 482 pp. Blaschke, T., Lang, S., Lorup, E., Strobl, J., Zeil, P., 2000. Objectoriented image processing in an integrated GIS/remote sensing environment and perspectives for environmental applications. In: Cremers, A., Greve, K. (Eds.), Environmental Information for Planning, Politics and the Public, vol. II, Metropolis-Verlag, Marburg, pp. 555e570, http://www.geo.sbg.ac.at/staﬀ/tblaschk/ publications/UI2000_Blaschke_et_al.pdf. Civco, D.L., 1993. Artiﬁcial neural network for land cover classiﬁcation and mapping. International Journal of Geographical Information Systems 7 (2), 173e186. Dicks, S.E., Lo, T.H.C., 1990. Evaluation of thematic map accuracy in a land-use and land-cover mapping program. Photogrammetric Engineering and Remote Sensing 56 (9), 1247e1252. Dugan, P., 1988. Wetlands conservation and sustainable development in Latin America and the Caribbean. In: Ecologı´ a y Conservacio´n del Delta de los rı´ os Usumacienta y Grijalva (memorias), Insituto Nacional de Investigacio´n sobre Recursos Bio´ticos (INIREB), pp. 1e4. Eastman, J.R., Laney, R.M., 2002. Bayesian soft classiﬁcation for subpixel analysis: a critical evaluation. Photogrammetric Engineering and Remote Sensing 68 (11), 1149e1154. Foody, G.M., 1992. A fuzzy sets approach to the representation of vegetation continua from remotely sensed data: an example from lowland heath. Photogrammetric Engineering and Remote Sensing 58 (2), 221e225. Foody, G.M., 1995. Land cover classiﬁcation by an artiﬁcial neural network with ancillary information. International Journal of Geographical Information Systems 9 (5), 527e542. Foody, G.M., Arora, M.K., 1997. An evaluation of some factors aﬀecting the accuracy of classiﬁcation by an artiﬁcial neural network. International Journal of Remote Sensing 18 (4), 799e810. Franklin, S.E., Hall, R.J., Moskal, L.M., Maudie, A.J., Lavigne, M.B., 2000. Incorporating texture into classiﬁcation of forest species composition from airborne multispectral images. International Journal of Remote Sensing 21 (1), 61e79. Gong, P., Howarth, P.J., 1990. The use of structural information for improving land cover classiﬁcation accuracies at the ruraleurban fringe. Photogrammetric Engineering and Remote Sensing 56 (1), 67e73. Hutchinson, C.F., 1982. Techniques for combining Landsat and ancillary data for digital classiﬁcation improvement. Photogrammetric Engineering and Remote Sensing 8 (1), 123e130. INEGI, 1984. Carta de uso del suelo y vegetacio´n, escala 1:250,000, hoja Ciudad del Carmen. Kavzoglu, T., Mather, P.M., 2000. Using feature selection techniques to produce smaller neural networks with better generalisation capabilities. Proceedings of IGARSS 2000, vol. 7, pp. 3069e3071.

230

J.F. Mas / Estuarine, Coastal and Shelf Science 59 (2004) 219e230

Khorram, S., Knight, J., Cakir, H., Yan, H., Mao, Z., Dai, X., 2000. Improving estimates of the accuracy of thematic maps when using aerial photos as the ground reference source. Proceedings of the ASPRS Symposium, Washington, USA. Kontoes, C., Wilkinson, G., Burril, A., Goﬀredo, S., Me´gier, J., 1993. An experimental system for the integration of GIS data in knowledge-based analysis for remote sensing of agriculture. International Journal of Geographical Information System 7 (3), 247e262. Lek, S., Gue´gan, J.F., 1999. Artiﬁcial neural networks as a tool in ecological modelling, an introduction. Ecological Modelling 120, 65e73. Lek, S., Delacoste, M., Baran, P., Dimopoulos, I., Lauga, J., Aulanier, S., 1996. Application of neural networks to modelling non-linear relationships in ecology. Ecological Modelling 90, 39e52. Loa Loza, E., 1994. Los manglares de Me´xico: Sinopsis general para su manejo. In: Suman, D.O. (Ed.), El ecosistema de manglar en Ame´ rica Latina y la cuenca del Caribe: Su manejo y conservacio´n. Rosenstiel School of Marine and Atmospheric Science, University of Miami, Florida, Tinker Foundation New York, USA, pp. 144e151. Long, B.G., Skewes, T.D., 1996. A technique for mapping mangroves with Landsat TM satellite data and geographic information system. Estuarine, Coastal and Shelf Science 43, 373e381. Mansor, S., Tai Hong, W., Rashid Mohamed Shariﬀ, A., 2002. Object oriented classiﬁcation for land cover mapping, http:// www.gisdevelopment.com/application/environment/overview/ envo0010.htm. Mather, P.M., 1999. Computer Processing of Remotely-Sensed Images, an Introduction. John Wiley & Sons Ltd, U.K., 292 pp. Mas, J.F., 1999. Monitoring land-cover changes: a comparison of change detection techniques. International Journal of Remote Sensing 20 (1), 139e152. Mas, J.F., Puig, H., 2001. Modalite´s de la de´forestation dans le Sudouest de l’Etat du Campeche, Mexique. Canadian Journal of Forest Research 31 (7), 1280e1288. Mas, J.F., Ramı´ rez, I., 1996. Comparison of land use classiﬁcations obtained by visual interpretation and digital processing. ITC Journal 1996-3/4, 278e283. Mas, J.F., Vela´zquez, A., Palacio-Prieto, J.L., Bocco, G., Peralta, A., Prado, J., 2002. Assessing forest resources in Mexico: wall-to-wall

land use/cover mapping. Photogrammetric Engineering and Remote Sensing 68 (10), 966e968. Palubinskas, G., Lucas, R.M., Foody, G.M., Curran, P.J., 1995. An evaluation of fuzzy and texture-based classiﬁcation approaches for mapping regenerating tropical forest classes from LandsatTM data. International Journal of Remote Sensing 16 (4), 747e759. Paola, J.D., Showengerdt, R.A., 1995. A detailed comparison of backpropagation neural network and maximum-likelihood classiﬁers for urban land use classiﬁcation. IEEE Transactions on Geoscience and Remote Sensing 33 (4), 981e996. Rico-Gray, V., 1982. Estudio de la vegetacio´n de la zona costera inundable del noroeste del Estado de Campeche, Me´xico: Los Petenes. Biotica 7 (2), 171e190. Rosin, P.L., Fierens, F., 1995. Improving neural network generalisation. Proceedings of IGARSS 1995, vol. 2, pp. 1255e1257. Rumelhart, D.E., Hinton, G.E., Williams, R.J., 1986. Learning internal representations by error propagation. In: Rumelhart, D.E., McClelland, J.L. (Eds.), Parallel Distributed Processing, vol. 1, MIT Press, Cambridge, MA. Srinivasan, A., Richards, J.A., 1990. Knowledge-based techniques for multi-source classiﬁcation. International Journal of Remote Sensing 3 (3), 505e525. Stehman, S.V., Czaplewski, R.L., 1998. Design and analysis for thematic map accuracy assessment: fundamental principles. Remote Sensing of Environment 64, 331e344. Stehman, S.V., 1997. Selecting and interpreting measures of thematic classiﬁcation accuracy. Remote Sensing of Environment 62, 77e89. Tou, J.T., Gonzalez, R.C., 1974. Pattern Recognition Principles. Addison-Wesley Publishing Company, Reading, Massachusetts. Wang, F., 1990. Improving remote sensing image analysis through fuzzy information representation. Photogrammetric Engineering and Remote Sensing 56 (8), 1163e1168. Woodcock, C., Gopal, S., 2000. Accuracy assessment and area estimates using fuzzy sets. International Journal of Geographical Information Science 14 (2), 153e172. Zadeh, L.A., 1978. Fuzzy sets as a basis for theory of possibility. Fuzzy Sets and Systems 1, 3e28.

cover in a tropical coastal area using satellite sensor data, GIS and artificial neural networks

cover in a tropical coastal area using satellite sensor data, GIS and artificial neural networks

Recommend Documents