Classifiers vs. input variables—The drivers in image classification for land cover mapping

International Journal of Applied Earth Observation and Geoinformation 11 (2009) 423–430 Contents lists available at ScienceDirect International Jour...

Download PDF

494KB Sizes 0 Downloads 45 Views

Report

PDF Reader
Full Text

International Journal of Applied Earth Observation and Geoinformation 11 (2009) 423–430

Contents lists available at ScienceDirect

International Journal of Applied Earth Observation and Geoinformation journal homepage: www.elsevier.com/locate/jag

Classiﬁers vs. input variables—The drivers in image classiﬁcation for land cover mapping M. Heinl a,b,*, J. Walde c, G. Tappeiner d, U. Tappeiner a,b a

Institute of Ecology, University of Innsbruck, Sternwartestr. 15, 6020 Innsbruck, Austria Institute for Alpine Environment, EURAC, Viale Druso 1, 39100 Bolzano, Italy c Department of Statistics, University of Innsbruck, Universita¨tsstr. 15, 6020 Innsbruck, Austria d Department of Economics, University of Innsbruck, Universita¨tsstr. 15, 6020 Innsbruck, Austria b

A R T I C L E I N F O

A B S T R A C T

Article history: Received 11 November 2008 Accepted 20 August 2009

The study investigates the performance of image classiﬁers for landscape-scale land cover mapping and the relevance of ancillary data for the classiﬁcation success in order to assess and to quantify the importance of these components in image classiﬁcation. Speciﬁcally tested are the performance of maximum likelihood classiﬁcation (MLC), artiﬁcial neural networks (ANN) and discriminant analysis (DA) based on Landsat7 ETM+ spectral data in combination with topographic measures and NDVI. ANN produced high accuracies of more than 75% also with limited input information, while MLC and DA produced comparable results only by incorporating ancillary data into the classiﬁcation process. The superiority of ANN classiﬁcation was less pronounced on the level of the single land cover classes. The use of ancillary data generally increased classiﬁcation accuracy and showed a similar potential for increasing classiﬁcation accuracy than the selection of the classiﬁer. Therefore, a stronger focus on the development of appropriate and optimised sets of input variables is suggested. Also the deﬁnition and selection of land cover classes has shown to be crucial and not to be simply adaptable from existing land cover class schemes. A stronger research focus towards discriminating land cover classes by their typical spectral, topographic or seasonal properties is therefore suggested to advance image classiﬁcation. ß 2009 Elsevier B.V. All rights reserved.

Keywords: Classiﬁcation Landsat Artiﬁcial neural network Discriminant analysis Maximum likelihood Land use Land cover Thematic map Ancillary data This paper is dedicated to Professor Walter Larcher on the occasion of his 80th birthday.

1. Introduction Detailed and accurate land cover data are among the most crucial information that are required for large-scale environmental research. The knowledge of the spatial conﬁguration of the Earth’s surface is the key for assessing habitat distribution, landscape composition or land use changes and is an essential requirement for landscape modelling and scenario building, particularly in times of global change. The suitability of remote sensing for acquiring land cover data has long been recognised, but the process of generating land cover information from remotely sensed data is still far from being standardised or optimised (Foody, 2002; Lu and Weng, 2007). An extensive variety of multi-spectral image classiﬁcation methods have been developed, which were recently reviewed by Lu and Weng (2007), though none of the developed classiﬁers is described as inherently superior to any other, as their performance largely depends on the kind and quality of the input

* Corresponding author at: Institute of Ecology, University of Innsbruck, Sternwartestr. 15, 6020 Innsbruck, Austria. Tel.: +43 512 507 5980. E-mail address: [email protected] (M. Heinl). 0303-2434/$ – see front matter ß 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.jag.2009.08.002

data for the classiﬁcation and the desired output. Even unsupervised ISODATA classiﬁcation has been used successfully, for example to extract speciﬁc, spectrally distinct features such as forests, ﬁre scars, coastlines or urban areas (Ekercin, 2007; Heinl et al., 2006; Kaya and Curran, 2006; Souza et al., 2003). However, for obtaining thematic land cover data, supervised classiﬁcation is to be preferred in most cases (Foody, 2001; Jensen, 2005; Kavzoglu, 2009), as desired output classes are already pre-deﬁned and postclassiﬁcation analyses and class aggregations are not necessarily required. Especially the use of advanced approaches such as artiﬁcial neural networks, fuzzy-sets or support vector machines produced levels of accuracy higher than, e.g. the popular maximum likelihood classiﬁer or discriminant analysis (Berberoglu et al., 2007; Dixon and Candade, 2008; Jensen, 2005; Kavzoglu and Mather, 2003; Kavzoglu and Reis, 2008; Pal and Mather, 2005). But only few speciﬁc comparisons have been published (Berberoglu et al., 2007; Hardin, 2000; Kavzoglu and Reis, 2008; Paola and Schowengerdt, 1995; Zhang et al., 2007), usually documenting a superiority of the advanced approaches, but also suggesting maximum likelihood classiﬁcation as better alternative (Carvalho et al., 2004). The use of different numbers and types of land cover classes and sample sizes complicates a quantitative comparison of

424

M. Heinl et al. / International Journal of Applied Earth Observation and Geoinformation 11 (2009) 423–430

the results. And despite the often documented inferiority in classiﬁcation success, maximum likelihood classiﬁcation is still one of the most widely used classiﬁcation algorithms (Jensen, 2005), most likely also due to advantages in data handling and processing times (Paola and Schowengerdt, 1995). Therefore, many applied landscape-scale studies and land use/land cover research rely on these standard classiﬁcation approaches (Brandt and Townsend, 2006; Cushman and Wallin, 2000; Jianchu et al., 2005; Joy et al., 2003; Ruiz-Luna and Berlanga-Robles, 2003). In contrast, advanced approaches are primarily limited to methodological studies for optimising the classiﬁcation process, often using only very limited sample sizes (Fassnacht et al., 2006; Foody, 2001; Kavzoglu and Mather, 2003; Kavzoglu and Reis, 2008; Ouyang and Ma, 2006; Paola and Schowengerdt, 1997; Yemefack et al., 2006). Besides the type of image classiﬁer, the use of ancillary data is recognised as being crucial for the performance of image classiﬁcation. Ancillary data have been used successfully to improve image classiﬁcation, especially by including topographic measures, NDVI or texture measures in the classiﬁcation process additionally to the spectral information for separating features with similar spectral properties (Berberoglu et al., 2007; Carpenter et al., 1999; Giannetti et al., 2001; Islam et al., 2008; Joy et al., 2003; Kozak et al., 2008; Lu and Weng, 2007; Saadat et al., 2008; Watanachaturaporn et al., 2008). Despite extensive research on classiﬁers and ancillary data since decades, comparisons and applications of image classiﬁers using standardised samples on landscape-scale are largely missing (Lu and Weng, 2007). To overcome this discrepancy, the present study was conducted mutually both on the performance of different classiﬁers and on the importance of ancillary data for landscape-scale land cover assessments using pre-deﬁned land cover classes. The present study investigates therefore the effect of a variety of selected and widely accessible input variables and classiﬁers on classiﬁcation accuracy overall and on the level of speciﬁc land cover classes, and assesses and especially quantiﬁes the importance of these components in image classiﬁcation. We hypothesize that advanced classiﬁcation approaches achieve higher overall accuracies compared to standard classiﬁers with little or no ancillary data, while incorporating ancillary data reduces the importance of the type of classiﬁer. Speciﬁcally compared are the performance of maximum likelihood classiﬁcation, discriminant analysis and artiﬁcial neural networks, covering presumably the most widely used hard classiﬁers and representing parametric and non-parametric classiﬁers. Ancillary data in the form of topographic measures and NDVI were incorporated stepwise into the classiﬁcation to document the relevance of these input data. Classiﬁcation results on the level of land cover classes are discussed in the context of reference data selection and land cover class deﬁnition.

in the valley bottoms and of forests, alpine grasslands and bare rocks and glaciers on slopes and high altitude. 2.2. Ancillary data Elevation, slope, aspect, sun elevation angle (terrain illumination) and the Normalised Differenced Vegetation Index (NDVI) were calculated and used as ancillary data in the classiﬁcation process. A digital elevation model with 90 m resolution (pixel size) (Jarvis et al., 2006) was used to derive elevation, slope and aspect for the study region in ArcGIS 9.2. The resulting data were resampled to the spatial resolution of the spectral data (28.5 m) by cubic convolution. The three terrain measures were used in combination as one classiﬁcation input package (DEM). The sun elevation angle cos(i) was used as a measure of terrain illumination and accounts for topographic effects (Teillet et al., 1982), calculated as cos(i) = cos un cos usz + sin un sin usz cos (fs fn), where un is the terrain slope angle, uszis the solar zenith angle, fs is the solar azimuth angle and fn is the aspect angle (Twele and Erasmi, 2005). Solar zenith and azimuth angle were derived from the Landsat metadata ﬁle. Slope and aspect angle were calculated from the digital elevation model. The Normalised Differenced Vegetation Index (NDVI) was calculated as NDVI = (NIRTM4 REDTM3)/(NIRTM4 + REDTM3). 2.3. Reference data

2. Methods

Reference data for training and validating the classiﬁcations were provided by an extensive land cover assessment campaign for selected districts in North Tyrol, Austria (Neustift, Fulpmes, Mutters, Innsbruck, La¨ngenfeld), South Tyrol, Italy (St. Leonhard in Passeier, St. Martin in Passeier) and Upper Bavaria, Germany (GarmischPartenkirchen, Farchant) (Tasser et al., 2009). The land cover data were derived from visual interpretation of aerial photography from 2000 in combination with ﬁeld sampling in 2000/2001. The data were mapped consistently at a scale of 1:10 000 with a minimum mapping unit (MMU) of 4 ha. Only pixels within the core areas of the reference data polygons, i.e. pixels at least 50 m away from the polygon boundary (10 m for water courses), were included to reduce delineation errors. The data were transferred to raster format using the spatial resolution of the spectral data (28.5 m), which resulted in 716 524 pixels as reference data. The land cover classes used for the mapping were reclassiﬁed for the present study to meet the European CORINE Level 2 data criteria (Bossard et al., 2000; Nunes de Lima, 2005). The hierarchical CORINE land cover classiﬁcation scheme includes in Level 1 ‘artiﬁcial surfaces’, ‘agricultural areas’, ‘forest and semi-natural areas’, ‘wetlands’ and ‘water bodies’, and 15 classes in Level 2, of which 11 were recorded in the study area. Only class 34 (‘glaciers and perpetual snow’) was introduced from CORINE Level 3 to account for the wide areas covered by snow and glaciers in the study area, so that 12 land cover classes were used in the present study (Table 1).

2.1. Spectral data properties and study region

2.4. Classiﬁcation process

The spectral information for the image classiﬁcation was acquired by the Landsat7 ETM+ sensor (path193/row027) on 13 September 1999. The imagery was provided by the Global Land Cover Facility (GLCF) (www.landcover.org) as orthorectiﬁed GeoCover data set in GeoTIFF format with UTM projection (UTM 32N), WGS-84 datum, and 28.5 m pixel size. The six bands representing the visible and infrared spectrum (ETM+ bands 1–5, 7) were used in the study. The scene was cut to 1650 3300 pixels to ﬁt to the extent of available reference data and covers 3541 km2, including the city of Innsbruck (Austria) in the north-east (Fig. 1). The landscape is mountainous with elevation ranging from 390 m to 3739 m a.s.l. and consists primarily of grasslands and urban areas

For assessing the relevance of classiﬁers and input variables for the classiﬁcation success, 15 classiﬁcations were calculated, using three classiﬁers with ﬁve different input combinations. The input variables included (1) the spectral Landsat information from bands 1–5 and 7 (ETM), (2) ETM in combination with the topographic measures elevation, slope, aspect (DEM), (3) ETM, DEM and NDVI, (4) ETM, DEM and cos(i), and (5) ETM, DEM, NDVI, cos(i). Supervised classiﬁcation was performed using discriminant analysis (DA) in SPSS, maximum likelihood classiﬁcation (MLC) in Geomatica and the artiﬁcial neural network (ANN) in MATLAB. The MLC was calculated so that every pixel was assigned to a training class and no Null-class was created. DA were calculated using prior probabilities

M. Heinl et al. / International Journal of Applied Earth Observation and Geoinformation 11 (2009) 423–430

425

1996) was used as training algorithm. The optimal hit ratio, which was deﬁned as the proportion of correctly classiﬁed pixels to all pixels in the validation set, was achieved with 40 hidden units. A sample of 100 000 pixels (based on the 30% (214 931) initial training data sets) were selected randomly for the ANN classiﬁcation, of which 70% were used as training data to estimate the parameters and 30% were used to control for the generalization ability. In order to have equal prior probabilities, each class was represented by 8333 data sets. For classes with less than this number of pixels the units in the random sample were duplicated. Final validation of the ANN classiﬁcation was performed with the same 70% (501 593 pixels) of the reference data set as for MLC and DA. 2.5. Post-classiﬁcation assessments

Fig. 1. Outline of the districts providing the reference data (white areas), presented on Landsat7 ETM+ imagery (false colour composite of bands 5, 4, 3 with UTM coordinates, Zone 32N).

equal for all groups. (Although prior probabilities proportional to the group sample size usually lead to higher classiﬁcation accuracy, very small classes are rarely preserved in the classiﬁcation outcome.) MLC and DA were calculated with 30% (214 931) randomly selected pixels of the reference data (training data), and were validated by the remaining 70% (501 593) of the reference data. Accuracy assessment is based on the agreement between validation data and classiﬁcation outcomes. The artiﬁcial neural network, in particular a fully connected three-layer perceptron (MLP), was used as non-linear analyzing tool. The MLP consisted of an input layer containing 6, 9, 10 or 11 processing units according to the number of input variables, of a hidden layer including as many processing units as necessary to approximate the relationship, and of an output layer. The output layer had 12 output units corresponding to the land cover classes (cf. Table 1). The logistic function was employed as activation function for the output units to ensure the interpretation of the output as probability. The tangens hyperbolicus function was employed as activation function for the hidden layer. The Levenberg–Marquardt algorithm (Bishop,

Classiﬁcation accuracy was assessed as overall accuracy, representing the proportion of ‘correctly’ classiﬁed pixels (i.e. pixels with corresponding reference and classiﬁcation class) relative to the total amount of investigated pixels. For detailed analyses, also user’s and producer’s accuracy as well as the Kappa coefﬁcient of agreement were calculated (cf. Foody, 2002; Jensen, 2005). A signiﬁcance level of 1% was chosen. Upper and lower limits of the conﬁdence interval were computed using the formula for conﬁdence intervals for proportions (e.g. in Thomas and Allcock, 1984) for each class separately in order to consider the underlying heterogeneity. Afterwards the weighted average was calculated to obtain the limits for the overall accuracy. Additionally, a few training samples were drawn randomly and the methods optimised on the new samples. Subsequently the distribution of the obtained overall accuracies were analysed. As the width of the so obtained conﬁdence intervals was quite similar to the width of the aforementioned calculated conﬁdence intervals these results are suppressed for the sake of brevity. Due to a spatial resolution of 28.5 m of the input data (MMU of 0.08 ha), the classiﬁcation results are provided in a different level of detail than the reference data (MMU of 4 ha). Therefore, classiﬁcation accuracy is underestimated, as a discrepancy between reference and classiﬁcation data is not necessarily a result of misclassiﬁcation but can also be caused by the missing level of detail in the reference data. These differences in MMU were addressed in an additional assessment by limiting the validation data to pixels from pixel clusters larger than 4 ha in the classiﬁcation outcomes. 3. Results 3.1. Overall classiﬁcation accuracy related to classiﬁers and input variables The classiﬁcations by DA and MLC produced very similar overall accuracies for all input combinations. Accuracies were in the range of 55–60% for using only spectral data (ETM) as input variables and reached about 75% when ancillary data were included (Fig. 2). The classiﬁcations using ANN produced higher overall accuracies for all input combinations compared to MLC and DA, reaching about 75% for using only spectral data (ETM) and 85% with ancillary data. Maximum overall classiﬁcation accuracy of 86.3% was achieved by using ANN and all input information, lowest accuracy of 56.3% resulted from DA using spectral information (ETM) only. The stepwise incorporation of ancillary data into the classiﬁcation process showed the most pronounced increase in classiﬁcation accuracy for DEM data (i.e. elevation, slope and aspect), independent of the classiﬁer. The increase of classiﬁcation accuracy by incorporating DEM data was highest for MLC and DA with 18.8% and 21.1%, respectively, and lowest for ANN with 8.4%. The further incorporation of NDVI values increased classiﬁcation accuracy

426

M. Heinl et al. / International Journal of Applied Earth Observation and Geoinformation 11 (2009) 423–430

Table 1 Number and percentage of pixels of the land cover classes in the reference data. LCC

Class description

Abbr.

11 12

Urban fabric Industrial, commercial and transport units Artiﬁcial, non-agric. vegetated areas Arable land Permanent crops Grassland Forests Scrub and/or herbaceous associations Open spaces with little or no vegetation Glaciers and perpetual snow Inland wetlands Inland waters

urb ind

14 21 22 23 31 32 33 34 41 51 Total

art

Number of pixels 21,666 (3.0%) 4,412 (0.6%) 96 (<0.1%)

arab crop grass for scrub

1,453 301 33,509 356,986 108,629

(0.2%) (<0.1%) (4.7%) (49.8%) (15.2%)

open

156,446 (21.8%)

glac wet wat

30,794 (4.3%) 596 (0.1%) 1,636 (0.2%) 716,524 (100%)

Land cover class coding (LCC) and description follow the CORINE Land Cover Level 2 nomenclature (Bossard et al., 2000; Nunes de Lima, 2005).

by roughly 2–3%, including cos(i) increased the classiﬁcation accuracy by roughly 1–2% and their combined inclusion increased classiﬁcation accuracy by about 3–4%. Only pixels from pixel clusters larger than 4 ha were then used to correct for the differences in the MMU of reference and image data. The effect of this correction is an increase in overall accuracy to a ﬁnal maximum in this study of 89.5% (Kappa: 0.84) for MLC, 89.2% (Kappa: 0.83) for DA and 94.3% (Kappa: 0.91) for ANN classiﬁcation, respectively. Overall, classiﬁcations with the same input information were always most accurately classiﬁed by ANN. Regarding the input variables, especially DEM data signiﬁcantly increased the classiﬁcation accuracy and the increase was most pronounced for DA and MLC classiﬁcation. Considering the MMU in the reference data further increased overall accuracy by about 9%. The maximum overall accuracy reached by all three classiﬁers was in the magnitude of about 90%. 3.2. Accuracy of single land cover classes related to classiﬁers and input variables The just described relation of increasing overall accuracy with increasing input variables and highest accuracies by ANN classiﬁcation could only partially be supported by the results for single land cover classes, i.e. by ‘industrial units’ (12), ‘forest’ (31), ‘scrub’ (32) and ‘open’ (33) (Fig. 3). In contrast, classiﬁcation success was only little affected by the input variables for ‘art. veg. areas’ (14), ‘arable’ (21), ‘crop’ (22), ‘wetland’ (41) or ‘water’ (51). These classes were also in general not very well classiﬁed by ANN and even empty output classes were produced. As a third group, the classes ‘urban fabric’ (11), ‘grassland’ (23) and ‘glaciers’ (34) were least accurately classiﬁed by ANN for ETM as single input information, though incorporating further input variables resulted again in highest accuracies by ANN. In the case of the ‘grassland’ (23) class, also the incorporation of NDVI data clearly increased classiﬁcation accuracy, besides the already mentioned general positive effect of DEM data on overall accuracy. This indicates the importance of ancillary data, especially DEM, but also NDVI data, additionally to the spectral information (ETM), for the discrimination of speciﬁc land cover classes. 3.3. ‘‘Confusion’’ of land cover classes Misclassiﬁcations between single land cover classes are best assessed by an error or confusion matrix. Exemplary for the classiﬁcation outcomes, the results from discriminant analysis,

Fig. 2. Overall accuracies of maximum likelihood classiﬁcation (MLC), discriminant analysis (DA) and artiﬁcial neural network classiﬁcation (ANN) with different input data combinations (ETM: Landsat7 ETM+ bands 1–5, 7; DEM: aspect, slope, elevation; NDVI; cos(i): terrain illumination). Presented are also overall accuracies after considering the minimum mapping unit (MMU) in the reference data, i.e. the assessment was limited to pixel clusters larger than the MMU of 4 ha (cf. Section 2).

using all input variables (ETM, DEM, NDVI, cos(i)) and considering the MMU are presented (Table 2). Producer’s accuracies higher than 80% were recorded for 6 out of the 12 land cover classes, and all classes showed producer’s accuracies higher than 50%. User’s accuracies higher than 80% were also produced for 6 out of the 12 land cover classes. The other classes showed user’s accuracies below 50%, ‘artiﬁcial, non-agric. vegetated areas’ (14) and ‘wetland’ (41) even below 10%. The land cover classes can be grouped into three main categories according to the classiﬁcation success. One group consists of classes with both high user’s and producer’s accuracy, including ‘forest’ (31), ‘scrub’ (32), ‘open’ (33) and ‘glaciers’ (34). A second group consists of classes with high user’s accuracy and low producer’s accuracy, which includes ‘urban fabric’ (11) and ‘grassland’ (23). ‘Urban fabric’ (11) was primarily misclassiﬁed as ‘industrial unit’ (12) inside the group of urban classes. ‘Grassland’ (23) was primarily misclassiﬁed as ‘forest’ (31) and ‘wetland’ (41), but also to a lesser extent as ‘arable land’ (21), ‘scrub’ (32), ‘crop’ (22) and ‘artiﬁcial, non-agric. vegetated areas’ (14). The third group consists of classes with high producer’s accuracy and low user’s accuracy, including ‘industrial units’ (12), ‘artiﬁcial, non-agric. vegetated areas’ (14), ‘arable land’ (21), ‘crop’ (22), ‘wetland’ (41) and ‘water’ (51). Areas classiﬁed as ‘industrial units’ (12) included primarily pixels from ‘urban fabric’ (11), indicating again a confusion inside the ‘urban’ classes. The ‘artiﬁcial, non-agric. vegetated areas’ (14) classiﬁcation was mainly confused by falsely included pixels from ‘forest’ (31), ‘urban fabric’ (11) and ‘scrub’ (32). Both ‘arable land’ (21), ‘crop’ (22) and also ‘wetland’ (41) were primarily misclassiﬁed due to the inclusion of pixels originating from ‘grassland’ (23). The areas classiﬁed as ‘water’ (51) included primarily pixels from ‘forest’ (31), but also from ‘open’ habitats (33). Adequate classiﬁcation accuracy of about 80% or higher could therefore only be achieved for ‘forest’ (31), ‘scrub’ (32), ‘open’ (33) and ‘glaciers’ (34). Considering the misclassiﬁcation between the two classes ‘urban fabric’ (11) and ‘industrial units’ (12) would also qualify a combined urban class as well represented by the classiﬁcation. 4. Discussion 4.1. The relevance of input variables and classiﬁers for image classiﬁcation accuracy Spectral data, topographic measures and NDVI data were used to test their performance in image classiﬁcations by maximum

M. Heinl et al. / International Journal of Applied Earth Observation and Geoinformation 11 (2009) 423–430

427

Fig. 3. Number of correctly classiﬁed pixels by MLC (*), DA (&) and ANN (!) classiﬁcation for each land cover class, using different input data combinations (ETM: Landsat7 ETM+ bands 1–5, 7; DEM: aspect, slope, elevation; NDVI; cos(i): terrain illumination). Zero values for classes not present in the classiﬁcation outcomes were omitted.

likelihood classiﬁcation (MLC), discriminant analysis (DA) and artiﬁcial neural networks (ANN). The use of ancillary data signiﬁcantly improved the classiﬁcation accuracy for the present data set compared to using spectral data (ETM) only. These increases in overall accuracy were observed independent of the classiﬁer. Especially incorporating topographic information (elevation, slope, aspect; DEM), but also the NDVI, showed positive effects on the overall accuracy, indicating strong interdependences between these factors and land cover. The NDVI is derived from surface reﬂectance and, therefore, inevitably responds to land cover; but due to its nonlinearly, the index adds new information to the spectral bands, enhancing class separability (Cihlar et al., 1996; Defries and Townshend, 1994; Hansen et al., 2000; Jensen, 2005; Muchoney and Strahler, 2002). But topography is only an indirect measure, reﬂecting environmental gradients (e.g. temperature) that originally affect the land cover. Hence, land cover is not a response to topography, but to environmental conditions or land use often associated with topography. The incorporation of topographic data into the classiﬁcation process will therefore not increase classiﬁcation accuracy in all cases, but becomes most relevant, where it reﬂects environmental gradients (e.g. as in the case of mountainous regions in the present study) (Islam et al., 2008; Saadat et al., 2008). Therefore, also hydrological, soil or geological data, etc. can be used

successfully for improving image classiﬁcation accuracy. However, environmental gradients or landscape heterogeneity need to be reﬂected, so that their relevance becomes obviously dependent on the study area (Giannetti et al., 2001; Mas, 2004; Shrestha and Zinck, 2001; Watanachaturaporn et al., 2008). Regarding the classiﬁers, the study revealed large differences in classiﬁcation accuracy between ANN and MLC or DA for the minimum input setting (ETM). The differences between the classiﬁers are less pronounced when comparing the classiﬁcation results using the maximum number of input variables (cf. Fig. 2). The level of classiﬁcation accuracy was similar for DA and MLC including ancillary data and ANN without ancillary data. This suggests that investing in the incorporation of ancillary data can be as productive for increasing classiﬁcation accuracy as the preparation of classiﬁcation algorithms: incorporating ancillary data increased overall classiﬁcation accuracy by about 10% for ANN and by about 20% for MLC and DA, while the different classiﬁers varied in the range of 20% for ETM as sole input information and differed otherwise less than 10%. Overall, this indicates a superiority of ANN classiﬁcation in case of limited input information, the necessity of incorporating ancillary data when using DA or MLC for achieving adequate classiﬁcation accuracies comparable to ANN, and a minor importance of the type of classiﬁer with increasing

M. Heinl et al. / International Journal of Applied Earth Observation and Geoinformation 11 (2009) 423–430

428

Table 2 Error matrix of the supervised classiﬁcation by discriminant analysis (DA). LCC

Reference data

Sum

UA [%]

97.7 45.2 0.9 21.5 28.0 83.5 96.8 80.2 85.6 91.7 6.9 37.3

urb

ind

art

arab

crop

grass

for

scrub

open

glac

wet

wat

11

12

14

21

22

23

31

32

33

34

41

51

11 12 14 21 22 23 31 32 33 34 41 51

3834 1766 220 1 0 3 23 0 0 0 4 12

0 1564 6 0 0 3 17 0 0 0 10 5

1 3 29 0 0 5 0 0 0 0 0 0

0 4 17 162 9 0 5 0 0 0 0 1

0 0 0 0 109 0 6 0 0 0 0 0

7 49 133 591 246 5,195 1,863 572 0 0 1372 8

0 0 2,564 0 25 373 208,659 5,358 1,086 0 166 605

0 4 220 0 0 631 3,677 49,708 10,931 38 320 1

81 50 3 0 0 0 1,158 6,250 87,417 1,580 26 200

0 0 0 0 0 0 0 0 2,698 17,968 0 0

0 0 6 0 0 0 20 98 0 0 140 0

1 22 25 0 0 12 75 0 12 0 4 494

3,924 3,462 3,223 754 389 6,222 215,503 61,986 102,144 19,586 2,042 1,326

Sum

5863

1605

38

198

115

10,036

218,836

65,530

96,765

20,666

264

645

420,561

PA [%]

65.4

97.4

76.3

81.8

94.8

51.8

95.3

75.9

90.3

86.9

53.0

76.6

Input data are the Landsat bands 1–5 and 7 (ETM), elevation, slope and aspect (DEM), NDVI and cos(i). Only pixels from pixel clusters larger than 4 ha are considered in order to correct for the MMU in the reference data. See Table 1 for coding and abbreviation of the land cover classes (LCC). Overall accuracy: 89.2. Kappa coefﬁcient: 83.5. All underlined values indicate hit ratios that are signiﬁcantly higher than the corresponding hit ratios computed by employing a-priori probabilities. A-priori probabilities for class membership may either be obtained by the equal distribution of the classes or by the empirically given class distribution.

number of inputs. The latter becomes apparent when assuming all classiﬁers working ideally towards the same level of accuracy, i.e. 100%, so that high classiﬁcation accuracies of different classiﬁers will inevitably approximate. But this approximation is basically driven by the input variables. Choosing a superior classiﬁer, like, e.g. ANN (Hardin, 2000; Jensen, 2005; Kavzoglu and Mather, 2003; Zhang et al., 2007), is therefore expected to become less crucial with the more input information is included in the classiﬁcation. However, ancillary data need to be carefully selected, as increasing the input information does not necessarily enhance classiﬁcation accuracy (Kavzoglu and Mather, 2002). The selection of appropriate ancillary data becomes therefore highly relevant for trying to achieve high classiﬁcation accuracies, especially with standard classiﬁers like MLC or DA. Also the accuracy assessment on the level of the single land cover classes often revealed more pronounced advantages of incorporating ancillary data compared to the selection of the classiﬁer (cf. Fig. 3). Only ‘industrial units’ (12), ‘forest’ (31), ‘scrub’ (32) and ‘open’ (33) produced the same superiority of ANN classiﬁcation for all input combinations that was documented for the overall accuracy. But these classes represent more than 85% of the reference data set, so that they obviously dominate the trends in overall accuracy. In contrast, ‘urban fabric’ (11), ‘grassland’ (23) and ‘glaciers’ (34) produced lowest accuracies for ANN classiﬁcation without ancillary data and only the incorporation of DEM and NDVI led to accuracies comparable with MLC and DA. As these classes represent only 12% of the reference data, their accuracies do not affect overall accuracy, but they are of course highly relevant for land cover mapping. Therefore, despite relatively little effect of ancillary data on the overall accuracy in the ANN classiﬁcations, the assessment of the single land cover classes clearly revealed the importance of ancillary data for speciﬁc classes also for ANN classiﬁcation; and not only for MLC and DA as indicated by the results on the overall accuracy. Including ancillary data in the classiﬁcation process is therefore considered as crucially important for achieving high classiﬁcation accuracies independent on the classiﬁer and both on the level of overall accuracy and on the level of the single land cover classes. The rather extensive discussion in the scientiﬁc literature on techniques and performances of image classiﬁers (Dixon and

Candade, 2008; Foody, 2001; Kavzoglu and Mather, 2003; Kavzoglu and Reis, 2008; Ouyang and Ma, 2006; Pal and Mather, 2005; Paola and Schowengerdt, 1997) and only little attention and efforts towards developing and optimising input variables (Berberoglu et al., 2007; Carra˜o et al., 2008; Zhu and Tateishi, 2006) does, however, not reﬂect this picture (Lu and Weng, 2007). A stronger focus on the development of appropriate and optimised sets of input variables for discrimination of speciﬁc land cover classes rather than on image classiﬁers would therefore be an important step for further advances in image classiﬁcation. 4.2. Aspects of reference data selection and class deﬁnition Besides input variables and classiﬁers, also the aspects of reference data selection and the land cover class deﬁnition are of crucial importance for any kind of supervised image classiﬁcation. The approach of using data polygons for reference data selection with a minimum mapping unit (MMU) has shown to be problematic, as the discrepancy in mapping detail between reference and image data produces inaccurate land cover information for areas smaller than the MMU, but is rather unavoidably in any non-pixel based approach (Cihlar et al., 1996). The effect of the MMU could be illustrated in the present study by the increase in overall accuracy by about 10% after excluding pixels that are part of pixel clusters smaller than the MMU; considering the MMU in the accuracy assessment is therefore deﬁnitely of signiﬁcant importance. The derived confusion matrix for the classiﬁcation by discriminant analysis (DA) revealed not only the classiﬁcation accuracy of the single land cover classes, but could also illustrate effects of inappropriate land cover class deﬁnition, which became especially evident for ‘grassland’ (23) and the classes representing artiﬁcial surfaces. The ‘grassland’ (23) class showed low producer’s accuracy and was falsely included basically in the ‘forest’ (31), ‘scrub’ (32) and ‘wetland’ (41) classes (cf. Table 2), which indicates a too heterogeneous deﬁnition of this class. Indeed, ‘grassland’ (23) includes both intensive meadows, extensively managed pastures or even abandoned grasslands and covers therefore many different ‘subclasses’ with different spectral properties caused by woody components, senescent vegetation or treatment that would require

M. Heinl et al. / International Journal of Applied Earth Observation and Geoinformation 11 (2009) 423–430

a reﬁnement of this class. However, comparing the classiﬁcation success of the ‘grassland’ (23) class by the different classiﬁers indicates a better separability by ANN. Artiﬁcial neural networks have, indeed, found to be more robust to training site heterogeneity than other classiﬁers (Kavzoglu and Reis, 2008; Muchoney and Strahler, 2002; Paola and Schowengerdt, 1995), although this could only be supported in this study when ancillary data were incorporated. ANN classiﬁcation seems, therefore, less sensitive to class deﬁnition than MLC or especially DA, presuming that sufﬁcient non-spectral information are provided for their discrimination. Inappropriate class deﬁnition affects most likely also the classiﬁcation accuracy of other classes, e.g. the land cover classes ‘arable’ (21), ‘crop’ (22) and ‘wetland’ (41) with low user’s accuracies and many ‘grassland’ (23) pixels falsely included. These obviously spectrally and also topographically rather similar land cover types would require further information for their separation in image classiﬁcation, e.g. by multi-temporal data merging to account for seasonal variations related to management practices (Brown de Colstoun et al., 2003; Capao et al., 2007; Carra˜o et al., 2008; Langley et al., 2001; Liu et al., 2002; Zhu and Tateishi, 2006). In contrast to the suggested reﬁnement or subdivision of the ‘grassland’ (23) class, the classes representing artiﬁcial surfaces appear to be inadequately dissected. ‘Urban fabric’ (11) and ‘industrial units’ (12) are largely confused among each other (cf. Table 2) and appear to be difﬁcult to distinguish spectrally or even in the process of reference data mapping. A combination of these classes would clearly improve the classiﬁcation outcome, though obviously any reduction of thematic resolution would increase classiﬁcation accuracy (Bach et al., 2006; Latifovic and Olthof, 2004). Inappropriate class deﬁnition became also evident for ‘artiﬁcial, non-agric. vegetated areas’ (14), which represents parks and leisure facilities largely composed of grassy and woody surfaces, and is therefore primarily confused with ‘grassland’ (23) and ‘forest’ (31) due to spectral similarities. Hence, classes that are primarily deﬁned by their spatial appearance (‘land use’) and not by their surface characteristics (‘land cover’) seem inappropriate for image classiﬁcation purely based on spectral and topographic information. In cases where land use classes are not sufﬁciently separable by remotely sensed data, classiﬁcation accuracy may be increased through the use of texture measures or contextual information (Berberoglu et al., 2007; Carvalho et al., 2004). But nevertheless, the results clearly stress the importance of well-deﬁned land cover classes for image classiﬁcation. Class deﬁnition needs to be based on typical surface characteristics rather than on land use or spatial connectivity with other classes. Using global or continental land cover classiﬁcation schemes (e.g. FAO’s LCCS (Di Gregorio and Jansen, 2000), CORINE (Bossard et al., 2000; Nunes de Lima, 2005)) requires therefore careful consideration if the used classes are appropriately representing the land cover in the regional or local study area. The land cover class deﬁnition also needs to consider the spectral properties of the classes and the separability of the classes by the input variables. Inappropriate class deﬁnition or missing input variables for discriminating classes will inevitably leads to inaccurate and blurred classiﬁcation results, as, e.g. for the ‘grassland’ (23) class in the present study. The understanding of what kind of land cover classes can be discriminated by what kind of spectral, topographic or temporal information is, however, still rudimentary and further research on the separability of speciﬁc land cover classes is required to advance image classiﬁcation. 5. Conclusion The comparison of the performance of MLC, DA and ANN in image classiﬁcation revealed advantages of ANN classiﬁcations in image accuracy overall and for single land cover classes. The incorporation of ancillary data into the classiﬁcation process clearly increased

429

classiﬁcation accuracy overall and on the level of single land cover classes, independent of the used classiﬁer. However, ANN produced high accuracies also with limited input information, while MLC and DA produced comparable results only by incorporating ancillary data into the classiﬁcation process. However, the superiority of ANN classiﬁcation was less pronounced on the level of the single land cover classes, especially when no ancillary data are incorporated in the classiﬁcation. Overall, the magnitude of difference in overall accuracy for the different input data combinations indicates large potential of ancillary data for increasing classiﬁcation accuracy, comparable to the selection of the classiﬁer. Therefore, further approaches that work towards an optimised set of input variables for discriminating speciﬁc land cover classes are required. Also the deﬁnition and selection of land cover classes has shown to be crucial and not to be simply adaptable from existing land cover class schemes, especially for regional or local studies. Land cover classes need to be deﬁned based on separability by the used input variables and on local habitat characteristics. A stronger focus on discriminating land cover types by their spectral, topographic or seasonal properties is therefore required to further advance the process of image classiﬁcation. Acknowledgements The research was kindly supported by the University of Innsbruck Vice Rectorate for Research and the European Academy Bolzano (EURAC). The authors thank two anonymous reviewers for their valuable comments and suggestions. References Bach, M., Breuer, L., Frede, H.G., Huisman, J.A., Otte, A., Waldhardt, R., 2006. Accuracy and congruency of three different digital land-use maps. Landscape and Urban Planning 78 (4), 289–299. Berberoglu, S., Curran, P.J., Lloyd, C.D., Atkinson, P.M., 2007. Texture classiﬁcation of Mediterranean land cover. International Journal of Applied Earth Observation and Geoinformation 9 (3), 322–334. Bishop, C., 1996. Neural Networks for Pattern Recognition. Oxford University Press, New York. Bossard, M., Feranec, J., Otahel, J., 2000. CORINE Land Cover Technical Guide— Addendum 2000. Technical Report No 40 (Copenhagen, EEA). Brandt, J.S., Townsend, P.A., 2006. Land use–land cover conversion, regeneration and degradation in the high elevation Bolivian Andes. Landscape Ecology 21 (4), 607–623. Brown de Colstoun, E.C., Story, M.H., Thompson, C., Commisso, K., Smith, T.G., Irons, J.R., 2003. National Park vegetation mapping using multitemporal Landsat 7 data and a decision tree classiﬁer. Remote Sensing of Environment 85 (3), 316– 327. Capao, L., Carrao, H., Araujo, A., Caetano, M., 2007. An approach for land cover mapping with multi-temporal MERIS imagery. In: IEEEE Geoscience and Remote Sensing Symposium (IGARSS), Proceedings. pp. 3836–3839. Carpenter, G.A., Gopal, S., Macomber, S., Martens, S., Woodcock, C.E., Franklin, J., 1999. A neural network method for efﬁcient vegetation mapping. Remote Sensing of Environment 70, 326–338. Carra˜o, H., Gonc¸alves, P., Caetano, M., 2008. Contribution of multispectral and multitemporal information from MODIS images to land cover classiﬁcation. Remote Sensing of Environment 112 (3), 986–997. Carvalho, L.M.T.D., Clevers, J.G.P.W., Skidmore, A.K., Jong, S.M.D., 2004. Selection of imagery data and classiﬁers for mapping Brazilian semideciduous Atlantic forests. International Journal of Applied Earth Observation and Geoinformation 5 (3), 173–186. Cihlar, J., Ly, H., Xiao, Q.H., 1996. Land cover classiﬁcation with AVHRR multichannel composites in northern environments. Remote Sensing of Environment 58 (1), 36–51. Cushman, S.A., Wallin, D.O., 2000. Rates and patterns of landscape change in the Central Sikhote-alin Mountains, Russian Far East. Landscape Ecology 15 (7), 643–659. Defries, R.S., Townshend, J.R.G., 1994. NDVI-derived land-cover classiﬁcations at a global-scale. International Journal of Remote Sensing 15 (17), 3567–3586. Di Gregorio, A., Jansen, L., 2000. Land Cover Classiﬁcation System (LCCS)—Classiﬁcation Concepts and User Manual. FAO, Rome, Italy. Dixon, B., Candade, N., 2008. Multispectral landuse classiﬁcation using neural networks and support vector machines: one or the other, or both? International Journal of Remote Sensing 29 (4), 1185–1206. Ekercin, S., 2007. Coastline change assessment at the Aegean Sea Coasts in Turkey using multitemporal Landsat imagery. Journal of Coastal Research 23 (3), 691– 698.

430

M. Heinl et al. / International Journal of Applied Earth Observation and Geoinformation 11 (2009) 423–430

Fassnacht, K.S., Cohen, W.B., Spies, T.A., 2006. Key issues in making and using satellite-based maps in ecology: a primer. Forest Ecology and Management 222 (1–3), 167–181. Foody, G.M., 2001. Thematic mapping from remotely sensed data with neural networks: MLP, RBF and PNN based approaches. Journal of Geographical Systems 3, 217–232. Foody, G.M., 2002. Status of land cover classiﬁcation accuracy assessment. Remote Sensing of Environment 80, 185–201. Giannetti, F., Montanarella, L., Salandin, R., 2001. Integrated use of satellite images, DEMs, soil and substrate data in studying mountainous lands. International Journal of Applied Earth Observation and Geoinformation 3 (1), 25–29. Hansen, M.C., Defries, R.S., Townshend, J.R.G., Sohlberg, R., 2000. Global land cover classiﬁcation at 1 km spatial resolution using a classiﬁcation tree approach. International Journal of Remote Sensing 21 (6–7), 1331–1364. Hardin, P.J., 2000. Neural networks versus nonparametric neighbor-based classiﬁers for semisupervised classiﬁcation of Landsat Thematic Mapper imagery. Optical Engineering 39 (7), 1898–1908. Heinl, M., Neuenschwander, A., Sliva, J., Tacheba, B., 2006. Interactions between ﬁre and ﬂooding in a southern African ﬂoodplain system (Okavango Delta, Botswana). Landscape Ecology 21, 699–709. Islam, M.A., Thenkabail, P.S., Kulawardhana, R.W., Alankara, R., Gunasinghe, S., Edussriya, C., Gunawardana, A., 2008. Semi-automated methods for mapping wetlands using Landsat ETM plus and SRTM data. International Journal of Remote Sensing 29 (24), 7077–7106. Jarvis, A., Reuter, H.I., Nelson, A., Guevara, E., 2006. Hole-ﬁlled Seamless SRTM Data V3. International Centre for Tropic Agriculture (CIAT) available from http:// srtm.csi.cgiar.org. Jensen, J.R., 2005. Introductory Digital Image Processing: A Remote Sensing Perspective. Pearson, Prentice Hall, USA. Jianchu, X., Xihui, A., Xiqing, D., 2005. Exploring the spatial and temporal dynamics of land use in Xizhuang watershed of Yunnan, southwest China. International Journal of Applied Earth Observation and Geoinformation 7 (4), 299–309. Joy, S.M., Reich, R.M., Reynolds, R.T., 2003. A non-parametric, supervised classiﬁcation of vegetation types on the Kaibab National forest using decision trees. International Journal of Remote Sensing 24 (9), 1835–1852. Kavzoglu, T., 2009. Increasing the accuracy of neural network classiﬁcation using reﬁned training data. Environmental Modelling & Software 24 (7), 850– 858. Kavzoglu, T., Mather, P.M., 2002. The role of feature selection in artiﬁcial neural network applications. International Journal of Remote Sensing 23 (15), 2919– 2937. Kavzoglu, T., Mather, P.M., 2003. The use of backpropagating artiﬁcial neural networks in land cover classiﬁcation. International Journal of Remote Sensing 24 (23), 4907–4938. Kavzoglu, T., Reis, S., 2008. Performance analysis of maximum likelihood and artiﬁcial neural network classiﬁers for training sets with mixed pixels. Giscience & Remote Sensing 45 (3), 330–342. Kaya, S., Curran, P.J., 2006. Monitoring urban growth on the European side of the Istanbul metropolitan area: A case study. International Journal of Applied Earth Observation and Geoinformation 8 (1), 18–25. Kozak, J., Estreguil, C., Ostapowicz, K., 2008. European forest cover mapping with high resolution satellite data: the Carpathians case study. International Journal of Applied Earth Observation and Geoinformation 10 (1), 44–55. Langley, S.K., Cheshire, H.M., Humes, K.S., 2001. A comparison of single date and multitemporal satellite image classiﬁcations in a semi-arid grassland. Journal of Arid Environments 2001 (49), 401–411. Latifovic, R., Olthof, I., 2004. Accuracy assessment using sub-pixel fractional error matrices of global land cover products derived from satellite data. Remote Sensing of Environment 90 (2), 153–165. Liu, Q.J., Takamura, T., Takeuchi, N., Shao, G., 2002. Mapping of boreal vegetation of a temperate mountain in China by multitemporal Landsat TM imagery. International Journal of Remote Sensing 23 (17), 3385–3405.

Lu, D., Weng, Q., 2007. A survey of image classiﬁcation methods and techniques for improving classiﬁcation performance. International Journal of Remote Sensing 28 (5), 823–870. Mas, J.F., 2004. Mapping land use/cover in a tropical coastal area using satellite sensor data, GIS and artiﬁcial neural networks. Estuarine Coastal and Shelf Science 59 (2), 219–230. Muchoney, D.M., Strahler, A.H., 2002. Pixel- and site-based calibration and validation methods for evaluating supervised classiﬁcation of remotely sensed data. Remote Sensing of Environment 81 (2–3), 290–299. Nunes de Lima, M.V.E., 2005. CORINE land cover updating for the year 2000. Image 2000 and CLC2000, products and methods. EUR 21757 EN (Ispra, JRC-IES). Ouyang, Y., Ma, J., 2006. Classiﬁcation of multi-spectral remote sensing data using a local transfer function classiﬁer. International Journal of Remote Sensing 27 (24), 5401–5408. Pal, M., Mather, P.M., 2005. Support vector machines for classiﬁcation in remote sensing. International Journal of Remote Sensing 26 (5), 1007–1011. Paola, J.D., Schowengerdt, R.A., 1995. A detailed comparison of backpropagation neural network andmaximum-likelihood classiﬁers for urban land use classiﬁcation. Geoscience and Remote Sensing 33 (4), 981–996. Paola, J.D., Schowengerdt, R.A., 1997. The effect of neural-network structure on a multispectral land-use/land-cover classiﬁcation. Photogrammetric Engineering & Remote Sensing 63 (5), 535–544. Ruiz-Luna, A., Berlanga-Robles, C., 2003. Land use, land cover changes and coastal lagoon surface reduction associated with urban growth in northwest Mexico. Landscape Ecology 18 (2), 159–171. Saadat, H., Bonnell, R., Shariﬁ, F., Mehuys, G., Namdar, M., Ale-Ebrahim, S., 2008. Landform classiﬁcation from a digital elevation model and satellite imagery. Geomorphology 100 (3–4), 453–464. Shrestha, D.P., Zinck, J.A., 2001. Land use classiﬁcation in mountainous areas: integration of image processing, digital elevation data and ﬁeld knowledge (application to Nepal). International Journal of Applied Earth Observation and Geoinformation 3 (1), 78–85. Souza, C., Firestone, L., Silva, L., Roberts, D., 2003. Mapping forest degradation in Eastern Amazon from SPOT4 through spectral mixture models. Remote Sensing of Environment 87 (4), 494–506. Tasser, E., Rufﬁni, F., Tappeiner, U., 2009. An integrative approach for analysing landscape dynamics in diverse cultivated and natural mountain areas. Landscape Ecology 24 (5), 611–628. Teillet, P.M., Guindon, B., Goodenough, D.G., 1982. On the slope-aspect correction of multispectral scanner data. Canadian Journal of Remote Sensing 8 (2), 84– 106. Thomas, I.L., Allcock, G.M., 1984. Determining the conﬁdence level for a classiﬁcation. Photogrammetric Engineering and Remote Sensing 50 (10), 1491– 1496. Twele, A., Erasmi, S., 2005. Evaluating topographic correction algorithms for improved land cover discrimination in mountainous areas of central Sulawesi. In: Erasmi, S., Cyffka, B., Kappas, M. (Eds.), Remote Sensing and GIS for Environmental Studies. Go¨ttinger Geographische Abhandlungen 113, Go¨ttingen, pp. 287–295. Watanachaturaporn, P., Arora, M.K., Varshney, P.K., 2008. Multisource classiﬁcation using support vector machines: an empirical comparison with decision tree and neural network classiﬁers. Photogrammetric Engineering and Remote Sensing 74 (2), 239–246. Yemefack, M., Bijker, W., De Jong, S.M., 2006. Investigating relationships between Landsat-7 ETM+ data and spatial segregation of LULC types under shifting agriculture in southern Cameroon. International Journal of Applied Earth Observation and Geoinformation 8 (2), 96–112. Zhang, Y., Gao, J., Wang, J., 2007. Detailed mapping of a salt farm from Landsat TM imagery using neural network and maximum likelihood classiﬁers: a comparison. International Journal of Remote Sensing 28 (10), 2077–2089. Zhu, L., Tateishi, R., 2006. Fusion of multisensor multitemporal satellite data for land cover mapping. International Journal of Remote Sensing 27 (5–6), 903– 918.

Classifiers vs. input variables—The drivers in image classification for land cover mapping

Classifiers vs. input variables—The drivers in image classification for land cover mapping

Recommend Documents