ISPRS Journal of Photogrammetry and Remote Sensing 121 (2016) 167–176
Contents lists available at ScienceDirect
ISPRS Journal of Photogrammetry and Remote Sensing journal homepage: www.elsevier.com/locate/isprsjprs
Detecting bugweed (Solanum mauritianum) abundance in plantation forestry using multisource remote sensing Kabir Peerbhay ⇑, Onisimo Mutanga, Romano Lottering, Victor Bangamwabo, Riyad Ismail University of KwaZulu-Natal, School of Agricultural, Earth and Environmental Sciences, Discipline of Geography, P/Bag X01, Scottsville 3209, Pietermaritzburg, South Africa
a r t i c l e
i n f o
Article history: Received 5 October 2015 Received in revised form 28 July 2016 Accepted 13 September 2016 Available online 15 October 2016 Keywords: Remote sensing LiDAR Weed detection Sparse partial least squares discriminant analysis (SPLS-DA)
a b s t r a c t The invasive weed Solanum mauritianum (bugweed) has infested large areas of plantation forests in KwaZulu-Natal, South Africa. Bugweed often forms dense infestations and rapidly capitalises on available natural resources hindering the production of forest resources. Precise assessment of bugweed canopy cover, especially at low abundance cover, is essential to an effective weed management strategy. In this study, the utility of AISA Eagle airborne hyperspectral data (393–994 nm) with the new generation Worldview-2 multispectral sensor (427–908 nm) was compared to detect the abundance of bugweed cover within the Hodgsons Sappi forest plantation. Using sparse partial least squares discriminant analysis (SPLS-DA), the best detection results were obtained when performing discrimination using the remotely sensing images combined with LiDAR. Overall classification accuracies subsequently improved by 10% and 11.67% for AISA and Worldview-2 respectively, with improved detection accuracies for bugweed cover densities as low as 5%. The incorporation of LiDAR worked well within the SPLS-DA framework for detecting the abundance of bugweed cover using remotely sensed data. In addition, the algorithm performed simultaneous dimension reduction and variable selection successfully whereby wavelengths in the visible (393–670 nm) and red-edge regions (725–734 nm) of the spectrum were the most effective. Ó 2016 International Society for Photogrammetry and Remote Sensing, Inc. (ISPRS). Published by Elsevier B.V. All rights reserved.
1. Introduction In South Africa, commercial forests occupy 1.5 million ha of land and are primarily located along the east coast of the country. The industry provides employment for approximately 750 000 employees, including those in indirectly related sectors, and contributes around 2% to the gross domestic product (GDP) (Le Maitre et al., 2004; DAFF, 2009). Commercial forestry supports 95% of the country’s demand for wood-derived material and largely consists of exotic tree species including Acacia, Pinus and Eucalyptus (Fairbanks et al., 2000). Recently, there have been numerous pressures exerted on the industry due to the growing demand on timber and the limited extent of commercial forests. The decline in newly afforested areas has also contributed in this regard and is primarily attributed to the lack of suitable areas for forest production, the difficulties associated with obtaining water permits and the conversion from a forestry application to other agricultural uses (Pallett, 2005; DAFF, 2009).
⇑ Corresponding author. Fax: +27 333862314.
It is therefore essential to protect these valuable plantation forests by successfully eliminating any form of agent that may present a significant threat. One such threat that presents great economic and ecological concerns in South African forest plantations includes invasive alien plants (IAPs) (Peerbhay et al., 2016). Solanum mauritianum in particular (commonly known as bugweed), is a category one plant invader that has invaded all but two of the driest provinces of the country (Olckers, 2011; Jordaan and Downs, 2012). The evergreen, noxious, branched scrub or small tree is currently one of the most aggressive, opportunistic and extremely resilient IAP species and is widespread in the eastern higher rainfall regions (Olckers, 2011). With grey-green oval shaped leaves and a lifespan of approximately 30 years, the plant is a major invader of agricultural land, forestry plantations, water courses and disturbed environments (Copeland and Wharton, 2006; Olckers and Borea, 2009). In forest plantations, bugweed invasions have the ability to excessively capture available natural resources, suppressing growth and displacing forest vegetation (Campbell and van Staden, 1990). With a tree height of up to 10 m, extensive bugweed stands are able to dominate and replace canopy or sub-canopy layers of forest plantations (Olckers, 2011) which may further exclude
E-mail address:
[email protected] (K. Peerbhay). http://dx.doi.org/10.1016/j.isprsjprs.2016.09.014 0924-2716/Ó 2016 International Society for Photogrammetry and Remote Sensing, Inc. (ISPRS). Published by Elsevier B.V. All rights reserved.
168
K. Peerbhay et al. / ISPRS Journal of Photogrammetry and Remote Sensing 121 (2016) 167–176
existing forest species through overcrowding and shading (Jordaan and Downs, 2012). Bugweed has the capability to impede the regeneration process of other vegetation through allelopathic suppression (Levine et al., 2003). Allelochemicals endorse ecological change that is responsible for altering environments and typically inhibiting vegetation succession (Huang and Asner, 2009). These changes compromise organic matter content and ecosystem services such as water purification and soil regeneration (Le Maitre et al., 2004). In order to respond to the proliferation of bugweed, forest managers would not only require timely and accurate information relating to the location of infestations but also the spatial extent and abundance of weed cover (Carson et al., 1995; Goel et al., 2002; Lass et al., 2002; Ustin et al., 2002). This information is important for supporting decisions related to the allocation of resources for weed removal, estimating these costs per hectare, tracking changes in weed populations and evaluating the effectiveness of site-specific weed mitigation interventions (Lass et al., 1996, 2002). However, since weed infestations are not distributed evenly across the forest, this kind of information cannot be obtained using conventional field-based surveys owing to time, cost and logistical implications (Lass et al., 1996). Remote sensing technologies thus offer a viable option for obtaining information related to the location, distribution and spatial extent of IAPs (Lass et al., 2005; Miao et al., 2006). Many applications in remote sensing have investigated the detection and mapping of IAPs (Fuller, 2005; Glenn et al., 2005; Lawrence et al., 2006; Mundt et al., 2005; Laba et al., 2008 Peerbhay et al., 2015). For example, Fuller (2005) employed a back-propagation neural network (NN) classifier to successfully map the invasive paperbark tree using IKONOS imagery (4 m) and obtained an accuracy of greater than 85% when combined with a landscape fragmentation analysis. Laba et al. (2008) detected the presence of three alien plants, purple loosestrife, common reed and water chestnut using Quickbird imagery (2.4 m). Using a maximum likelihood classifier, the fine spatial resolution data produced an overall accuracy greater than 65%. Using hyperspectral imaging (Hymap data, 3 m), Mundt et al. (2005) discriminated hoary cress using different parameters and classification methods. MNF reflectance and spectral angel mapper (SAM) analysis performed the best and resulted in producers, users and overall accuracies of 82%, 79% and 86%, respectively. Spotted knapweed and leafy spurge were successfully mapped by the Breiman Cutler’s random forest algorithm to produce classification accuracies of 84% and 86% respectively using the hyperspectral Probe-1 sensor (3 m and 5 m) (Lawrence et al., 2006). Recently, Peerbhay et al. (2014) demonstrated the capability of detecting bugweed (67%) from a variety of forest species using AISA Eagle hyperspectral data (2.4 m) and partial least squares discriminant analysis (PLS-DA). Atkinson et al. (2014), however, produced better detection accuracies (93%) using support vector machines (SVM) and when focusing only within Pinus patula compartments. Nevertheless, Peerbhay et al. (2015) developed a novel unsupervised framework using random forests to produce a semiautomated detection accuracy of 88.96% when mapping bugweed as an anomaly. While such studies concentrated on detecting and mapping IAP’s, only a limited number of studies focused on detecting the abundance of weed infestations (Carson et al., 1995; Lass et al., 1996, 2000, 2002; Lass and Prather, 2004). For instance, Lass et al. (2002) successfully detected the abundance in spotted knapweed cover (>70% cover) using an image spectrometer (440– 2543 nm) and a 5 m spatial resolution. Subsequently, Lass et al. (2005) detected the cover of knapweed and baby’s breath infestations using a charged couple device (CCD) (415–953 nm) with a spatial resolution of 2 m. Using a multispectral CCD (400– 1000 nm) with 4 m spatial resolution, yellow starthistle cover
(31–70% cover, 71–100% cover) was also detected (Lass et al., 2000). In this regard, the utility of hyperspectral sensors has been found to be more effective than broad band multispectral sensors (Lass et al., 2002, 2005; Lass and Prather, 2004; Miao et al., 2006). Hyperspectral data improves on multispectral data by capturing information from hundreds of narrow bands, providing a full spectral curve for each pixel (Dye et al., 2011; Peerbhay et al., 2013). These detailed spectral profiles are capable of distinguishing vegetation at species level which often proves challenging for multispectral analysis (Peerbhay et al., 2014b). In addition, the subtle spectral variations between surrounding vegetation and IAPs challenge the utility of multispectral remote sensing data as well as occurrence of over story canopy interferences (Müllerová et al., 2013). Therefore, integrating canopy structural information from reliable LiDAR systems may prove promising (Mundt et al., 2006; Asner et al., 2008; Huang and Asner, 2009). For example, LiDAR was combined with Hyperspectral image data (450–2500 nm) to successfully detect sagebush (89%) in a semi-arid rangeland (Mundt et al., 2006. LiDAR provided valuable information on the stand structure of the weed and exceeded the mapping capability of using either of the individual datasets alone. Asner et al., 2008 successfully detected (<7% error) individual crowns of five invasive tree species in Hawaii using hyperspectral imaging (367–2510 nm) and LiDAR. LiDAR proved effective when penetrating the dense rain forest to measure the effects of invasion on the structure of the three-dimensional canopy. However, LiDAR data alone is not efficient for detecting IAP’s due to the similar structural traits weeds may display compared to other vegetation (Huang and Asner, 2009). It is within this context that this study evaluates the applicability of hyperspectral and multispectral remote sensing combined with LiDAR for detecting the abundance of bugweed cover using sparse partial least squares discriminant analysis (SPLS–DA). 2. Methods and materials 2.1. Study region The study region falls within the mist-belt grassland bioregion of the KwaZulu-Natal midlands and is located along the east coast of South Africa (Fig. 1). The province of KwaZulu-Natal is the country’s leading timber producer with Sappi being one of the largest pulp and paper manufacturers in this region. The study area occupies 6391 ha of the Sappi Hodgsons plantation forest which consists mainly of Acacia mearnsii, Eucalyptus grandis, Eucalyptus nitens, Eucalyptus smithii, Pinus patula, Pinus elliotii and Pinus taeda trees. Annual rainfall ranges between 730 mm and 1280 mm with most rainfall occurring during the summer. The plantation experiences average temperatures in the region of 16 °C and lies at an altitude between 1030 m and 1590 m above sea level (Mucina and Rutherford, 2006). Bugweed is the most widespread alien invader across the plantation and occurs in dense patches along riparian areas, open areas and forest corridors. In some instances, bugweed forms extensive and impenetrable stands that dominate portions of the plantation. 2.2. Input remotely sensed datasets This study utilised different passive and active remotely sensed data types (Table 1). Chronologically, the hyperspectral and LiDAR data were acquired during February 2009 on board a fixed wing aircraft at an altitude of 2728 m and an imaging swath width of 3058 m. The Airborne Imaging Spectrometer for Applications (AISA) Eagle imagery consisted of four flight strips and was captured at 2.4 m spatial resolution while measuring reflectance in the 393–994 nm spectral range. The LiDAR dataset was acquired
K. Peerbhay et al. / ISPRS Journal of Photogrammetry and Remote Sensing 121 (2016) 167–176
169
Fig. 1. The location of the study area and the LiDAR DSM over the study plantation.
in full waveform format and released in LAS files by the vendor (Dimap Australia). The sensor is a Riegl Q560 laser and during the flight achieved an average point density of about 5 points/m2 for the LiDAR acquisition, thanks to the maximum sampling rate of the instrument (80 kHz). The raw point cloud, once opportunely
filtered for noise and regularized by LAS Tools (LAS dataset to raster tool), generated a grid digital surface model (DSM) having a cell size of 2.4 m, consistent to that of the AISA Eagle image. Finally, during February 2010, WorldView-2 (WV-2) multispectral imagery of the study area was obtained under cloudless conditions. The
170
K. Peerbhay et al. / ISPRS Journal of Photogrammetry and Remote Sensing 121 (2016) 167–176
Table 1 Datasets used in this study. Acquisition date
Data type
Spatial resolution
Spectral resolution
Off-Nadir
Solar Azimuth
2009 2009 2010
Hyperspectral LiDAR DSM Multispectral
2.4 m 5 points per m2 2m
272 bands
0°
53.6°
8 bands
20°
58.5°
multispectral system comprises eight bands sensing in the 427– 908 nm spectral range with a spatial resolution of 2 m and an imaging swath of 16.4 km. A second DSM was then generated using the LiDAR point cloud dataset and with the same spatial resolution as the WorldView-2 reflectance data (2 m). The AISA Eagle image was atmospherically corrected using the empirical line method which is based on the linear relationship between in situ measured ground reflectance and the sensor’s spectral signal (Roberts et al., 1986). Using the Analytical Spectral Devices (ASD) FieldSpec 3 spectrometer (350–2500 nm), reflectance of two 10 m 10 m plastic targets (one black and one white) per flight line were obtained. The field spectra were then resampled to match the AISA wavelengths and calibrate the image to percentage reflectance (RMSE < 2%). For the WorldView-2 image, atmospheric calibration was accomplished by converting the digital numbers (DN) of the image to the top-of-atmosphere reflectance (Omar, 2010). The ortho-projection of both images was performed using a digital elevation model (DEM) with 5 m contours, using the Universal Transverse Mercator (UTM zone 36S) projection and the WGS-84 datum, consistent with the LiDAR dataset. Using 15 differentially corrected GPS ground control points, collected around distinctive geographical features in the study area (e.g. rock outcrops, roads and office buildings), pixels in the image datasets corresponded to the same geographic location. These control points were therefore representative of the GPS features and produced an error of less than one WV-2 pixel (<2 m). ENVI 4.7 image processing software (ENVI, 2009) was used for the preprocessing of the AISA Eagle and WorldView-2 images, while the DSM and processing of the LiDAR data were executed using ArcGIS 10.2 (ArcMap, 2013). 2.3. Reference data collection A total of 150 bugweed samples were recorded during a field exercise in February 2009. The centre of each bugweed sample was recorded using a differentially corrected Trimble GPS receiver with sub-meter accuracy. The field sample points were then used to collect image spectra from the various datasets (Fig. 2) and then overlaid onto a high resolution 10 cm colour aerial photograph to
create 20 m 20 m sample plots. Each 20 20 m sample plot consisted of a homogenous bugweed stand of consistent height and cover and the number of bugweed pixels in each sample plot determined (Fig. 3). The percentage of bugweed cover was then placed in one of 10 cover classes (no cover, 1–5%, 6–10%, 11–20%, 21– 30%, 31–40%, 41–50%, 51–60%, 61–75%, 76–100%) (Fig. 4). The height of bugweed within each of the 10 cover classes ranged from 0 to 2 m, 2 to 3 m, 3.2 to 4.2 m, 5 to 6.9 m, 4.5 to 5.6 m, 5 to 7 m, 5.8 to 7.2 m, 6 to 7.6 m, 7 to 9.5 m and 7.8 to 10 m, respectively. As noted in the above bugweed cover class categories, the study focused on the lower percentage of weed cover as a result of the relative importance of detecting weeds during the early stages of proliferation. 2.4. Bugweed cover detection In this study, a multivariate statistical technique, the sparse partial least squares discriminant analysis (SPLS-DA) method (Chung and Keles, 2010), was employed to address the complexities associated when utilising remotely sensed for alien plant detection. SPLS-DA analysis was performed using the AISA Eagle and WorldView-2 datasets which were then combined with LiDAR respectively. Since both remotely sensed images and the LiDAR DSM were well georeferenced, four classification procedures were performed. 2.5. SPLS-DA model development Partial least squares (PLS) (Wold et al., 2001) is a multivariate dimension reduction method that is referred to as partial least squares discriminant analysis (PLS-DA) when used for discrimination (Tesfamariam and Liu, 2010). Unlike principal component analysis (PCA), PLS identifies a selected number of eigenvectors whose scores explain the variance of the explanatory variables (X) as well as have a high correlation with the response variables (Y) (Li et al., 2008; Wolter et al., 2008). More specifically, the PLS operation can be described in Eqs. (1) and (2), where X represents the spectral bands and Y represents bugweed cover. The X matrix is derived by decomposing X into the score matrix (T) (a linear com-
WorldView-2 0.4
0.3
0.3
Reflectance (%)
0.2
0.1
0
0.2
0.1
Wavelenghts (nm)
908
831
724
659
608
546
478
969
923
832
878
786
741
696
652
563
607
520
477
393
434
0 427
Reflectance (%)
AISAEagle 0.4
Wavelenghts (nm)
Fig. 2. Hyperspectral and multispectral reflectance curves of bugweed utilised for detecting weed abundance in the study area.
K. Peerbhay et al. / ISPRS Journal of Photogrammetry and Remote Sensing 121 (2016) 167–176
100
loading matrix Q (coefficients that link responses to Y scores), with an error matrix (F) for outliers.
90
Bugweed pixels (%)
171
80 70 60 50 40 30 20 10 0 1
10 19 28 37 46 55 64 73 82 91 100 109 118 127 136 145
Bugweed Sample Plots Fig. 3. The percentage of bugweed cover present in individual field sample plots (n = 150) across the forest plantation.
bination of the spectral bands) and a P loading matrix (coefficients that link bands to X scores), including an error matrix (E) for nonrelevant variables. Similarly, the Y matrix is decomposed into the score matrix U (a linear combination of the responses) and the
X ¼ TP0 þ E
ð1Þ
Y ¼ UQ 0 þ F
ð2Þ
To improve on some of the concerns associated with PLS, sparse PLS (SPLS) was developed (Chun and Kelesß, 2010). SPLS is specifically designed for high dimensional datasets and avoids overfitting by implementing dimension reduction and variable selection simultaneously. This is accomplished by imposing sparseness within the latent components which are built to explain the best discrimination among classes by using only the few informative variables (non-zero variables). Non-relevant and noisy variables are scored to zero by imposing L1 penalty (Chun and Kelesß, 2010), thus eliminating any contribution towards the model’s discrimination power. Class membership of each variable is then assigned by reference cell coding the response matrix (Y) with dummy variables (Chung and Keles, 2010). Y is assumed to be one of the classes (G + 1) indicated by 0, 1, . . . , G. The recoded response matrix is then defined as an n (G + 1) matrix with:
yi;ðgþ1Þ ¼ Iðyi ¼ gÞ
ð3Þ
1 – 5% cover
6 – 10% cover
11 – 20% cover
21 – 30% cover
31 – 40% cover
41 – 50% cover
51 – 60% cover
61 – 75% cover
76 – 100% cover
Fig. 4. The percentages of bugweed cover occurring across the forest plantation. The representative classes are shown using a 10 cm colour image for selected sample plots (20 m 20 m). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
172
K. Peerbhay et al. / ISPRS Journal of Photogrammetry and Remote Sensing 121 (2016) 167–176
where i = 1, . . . , n; g = 0, 1, . . . , G, and I is an indicator function of event (A). Once the latent components are developed, the final procedure in SPLS is to fit a classifier since the number of latent components (K) is generally smaller than n. For this purpose, linear classifiers such as linear discriminant analysis (LDA) are commonly utilised (Chung and Keles, 2010). One of the most important processes in obtaining a PLS-DA model with superior performance is to determine the number of optimal latent components (Wold et al., 2001; Wolter et al., 2009). While the number of components can be equal to the number of explanatory variables (X), the generation of fewer initial components are suggested to remove lower order components that display properties of collinearity, random measurement errors and information unrelated to the feature of interest (Wolter et al., 2009). Cross-validation (CV) is a practical and reliable method for measuring the significance of each PLS-DA component (Wold et al., 2001). Conditioned on the training dataset, each component is added systematically to the SPLS-DA model until the CV error indicates that an additional component does not improve the model performance. While the number of components ‘‘k” is only one of the key tuning parameters, SPLS-DA also requires the sparsity thresholding parameter ‘‘eta”, for ideal model performance (Chun and Kelesß, 2010; Chung and Keles, 2010). While ‘‘k” largely depends on the number of variables and sample size, it has been recommended to search for components between 1 and 10 with a thresholding parameter (eta) ranging between 0 and 1 (Chun and Kelesß, 2010). This most optimal component would therefore have retained the most effective variables while all non-relevant variables would have a zero probability towards the model. Subsequently, the SPLS-DA models developed with the optimal number of components were used to classify the test datasets. PLS model development and model optimisation was done using the R statistical software package (R Development Core Team, 2012).
2.6. Validation assessment This study utilised the confusion matrix (Congalton and Green, 1999) to evaluate the performance of the different datasets combinations (AISA, WV-2, AISA + LiDAR, WV-2 + LiDAR) used in this study. The confusion matrix was calculated by dividing the datasets into training data (60%) and test data (40%). Since PLS-DA is implemented as a supervised approach, the classification capability could vary with the change of different training and validation sample combinations (Fassnacht et al., 2014). Therefore, to achieve a model with maximum classification stability, the split between test and training samples was repeated using 100 iterations. The quantity and allocation disagreement was then used to measure the disagreement within the error matrix, as suggested by Pontius and Millones (2011). The former quantifies the amount of training samples of bugweed cover (i.e. 1–5%) that differs from the quantity of samples of the same class in the test data while the latter measures the amount of training samples of bugweed cover that were allocated to different locations of the same class in the test dataset. For the purpose of this study the quantity and allocation disagreement were combined and the total disagreement of the error matrix reported (Pontius and Millones, 2011). Class accuracies for individual bugweed cover categories were also compared by examining the user’s and producer’s accuracies. User’s accuracy was calculated by dividing the number of correctly classified tree species by the total number of tree species that were classified in that particular class (expressed by the row total in confusion matrix). Producer’s accuracy was calculated by dividing the number of correctly classified tree species in each class by the size of the training samples used for that class (expressed by the column total in confusion matrix) (Congalton and Green, 1999).
2.7. Results 2.7.1. SPLS-DA optimisation Separate SPLS-DA models were developed for the four datasets used to detect bugweed abundance across the 10 cover classes. The contribution of each SPLS-DA component towards the respective models is shown in Fig. 5. Utilising only the AISA Eagle data, 10 components produced the lowest CV error of 15.49% with an ‘‘eta” value of 0.9. Similarly, with an ‘‘eta” value of 0.9, the WorldView-2 data obtained the lowest CV error of 11.50% using only 7 components. In comparison, the optimisation of the number of components and ‘‘eta” when using the remotely sensed images combined with LiDAR yielded successful results. The hyperspectral dataset achieved the lowest CV error of 13% with 9 latent components and an ‘‘eta” value of 0.7, while 5 components produced the lowest CV error of 14.56% with an ‘‘eta” value of 0.6 when utilising the multispectral data. 2.7.2. Bugweed detection analysis Test dataset results indicate that the AISA Eagle bands produced an overall accuracy of 68.33% and a total disagreement of 31. Individual bugweed cover user’s and producer’s accuracies ranged from 50% to 100% (Table 2). When combining LiDAR with the AISA Eagle data, SPLS-DA produced an overall accuracy of 78.33%, and total disagreement of 22, with individual user’s and producer’s accuracies ranging from 50% to 100%. Using only the LiDAR dataset, however produced an overall accuracy of 64%, a total disagreement of 35, with class accuracies between 30% and 80%. For comparison purposes, a traditional PLS-DA analysis was used to classify the AISA eagle dataset. The results revealed an overall accuracy of 63.90% with user’s and producer’s accuracies between 45% and 92%. Additionally, the combination of LiDAR with the hyperspectral dataset produced a PLS-DA accuracy of 72% and user’s and producer’s accuracies ranging from 54% to 89%, while LiDAR alone, obtained a PLS-DA accuracy of 62.24% with class accuracies between 35% and 71%. Utilising the WorldView-2 sensor, test dataset results show that the multispectral data produced an overall accuracy of 63.33% and a total disagreement of 37 (Table 3). Accuracies for individual bugweed cover classes ranged from 33% to 83%. When combining LiDAR with WorldView-2 bands, the dataset produced an improved overall accuracy of 75% with a total disagreement of 25. In addition, individual bugweed user’s and producer’s accuracies ranged between 50% and 100%. In comparison, a traditional PLS-DA analysis revealed overall accuracies of 60.53% and 69.11% when using the WV-2 data only and with the inclusion of LiDAR, respectively. User’s and producer’s accuracies ranged from 30% to 75% for the WV-2 dataset and 50% – 98% with the inclusion of LiDAR. 2.7.3. Classifier stability Fig. 6 illustrates the classification stability for each of the SPLSDA models using 100 iterations for splitting the training and validation dataset. Classification means produced using the hyperspectral bands combined with LiDAR were greater than 78% with a standard deviation of 1.7% compared to when using the hyperspectral bands alone, which produced classification means greater than 68% and a standard deviation of 2.8%. When using the WorldView2 bands in conjunction with LiDAR, classification means were found to be greater than 74% with a standard deviation of 1.8%. In comparison, the multispectral image excluding LiDAR produced classification means greater than 63% with a standard deviation of 3.3%. The variation in classification accuracy was minimum for when utilising the remote sensing datasets with LiDAR which also produced higher ranges of classification accuracies compared to when using the wavelengths alone.
173
K. Peerbhay et al. / ISPRS Journal of Photogrammetry and Remote Sensing 121 (2016) 167–176
Fig. 5. Determining the discriminatory power of SPLS-DA components using AISA Eagle hyperspectral data and WorldView-2 multispectral bands with and without LiDAR. Tenfold cross validation was used to determine the lowest error rate conditioned on the training dataset. The optimal component with the lowest error is indicated by the black arrows.
Table 2 Classification accuracies using hyperspectral imagery with LiDAR. Bugweed cover
AISA Eagle
LiDAR
AISA Eagle + LiDAR
Producer
User
Producer
User
Producer
User
1–5% 6–10% 11–20% 21–30% 31–40% 41–50% 51–60% 61–75% 76–100% No cover
50 50 67 67 67 67 83 83 83 67
50 75 57 50 80 67 71 71 100 80
65 50 50 50 50 67 80 67 80 80
60 75 30 50 75 80 71 67 80 80
50 67 67 83 83 67 83 100 100 83
60 67 67 71 83 100 71 86 100 83
Overall accuracy Allocation disagreement Quantity disagreement
68.33 23 8
Fig. 7 shows the most effective variables that were automatically retained by the SPLS-DA algorithm for the classification of the hyperspectral and multispectral datasets. In the hyperspectral dataset, bands within the visible portion (400–700 nm) of the electromagnetic spectrum were important for the classification of bugweed cover while all other contributions from other variables towards the model’s performance were eliminated. More specifically, a total of 50 bands best explained the discrimination among
64.00 27 8
78.33 17 5
the bugweed cover categories and were located in intervals in the blue (393–397 nm; 405–409 nm; 413–423 nm), green (532– 563 nm), red (590–616 nm; 650–670 nm) and red edge regions (725–734 nm) respectively. In total, 11 bands were considered important in the blue, 14 in the green, 20 in the red and 5 in the red-edge portion of the spectrum. When utilising the multispectral image to detect the abundance of bugweed cover, SPLS-DA retained 5 bands. These bands were
174
K. Peerbhay et al. / ISPRS Journal of Photogrammetry and Remote Sensing 121 (2016) 167–176
Table 3 Classification accuracies using multispectral imagery with LiDAR. Bugweed cover
WorldView-2
WorldView-2 + LiDAR
Producer
User
Producer
User
1–5% 6–10% 11–20% 21–30% 31–40% 41–50% 51–60% 61–75% 76–100% No cover
33 33 67 67 67 67 83 67 83 67
40 40 57 57 67 67 63 80 83 80
50 50 67 67 83 83 83 83 100 83
60 60 67 67 71 71 100 83 86 83
Overall accuracy Allocation disagreement Quantity disagreement
63.33 30 7
393 413 434 455 477 498 520 541 563 585 607 630 652 674 696 719 741 764 786 809 832 855 878 900 923 946 969 992
75.00 20 5
Hyperspectral wavelengths (nm)
Fig. 6. Boxplots assessing the stability of SPLS-DA using different input datasets considered in this study.
This study has presented valuable evidence for the application and potential of combining LiDAR derived height with remote sensing imagery to accurately detect the abundance of bugweed cover in a plantation forest environment. Results present the potential of using high resolution WorldView-2 imagery and LiDAR to discriminate between the 10 bugweed cover classes, however, the spectral resolution of the hyperspectral sensor produced the most effective results. The utility of the PLS-DA algorithm provided an ideal framework for analysing and dealing with the high resolution and high dimensional datasets and for detecting subtle differences between the different bugweed cover classes. 2.8.1. Classifications performed with imagery PLS-DA has successfully reduced the large number of hyperspectral bands to only a few components to avoid model overfitting and remove influences of non-significant information (Wold et al., 2001). The results show that the addition of components iteratively to the PLS-DA model, results in a decrease in the CV error (Wold et al., 2001). Utilising the AISA Eagle hyperspectral dataset, 10 latent components retained the most important variables for detecting bugweed cover and produced the lowest CV error
908
831
742
659
608
546
478
2.8. Discussion
427
located in the new coastal blue (427 nm), the green (546 nm), yellow (608 nm), red (659 nm) and red-edge (742 nm) regions.
Multispectral wavelengths (nm) Fig. 7. Location of the most effective bands selected by the SPLS-DA algorithm using hyperspectral and multispectral remotely sensed data for detecting bugweed canopy cover.
(15.49%). The accuracy produced using the hyperspectral dataset is comparable to Lass et al. (2002) and Lass et al. (2005) for discriminating between different cover categories of alien vegetation. For example, using a charged couple device (CCD) (415–953 nm) with a spatial resolution of 2 m, Lass et al. (2005) produced an overall site detection average of 67% for knapweed and 83.5% for baby’s breath infestations. In a previous study, the cover densities of spotted knapweed infestations were successfully detected (>70% cover) using an image spectrometer (440–2543 nm) with a 5 m spatial resolution (Lass et al., 2002). When utilising the WorldView-2 bands, 7 latent components were retained (lowest CV error = 11.50%) to produce an overall accuracy that is consistent with other studies that have mapped the cover of IAPs using multispectral datasets (Carson et al.,
K. Peerbhay et al. / ISPRS Journal of Photogrammetry and Remote Sensing 121 (2016) 167–176
1995; Lass et al., 1996, 2000). For example, Lass et al. (1996) distinguished and mapped the different cover classes (none, 30–70%, 71– 90%, 91–100%) of yellow starthistle (Centaurea solstitialis) and St John’s wort (Hypericum perforatum) using a multispectral CCD (400–1000 nm) and an unsupervised image classification approach. The study produced best overall classification accuracies between 77% and 81% when weed cover was greater than 50%, using 2 m or 4 m resolutions and when the number of classes was reduced from 9 to 4. In a subsequent study (Lass et al., 2000), a multispectral CCD with 4 m spatial resolution produced the best classification accuracies for yellow starthistle cover (31– 70% cover, 71–100% cover) using a Bayesian methodology. When examining the classification of individual bugweed cover classes, most of the confusion occurred between the low bugweed cover classes (i.e. 1–5%; 6–10%) and could be as a result of the lack in spectral resolution of the multispectral sensor. However, when compared to the AISA Eagle dataset, the WorldView-2 sensor provided a 9% and 7% improvement in user’s accuracy for detecting bugweed in the 61–75% and 21–30% bugweed cover classes respectively, while similar producer’s accuracies were obtained for the majority of cover categories (no cover, 11–60% and 76–100%). Nonetheless, those bugweed cover categories that reduced or improved in classification accuracies could be affected by the variation in reflectance influenced by shadowing, differences in light absorption or non-photosynthetic material such as rock and underlying soil (Andrew and Ustin, 2008). For instance, lower classification accuracies were obtained for low bugweed cover percentages in all datasets (i.e. 1–5%, 6–10%). This could be as a result of these low bugweed cover classes being subjected to reflectance of other materials within the 20 m 20 m sample plots. Future studies may investigate the masking of background features to improve on the detection of low bugweed cover within forest plantations. The difference in solar and sensor geometry between the AISA Eagle and WorldView-2 imagery may have also contributed to the classification performance within the various bugweed cover groups and could have influenced the overall detection results. For instance, the difference of 20° between the images represents an increase in the classification angle of the target weed samples in the Worldview-2 image. This increase in classification angle could have incorrectly increased the number of classified bugweed pixels on the image (producer’s accuracy) by including background or mixed pixels with similar spectral patterns. Alternatively, a decrease in classification angle could reduce the number of classified bugweed pixels by only detecting pixels with pure reflectance (Lass and Prather, 2004; Lass et al., 2005). Since the sun-sensor configuration of Worldview-2 varies from scene to scene and could be rotated, further research is required to investigate the best settings and across different seasons to understand and minimise the role of the Bidirectional Reflectance Distribution Function (BRDF) when detecting and mapping the cover of IAP. The utilisation of LiDAR only for discriminating between the various categories of bugweed cover proved successful, with a slight overall improvement of 0.67% over the WorldView-2 dataset alone. When compared to the performance of the hyperspectral bands only, an overall reduction of 4.33% was noticed. Nonetheless, LiDAR data alone provided comparable class accuracies when compared to the performance of both optical images individually, with improved detection accuracies for the low bugweed cover groups of 1–5% and 6–10% as well as in the cover categories of no cover, 41–50% and 51–60%. 2.8.2. Classifications performed with imagery and LiDAR SPLS-DA effectively reduced the hyperspectral and multispectral bands that have been combined with LiDAR to retain only the most relevant information for the mapping of bugweed cover. Utilising 9 components, the hyperspectral dataset produced a 10%
175
improvement in the overall classification accuracy with individual class accuracies ranging between 50% and 100%. However, individual class accuracies ranged between 67% and 100% for bugweed cover greater than 5%. In comparison, utilising 5 components the Worldview-2 bands in conjunction with LiDAR produced an 11.67% improvement in the overall classification accuracy compared to when using the original multispectral bands. Individual bugweed cover class accuracies were between 50% and 100%. While the incorporation of LiDAR with both remotely sensed datasets produced an improvement on the user’s and producer’s accuracies compared to when using the multispectral and hyperspectral bands alone, the 1–5% cover class was the least correctly detected bugweed cover class. Nonetheless, this study demonstrates the potential of using remotely sensed data in conjunction with LiDAR to detect various levels of bugweed cover, with particular focus on low weed abundance. Focusing on the best overall classification accuracy, it was observed that among the datasets utilised in this study, those that combined imagery with LiDAR produced the best results. More specifically, the AISA Eagle hyperspectral dataset produced slightly higher accuracies due to its greater spectral capability than the Worldview-2 dataset. However, accuracy differences between the individual bugweed cover classes were often marginal when comparing the user’s and producer’s accuracies obtained for each cover class. Due to the iterative classification approach implemented in SPLS-DA, stability measures of the overall accuracy for each dataset were provided. The remotely sensed images combined with LiDAR provided classification results with the least variation compared to using only the images for detecting bugweed cover. Such information on the stability of classifiers increases the reliability and robustness of the reported results and is valuable for capturing the influence of changing training and test sample combinations (Fassnacht et al., 2014; Peerbhay et al., 2014a). The utility of LiDAR derived height in conjunction with remotely sensed imagery offers an alternative to traditional fieldbased methods of acquiring detailed information related to the precision mapping of IAPs. While current field survey methods cannot efficiently detect changes in invasive plant cover and include information on infestation size, shape or abundance; such information can be potentially collected and improved by using remotely sensed technologies in conjunction with LiDAR. Therefore, the methodology used in this study can provide valuable information related to the changes in bugweed cover and could be useful in determining the effectiveness of chemical or biological control methods in forest plantations. 2.9. Conclusion Combining LiDAR information with remotely sensed imagery has proven to be a significant factor in improving the classification accuracy when detecting the abundance of bugweed infestations. Utilising AISA Eagle hyperspectral data and WorldView-2 multispectral bands, bugweed abundance were detectable with varying class accuracies between 33% and 100% for cover from 1% to 100%. Incorporating LiDAR with imagery produced improved class accuracies between 50% and 100% with an overall classification accuracy of 78.33% for the AISA dataset and 68.33% for the Worldview-2 bands. Utilising LiDAR only for discriminating between the bugweed cover groups revealed a 64% overall accuracy with class accuracies ranging from 30% to 80%, which were comparable to the results produced when using the AISA and WorldView-2 datasets individually. While SPLS-DA worked well for most of the bugweed cover categories, the 1–5% cover was the least often detected. Overall, the SPLS-DA algorithm provided a unique framework for dealing with high resolution and high dimensional datasets that were combined with LiDAR for success-
176
K. Peerbhay et al. / ISPRS Journal of Photogrammetry and Remote Sensing 121 (2016) 167–176
ful classification and variable selection. Wavelengths selected by SPLS-DA coincided with the visible and red-edge regions for both images and were the most effective for detecting the abundance of the weed. Acknowledgements This study was carried out with the financial support from the Applied Centre for Climate and Earth Systems Science (ACCESS). We extend our gratitude to Sappi forests-SA for allowing us the privilege of working with LiDAR information. References Andrew, M.E., Ustin, S.L., 2008. The role of environmental context in mapping invasive plants with hyperspectral image data. Rem. Sens. Environ. 112 (12), 4301–4317. ArcMap, 2013. Arcgis version 10.2. Environmental Systems Research Institute (ESRI), California. Asner, G.P., Hughes, R.F., Vitousek, P.M., Knapp, D.E., Kennedy-Bowdoin, T., Boardman, J., Martin, R.E., Eastwood, M., Green, R.O., 2008. Invasive plants transform the three-dimensional structure of rain forests. Proc. Natl. Acad. Sci. 105, 4519–4523. Atkinson, J.T., Ismail, R., Robertson, M., 2014. Mapping bugweed (Solanum mauritianum) infestations in Pinus patula plantations using hyperspectral imagery and support vector machines. IEEE J. Sel. Top. Appl. Earth Obs. Rem. Sens. 7, 17–28. Campbell, P., van Staden, J., 1990. Utilisation of solasodine from fruits for long-term control of Solanum mauritianum. S. Afr. For. J. 155 (1), 57–60. Carson, H.W., Lass, L.W., Callihan, R.H., 1995. Detection of yellow hawkweed (Hieracium pratense) with high resolution multispectral digital imagery. Weed Technol., 477–483 Chun, H., Kelesß, S., 2010. Sparse partial least squares regression for simultaneous dimension reduction and variable selection. J. Roy. Stat. Soc.: Ser. B (Stat. Methodol.) 72 (1), 3–25. Chung, D., Keles, S., 2010. Sparse partial least squares classification for high dimensional data. Stat. Appl. Genet. Mol. Biol. 9 (1). Congalton, R., Green, K., 1999. Assessing the Accuracy of Remotely Sensed Data: Principles and Practices. Lewis Publishers, Florida. Copeland, R.S., Wharton, R.A., 2006. Year round production of Pest Ceratitis species (Diptera: Tephritidae) in fruit of the invasive species Solanum mauritianum in Kenya. Ann. Entomol. Soc. Am. 99 (3), 530–535. DAFF, 2009. Report on Commercial Timber Resources and Primary Roundwood Processing in South Africa: Department of Agriculture. Forestry and Fisheries, Pretoria. Dye, M., Mutanga, O., Ismail, R., 2011. Examining the utility of random forest and AISA Eagle hyperspectral image data to predict Pinus patula age in KwazuluNatal, South Africa. Geocarto Int. 26 (4), 275–289. ENVI, 2009. Environment for visualizing images: Version 4.7. USA, Exelis Visual Information Solutions. ITT Industries. Fairbanks, D.H.K., Thompson, M.W., Vink, D.E., Newby, T., van den Berg, H.M., Everard, D.A., 2000. The South African land-cover characteristics database: a synopsis of the landscape. S. Afr. J. Sci. 96, 69–82. Fassnacht, F., Neumann, C., Forster, M., Buddenbaum, H., Ghosh, A., Clasen, A., Joshi, P., Koch, B., 2014. Comparison of feature reduction algorithms for classifying tree species with hyperspectral data on three central European test sites. IEEE J. Sel. Top. Appl. Earth Obs. Rem. Sens. 7 (6), 2547–2561. Fuller, D.O., 2005. Remote detection of invasive melaleuca trees (Melaleuca quinquenervia) in south Florida with multispectral IKONOS imagery. Int. J. Rem. Sens. 26 (5), 1057–1063. Glenn, N.F., Mundt, J.T., Weber, K.T., Prather, T.S., Lass, L.W., Pettingill, J., 2005. Hyperspectral data processing for repeat detection of small infestations of leafy spurge. Rem. Sens. Environ. 95 (3), 399–412. Goel, P., Prasher, S., Patel, R., Smith, D., DiTommaso, A., 2002. Use of airborne multispectral imagery for weed detection in field crops. Trans.-Am. Soc. Agric. Eng. 45 (2), 443–450. Huang, C.-Y., Asner, G., 2009. Applications of remote sensing to alien invasive plant studies. Sensors 9 (6), 4869–4889. Jordaan, L.A., Downs, C.T., 2012. Comparison of germination rates and fruit traits of indigenous Solanum giganteum and invasive Solanum mauritianum in South Africa. S. Afr. J. Bot. 80, 13–20. Laba, M., Downs, R., Smith, S., Welsh, S., Neider, C., White, S., Richmond, M., Philpot, W., Baveye, P., 2008. Mapping invasive wetland plants in the Hudson River National Estuarine research reserve using quickbird satellite imagery. Rem. Sens. Environ. 112, 286–300. Lass, L.W., Carson, H.W., Callihan, R.H., 1996. Detection of yellow starthistle (Centaurea solstitialis) and common St. Johnswort (Hypericum perforatum) with multispectral digital imagery. Weed Technol., 466–474 Lass, L.W., Prather, T.S., 2004. Detecting the locations of Brazilian pepper trees in the everglades with a hyperspectral sensor. Weed Technol. 18 (2), 437–442. Lass, L.W., Prather, T.S., Glenn, N.F., Weber, K.T., Mundt, J.T., Pettingill, J., 2005. A review of remote sensing of invasive weeds and example of the early detection
of spotted knapweed (Centaurea maculosa) and babysbreath (Gypsophila paniculata) with a hyperspectral sensor. Weed Sci. 53 (2), 242–251. Lass, L.W., Shafii, B., Price, W.J., Thill, D.C., 2000. Assessing agreement in multispectral images of yellow starthistle (Centaurea solstitialis) with ground truth data using a bayesian methodology. Weed Technol. 14 (3), 539–544. Lass, L.W., Thill, D.C., Shafii, B., Prather, T.S., 2002. Detecting spotted knapweed (Centaurea maculosa) with hyperspectral remote sensing technology 1. Weed Technol. 16 (2), 426–432. Lawrence, R.L., Wood, S.D., Sheley, R.L., 2006. Mapping invasive plants using hyperspectral imagery and Breiman Cutler classifications (randomforest). Rem. Sens. Environ. 100 (3), 356–362. Le Maitre, D.C., Richardson, D.M., Chapman, R.A., 2004. Alien plant invasions in South Africa: driving forces and the human dimension. S. Afr. J. Sci. 100, 103– 112. Levine, J.M., Vilà, M., Antonio, C.M.D., Dukes, J.S., Grigulis, K., Lavorel, S., 2003. Mechanisms underlying the impacts of exotic plant invasions. Proc. Roy. Soc. Lond. B Biol. Sci. 270 (1517), 775–781. Li, L., Cheng, Y.B., Ustin, S., Hu, X.T., Riaño, D., 2008. Retrieval of vegetation equivalent water thickness from reflectance using genetic algorithm (ga)partial least squares (pls) regression. Adv. Space Res. 41 (11), 1755–1763. Miao, X., Gong, P., Swope, S., Pu, R., Carruthers, R., Anderson, G.L., Heaton, J.S., Tracy, C.R., 2006. Estimation of yellow starthistle abundance through casi-2 hyperspectral imagery using linear spectral mixture models. Rem. Sens. Environ. 101 (3), 329–341. Mucina, L., Rutherford, M.C., 2006. The Vegetation of South Africa, Lesotho and Swaziland. South African National Biodiversity Institute, Pretoria. Müllerová, J., Pergl, J., Pyšek, P., 2013. Remote sensing as a tool for monitoring plant invasions: testing the effects of data resolution and image classification approach on the detection of a model plant species Heracleum mantegazzianum (giant hogweed). Int. J. Appl. Earth Obs. Geoinf. 25, 55–65. Mundt, J.T., Glenn, N.F., Weber, K.T., Prather, T.S., Lass, L.W., Pettingill, J., 2005. Discrimination of hoary cress and determination of its detection limits via hyperspectral image processing and accuracy assessment techniques. Rem. Sens. Environ. 96, 509–517. Mundt, J.T., Streutker, D.R., Glenn, N.F., 2006. Mapping sagebrush distribution using fusion of hyperspectral and lidar classifications. Photogramm. Eng. Rem. Sens. 72 (1), 47–54. Olckers, T., 2011. Biological control of Solanum mauritianum scop. (solanaceae) in South Africa: will perseverance pay off? Afr. Entomol. 19 (2), 416–426. Olckers, T., Borea, C., 2009. Assessing the risks of releasing a sap-sucking lace bug, Gargaphia decoris, against the invasive tree Solanum mauritianum in New Zealand. Biocontrol 54 (1), 143–154. Omar, H., 2010. Commercial Timber Tree Species Identification Using Multispectral worldview2 Data. DigitalGlobe Incorporated, USA. Pallett, R.N., 2005. Precision forestry for pulpwood re-establishment silviculture. S. Afr. Forestry J. 203 (1), 33–40. Peerbhay, K.Y., Mutanga, O., Ismail, R., 2013. Commercial tree species discrimination using airborne AISA eagle hyperspectral imagery and partial least squares discriminant analysis (PLS-DA) in Kwazulu–Natal, South Africa. ISPRS J. Photogramm. Rem. Sens. 79, 19–28. Peerbhay, K.Y., Mutanga, O., Ismail, R., 2014a. Does simultaneous variable selection and dimension reduction improve the classification of pinus forest species? J. Appl. Rem. Sens. 8 (1). 085194-085194. Peerbhay, K.Y., Mutanga, O., Ismail, R., 2014b. Investigating the capability of few strategically placed worldview-2 multispectral bands to discriminate forest species in Kwazulu-Natal, South Africa. IEEE J. Sel. Top. Appl. Earth Obs. Rem. Sens. 7 (1), 307–316. Peerbhay, K.Y., Mutanga, O., Ismail, R., 2015. Random forests unsupervised classification: the detection and mapping of Solanum mauritianum infestations in plantation forestry using hyperspectral data. IEEE J. Sel. Top. Appl. Earth Obs. Rem. Sens. 8 (6), 3107–3122. Peerbhay, K., Mutanga, O., Ismail, R., 2016. The identification and remote detection of alien invasive plants in commercial forests: an overview. S. Afr. J. Geoinform. 5 (1), 49–67. Pontius Jr, R.G., Millones, M., 2011. Death to kappa: birth of quantity disagreement and allocation disagreement for accuracy assessment. Int. J. Rem. Sens. 32 (15), 4407–4429. R Development Core Team, 2012. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.
. Roberts, D., Yamaguchi, Y., Lyon, R., 1986. Comparison of Various Techniques for Calibration of AIS Data. NASA STI/Recon Technical Report, 87:12970. Tesfamariam, S., Liu, Z., 2010. Earthquake induced damage classification for reinforced concrete buildings. Struct. Saf. 32 (2), 154–164. Ustin, S.L., DiPietro, D., Olmstead, K., Underwood, E., Scheer, G.J., 2002. Hyperspectral remote sensing for invasive species detection and mapping. IEEE Geosci. Rem. Sens. 1653, 1658–1660. Wold, S., Sjöström, M., Eriksson, L., 2001. PLS-regression: a basic tool of chemometrics. Chemometr. Intell. Lab. Syst. 58 (2), 109–130. Wolter, P.T., Townsend, P.A., Sturtevant, B.R., 2009. Estimation of forest structural parameters using 5 and 10 meter spot-5 satellite data. Rem. Sens. Environ. 113 (9), 2019–2036. Wolter, P.T., Townsend, P.A., Sturtevant, B.R., Kingdon, C.C., 2008. Remote sensing of the distribution and abundance of host species for spruce budworm in Northern Minnesota and Ontario. Rem. Sens. Environ. 112 (10), 3971–3982.