Radiometric calibration assessments for UAS-borne multispectral cameras: Laboratory and field protocols

Radiometric calibration assessments for UAS-borne multispectral cameras: Laboratory and field protocols

ISPRS Journal of Photogrammetry and Remote Sensing 149 (2019) 132–145 Contents lists available at ScienceDirect ISPRS Journal of Photogrammetry and ...

7MB Sizes 0 Downloads 34 Views

ISPRS Journal of Photogrammetry and Remote Sensing 149 (2019) 132–145

Contents lists available at ScienceDirect

ISPRS Journal of Photogrammetry and Remote Sensing journal homepage: www.elsevier.com/locate/isprsjprs

Radiometric calibration assessments for UAS-borne multispectral cameras: Laboratory and field protocols

T

Sen Caoa,b, Brad Danielsonb, Shari Clareb, , Shantel Koenigb, Carlos Campos-Vargasa, ⁎ Arturo Sanchez-Azofeifaa, ⁎

a b

Department of Earth and Atmospheric Sciences, University of Alberta, Edmonton T6G 2E3, Canada Fiera Biological Consulting, Edmonton T6E 1Z9, Canada

ARTICLE INFO

ABSTRACT

Keywords: UAS Radiometric calibration Multispectral sensors

The main objective of this study was to develop and test a framework that can be used by Unmanned Aerial Systems (UAS) operators with varying technical backgrounds to estimate the accuracy and reliability of multispectral (visible and Near-Infrared or NIR) sensor measurements. We evaluated the performance of two multispectral sensors – the MicaSense RedEdge and the Airinov MultiSpec 4C – in both a laboratory and field setting. In the laboratory, we measured the reflectance of a number of reference target materials using each UAS sensor, and compared the values to those measured using a calibrated spectrometer. We found a strong linear relationship between the measurements made by the MicaSense RedEdge and the spectrometer, while the relationship was much weaker for the Airinov MultiSpec 4C, particularly in the longer wavelength bands (red-edge and NIR). A sub-set of the target materials were selected as ground reference targets for three field calibration exercises. In field calibration assessment No. 1, imagery was collected using each UAS sensor and reflectance values were extracted from pixels covering the ground reference targets. The extracted values were compared to the reflectance values acquired in the laboratory, and both UAS sensors were found to over-estimate reflectance, with lower accuracy in red-edge and NIR bands. Field calibration assessment No. 2 involved a calculation of Normalized Difference Vegetation Index (NDVI) values at field control points using both UAS sensors, and we found a strong linear relationship between the NDVI values and measurements made by a hand-held NDVI sensor, suggesting that the calculation of a normalized band ratio (i.e., NDVI) effectively reduces the reflectance measurement inaccuracy that we observed previously. Field calibration assessment No. 3 included image acquisition of ground reference targets using the MicaSense RedEdge sensor over seventeen sequential field surveys. Results revealed measurement variability over time, suggesting that daily differences in solar illumination and atmospheric conditions may influence derived reflectance values. In light of these results, we propose simplified procedures that can be adopted by UAS operators to periodically assess the radiometric fidelity of their multispectral sensors.

1. Introduction Recent advances in Unmanned Aerial Systems (UAS) have made these instruments increasingly popular for on-demand imagery acquisition for a variety of research and commercial applications (Colomina and Molina, 2014). Compared to traditional airborne or satellite-based platforms, UAS have several unique advantages. This includes the ability to acquire ultra-high spatial resolution imagery at a relatively low cost in a wide range of different environments and at time steps that are dictated by the user (Berni et al., 2009; Ambrosia et al., 2003; Yuan et al., 2015; Dall’Asta et al., 2017). In addition, UAS can carry a



variety of digital sensors, including multispectral, hyperspectral, thermal, and LiDAR, which makes these systems attractive to a wide range of users in both the private and public sectors (Toth and Józków, 2016; Aasen and Bolten, 2018; Webster et al., 2018; Liu et al., 2018). In particular, the use of multispectral (visible and Near-Infrared or NIR) sensors on UAS platforms for vegetation assessment and monitoring has expanded dramatically over the last decade, most notably in the realm of agriculture and precision farming (e.g., Deng et al., 2018; Jin et al., 2017; Stagakis et al., 2012; Zhang and Kovacs, 2012; Laliberte et al., 2011). More recently, hyperspectral imagers and LiDAR sensors have been employed on UAS for a range of applications, although the use of

Corresponding authors. E-mail addresses: [email protected] (S. Clare), [email protected] (A. Sanchez-Azofeifa).

https://doi.org/10.1016/j.isprsjprs.2019.01.016 Received 12 July 2018; Received in revised form 10 January 2019; Accepted 20 January 2019 0924-2716/ © 2019 Published by Elsevier B.V. on behalf of International Society for Photogrammetry and Remote Sensing, Inc. (ISPRS).

ISPRS Journal of Photogrammetry and Remote Sensing 149 (2019) 132–145

S. Cao et al.

these sensors is still limited due to cost and availability (Zarco-Tejada et al., 2013; Toth and Józków, 2016). As the commercial use of multispectral sensors on UAS grows, these systems are increasingly being operated by users with limited technical training in remote sensing or image processing. As a result, the ability of users to carefully consider and evaluate the quality of data acquired from the systems is somewhat limited. This limitation is important, because unlike airborne or satellite-based systems that have been rigorously calibrated and validated (such as Landsat series, MODIS and Sentinel-2), sensors instrumented on UAS are customized with varied band designations, and the reliability and accuracy of these sensors are seldom reported by the manufactures. Thus, users must simply trust that the sensors they instrument on their UAS are calibrated correctly “out of the box”, which includes the important assumption that multispectral sensors reliably produce accurate data under a variety of atmospheric conditions and through changing solar illumination conditions. Furthermore, users must trust that their sensors perform reliably through time, despite periodic software or firmware upgrades, and despite regular wear and tear (e.g., heat, vibration, dust, moisture, etc.) that can lead to the degradation of lenses, optical coatings, filters, and even electronics, with unknown consequence to sensor performance (Wang et al., 2012; Pena et al., 2013). These potential sources of error pose a serious threat to the reputation, productivity, and profitability of commercial UAS operators who rely on sensors to collect reliable and accurate information that allows them to track changes in land cover through time and space. Being forced to blindly trust in the long-term reliability of a sensor without having a method to test and assess its accuracy is an unmanaged risk. For example, it has been our experience, both in a research and commercial setting that UAS multispectral sensors can fail to produce repeatable measurements over time, or exhibit changes in performance following a firmware update (Assmann et al., 2018). Thus, we feel that these sensors require careful monitoring for reliability as part of standard operating procedures. However, this type of reliability monitoring is extremely difficult, primarily due to a general lack of transparency in sensor function as it relates to the conversion of raw image data (i.e., Digital Numbers or DNs) to radiance or surface reflectance values. When a multispectral sensor is used to collect an image, each raw image pixel contains a DN. A DN is a binary-encoded value of the intensity of light reflected from the earth’s surface, as measured by the sensor using a particular combination of exposure settings (e.g., shutter speed, aperture, ISO/gain, etc.) at the instant that the image was obtained. A radiometric conversion model is a set of mathematical equations that allow DNs to be converted to radiance (in units W·sr−1·m−2·nm−1 or watt per steradian per square metre per nanometre), which is the flux of radiant energy being received by the camera. Radiance measured at the camera is a function of the solar illumination (irradiance) at that instance in time and the physical properties of the surface material, both of which affect how much light is absorbed or reflected. The ratio of the reflected radiance to the irradiance is called surface reflectance, a unit-less measure that can be compared over time because the effect of the incident illumination has been normalized. Multispectral imagery must be converted from radiance to surface reflectance before it can be used for land management applications, such as assessing the change in vegetation condition through time. Each multispectral sensor that is manufactured has its own unique physical properties (e.g., minor variations in lens curvature, filter performance, etc.), and as such, every sensor must undergo a system calibration prior to leaving the factory to ensure that raw DN values can be accurately converted to radiance values. While the exact calibration methods used by manufacturers are proprietary, these calibrations generally include a multi-point comparison of sensor measurements against a calibrated spectrometer or set of reference targets (a “standard”). These measurements are then used to derive a set of unique

calibration coefficients that make a generalized radiometric conversion model, which allows for the accurate conversion of DN values to radiance values, which can then be converted to reflectance. The mathematical functions or models that describe the radiometric conversion process for traditional airborne and satellite sensors are typically available to users who process and analyse imagery obtained from these platforms; however, availability of this information is relatively rare for imagery obtained from UAS sensors. Consequently, users of UAS sensors are forced to rely on software that is provided by the sensor manufacturer or is available through a third-party provider (e.g., Pix4D MapperPro) to perform the radiometric conversion behind the scenes during image processing. As a result, the radiometric correction process is generally a “black-box” to most users. This lack of transparency is problematic because UAS sensor performance may change over time, and these sensors have not been in use for long enough for users to know when, or by how much, performance degradation may occur. Thus, it is critical that manufacturers enable their users to perform ongoing assessments of calibration accuracy to maintain confidence and trust in the performance of their sensor. Open access to the radiometric conversion models and calibration coefficients would allow users to perform their own independent calculations to convert sensor DNs to radiance, thereby allowing for independent calibration assessments. Some sensor manufacturers have begun to share this critical information with their user community; however, there are currently no commonly adopted procedures or protocols that allow users to do this. The only user-calibration method prescribed by many multispectral sensor manufacturers is to collect one or more images of a Calibrated Reflectance Panel (CRP) before and after a flight, and these images are later used by image mosaicking software to derive reflectance. Through this “one-point calibration” process, users are led to believe that their imagery has been calibrated; however, this approach can only be used to adjust the scale of measured values. In this case, the scale adjustment allows radiance values to be expressed as reflectance, relative to the solar irradiation current to the moment the CRP image was collected. Because this conversion is conducted by proprietary software, it remains opaque to the user, and as a result, this type of calibration cannot be used to independently verify the accuracy or precision of the sensor across its entire measurement range. In the last few years, a number of users have worked towards developing their own radiometric correction and user calibration assessment methods for UAS sensors, both in the laboratory and field. For example, Kelcey and Lucieer (2012) proposed a workflow for the correction of a multispectral sensor onboard UAS, and considered sensor noise, vignetting (the phenomenon of brightness attenuation away from the image center) (Zheng et al., 2009), and lens distortion as sources of error; however, their method was designed for DN correction only, and the conversion of DNs to radiance was lacking. Other studies have used surrogate methods to derive radiance or reflectance based on field measurements and the empirical line approach (Wang and Myint, 2015; Hruska et al., 2012; Crusiol et al., 2017; Pozo et al., 2014). In these studies, ground targets with known reflectance were employed as references, and relationships were built between the reflectance of the targets and the DNs from the multispectral sensor (Smith and Milton, 1999). With different degrees of success, the relationship has been empirically assumed to be linear (Pozo et al., 2014), second-degree polynomial (Crusiol et al., 2017), or logarithmic (Wang and Myint, 2015). Nevertheless, the performance of UAS sensors used in these studies has seldom been evaluated, and for most of the applications the selection of ground targets tended to be subjective. These limitations likely introduced uncertainty into the mathematical relationships that were developed. In this context, the main objective of this study was to develop calibration methods that can be used to evaluate and track the accuracy and reliability of UAS multispectral sensors through time. To do this, we developed and employed both laboratory and field methods, with the ultimate aim of developing a field methodology that can be adopted by 133

ISPRS Journal of Photogrammetry and Remote Sensing 149 (2019) 132–145

S. Cao et al.

illumination variability during the process of aggregating individual photos into a mosaic image. We used the auto-settings of both cameras for all calibration assessments. We used two different UAS to carry the multispectral camera models. The MultiSpec 4C was onboard a senseFly eBee (senseFly, Lausanne, Switzerland; https://www.sensefly.com/; “Multispectral System 1” in Table 2) and the MicaSense RedEdge was onboard a Draganfly X4-P (Draganfly Innovations Inc., Saskatoon, SK, Canada; http://www.draganfly.com/; “Multispectral System 2” in Table 2). This set-up was used for both laboratory and field calibration assessments.

other UAS operators who do not have access to highly specialized calibration equipment. First, we conducted a laboratory calibration assessment using two multispectral sensors from different manufacturers: the MicaSense RedEdge and the Airinov MultiSpec 4C. This allowed us to compare the measurement performance of each UAS sensor against that of a calibrated hyperspectral scanner (an Analytical Spectral Device or ASD) using a variety of target materials. The best performing target materials were identified and were used as ground reference targets for three different field calibration exercises that were conducted in an agricultural setting. In field calibration assessment No. 1, we compared MicaSense RedEdge reflectance measurements of our ground reference targets in aerial imagery against the known reflectance values acquired in the laboratory. Field calibration assessment No. 2 involved a comparison of reflectance values from field plots measured by the MicaSense and the Airinov multispectral sensors, against measurements taken by a Trimble GreenSeeker Normalized Difference Vegetation Index (NDVI) sensor. Field calibration assessment No. 3 involved a comparison of the reflectance of our ground reference targets collected by the MicaSense RedEdge sensor during seventeen sequential field surveys to explore measurement variability over time. We present the methods used for each of these calibration assessment methods, along with the accuracy results from each assessment. We conclude the paper by discussing the importance of integrating a calibration assessment procedure into standard UAS operating procedures and propose simplified procedures that other UAS users can adopt to validate the ongoing radiometric fidelity of multispectral sensors for remote sensing studies using UAS.

2.1.1. Airinov MultiSpec 4C: Image preprocessing The photographs taken by the Airinov MultiSpec 4C were preprocessed to remove two essential effects: dark current and lens vignetting (Kelcey and Lucieer, 2012). The dark current refers to image noise caused by thermal stimulation of electrons and can be compensated for by subtracting a dark frame from the original photograph. The value of the dark frame is stored in the photo’s metadata. The lens vignetting phenomenon occurs when the brightness of a photo radially reduces towards the image edges (Goldman, 2010). The vignetting effect can be corrected using a vignetting model or a white reference panel. Since the vignetting model is not available for our version of the MultiSpec 4C camera, we used a white Spectralon control panel to correct for vignetting, as well as sensor non-uniformities and angular deviation from nadir (Clemens, 2012; Neale and Crowther, 1994). First, the mean of image values was calculated as the Normalized Brightness Value (NBVref) for the green, red, red-edge, and NIR band of the Spectralon panel photo, respectively. We then derived a Correction Coefficient (CCref) for each band of the Spectralon panel photo by dividing the NBV by the Brightness Value (BVref) for each pixel of the band:

2. Methodology 2.1. UAS multispectral sensors and image preprocessing

CC (x , y )ref =

We tested two popular and commercially available multispectral camera models: the MicaSense RedEdge (MicaSense Inc., Seattle, WA, USA; http://www.micasense.com/) and the Airinov MultiSpec 4C (Airinov SAS, Paris, France; http://www.airinov.fr/). Both cameras have narrow, non-overlapping bands in the visible and NIR (VIS-NIR, 400–1000 nm) range. The band positions were designed to measure light in the visible range where plants strongly absorb light to conduct photosynthesis, and light in the longer wavelength range that plants strongly reflect due to leaf cell structure (Table 1). When the MicaSense RedEdge or the Airinov MultiSpec 4C captures a single image, it produces individual raw photos per spectral band. This type of camera is commonly used for commercial agricultural monitoring applications, and both cameras that we tested have Downwelling Light Sensors (DLS) that measure and record the intensity of incoming light (irradiance) at the time of each photo capture. This information is stored in the photo metadata and can later be retrieved to help correct for the effects of

Band number

Band name

Center wavelength (nm)

Bandwidth FWHM (nm)

MicaSense RedEdge

1 2 3 4

Blue Green Red Rededge NIR

475 560 668 717

20 20 10 10

840

40

Green Red Rededge NIR

550 660 735

40 40 10

790

40

5 Airinov MultiSpec 4C

1 2 3 4

(1)

where (x,y) is the pixel location. Finally, the CC value was applied to other photos taken by Airinov MultiSpec 4C within the same mission to calculate Calibrated Brightness Value (CBV):

CBV (x , y ) = CC (x , y ) × BV (x , y )

(2)

2.1.2. MicaSense RedEdge: Radiometric conversion The image preprocessing of the MicaSense RedEdge sensor used the radiometric conversion formula provided by MicaSense (https:// support.micasense.com/hc/en-us/articles/115000351194-RedEdgeCamera-Radiometric-Calibration-Model), which converts DN values to absolute spectral radiance values (W·sr−1·m−2·nm−1). The formula compensates for sensor black-level, sensitivity, gain and exposure settings, and lens vignette effects:

Table 1 Band characteristics of the two multispectral sensors included in this study. FWHM is short for full width at half maximum. Camera manufacturer and brand name

NBVref BV (x , y )ref

L = V (x , y ) ×

p pBL a1 × g te + a2 y a3 te y

(3)

where L is the spectral radiance; V(x, y) the vignette polynomial function at pixel (x, y); a1, a2, and a3 the radiometric calibration coefficients; g the sensor gain setting; p the normalized DN value; pBL the black level offset; and te the image exposure time. All the parameters required for the calculation of the spectral radiance can be found in the photo’s metadata. 2.2. Laboratory calibration assessment The first objective of the laboratory experiment was to evaluate the performance and reliability of the two UAS sensors against a full range spectrometer in carefully controlled lighting conditions (Fig. 1). Twenty-nine different reference materials were imaged using the two UAS sensors and an ASD FieldSpec 3 Pro Full Range portable 134

ISPRS Journal of Photogrammetry and Remote Sensing 149 (2019) 132–145

S. Cao et al.

Table 2 The UAS platform and camera model used in this study and aerial image collection parameters for the field calibration experiment of the UAS multispectral sensors. AGL is short for above ground level and GSD is short for ground sample distance.

UAS Platform

RGB System

Multispectral System 1

Multispectral System 2

senseFly eBee fixed-wing

senseFly eBee fixed-wing

Draganfly X4-P quadcopter

Camera Model

Image sensor(s) Altitude Forward Overlap Lateral Overlap Image resolution (GSD)

Canon ELPH 110

16 Mpixel 90 m AGL 65–70% 65–70% 3 cm/pixel

MultiSpec 4C

4-band multispectral (green, red, red-edge, NIR) 4 × 1.2 Mpixel 90 m AGL 85% 75% 10 cm/pixel

MicaSense RedEdge

5-band multispectral (blue, green, red, red-edge, NIR) 5 × 1.2 Mpixel 120 m AGL 85% 75% 8 cm/pixel

Watt broad-spectrum halogen DC lab lamps, supplied by ASD Inc., at a zenith angle of 45° (Fig. 2a). The fluorescent room lighting in the laboratory was turned off, and all doors to the laboratory were closed to restrict the influence of inconsistent illumination. The lens of the ASD was placed 4.5 cm from the target, which can be translated as a circular footprint on the target with a diameter of 0.02 m (Fig. 2b). For each calibration reference material, three measurements were taken with the ASD, and their locations on the target material were recorded. Each calibration target material was also imaged using the RedEdge sensor onboard a Draganfly X4-P quadcopter (Fig. 2c) and the MultiSpec 4C sensor onboard a senseFly eBee fixed-wing (Fig. 2d). The two cameras were supported by racks at a height of 0.24 m. Illumination conditions were consistent with the ASD imaging. To analyze the performance of the sensors and materials, we first resampled the ASD spectra of the twenty-nine materials to match the wavelength and bandwidth (Full-Width Half Maximum or FWHM) profile of the RedEdge sensor and MultiSpec 4C sensor, respectively (see Table 1). Each photograph captured by the RedEdge and MultiSpec 4C sensors were then preprocessed using the methods presented in Sections 2.1.1 and 2.1.2. It should be noted that pixel values after preprocessing were expressed as radiance (W·sr−1·m−2·nm−1) for the RedEdge photographs and as DNs for the MultiSpec 4C photographs. After preprocessing, three spectra were obtained from each photograph at the locations where the ASD measurements were taken. Finally, for each calibration reference material, we averaged the spectra samples for both the ASD and the two UAS sensors. A correlation analysis between the mean spectra of the UAS sensors and that of the ASD was conducted, and a Pearson correlation coefficient and coefficient of determination (R2) were calculated. The Pearson correlation coefficient and R2 are the most commonly explored and best understood indicators to measure the linear (including 1:1) correlation between two variables (here UAS measurements and the ASD measurements) (Sarma, 2010), and have been previously used in similar work (e.g., Smith and Milton, 1999; Kelcey and Lucieer, 2012; Wang and Myint, 2015).

Fig. 1. The workflow of radiometric calibration in the laboratory: (a) ASD spectra acquisition; (b) the derivation of RedEdge radiance; and (c) the calculation of MultiSpec 4C calibrated DN.

spectrometer (ASD Inc., Boulder, Colorado, USA), and the results were compared using a correlation analysis. The second objective of the laboratory experiment was to select a smaller number of reference materials with known spectral properties to test the reliability of the UAS sensors in a field setting. The calibration target materials we tested in this study were selected because: (1) the materials cover a range of spectral reflectance, thereby allowing us to evaluate sensor performance at low and high brightness in all wavelengths; (2) the materials have the potential for use as field calibration targets because they are Lambertian or near-Lambertian (Pozo et al., 2014); and (3) the materials are durable, commercially available, cost-efficient, and easy to transport (Wang and Myint, 2015). A list of the materials we tested, with their ASD spectra (400–1000 nm), is provided in Table 3. 2.2.1. Laboratory procedure We sampled the reflectance of all calibration target materials using an ASD FieldSpec 3 Pro Full Range portable spectrometer in a completely external-light-isolated environment (Fig. 2). The ASD measures spot reflectance of light through a foreoptic with a conical, 25° wide Field of View (FOV) in 1214 discrete bands in the visible and shortwave infrared wavelengths (350–2500 nm). To control the illumination of target materials, we used a pair of 50-

2.3. Field calibration assessments In order to test the reliability of the UAS sensors in a field setting with natural illumination conditions, we conducted three separate field calibration assessments. Field calibration assessment No. 1 included image capture of the Ground Reference Panels (GRPs) that were identified from our lab calibration experiment as being the most suitable for use as reference panels in the field. Field calibration assessment No. 2 135

ISPRS Journal of Photogrammetry and Remote Sensing 149 (2019) 132–145

S. Cao et al.

Table 3 Calibration target materials used in this study and their ASD spectra (only wavelengths from 400 nm to 1000 nm are shown). The reference panels identified with an asterisk are the two materials that were selected as Ground Reference Panels (GRPs) in the field calibration assessments. Materiel sample images can be found in Fig. 1s. Material ID

Material description

Reference Panels RP1 RP2 RP3 RP4 RP5* RP6

Spectralon Reference Panel – White Teflon Panel – White MicaSense CRP Airinov/SenseFly CRP Linoleum Panel – Dark Grey Linoleum Panel – Light Grey

Paper products P1 P2 P3

Foam-core Sign Board – Black Foam-core Sign Board – White Poster Board – Black

Synthetic/Polymer Materials S1 S2 S3 S4 S5 S6 S7 S8 S9

Vinyl Siding - Dark Grey Vinyl Siding - Medium Grey Vinyl Siding - Light Grey Vinyl Siding - White Grey PVC Tarp - Grey PVC Tarp - White Vinyl Tablecloth - Grey Vinyl Tablecloth – Black (glossy) Vinyl Tablecloth - Tan

Common fabrics F1 F2 F3* F4 F5 F6 F7 F8 F9 F10 F11

Nylon Fabric - Black Cotton Canvas - Black Fine-Weave Cotton Fabric - Black Cotton Denim - Black Suiting Fabric - Black Knit Suiting Fabric - Black Bonded Denim - Black White flannel back of Vinyl Tablecloth White flannel back of Vinyl Tablecloth White flannel back of Vinyl Tablecloth White flannel back of Vinyl Tablecloth

ASD Spectra Plot

– – – –

Grey Tan Black Patterned

included image capture of ground validation plots where NDVI values were also measured using a Trimble GreenSeeker handheld crop sensor (model HCS-100; Trimble Inc., Sunnyvale, California, USA; http:// www.trimble.com/). Field calibration assessment No. 3 included images that were taken over the course of two months at a variety of different field sites. Our primary field validation site was located in north-central Alberta, Canada, approximately 130 km north of the City of Edmonton and 10 km west of the Hamlet of Dapp (hereafter referred to as the

“Dapp site”) (Fig. 3). The 190 ha field was planted to Spring Wheat in 2017, and we selected a rectangular 10 ha survey plot near the southcentral portion of the field as our study area. We visited the site twice during different stages of crop development, beginning shortly after plant emergence (Table 4). Aerial surveys were conducted within two hours of local noon to keep the solar illumination angles consistent. Weather conditions were favorable for flying on both collection dates, with mostly clear and sunny conditions, though scattered clouds developed during the second survey. During each survey a Hobo U30 data 136

ISPRS Journal of Photogrammetry and Remote Sensing 149 (2019) 132–145

S. Cao et al.

Fig. 2. Laboratory setups of the ASD (a), the Rededge sensor (c), and the MultiSpec 4C sensor (d), and a schematic of the footprint of the ASD (b).

logger was used to record incoming solar illumination with a photosynthetically active radiation (PAR) sensor (HOBO S-LIA-M003), along with air temperature and relative humidity (S-THB-M002) at 1 Hz (1 sample per second).

(660 nm) and NIR (780 nm) light, and then measures the light in each wavelength that is reflected by the ground-cover beneath the instrument. The reflected light values are used to calculate the NDVI reading of that spot, which is displayed on an LCD readout. We collected three readings at each plot, manually tabulated the values, and calculated the average. The instrument’s FOV on the ground is oval with a width ranging between 25 cm and 50 cm at the suggested operating heights of 60 cm to 120 cm above the measurement surface. We operated the instrument at approximately 75 cm above the ground- or vegetationcover (waist-height of the operator) to ensure measurement of a consistent area. The Linoleum Panel – Dark Grey (RP5) and Fine-Weave Cotton Fabric – Black (F3) were selected as GRP materials to assess the radiometric consistency of the multispectral imagery collected under natural solar illumination conditions. Further discussion of why we selected these materials is provided in Sections 3.1, 4.1.2 and 4.1.3. We placed these GRPs on the ground within the boundaries of the aerial survey scenes and recorded their geographic locations (Fig. 3). The GRPs remained on the ground for the duration of the aerial survey and were visible in the resulting imagery.

2.3.1. Field procedure Prior to initiating the UAS aerial surveys, nine high-visibility Ground Control Point (GCP) targets were laid out in the field in a grid that roughly defined the boundaries of the 10 ha plots (Fig. 3a and b). The center of each target was surveyed with a Trimble AgGPS 332 (Trimble Inc., Sunnyvale, California, USA; 12 channel Differential Global Positioning System (DGPS) with Wide Area Augmentation System (WAAS) and OmniStar XP satellite-based differential signal corrections). The surveyed positions were exported as UTM coordinates directly from ‘Field Worker GPS 4.5′ software running on the field collection device. Horizontal position measurement accuracy was nominally ± 0.20 cm. The GCP targets remained in the field during the UAS surveys so that they would be visible in the imagery. In addition, between five and seven 1 m × 1 m wooden frames were placed on the ground at various locations with a high diversity of vegetation cover prior to each survey. As with the GCPs, these frames remained in place for the duration of the aerial survey and were thus visible in the imagery. NDVI of the ground-cover was measured at the centre of each frame using a Trimble GreenSeeker sensor. The GreenSeeker uses a pair of laser diodes to emit a brief pulse of Red

2.3.2. Aerial image collection and processing For each aerial image collection, the RGB System and Multispectral System 1 and 2 (Table 2) were used to sequentially survey the 10 ha field site, with effort taken to minimize the downtime between the two 137

ISPRS Journal of Photogrammetry and Remote Sensing 149 (2019) 132–145

S. Cao et al.

Fig. 3. Field site near Dapp, AB, Canada, showing the locations of our ground NDVI plots, ground reference panels, and Ground Control Points (GCPs) in field collection 1 (a) and 2 (b). Base maps for (a) and (b) are RGB orthomosaics.

multispectral surveys. Immediately prior to the multispectral surveys, we followed instructions provided by each sensor manufacturer for collecting photos of the supplied CRP. We used Pix4D MapperPro software version 3.3.13 (Pix4D SA, Lausanne, Switzerland) to produce 2D orthorectified reflectance images from each set of multispectral photographs (RedEdge GSD = 8 cm/pixel; MultiSpec 4C GSD = 10 cm/ pixel) and 2D orthomosaic images from the RGB photographs (GSD = 3 cm/pixel). According to the Pix4D radiometric processing and calibration workflow, we used the corresponding CRP photos during processing of each set of multispectral photographs. Pix4D MapperPro (hereafter referred to as “Pix4D”) was selected because it is one of the most popular and widely used structure-from-motion software packages, and it includes a radiometric correction function. AgiSoft Photoscan is another widely used software package for this purpose, but at the time this work was conducted, Photoscan did not include a radiometric correction function. The RGB orthomosaic was georeferenced based on the GCP targets. The GCP targets were first marked in the imagery and the precise survey coordinates were registered to provide horizontal and vertical geometric control for the final UAS imagery products. We co-registered

the three image datasets (from Canon ELPH 110, MultiSpec 4C, and MicaSense RedEdge) from each field collection in ENVI Version 5.4 (Exelis Visual Information Solutions, Boulder, Colorado) using the image-to-image registration workflow. The RGB orthomosaic was used as the base image and the reflectance images from both multispectral sensors were registered to the base image using a nearest-neighbor warp process. The native resolution of the multispectral images was preserved during the image-to-image registration. The image registration achieved sub-pixel accuracies for all multispectral images. 2.3.3. Extracting reflectance values from individual photos and mosaic imagery In order to extract the reflectance values for each GRP in the individual photos taken with the RedEdge sensor, the UTM coordinate collected in the field for each GRP was imported using Pix4D and registered as a new GCP. Using the GCP/MTP Editor tool, the center of each GRP was marked in each photo where the target could be reliably identified, and a list of image filenames and associated image-space coordinates of the target center was exported. All the photos containing GRP targets were converted from DN values to absolute radiance (Eq.

Table 4 Field data collection schedule and general weather conditions during the aerial surveys. Survey

Plant development stage

Date & time of survey

Sky conditions

Wind

Temperature (°C)

1

Emergence

Sunny & clear, > 3000 m ceiling

6–8 m/s, gusty

22–24

2

Early development

7 June 2017 12:00–13:30 GMT-6 29 June 2017 10:50–12:15 GMT-6

Sunny, scattered clouds, > 3000 m ceiling

1–3 m/s

21

138

ISPRS Journal of Photogrammetry and Remote Sensing 149 (2019) 132–145

S. Cao et al.

(3)) and then to surface reflectance by calculating the ratio of radiance to DLS irradiance. For each GRP, the mean reflectance value (in each band) of the 3 × 3 pixel neighborhood around the plot center was extracted. For field calibration No. 2, reflectance values from the center of each NDVI plot were extracted, and an NDVI value was calculated and compared to the corresponding value obtained in the field using the GreenSeeker. To do this, the center of each NDVI plot and GRP was identified within the RGB orthomosaic imagery and marked with a point in ArcGIS 10.5 (Environmental Systems Research Institute, Redlands, California). These points were buffered to make 0.10 m diameter circular extraction points that fell within the ∼0.30 m diameter GreenSeeker measurement locations on the ground (or within the bounds of the panel, in the case of GRPs). In ENVI, the multispectral reflectance mosaics were loaded and the red and NIR bands were used to calculate an NDVI raster from both the MultiSpec 4C and RedEdge datasets. The buffered extraction points were loaded into ENVI and converted to Regions Of Interest (ROI). Each ROI group was manually inspected to remove any pixel that was too close to the NDVI plot frame or the GRP edge to avoid potential spectral signature mixing. Finally, the mean value within each of the NDVI-plot ROIs from both the MultiSpec 4C-NDVI and RedEdge-NDVI raster layers were extracted, along with the mean reflectance values from each band of the RedEdge mosaics for the GRP-ROIs.

and compared these values to the ASD-derived reflectance values. We produced a scatterplot using all RedEdge measurements from both field collections, and performed a correlation analysis to calculate the Pearson correlation coefficient and coefficient of determination (R2) between field and lab reflectance measurements. It should be noted that the performance of the MultiSpec 4C sensor was not assessed in this way because we lacked the full solution to convert raw DN to reflectance for this sensor. Finally, reflectance values for each GRP were extracted from the RedEdge and MultiSpec 4C mosaics, and these values were compared to the ASD laboratory reflectance using a correlation analysis. Since only one reflectance value can be extracted from each band of the RedEdge or the MultiSpec 4C mosaics, the standard deviation of the reflectance value was not calculated. 2.3.5. Field calibration assessment No. 2 As an alternative evaluation of the performance of the RedEdge and MultiSpec 4C sensors in a typical-use scenario, field calibration assessment No. 2 involved a multi-point comparison between NDVI values measured by each UAS sensor (as calculated from reflectance mosiacs produced by Pix4D) and NDVI values measured using the Trimble GreenSeeker sensor (Fig. 4 – Field calibration assessment 2). The extraction of NDVI values from images obtained using the RedEdge and MultiSpec 4C was described in Section 2.3.3. A correlation analysis was conducted to calculate the Pearson correlation coefficient and coefficient of determination (R2) to compare the results between each UAS sensor and the GreenSeeker sensor.

2.3.4. Field calibration assessment No. 1 To evaluate the accuracy and consistency of field reflectance measurements from the RedEdge and MultiSpec 4C sensors, we performed a two-point calibration test using the GCP materials selected from the laboratory trails: RP5 and F3 (Fig. 4 – Field calibration assessment 1). In the two-point calibration, the reflectance of each GRP measured by the UAS sensors in the field was compared to the reflectance measured in the laboratory with the ASD. We first used the process described in Section 2.3.3 to extract the GRP reflectance values from individual photos captured using the RedEdge sensor. Given that the field site was surveyed using 85% forward and 75% lateral overlap, the GRPs were visible in a large number of individual photos, allowing for the collection of multiple reflectance values (14–16) in each RedEdge band. We calculated the mean reflectance and standard deviation for each GRP in each RedEdge band,

2.3.6. Field calibration assessment No. 3 Our third field calibration assessment is a simplified version of the process described in Section 2.3.4, in which we conduct a more extensive and longer-term evaluation of the accuracy and consistency of reflectance measurements made by the RedEdge sensor. Following our first field data collection at the Dapp site, we conducted surveys using the RedEdge sensor throughout June and July 2017, at seventeen additional agricultural field sites located throughout central Alberta. For each of these surveys, we deployed the same GCPs (F3 and RP5) that were used at the Dapp site for field calibration assessments 1 and 2, resulting in a more extensive dataset that could be used to evaluate sensor performance through time. In this assessment, we omitted the process of extracting reflectance values from individual photos, and instead focused on

Fig. 4. Field calibration assessment workflow diagram. Field calibration assessment 1 included a comparison of field-derived (MultiSpec 4C and RedEdge) and laboratory-derived (ASD) reflectance values for each of the selected Ground Reference Panels (GRPs). Field calibration assessment 2 included a comparison of NDVI values derived in the field using a handheld Trimble GreenSeeker, to NDVI values derived from imagery captured with the MultiSpec 4C and the RedEdge sensors. 139

ISPRS Journal of Photogrammetry and Remote Sensing 149 (2019) 132–145

S. Cao et al.

Fig. 5. Scatterplots and linear fit lines between the ASD reflectance and MicaSense Rededge radiance in different bands using twenty-nine laboratory materials.

Fig. 6. Scatterplots and linear fit lines between the ASD reflectance and MultiSpec 4C calibrated DN values in different bands using twenty-nine laboratory materials.

extracting reflectance values for the GRPs from Pix4D mosaic images only. We then compared the extracted mosaic reflectance values to the ASDderived reference reflectance values to evaluate sensor performance through time and under different field conditions.

3.2. Field assessment results 3.2.1. Field calibration assessment 1 Results from the first field calibration assessment illustrate how accurately each UAS sensor (or the combination of the sensor and image post-processing method) measures reflectance under typical field conditions. Each sub-plot (a through c) in Fig. 7 compares the GRP reflectance values that were measured in the laboratory, to the GRP reflectance values measured by each UAS sensor and extracted from aerial imagery. In each sub-plot there are two clusters of points: a cluster of square points representing the reflectance of GRP F3 (the black fabric panel) and a cluster of circular points representing the reflectance of GRP RP5 (the grey linoleum panel). The spread of points along the x-axis is the same in each sub-plot because these are the reflectance values measured by the ASD, and both targets demonstrated very consistent reflectance across the full VIS-NIR spectrum when measured in the lab. In contrast, the spread of points along the y-axis illustrates the level of variation associated with the reflectance values measured and extracted from the images collected by each UAS sensor. The results of the lab calibration assessment (Figs. 5 and 6) demonstrate that both of the UAS sensors (and in particular, the RedEdge sensor) are well calibrated and can produce accurate reflectance measurements across a wide range of values in a laboratory setting. Given these results, we expected that these same sensors would be capable of reliably reproducing these measurements in a field setting, in which case the point-clusters for each GRP should conform to the laboratory measurements. If that were the case, we would see tight clusters of points with a matching spread of values on the x-axis and y-axis, and the clusters would plot along the 1:1 line shown for reference (Fig. 7). However, we see a number of deviations from this expected result. The first deviation is that in almost all cases, the image-derived reflectance values are higher than the lab-standard reflectance measurements. This indicates that our field measurements have a degree of inaccuracy and are generally over-estimating reflectance. Second, our image-derived reflectance values demonstrate a considerable range of variation between field collection 1 and collection 2. This is evident by comparing any pair of points in the same image band (color). Ideally,

3. Results 3.1. Laboratory calibration assessment The relationships between radiance of the twenty-nine reference materials imaged by the MicaSense RedEdge sensor and corresponding ASD reflectance for all five image bands (blue, green, red, red-edge, and NIR) are shown in Fig. 5. The relationships are very well represented by linear regressions with all R2 values ≥0.96. Points become slightly more dispersed in the higher-reflectance range (bright targets) than in the lower-reflectance range (darker targets). The relationships are stronger in the visible bands (R2 values for blue = 0.99; green = 0.99, and red = 0.98) and slightly weaker at longer wavelength bands (R2 values for red-edge = 0.98 and NIR = 0.96). The same comparison performed with the Airinov MultiSpec 4C measurements shows similar trends (Fig. 6); however, the overall performance of this sensor is lower, as compared to the RedEdge sensor. While the correspondence of MultiSpec 4C to ASD measurement points are still well represented by a linear regression, the R2 values for the green (R2 = 0.92), red (R2 = 0.93), red-edge (R2 = 0.82), and NIR (R2 = 0.73) bands are much lower than the RedEdge sensor. This indicates low measurement precision, especially at higher reflectance values in the NIR band. The RP5 (Linoleum Panel – Dark Grey) and F3 (Fine-Weave Cotton Fabric – Black) materials were identified as suitable GCPs for our field calibration assessments because both of these materials had low and stable reflectance in all spectral bands when imaged by the MicaSense RedEdge and the MultiSpec 4C sensors (Table 3; Fig. 5; Fig. 6). We rejected other potential materials that exhibited relatively stable reflectance profiles (i.e., RP1, RP2, RP3, RP6, P1, P3, S5 and S8, in Table 3), primarily due to their non-Lambertian properties, and other details discussed below in Sections 4.1.2 and 4.1.3. 140

ISPRS Journal of Photogrammetry and Remote Sensing 149 (2019) 132–145

S. Cao et al.

Fig. 7. Comparison of the reflectance values of GRPs (F3 and RP5) extracted from field imagery, against the reflectance measured in the laboratory; the dashed diagonal line in all plots shows the theoretical ideal 1:1 correspondence. For each scatterplot, the laboratory-derived reflectance is compared to: (a) reflectance from individual RedEdge photos; (b) reflectance from RedEdge Pix4D mosaics; and (c) reflectance from MultiSpec 4C Pix4D mosaics. Reflectance values from all spectral bands of either RedEdge (five bands) or MultiSpec 4C (four bands) in both field collections were used to estimate the relationships and fit lines.

the pairs of points in each cluster should overlap (e.g., both green circles should overlap, both red squares should overlap, etc.). Discrepancies are noticeably worse for RP5 than F3. Finally, it appears that the reflectance values produced by our manual radiometric conversion method (Fig. 7a) correspond more closely to the laboratory measurements than the reflectance values extracted from the Pix4D mosaic images for either the RedEdge (Fig. 7b) or the MultiSpec 4C (Fig. 7c). This suggests that the radiometric conversion being performed by Pix4D is less precise than our manual method. 3.2.2. Field calibration assessment 2 The results from our multi-point comparison of NDVI values indicate that both the mosaic reflectance images produced in Pix4D using the RedEdge (Fig. 8a) and MultiSpec 4C images (Fig. 8b) had very strong linear relationships (R2 > 0.99) with the GreenSeeker NDVI measurements. When the NDVI values derived from the UAS imagery were compared (Fig. 8c), there was also a very high correlation (R2 > 0.99) between the values. These results were consistent between collection 1 and collection 2. Fig. 8. Comparison of NDVI values derived using the Trimble GreenSeeker sensor to: (a) NDVI values derived from the RedEdge mosaic and (b) NDVI values from the MultiSpec 4C mosaic. Plot (c) also shows a comparison of NDVI values derived from the RedEdge and MultiSpec 4C mosaics. Values derived from images captured during field collection 1 are shown in orange, while values derived from images collected during field collection 2 are shown in green. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

3.2.3. Field calibration assessment 3 In order to evaluate the performance of the RedEdge sensor through time and under a range of different field conditions, we compared reflectance values from mosaic images that were obtained over a period of two months at seventeen different field sites. With the exception of values extracted for the blue band for the RP5 reference panel, average reflectance values extracted from the mosaics were consistently higher than the laboratory-derived reflectance values for both GRPs (Fig. 9 and Table 5). The comparison of field reflectance values through time also shows considerable variability in values across the seventeen flights, and this variability is generally synchronized. For example, the highest reflectance value for reference panel F3 was recorded during flight 7, and this peak in reflectance was consistent across all five bands. Notably, there is a larger mean offset and a larger standard deviation for the red-edge and NIR bands, as compared to the visible bands for both reference panels.

4. Discussion 4.1. Laboratory calibration assessment 4.1.1. UAS multispectral sensor performance Our laboratory results indicate that the RedEdge sensor preforms with high accuracy and precision in all bands (blue, green, red, rededge, and NIR) under strictly controlled conditions. One of the key 141

ISPRS Journal of Photogrammetry and Remote Sensing 149 (2019) 132–145

S. Cao et al.

the two bands used to calculate NDVI and several other important vegetation indices, and is one of the primary reasons that multispectral cameras are purchased for instrumenting on UAS. Unfortunately, we cannot make a definitive assessment of the performance of the Airinov MultiSpec 4C sensor because the manufacturer of the sensor does not provide enough information about the function of the instrument or the contents of the image metadata to allow us to convert raw pixel values to calibrated radiance values. For the same reason, we were unable to assess the field performance of the MultiSpec 4C sensor through its individual photos in our field calibration assessment in the same manner as the MicaSense RedEdge. Our inability to independently evaluate the performance of the MultiSpec sensor underscores the critical importance of industry transparency in this quickly evolving and rapidly growing technological space. UAS sensors are increasingly being used in a wide range of industries, and both the users and consumers of the data implicitly trust that the information is accurate and reliable. The results from our study, however, give us pause and highlight the critical need for transparency if trust is to be maintained with the user community. Indeed, some sensor manufacturers have acknowledged this need and have recently demonstrated a willingness to respond to consumer concerns and user community requests for information. For instance, MicaSense has recently made firmware changes to the RedEdge camera that made important calibration factors accessible and clearly readable in the image metadata, and further provided technical notes describing the process to calculate calibrated reflectance from raw imagery. Without this information, we would not have been able to perform the manual radiometric conversion process that we relied on for our assessment of the RedEdge camera. The evolution of UAS as reliable remote sensing technology depends on the continuation of such industry transparency, as a technically savvy user community will not trust these sensors as measurement devices if they remain black-box instruments.

Fig. 9. Comparison of laboratory-derived reflectance and field derived reflectance for GCPs F3 and RP5. Imagery was acquired using the RedEdge sensor over a period of two months at seventeen different field sites (listed by Flight ID#). Reflectance values were extracted from mosaic images processed using Pix4D software.

4.1.2. Ground reference panel material selection One of our goals was to identify a set of common, easily accessible materials with consistent reflectance properties that could be inexpensively acquired and used as routine field-calibration targets. We were motivated to do this for our own operational usage, with the end goal of sharing recommendations for appropriate reference panel materials with the UAS multispectral imaging community. Through this study, we found that identifying such materials was more difficult than anticipated, and we discuss the key challenges associated with selecting materials across reflectance levels (low: black; high: white; moderate: grey) in the VIS-NIR spectral range.

components of this strong performance was the consistent lighting conditions that were employed during this assessment. The surface reflectance of a target is calculated as a ratio of the surface radiance to incident irradiance, and if the incident irradiance remains constant, the radiance is a linear function of surface reflectance. Thus, the radiance values acquired during this laboratory assessment had, in theory, a linear relationship with the reflectance values sampled by the ASD sensor in all spectral bands. This explains the strong correlation (R2 > 0.96) between our MicaSense RedEdge measurements and the ASD measurements across the full range of target materials. In contrast, the MultiSpec 4C preprocessed DN values had a relatively low correspondence with the ASD measurements, especially in longer wavelength bands. We found these results disappointing, and in particular highlight the poor measurement precision observed in the NIR band at high reflectance values. The NIR band is critical because of its ability to provide measurements of the spectrum in which plant leaves are highly reflective due to the photon diffusing at the air-cell interfaces in the spongy mesophyll (Woolley, 1971). This band is one of

(a) Low Reflectance – Black Materials Many ‘black’ materials do not have uniformly low reflectance throughout the entire VIS-NIR range. We tested eight different black fabrics made from a range of synthetic and natural fibres and found a surprising number of materials have relatively high reflectance at different parts of the VIS-NIR spectrum (Table 3). The best black reference

Table 5 The mean and standard deviation of reflectance values for GCPs F3 and RP5 as measured in the laboratory and in the field. Average field reflectance values were calculated from imagery acquired using the RedEdge sensor over a period of two months at seventeen different field sites. Reflectance values were extracted from mosaic images processed using Pix4D software. Image Band

NIR Red-edge Red Green Blue

F3

RP5

Lab reflectance

Average field reflectance

Standard deviation of field values

Lab reflectance

Average field reflectance

Standard deviation of field values

0.023 0.020 0.019 0.018 0.018

0.141 0.074 0.039 0.043 0.031

0.051 0.026 0.010 0.012 0.006

0.064 0.061 0.063 0.069 0.075

0.215 0.131 0.081 0.093 0.066

0.067 0.039 0.023 0.023 0.016

142

ISPRS Journal of Photogrammetry and Remote Sensing 149 (2019) 132–145

S. Cao et al.

material we found was a dense cotton fabric composed of fine woven fibres (see F3 in Table 3), but we cannot explain why it has such a good flat reflectance profile while other similar materials did not (i.e. black cotton canvas, black denim). We can only speculate that the cause is the dye used in the fabric. Paints commonly contain titanium and other metallics that produce absorption or reflection features at specific wavelengths and their formulas vary widely by manufacturer. This is the reason why we avoided testing paints in our trials. Foam-core Sign Board - Black and Black poster board (e.g.,P1 and P3 in Table 3) is the next-best alterative black reference material; however, unprotected paper is not a field-durable material, as it quickly degrades in wet or dirty conditions, and with frequent transport and handling. This lack of durability has cost-implications for UAS operators who frequently deploy their multispectral sensors, making this material a less than desirable solution.

indicating that this panel was too small to provide reliable results. We suspect that the spectral signature of vegetation and shadows influenced the reflectance signature of pixels that we extracted from the panel, despite our efforts to avoid pixels on the edge of the panel. Panel size is also the reason we could not use the MicaSense CRP (see RP3 in Table 3) as a GRP. The CRP that comes with the MicaSense RedEdge camera is a material perfectly suited for use as a pre-flight calibration target because it has a moderate reflectance that is flat across the entire VIS-NIR range, and is highly Lambertian. These attributes also make the CRP an ideal material for use as a GRP, but unfortunately, the panel provided by MicaSense is too small (15 cm × 15 cm) to be effectively visible in imagery collected at typical survey altitudes. We have made inquiries, but at the time of publication we do not know what the CRP is made from, nor do we know where to source more of this material. Our black reference panel was constructed by affixing the bestperforming black fabric (F3) to a 1 m × 1 m panel of plywood. In our analysis, we only sampled the nine pixels closest to the panel center; however, this panel was large enough that we could exclude a buffer 2pixels wide around the edges of the panel, and still sample ∼64 pixels (8 × 8 pixels) from the interior of the panel. The size of the point clusters in Fig. 7 shows that the range of reflectance values extracted from these sample pixels was much smaller than the range of values extracted from the center of the grey panel (RP5). Also, the agreement between the laboratory-derived reflectance and the UAS-derived values for F3 was closer than the other reference panel we tested. This suggests that we were getting a more accurate reading of the reflectance of the F3 panel, and there was less spectral contamination from adjacent pixels. The F3 black panel may be less susceptible to spectral contamination by shadows than the grey panel, but we suggest that size of the panel is the best way to reduce the potential effects of both shadows and tall vegetation on the performance of the reference panel. Size of the field reference panel should be scaled to the expected GSD of the imagery, with a minimum buffer of two or three pixels around the edges of the panel that are excluded from reflectance comparisons. For extremely high-resolution imagery (0.01–0.02 m GSD) users should consider increasing the size of this buffer.

(b) High Reflectance – White Materials High quality white materials with flat reflectance profiles such as Spectralon (e.g., RP1 in Table 3) are expensive, difficult to acquire, and too easy to damage for use outside the lab environment. Teflon panels (e.g., RP2 in Table 3) are a reasonable alternative, though this material does not have a completely flat reflectance profile. Some of the vinyl/ plastic materials we tried had a surface that is too shiny and produced specular reflection, making them unsuitable (e.g., S5 and S8 in Table 3). Further, we recommend against using bright white materials as field reference panels, as we found they tend to introduce new problems. For example, white objects either saturate the image sensors, or force the camera to adjust the shutter speed and sensor gain, resulting in correct exposure of the white object but under-exposure of the surrounding features (i.e. vegetation). Our laboratory calibration assessment results also show that the multispectral image sensors perform with less precision in the very high range of reflectance (> 0.8), and therefore, bright white targets are not ideal calibration reference targets. (c) Moderate Reflectance – Grey Materials Our recommendation for a field reference material is grey linoleum (see RP5 and RP6 in Table 3). This material is durable, inexpensive, nearly Lambertian, and has a reflectance profile that is reasonably flat across the VIS-NIR range, similar to the MicaSense CRP. The specific material we used is frequently referred to as “Lino block” and is used for relief cut printmaking. This material is readily available from art and craft stores, or online retailers of art supplies. However, as we discuss further below, the size of the GRP is important for field calibration efforts.

4.2. Field calibration assessments Although the lab calibration assessment demonstrated that the UAS sensors we tested are reasonably well calibrated in a controlled environment, the field calibration assessments yielded a number of important results regarding the reliability of field-derived reflectance measurements. Firstly, when extracting pixel values from targets with known reflectance from field survey image mosaics, we found low accuracy in the measured reflectance values, with a general trend towards an over-estimation of reflectance (Fig. 7). In particular, we found lower accuracy in longer wavelength bands (red-edge and NIR), with a greater magnitude of over-estimation and greater variability in measurement range than visible bands. When we assessed the performance of the RedEdge sensor through time and across different field conditions, we found poor correspondence in reflectance values between image collections. Finally, we found that the radiometric conversion performed by Pix4D produces less precise reflectance results than deriving reflectance values manually from individual photos. The relationship between the laboratory observation and the field observation of our GRPs could provide a possible way to remove the over-estimation and inconsistency that we observed in the reflectance mosaic images. We could apply a two-point calibration correction to all of the field survey image mosaics using the lab-field reflectance relationships of the two GRPs. This follows the basic principal of the empirical line correction method and would involve applying a linear shift to all pixel values in our reflectance raster image, based on the equation of a linear regression line fit to our GRP measurement points

4.1.3. Ground reference panel size selection The samples of grey linoleum material (see RP5 and RP6 in Table 3) that we used for our lab scans were too small to use as effective field reference panels. When conducting aerial surveys with the RedEdge camera at 120 m AGL, the GSD of resulting images is 0.08 m/pixel. Our two linoleum panels were 0.20 m × 0.20 m (RP6) and 0.45 m × 0.45 m (RP5). In the RedEdge mosaic images, RP6 was less than 3 pixels in width and was very difficult to locate in the imagery. Further, it was extremely difficult to extract values from a pixel in the center of the reference panel that was free of edge effects and not spectrally mixed with the ground or vegetation surrounding the panel. Although we tried placing RP6 as a GRP at our field sites, we had to cease using it in the analysis for our field calibration assessments. The larger dimensions of the RP5 panel made it easier to locate in field imagery and allowed for extraction of pixel values from the center of the panel. However, when we extracted reflectance values for this target from individual RedEdge photos, we found that the standard deviation of our extracted values was high, and there were significant deviations from our ASD measured reference values (see Fig. 7a), 143

ISPRS Journal of Photogrammetry and Remote Sensing 149 (2019) 132–145

S. Cao et al.

(similar to the red line shown in Fig. 7) with the goal of aligning our measured image values to the expected (lab reference) values. A unique correction would need to be determined for each band, and for each image collection, which would be labour intensive and would require a high degree of technical knowledge to execute. Regarding the variability in reflectance values observed over time, we note again that this variability appears to be synchronized across all five of the sensor bands, which suggests the cause is not random noise that affects each band independently, but rather there is a larger systematic driver of this variation. Importantly, we do not think that this variation was driven by progressive degradation of our GRPs because we selected very durable materials (see Section 2.2) and protected the GRPs from damage or degradation during transport and storage. Further, if there had been progressive degradation of our GRPs, the measured reflectance values would have a gradual rather than variable change, as observed in Fig. 9. The source of this variability is more likely to be driven by changes in illumination caused by solar illumination angle (time of day, or day of year), cloud conditions, or atmospheric effects (e.g., water vapor content, particulate content, etc.), which have not been accounted for in this analysis. We also noticed that the longer wavelength bands (red-edge and NIR) presented greater variability in reflectance than other visible bands. The reasons for this are two-fold. First, the lower accuracy in red-edge and NIR bands as we observed in our laboratory calibration assessments was likely to persist in the field measurements. Second, red-edge and NIR reflectance is more sensitive to the water vapor content in the atmosphere than other visible bands (Richards, 2013). Performing the two-point calibration correction outlined above could potentially remove some or all of these affects. We are not entirely sure why our reflectance results from Pix4D exhibit lower accuracy than when we performed our own radiometric conversion. One potential explanation for these results is that our manual conversion used only the irradiance information provided by the DLS to convert radiance to reflectance for individual images, while the Pix4D method uses some combination of the DLS data and the photos of the CRP to scale reflectance values for the entire mosaic image composed of composite pixel values. Producing these composite mosaic images is technically complicated, and it is possible that this process is subject to error. How and why these differences exist are questions that we intend to investigate in future work; however, we recognize that most users, including ourselves, will continue using the mosaic reflectance products produced by Pix4D or similar software packages. As such, the issue of primary importance is to understand what may drive the uncertainties, and how errors can be limited. This requires establishing a process to monitor ongoing data quality. We suggest that if users intend to use the raw reflectance from mosaic images as inputs to pixel-based land cover classifications or temporal change analysis, that they take steps to assess whether their sensor performance remains stable over time. To this end, a recommended best-practice for UAS imagery service providers is to have a local test field site that they can easily access for the purpose of collecting imagery of multiple ground reference panels. A test field that is available for use on a frequent and ongoing basis will allow for regular testing of sensors, with the aim of measuring reflectance at the beginning and end of a field season, or anytime hardware or firmware changes are made, to ensure there are no significant issues with the sensor measurements over time. To achieve this, users can establish their own set of GRPs made of materials that are durable and long lasting. We liked the flat spectral response of the materials we chose as GRPs, but realistically, the materials can be a range of color and brightness values. Panels should be large enough to be easily identified in imagery collected at typical operating altitudes. The size should scale with GSD, such that the panel is a minimum size of 9 × 9 pixels, allowing for examination of 9 pixels at the center of the panel, with a 3-pixel wide exclusion buffer. The procedure we suggest for conducting regular calibration field tests

include five basic steps: (1) produce a reflectance raster image; (2) locate the positions of the GRPs in the image; (3) extract the reflectance pixel values from those positions; (4) plot the reflectance values from each band; and (5) track the variation in reflectance values over time. This procedure will result in the calculation of a set of confidence thresholds that can be used to alert the user to anomalous results that should be investigated or rejected. An interesting finding in this study is that, despite the problems noted above with respect to inaccurate and inconsistent reflectance measurements, especially in the longer wavelength bands of both UAS sensors, the NDVI data produced from image mosaics were accurate and precise with respect to the GreenSeeker, and each other. This is an encouraging result and demonstrates exactly the role that these commercial multispectral image sensors were designed to fulfil. Calculating a normalized band ratio effectively removes the effects of reflectance measurement inaccuracy that we observed. Furthermore, our results from field calibration assessment 2 (Fig. 8) are reliable and comparable over time. This is significant for applications such as monitoring the development and state of health of agricultural crops or other vegetation over time. For UAS operators who provide imagery services for agricultural applications, we suggest periodic validation of field results using an approach similar to our second field calibration assessment, where field values obtained from a handheld NDVI sensor were compared to UAS sensor values. The Trimble GreenSeeker is not an expensive scientific instrument, and while we did not test its accuracy against the ASD in the laboratory, this sensor provides a good source of ground-based validation measurement that is independent from the multispectral UAS sensors. The GreenSeeker calculation of NDVI is based on an active laser pulse measurement versus measurement of reflected sunlight, so it should provide consistent readings regardless of atmospheric or changing solar illumination conditions. Though it lacks sophistication (such as the ability to record or geo-tag measurements), it is inexpensive and easy to use. Other investigators have shown that this simple instrument is a powerful field validation tool that can be used to evaluate the performance of UAS multispectral image products in agricultural applications (Pauly, 2014, Pauly, 2016). To perform a comparison between image pixels and ground check points, it is not necessary to co-register images, or use the expensive GIS software that we used. There are a number of open source, free, or inexpensive GIS and image analysis software packages available that can enable a user to perform the required tasks. The procedure we suggest for validating NDVI values includes the following steps: (1) establish NDVI plots in the field and take a measurement from the centre of the plot with the GreenSeeker sensor (or any other independent instrument); (2) produce an NDVI raster image; (3) locate the positions of the NDVI plots in the image; (4) extract the NDVI pixel values from the image at those locations; and (5) compare the image NDVI values to ground measured values. The key to this process is that the ground plots measured by the handheld instrument must be physically marked so that they are clearly visible in the mosaic imagery. Visual registration of the targets is the only way to be absolutely certain that the correct group of image pixels can be located to directly compare to ground-measured points. Users should not rely on coordinates obtained from generic handheld GPS units for marking the ground measurement plots, as these coordinates will not be accurate enough to ensure a precise comparison of measured values. Instead, GPS devices with centimeter accuracy, such as a survey-grade DGPS, should be used for this purpose; however, this equipment is typically very expensive and we don’t expect that all users would have access to this type of specialized equipment. 5. Conclusion This study used both laboratory and field calibration assessments to evaluate the performance of two multispectral sensors (MicaSense 144

ISPRS Journal of Photogrammetry and Remote Sensing 149 (2019) 132–145

S. Cao et al.

RedEdge and Airinov MultiSpec 4C) onboard two different Unmanned Aerial Systems (UAS). In the laboratory assessment, we compared reflectance values of twenty-nine different materials measured by an Analytical Spectral Device (ASD), to measurements of the same materials using a RedEdge and MultiSpec 4C sensor. We found that the RedEdge measurements, after being converted to radiance, had a strong linear relationship with ASD measurements, especially in the visible bands (blue, green, and red). The relationship between the ASD measurements and the MultiSpec 4C measurements was weaker, perhaps in part because the full radiometric correction model for this sensor was unavailable. The results from our field calibration assessments suggest that users should be cautious when using raw reflectance values generated from UAS multispectral image sensors. Based on the results from this study, we suggest that the accuracy of the raw reflectance values may be low, which has important implications for pixel-based image classifications that rely on UAS imagery, particularly if the imagery is used to track change through time. Instead, we suggest that normalized index images would be more reliable for this purpose than absolute reflectance, especially if classification or change detection is to be performed on sets of multispectral imagery collected over a range of time or under different illumination conditions. Based on our experience operating UAS multispectral sensors, and in light of the results from this study, we feel strongly that users of UAS technology should establish standard procedures for regularly testing the calibration of their instruments. While calibrating a consumer multispectral camera against a high-quality lab spectrometer is a technical and time-consuming process, we have provided a number of options that are both practical and accessible to most UAS operators. These calibration and validation methods can be performed in the field on a periodic basis, and will reliably verify sensor accuracy, as well as alert users to sensor drift or sudden changes in sensor performance. Ultimately, this will help to ensure that the data obtained from UAS are both high quality and reliable.

vehicle. IEEE Trans. Geosci. Remote Sens. 47, 722–738. Clemens, S.R., 2012. Procedures for correcting digital camera imagery acquired by the AggieAir remote sensing platform. Colomina, I., Molina, P., 2014. Unmanned aerial systems for photogrammetry and remote sensing: a review. ISPRS J. Photogramm. Remote Sens. 92, 79–97. Crusiol, L.G.T., Nanni, M.R., Silva, G.F.C., Furlanetto, R.H., da Silva Gualberto, A.A., Gasparotto, A.C., De Paula, M.N., 2017. Semi professional digital camera calibration techniques for Vis/NIR spectral data acquisition from an unmanned aerial vehicle. Int. J. Remote Sens. 38, 2717–2736. Dall’Asta, E., Forlani, G., Roncella, R., Santise, M., Diotri, F., di Cella, U.M., 2017. Unmanned aerial systems and DSM matching for rock glacier monitoring. ISPRS J. Photogramm. Remote Sens. 127, 102–114. Deng, L., Mao, Z., Li, X., Hu, Z., Duan, F., Yan, Y., 2018. UAV-based multispectral remote sensing for precision agriculture: a comparison between different cameras. ISPRS J. Photogramm. Remote Sens. 146, 124–136. Goldman, D.B., 2010. Vignette and exposure calibration and compensation. IEEE Trans. Pattern Anal. Mach. Intell. 32, 2276–2288. Hruska, R., Mitchell, J., Anderson, M., Glenn, N.F., 2012. Radiometric and geometric analysis of hyperspectral imagery acquired from an unmanned aerial vehicle. Remote Sens. 4, 2736–2752. Jin, X., Liu, S., Baret, F., Hemerlé, M., Comar, A., 2017. Estimates of plant density of wheat crops at emergence from very low altitude UAV imagery. Remote Sens. Environ. 198, 105–114. Kelcey, J., Lucieer, A., 2012. Sensor correction of a 6-band multispectral imaging sensor for UAV remote sensing. Remote Sens. 4, 1462–1493. Laliberte, A.S., Goforth, M.A., Steele, C.M., Rango, A., 2011. Multispectral remote sensing from unmanned aircraft: Image processing workflows and applications for rangeland environments. Remote Sens. 3, 2529–2551. Liu, K., Shen, X., Cao, L., Wang, G., Cao, F., 2018. Estimating forest structural attributes using UAV-LiDAR data in Ginkgo plantations. ISPRS J. Photogramm. Remote Sens. 146, 465–482. Neale, C.M.U., Crowther, B.G., 1994. An airborne multispectral video/radiometer remote sensing system: development and calibration. Remote Sens. Environ. 49, 187–194. Pauly, K., 2014. Applying conventional vegetation vigor indices to UAS-derived orthomosaics: issues and considerations. In: Proceedings of the International Society of Precision Agriculture (ICPA). Pauly, K., 2016, July. Towards calibrated vegetation indices from UAS-derived orthomosaics. In: Proc. of the 13th Int. Conf. on Precision Agriculture, International Society of Precision Agriculture, Monticello, Illinois. Pena, M.A., Cruz, P., Roig, M., 2013. The effect of spectral and spatial degradation of hyperspectral imagery for the Sclerophyll tree species classification. Int. J. Remote Sens. 34, 7113–7130. Pozo, S.D., Rodríguez-Gonzálvez, P., Hernández-López, D., Felipe-García, B., 2014. Vicarious radiometric calibration of a multispectral camera on board an unmanned aerial system. Remote Sens. 6, 1918–1937. Richards, J.A., 2013. Remote Sensing Digital Image Analysis: An Introduction, fifth ed. Springer, Berlin, New York. Sarma, D.D., 2010. Geostatistics with Applications in Earth Sciences. Springer Science & Business Media, pp. 31–35. Smith, G.M., Milton, E.J., 1999. The use of the empirical line method to calibrate remotely sensed data to reflectance. Int. J. Remote Sens. 20, 2653–2662. Stagakis, S., González-Dugo, V., Cid, P., Guillén-Climent, M.L., Zarco-Tejada, P.J., 2012. Monitoring water stress and fruit quality in an orange orchard under regulated deficit irrigation using narrow-band structural and physiological remote sensing indices. ISPRS J. Photogramm. Remote Sens. 71, 47–61. Toth, C., Józków, G., 2016. Remote sensing platforms and sensors: A survey. ISPRS J. Photogramm. Remote Sens. 115, 22–36. Wang, C., Myint, S.W., 2015. A Simplified Empirical Line Method of Radiometric Calibration for Small Unmanned Aircraft Systems-Based Remote Sensing. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 8, 1876–1885. Wang, D., Morton, D., Masek, J., Wu, A., Nagol, J., Xiong, X., Levy, R., Vermote, E., Wolfe, R., 2012. Impact of sensor degradation on the MODIS NDVI time series. Remote Sens. Environ. 119, 55–61. Woolley, J.T., 1971. Reflectance and transmittance of light by leaves. Plant Physiol. 47, 656–662. Yuan, C., Zhang, Y., Liu, Z., 2015. A survey on technologies for automatic forest fire monitoring, detection, and fighting using unmanned aerial vehicles and remote sensing techniques. Can. J. For. Res. 45, 783–792. Zarco-Tejada, P.J., Guillén-Climent, M.L., Hernández-Clemente, R., Catalina, A., González, M.R., Martín, P., 2013. Estimating leaf carotenoid content in vineyards using high resolution hyperspectral imagery acquired from an unmanned aerial vehicle (UAV). Agric. For. Meteorol. 171–172, 281–294. Webster, C., Westoby, M., Rutter, N., Jonas, T., 2018. Three-dimensional thermal characterization of forest canopies using UAV photogrammetry. Remote Sens. Environ. 209, 835–847. Zhang, C., Kovacs, J.M., 2012. The application of small unmanned aerial systems for precision agriculture: a review. Precis. Agric. 13, 693–712. Zheng, Y., Lin, S., Kambhamettu, C., Yu, J., Kang, S.B., 2009. Single-image vignetting correction. IEEE Trans. Pattern Anal. Mach. Intell. 31, 2243–2256.

Acknowledgements We thank Dr. Martin Sharp for loaning us the Airinov MultiSpec 4C and SenseFly eBee system. We thank Dr. Guillermo Hernandez Ramirez, Kris Guenette, and Meng Jin for their contributions towards the fieldwork portions of the study. We thank James Jackson for allowing us to access his wheat field as our field site. We thank Dr. Benoit Rivard and Dr. Jilu Feng for access to and assistance with the laboratory spectrometry equipment. This work was carried out with the aid of a grant from Mitacs Accelerate. Appendix A. Supplementary material Supplementary data to this article can be found online at https:// doi.org/10.1016/j.isprsjprs.2019.01.016. References Aasen, H., Bolten, A., 2018. Multi-temporal high-resolution imaging spectroscopy with hyperspectral 2D imagers–From theory to application. Remote Sens. Environ. 205, 374–389. Ambrosia, V.G., Wegener, S.S., Sullivan, D.V., Buechel, S.W., Dunagan, S.E., Brass, J.A., Stoneburner, J., Schoenung, S.M., 2003. Demonstrating UAV-acquired real-time thermal data over fires. Photogramm. Eng. Remote Sens. 69, 391–402. Assmann, J.J., Kerby, J.T., Cunliffe, A.M., Myers-Smith, I.H., 2018. Vegetation monitoring using multispectral sensors-best practices and lessons learned from high latitudes. bioRxiv, 334730. Berni, J.A.J., Zarco-Tejada, P.J., Suárez, L., Fereres, E., 2009. Thermal and narrowband multispectral remote sensing for vegetation monitoring from an unmanned aerial

145