ARTICLE IN PRESS
Computers & Geosciences 31 (2005) 900–912 www.elsevier.com/locate/cageo
Localising missing plants in squared-grid patterns of discontinuous crops from remotely sensed imagery$ J.M. Robbez-Massona,, J.C. Folteˆteb a
Agro.M, Laboratoire d’e´tude des Interactions Sol-Agrosyste`me-Hydrosyste`me, UMR LISAH Agro.M/INRA/IRD (no. 1221), 2 place Viala, 34060 Montpellier cedex 01, France b Universite´ de Franche-Comte´, UMR The´MA (no. 6049), UFR Lettres SHS, 32 rue Me´gevand, 25030 Besanc- on cedex, France Received 13 July 2004; received in revised form 1 February 2005; accepted 25 February 2005
Abstract The purpose of this work is to localise and characterise missing plants on very high resolution (VHR) aerial images of agricultural parcels, in the case of discontinuous crops like wine and olive tree, which are planted according to a squared-grid pattern. It aims to establish an assisted, image processing system for remote sensed images, allowing the inventory the missing or withering plants, and the monitoring of their evolution during time. The global approach considers the planted parcel as a topological graph of vertices, whose reciprocal location conforms to a set of geometrical rules about orientation and length. The proposed system initiates the graph from the original image; then it adds missing vertices and refines its knowledge of the spatial pattern on an iterative basis. Quality indicators are assigned at each added vertex, and several stopping criteria are estimated for each iteration, permitting an automated use of the algorithm. Test cases have been conducted on two data sets of three parcels each: olive groves and goblet vineyards. The results are compared to validation data. They show an efficient reconstruction of the geometry and satisfactory omission–commission errors; they allow drawing up a typology of the major errors, and propose calibration parameters based on a sensitivity analysis. The main improvements include essentially the preprocessing, filtering step of the initial image. The process is being used for Languedocian vineyards (France), and may be potentially usable for other problematic with the same kind of spatial patterns. r 2005 Elsevier Ltd. All rights reserved. Keywords: Remote sensing; Digital image analysis; Point pattern; Graph; Olea europae L.; Vitis vinifera L.
1. Introduction
$
Code on server at http://www.iamg.org/CGEditor/ index.htm. Corresponding author. Tel.: +33 4 99 61 25 22; fax: +33 4 67 63 26 14. E-mail address:
[email protected] (J.M. Robbez-Masson).
Numerous crops are planted according to a complex spatial pattern: vine (Vitis vinifera L.) and olive tree (Olea europae L.) are such examples covering millions of hectares each in southern Europe, and representing an obviously economic interest. For these crops, observation from the sky currently shows regularly spaced individuals, broadly separated
0098-3004/$ - see front matter r 2005 Elsevier Ltd. All rights reserved. doi:10.1016/j.cageo.2005.02.013
ARTICLE IN PRESS J.M. Robbez-Masson, J.C. Folteˆte / Computers & Geosciences 31 (2005) 900–912
from each other by bare soil. In the case of vineyards, four kinds of spatial patterns that are frequent in Mediterranean areas, can be distinguished (Champagnol, 1984). One among the most commonly encountered is named ‘‘goblet’’, i.e. regularly planted vines according to a roughly squared grid. This training mode is traditionally used in Languedoc-Roussillon (France) as in most Mediterranean countries of Europe, therefore it can be encountered in younger wine-producing countries too (Australia, New-Zealand, USA). Other kinds of plantations (e.g. in forestry) show similar patterns. The geometrical description of such patterns potentially gives information to the agronomists about crop density and yields, pesticide amounts (spread on a plant basis), runoff and erosion risks (interaction between row orientation, slope and aspect), etc. Pattern regularity is an obvious indicator for localising missing or withering individuals, assessing the spreading of pests, or giving guidance for ground sampling. It may also allow distinguishing between (irregularly located) weeds and (regularly planted) crops (Onyango and Marchant, 2003). Finally it may be used for characterising the vigour of each individual. More and more indeed, the agricultural management according to precision agriculture concepts, leads to change in the management practices from the parcel scale to the plant scale. Accurately locating each plant is therefore a preliminary need to be addressed (Tisseyre et al., 1999). Remote sensing (RS) is an invaluable information source for observing and monitoring the plant vigour, especially since very high spatial resolution (VHR) images became widely available: for vineyards, numerous studies were run in the optical domain (see the review by Hall et al. (2002)), and more and more in the hyperspectral one (Madeira Netto et al., 2004). Training modes have been assessed using Fourier transforms (Wassenaar et al., 2002) or wavelets (Ranchin et al., 2001). An approach for localising individual vines in wire-trained, well-maintained vineyards, has recently been developed (Hall et al., 2003). As far as olive trees are concerned, an automatic recognition and inventory tool at the plant scale has been realised (Kay et al., 1998), and the issue of separating weeds and olive trees has been resolved by simple radiometric means (Pen˜a-Barraga´n et al., 2004). Such approaches are currently being applied in the framework of the European MARS project,1 whose aims are the assessment of the production capacity and the implementation of an efficient land parcel identification system (Tsiligirides, 1998).
1 European Commission, 2004. Agrifish unit—Vineyards and olive tree activity, http://mars.jrc.it/marspac/olivine/ default.htm.
901
While more effort is usually dedicated to locating and characterising present plants in the parcels by RS, none of the current projects have taken an interest in the missing ones. This is one of the objectives of the research program named GesSol (Lagacherie et al., 2001, 2004), whose purpose is to develop a characterisation approach for withering vineyards. Our work takes place into the framework of the GesSol program. It deals with the issue of searching for missing plants in discontinuous crops, by using aerial digital images. In the context of this kind of crop, we argue that a spatial analysis-based approach focused on individualised objects is more relevant than the usual approach based on image processing, even if digital images are the generic medium of most spatial data. Our approach might be usable for other crops arranged according to a squared pattern, as in numerous kinds of orchards. In the following text, whatever the plant species (vine, olive tree or other) and its age, individuals will be named ‘‘plants’’ when referring to the actual object.
2. Materials and methods Starting from images of cultivated parcels where individual plants are separated on the ground, we propose hereafter to consider them as discrete objects. Therefore a first preprocessing step locates each plant (Section 2.1) and produces a set of objects, on which an original algorithm is applied to find missing plants (Section 2.2). The indicators allowing the control of the geometrical and thematic quality of the precedent steps are presented in Section 2.3. Then test cases have been developed (Section 2.4) and the evaluation conditions are presented (Section 2.5). 2.1. Recognition of plants as elementary objects Starting from a grey level image, the first step involves recognition of visible plants from soil, as clusters of adjacent pixels, to create a categorical image with two classes. Assuming the strong difference between their respective reflectance, the specific radiometric behaviour of these two elements should allow discriminating them by simple thresholding. This assumption is supported by earlier works aiming at extracting vegetal elements from grey level images of crop parcels (Hall et al., 2003; Marchant, 1996; Perez et al., 2000; Tillett et al., 2001; Aitkenhead et al., 2003). In the case of rather irregularly maintained crop parcels, some plants are missing or withering, and weeds may fill a certain area. Therefore the image thresholding leads to numerous confusions, because numeric values representing either soil or plant have a strong variability. Locally, the identification proves to be easier if the plant’s grey
ARTICLE IN PRESS 902
J.M. Robbez-Masson, J.C. Folteˆte / Computers & Geosciences 31 (2005) 900–912
levels are compared with the values of their spatial neighbourhood, i.e. using a filtering of the initial image. With olive groves, Kay et al. (1998) used a morphological approach to recognise blobs of given size, compacity and radiometry, but the approach failed to proceed efficiently with relatively connected vines. In the case of wire-trained vineyards, Wassenaar et al. (2001) used conjointly radiometry and morphology; but their work is not aimed at solving the case of goblet vineyards. We tried several textural enhancement processors, like the ‘‘extreme value sharpening’’ filter (Klette and Zamperoni, 1996) and the LoG filter (Sotak and Boyer, 1989), but the results were not satisfactory. That is why a specific filtering procedure has been adapted to the particular structure of our crop images. Accordingly, a specific filtering procedure has been adapted to the particular structure of our crop images mainly made of dark dots on a light background: at each cell, we computed the difference between two arithmetic means: one taken on image values collected in a small ring-shaped neighbourhood, and one collected in a larger one. This difference is expected to be lower for plants than for soils. Then the filtered image is thresholded in order to separate the two classes in the form of unambiguous ‘‘blobs’’ representing one single plant each. This was done here in an interactive way by choosing the threshold value and visually checking the obtained result. The last step consists of identifying in the two-class thresholded image the adjacent pixels which belong to the ‘‘plant’’ class by a labelling process (Rosenfeld and Kak, 1982) and to get their centroid coordinates. 2.2. Identification of missing plants Centroids of plants identified above are now considered as vertices in a graph. The main principle of the algorithm is to build and use the topological and geometrical relationships between these vertices, in order to identify the ‘‘missing vertices’’ from the properties of the edges connecting existing ones. Lines are assumed to be roughly perpendicular; then determining missing vertices only needs the knowledge of a single orientation angle ap . A preliminary estimate of the inter-plant spacing ep is also assumed. These global parameters can be either measured on the ground, on the computer screen, or automatically estimated. 2.2.1. Definition of the graph Let S ¼ fSk g, k ¼ 1 . . . n be the set of n vertices. The graph A ¼ fAij g, i ¼ 1 . . . n, j ¼ 2 . . . ði 1Þ describes all the edges between each pair (Si, Sj) of vertices. Each edge Aij is described by its parent vertices Si and Sj, its length l and its orientation angle a. Given l and ep , the
integer number of intervals between Si and Sj can be estimated (see further on). Furthermore, each edge will be considered as ‘‘active’’ in the graph if three conditions are jointly satisfied:
First-order contiguity: Si and Sj are contiguous at the
first order if no other vertex interposes itself between them. Considering n circles Ck of radius r centred at each vertex, contiguity is defined as a Boolean function set to false if the edge Aij intersects one of the ðn 2Þ circles Ck,k6¼i,k6¼j, and true otherwise. Conform orientation: the orientation a of the edge Aij is in compliance if it is coarsely similar, either to the global orientation ap of the parcel, or to its perpendicular orientation, with a tolerance of da7 ða ap Þmod 90odaðdegreesÞ, ða ap þ 90Þmod 90oda. The use of an angular tolerance aims to compensate for the small imperfections of the rows between the identified vertices. Indeed, these imperfections can have a strong influence on the search for the valid edges. However, it must be avoided that this angular tolerance draws lines between plants pertaining to different alignments, whatever the length l of the edge be. If l corresponds to an estimated number m of intervals, then this constraint is defined by 1 1 daptan . 2ðm þ 1Þ
Minimal
distance: The edge Aij linking Si and Sj is sufficiently distant from all other vertices of S, i.e. dðAij ; S k Þ4dr;
8k ¼ 1 . . . n; kai; kaj.
This condition can be tested by checking the absence of intersection between the edge and all the circles of radius dr (dror) centred on Sk , kai, kaj (Fig. 1). Let AOK be the subset of selected edges from A as defined by the three above conditions. 2.2.2. Properties of the selected edges Let us now consider crop parcels with plants separated by a quite regular spacing of ep . In the case of a parcel without missing plants, the length l of each edge is similar to the spacing ep . The distribution histogram of the lengths is very concentrated around the modal value mode(l). In the case of a parcel containing missing plants, the expected length of the majority of the edges is still similar to ep , but may be equivalent to a multiple of ep too. It is assumed that the value mode(l) constitutes a good estimate of ep . Then the number of
ARTICLE IN PRESS J.M. Robbez-Masson, J.C. Folteˆte / Computers & Geosciences 31 (2005) 900–912
903
splitting one edge of length l into two edges of length smaller and closer from the modal value. For this reason the statistical distribution of distances l becomes increasingly grouped, and the mode is increasingly stable. 2.2.4. Computation of the coordinates of a new vertex Considering the graph AOK for a given iteration, let l max be the maximal value of l in AOK and Aij the corresponding edge. If m is the estimate of the number of missing vertices, let Sk be the vertex to be added. The point P0 is defined as
the point located at an equal distance between Si and Fig. 1. Tolerance angle da around a for searching complying orientations.
Sj, if mo2.
the point located at an interval of distance mode(l) of Si on the segment linking Si and Sj if mX2.
missing vertices m on an edge of length l can be estimated as l , m ¼ int modeðlÞ ‘‘int’’ being the function ‘‘lower integer to’’. In this way, all the edges like m41 lead to detection of the anomalies of the complete graph. 2.2.3. Algorithm steps Due to interrelationship between some parameters, the value of ep needs to be initiated. Then the algorithm (named reconstruction algorithm thereafter) operates by iterations. Each iteration consists of three steps:
In the first case, one single vertex is missing then the new vertex is positioned onto the point P0. In the second case, because several vertices are to be successively positioned, errors due to the estimation of mode(l) may arise. Then the model will look for a second orientation line perpendicular to Aij: if there are two vertices Si0 and Sj0 whose edge Ai0 j0 intersects Aij and whose orientation makes a roughly perpendicular angle with Aij, let us define the point P1 as the intersection between Aij and Ai0 j0 . If P1 lies in a range of consistent distances of its four parent vertices, i.e. an integer number of spacing, then the new vertex Sk is located into P1, else the P0 location is kept. Fig. 2 illustrates this operation.
(1) the graph A is created or updated, then AOK is derived from A and mode(l) is estimated; (2) the edge Aij of maximum length max(l) is selected in AOK; (3) a new vertex is added on the edge Aij at a precise distance from its parent vertices (cf. Section 2.2.4).
Si’
Sj
there are no more edges whose length l is higher than
2 mode(l); a predefined number of new added vertices has been reached. This option allows defining a posteriori the optimal number of new vertices to be kept (cf. Section 2.3).
At each iteration, the most frequently observed spacing mode(l) is estimated and then a vertex is added,
=
At the first iteration, the graph consists only of vertices identified after the image preprocessing step. Then the next iterations lead to create new vertices until a stopping criterion is satisfied. Two stopping criteria can be chosen by the user:
P1 P0
Si
Sj’
ep Fig. 2. Defining new coordinates during reconstruction: case of a spacing between Si and Sj greater than 2 modal spacings ep (mX2).
ARTICLE IN PRESS 904
J.M. Robbez-Masson, J.C. Folteˆte / Computers & Geosciences 31 (2005) 900–912
2.3. Control criterion of geometrical regularity The set of plants builds a spatial point pattern. An )ideal parcel* should show very regular rows of plants, without holes. If we consider this perfect parcel as a theoretical reference, the regularity level of the spatial point pattern can be used as a control criterion for the different processing steps. The aggregation coefficient R (Clark and Evans, 1954; Morrison, 1970) is used here for measuring the regularity level of the point pattern. It is usually employed in several domains to discriminate between concentrated, random and regular spatial point patterns. This index is based on the nearest neighbourhood distance of each point: the average values of the observed distances are compared to the average value expected in the case of complete spatial randomness, according to the Poisson law: !, rffiffiffi n 1X n R¼ di 2 , n i¼n s where di is the nearest neighbourhood distance of the point i, n the total number of points and s the area of the studied zone. In the case of a perfectly maintained crop parcel, the spatial point pattern would be a regular, squared grid involving an R value of 2. Then an R value smaller than 2 can be used as an indicator of the geometrical disorder
of the plants, for example if whole parts of the parcel are composed of missing plants, or if weeds or trees are present. 2.4. Data sets Two data sets of three parcels each have been used in order to test the algorithm: (i) olive tree parcels, and (ii) vine parcels (Fig. 3) Each parcel has been characterised by its major row orientation angle ap (degrees) and row spacing ep by a Fourier transform (Wassenaar et al., 2002). 2.4.1. Olive tree parcels These parcels have been extracted from the panchromatic band of a 1-m resolution orthophoto around Arles (France—lat. ¼ 431410 N, lon. ¼ 41520 E). This image was acquired by IGN in September 1997 for the needs of the European OLISTAT project, whose aim is the monitoring of olive groves (Kay et al., 1998; Masson, 2002). Olive trees are currently visible in the form of distinct, unambiguous dark blobs on a background of light soil, which has been homogenised by the cultivation practices. 2.4.2. Vineyard parcels Three parcels have been chosen within the lower ‘‘La Peyne’’ watershed, around Montpellier (France— 431320 N, 31200 E). This catchment area of about 70 km2,
Fig. 3. Images of data set.
ARTICLE IN PRESS J.M. Robbez-Masson, J.C. Folteˆte / Computers & Geosciences 31 (2005) 900–912
strongly dominated by vine cultivation, is representative of the French Mediterranean coastal plain. In the framework of research programs aiming at recognising the soil surface features by RS data (Wassenaar, 2001), aerial photographs were acquired by helicopter at several dates and transformed into orthophotos. The chosen image was taken in May 1998 into the visible domain and resampled to a 0.25 m spatial resolution. The red band was retained, because it is intended to provide the best contrast between vine (low reflectance) and soil surface (high). In this case, soil, shadow and vegetation are far more intricated than for olive trees. 2.5. Validation conditions We did not compare directly the obtained results with ground data in our case—first because both images and validation data should have been collected at the very same time—but more because it would have combined together errors between ground and image on the one hand, and modelling errors from the image on the other hand. It was chosen here to get validation data directly from the image interpretation, as a database of individually digitised points informed as present or missing, according to the spatial pattern made out by a trained expert. For each parcel, the visually recognised points Pv (plants and missing ones) are compared to the whole set of automatically recognised points Pa. The two spatial point patterns are matched by means of a spatial join: each point Pa is joined to its closer neighbour among the points Pv, if the distance between them is not above an a priori fixed value (for the present application, it corresponds to a third of the initial modal distance). Some points Pa have no corresponding points Pv: they constitute the commission error; some other points of Pv were not recognised by the reconstruction algorithm and are omission errors. The matching process yields two matrices:
Geometrical
error matrix: The errors are counted without considering the ‘‘plant’’ and ‘‘missing one’’ categories, i.e. the accuracy is measured only from a globally geometrical point of view. Total error matrix: It holds both for commission and omission errors. The plants which are not recognised by the preprocessing step and that are identified as missing ones by the reconstruction step are then counted as errors.
At each iteration, the two matrices allow the assessment of the global quality by means of a Kappa coefficient (Congalton, 1991), named‘‘geometrical Kappa’’ index Kg and ‘‘total Kappa’’ index Kt, respectively.
905
3. Results Olive tree and vine images were filtered, and then thresholded on a visual basis in order to keep only unique blobs on the label images. Results obtained on the selected parcels will be first described globally in terms of behaviour and performance (Section 3.1); secondly, omission and commission errors will be quantified (Section 3.2) while a typology of the main errors encountered will be proposed (Section 3.3); finally a sensitivity analysis will be conducted on the filtering, thresholding and stopping modalities, with the aim to identify the most appropriate operation conditions of the proposed method (Section 3.4). Results take into account 1022 olive trees and 3143 vines. 3.1. Geometrical performance of the reconstruction step For each parcel, R, Kg and Kt are computed for the spatial point pattern before adding the first new vertex and after stopping the processing (Table 1). The systematic increase of R shows that adding new vertices leads logically to a more regular pattern. This increase is all the more important since the number of missing plants to be found is greater. Increase of the two Kappa indices represents the likelihood gain obtained at the end of the recognition step. Therefore, increase of Kg is clearly greater than for Kt. This difference is due to preprocessing errors, since some plants that were not identified during this step were rebuilt later by the reconstruction step (Table 2). Roughly 2/3 of the vertices are located more accurately than one image cell, and 95% more accurately than two cells. The 1% of the worst matched points (i.e. with the greatest distances) are almost always located at the parcel borders, or—in the case of parcel V77—where two big trees are actually situated (north). The standard deviation s of matching distances is in almost all cases clearly less than the image resolution, and considered as satisfactory. The parcel O02 is the only one with a greater value, to be directly related with the western part of the image, in which the olive trees do not demonstrate a regular pattern at all. 3.2. Thematic performance Beyond the values of global indicators presented earlier, numerous extra information allows establishing and displaying the thematic performance (Fig. 4): results of each step can be overlaid on each other, and qualitatively or quantitatively analysed in a GIS. As validation data are available, one can compute the omission and commission errors, locate and explain them. In the case of olive tree parcels, 71% of actually missing trees (22/31) are recognised as such.
ARTICLE IN PRESS J.M. Robbez-Masson, J.C. Folteˆte / Computers & Geosciences 31 (2005) 900–912
906
Table 1 Evolution of control criteria from beginning to end of processing R coefficient
Parcel
Kt coefficient
Kg coefficient
Beginning
End
Beginning
End
Beginning
End
O01 O02 O03
1.778 1.663 1.858
1.800 1.759 1.868
0.941 0.757 0.979
0.974 0.838 0.990
0.939 0.757 0.979
0.966 0.772 0.990
Variation for olive groves (O) (%) V40 V41 V77
1.603 1.652 1.612
1.711 1.743 1.698
0.746 0.790 0.810
0.897 0.927 0.931
0.746 0.784 0.810
+2.5
Variation for vineyard parcels (V) (%)
+5.1
+5.9
+2.0
+17.5
0.768 0.810 0.898 +5.7
Table 2 Statistical distribution of matching distances between validation points and vertices: quantiles of relative distances to modal spacing d max =e (%), for 67, 90, 95 and 99%, and standard deviation of distances Parcel
O01 O02 O03 V40 V41 V77
Cumulative frequency 0.67
0.90
0.95
0.99
0.15 0.15 0.14 0.16 0.17 0.13
0.21 0.23 0.24 0.20 0.26 0.19
0.26 0.34 0.30 0.22 0.31 0.22
0.59 0.90 0.48 0.26 0.83 0.72
Spacing e (m)
Resolution (m)
Std. Dev. s (m)
7.00 8.00 9.00 1.50 1.50 1.75
1.00 1.00 1.00 0.25 0.25 0.25
0.71 1.31 0.88 0.10 0.19 0.20
Values to be compared with row spacing and resolution of image. For instance, 90% of vertices for parcel V77 are located closer than 0.19 modal spacing to a validation point.
Unrecognised trees are either located at the parcel borders (O01 and O02 parcels), or marked by the algorithm as being incorrectly located due to the irregularity of the spatial pattern (O02). In this last parcel, commission occurs for the same reason. Globally, these well-aligned trees, having a good contrast with a rather plain background, lead to satisfactory results. As far as vineyards are concerned, there are 278 actually missing vines, of which 69% are directly recognised by the program. Omission come mainly from orientation irregularity (in V77 parcel mean orientation was approximately 71 different between the northern and southern part), and in the same proportion from actual missing vines located at the parcel borders (V40 and V77). 90% of commission errors were due to the fact that we chose, during the preprocessing, the threshold level in order to eliminate blobs made from several vines. This choice made a lot of actual plants disappear and the algorithm
built them again, but marked them as missing instead of present. 3.3. Typology of main errors Examination of test cases allows proposing a typology describing the most widely encountered errors and their occurrence conditions. It does not have the aim to enumerate all the possible conditions—because some errors can cancel themselves or conversely be linked together—but just to characterise the main limits of the system in order to improve it. Fig. 5 schematises these cases. 3.3.1. Omission errors
O1: The missing plant lies outside of convex polygon containing the set of labelled vertices. This error is due to the current location function that is only able
ARTICLE IN PRESS J.M. Robbez-Masson, J.C. Folteˆte / Computers & Geosciences 31 (2005) 900–912
907
Fig. 4. From aerial image (top left), preprocessing step extracts each parcel—here, V41—filters it and thresholds result (top right). On other hand, validation data are available (bottom left). Thus results of algorithm can be assessed after mapping (bottom right). Finally all these steps can be overlaid for interpreting possible causes of error.
to generate a new vertex between two existing vertices. An enhancement would be to allow the generation of new vertices outside the links between two existing vertices. O2: A vertex corresponding to an actual missing plant has been generated, but too close to an existing vertex (S4 in the Fig. 5). Therefore, such vertices like S4 are mainly border artefacts created by our current preprocessing step. They may also correspond to an unexpected location of the neighbouring vertices. Currently the procedure does create the vertex S4 but marks it with a special value of the quality attributes, so it can easily be deleted a posteriori. A more efficient preprocessing filter would greatly reduce this problem. O3: Other cases. There just are no existing couples of vertices making edges with consistent geometrical properties (angle and length). This could be due to changes in planting pattern inside of the parcel, or to differential effects of the preprocessing according to plant vigour.
has been generated during labelling where n must have been. This vertex lies between two existing others. The locating algorithm will generate up to ðn þ 1Þ new vertices when no one should have been. C2: The context is similar to C1 but the blob is made of an odd number n of plants. The vertex lies on an existing plant if the shape of the blob is rather linear (the most common case). C3: The preprocessing step erroneously removed the less vigorous plants. The reconstruction algorithm generates again a vertex considered as ‘‘missing’’, when actually there is an existing plant. C4: Complex shape of the parcel: the algorithm creates a new vertex outside of the parcel limit. A special attribute keeps track of this, so that such vertices can easily be deleted a posteriori. Co: Other cases, like O3.
C1, C2 and C3 errors may easily be corrected a posteriori, examining the mean grey level of the original image under the new generated vertices.
3.3.2. Commission errors
3.4. Sensitivity analysis
C1: The preprocessing step generated a blob actually
Sensitivity analysis involves assessing the influence of the preprocessing options (neighbourhood size,
made of an even number, n, of plants. A single vertex
ARTICLE IN PRESS 908
J.M. Robbez-Masson, J.C. Folteˆte / Computers & Geosciences 31 (2005) 900–912
Fig. 5. Main error cases.
thresholding) and the automatic stopping criteria on the recognition quality. Concerning the neighbourhood size, tests have shown that modal spacing could be considered as the optimal diameter of the external neighbourhood. Bigger windows lead to the aggregation of many individual plants, whereas smaller windows lead to splitting the visually recognised plants into several blobs. Considering a certain neighbourhood size, the same filtering can provide different results in relation to the thresholding value. For assessing the relationship between this choice and the global performance of the method, three thresholding rules have been compared:
maximisation of R, named Rmax; maximisation of the number of
plants, named
Nbmax; visual interpretation and minimisation of the patches which represent more than a single plant, named Visual.
For all the possible threshold values, the relevance of these three criteria is evaluated by Kg, computed f
rom the matching between the resulting points (plants and missing ones) and the visually recognised points (Fig. 6). The curves of the parcels O01 and O03 are very similar and prove to be of little interest, because recognition here is relatively easy and the first two criteria give the same results. The curves of the other parcels show that Rmax seems not to be very efficient, being always lower than others. The maximal values of Kg are obtained with Nbmax or Visual, the difference being generally small. On the other hand, if one takes Kt into account too (Table 3), the Nbmax criterion proves to be the more efficient. Such advantage is the consequence of the definition of the maximum number of plants by thresholding that minimises the C3 error. The obtained results show that Nbmax is the more appropriate automatic option. However, note that the use of Nbmax may involve the creating of blobs composed of several plants, and that an automatic thresholding can be easily controlled by visual interpretation. The results presented above were derived from the ‘‘maximal length’’ stopping criterion (named Lmax). Other criteria can be used to automatically stop the reconstruction step. In this way, the maximal value of R (Rmax) can be compared with Lmax. For each criterion the number of missing plants is compared to the number corresponding to the maximal value of Kt (Table 4). The stopping of the algorithm defined by the two criteria is often close to the maximal value of Kt: these results show that both criteria are similar and relevant. However, parcels O02 and V41 involve a higher deviation, showing that one must be cautious about the use of automatic stopping criteria.
4. Discussion and perspectives 4.1. Benefits The proposed approach yields excellent to satisfactory results according to the complexity of the spatial patterns in the original images. With results obtained on two diverse data sets, one can observe that:
The algorithm rebuilds the geometry of an incomplete
graph of vertices—with a location accuracy of all vertices being better than 25% of the mean spacing in more than 90% of the cases. Recognition and placement of missing plants leads to omission and commission errors that are either small—or greatly improvable if using best quality images available and a more efficient preprocessing procedure.
ARTICLE IN PRESS J.M. Robbez-Masson, J.C. Folteˆte / Computers & Geosciences 31 (2005) 900–912
909
Fig. 6. Global behaviour of method in relation to thresholding criterion.
Table 3 Maximum Kappa coefficients in relation to thresholding criterion Parcel
Thresholding criterion
Maximum Kg index
Maximum Kt index
O01
Rmax Nbmax
0.976 0.974
0.955 0.966
O02
Rmax Nbmax Visual
0.794 0.869 0.857
0.784 0.858 0.788
O03
Rmax Nbmax
0.973 0.990
0.959 0.990
V40
Rmax Nbmax Visual
0.869 0.909 0.898
0.826 0.869 0.768
V41
Rmax Nbmax Visual
0.899 0.915 0.929
0.844 0.864 0.811
V77
Rmax Nbmax Visual
0.872 0.923 0.931
0.825 0.894 0.898
Recall that for O01 and O03 parcels, Nbmax and Visual results are equal.
Table 4 Number of missing vertices found according to stopping criteria Parcel
Kmax
Dmax
Rmax
O01 O02 O03
18 25 4
18 29 4
17 29 4
V40 V41 V77
190 156 185
191 151 185
191 141 183
The
stopping criteria using Lmax and Rmax are helpful in setting limits outside of which the search of new vertices is completed.
The algorithm produces—from a parcel database overlaying a (very) high spatial resolution image—a geographical database of vertices directly usable in a GIS, into the precision agriculture framework as for vinecrawler (Hall et al., 2003). Vertices are described by numerous specific quality indices—that can be mapped individually or in combination—allowing the photointerpreter to fit the model onto its specific needs. We
ARTICLE IN PRESS 910
J.M. Robbez-Masson, J.C. Folteˆte / Computers & Geosciences 31 (2005) 900–912
assessed the quality of the model and produced a typology of the most frequent errors, potentially helpful for qualifying the results and improving the recognition process. The prior knowledge required for processing a parcel reduces to a rather coarse estimate of the row spacing and of their main orientation. A set of tolerances expressed as distances or angles, have to be set by the user (the program proposes default values, whose relevance has been verified by our experiments). From a methodological point of view, the proposed approach lets to co-operate the radiometric and spatial dimensions of the information, through the building of a point graph whose geometrical and topological properties are exploited. Principle aims to be generic, potentially usable on other kinds of images than aerial photographs of crop areas: images characterised by squared, regular patterns (electronic parts, industrial devices, urban elevation gridsy) on which location and control of irregularities are searched for. 4.2. Drawbacks Performance is strongly linked to the quality of the preprocessing step: while an excellent recognition of missing vertices can be performed on an original image with unambiguous blobs, the quality of the original image influences the geometrical (location of new vertices) and thematic (omission and commission errors) qualities. Parcel borders (if not accurately digitised), connected blobs (inappropriate filtering and thresholding) and parcel shape (concavity of the perimeter, inclusions) may introduce consequent errors, and until now this process could not be used without postchecking made by a trained operator, using the set of quality indicators given by the program. 4.3. Perspectives Performance and potential of the tool are currently being checked on Languedocian vineyards in the framework of the GesSol program (Labrador et al., 2003; Llobera and Robbez-Masson, 2002), in which around 96 stands of 200 vines each have their vigour monitored for 2 years, and for which very high spatial resolution images have been taken with a slow drone (Asseline et al., 1999). The use of a near-infrared band might allow a best discrimination of vegetation, so might a wise combination of bands, as suggested by Pen˜a-Barraga´n et al. (2004). Hyperspectral imagery would be able to capture more accurate information about vine vigour and water balance, compensating for its currently coarse spatial resolution by a very detailed radiometry (Madeira Netto et al., 2004).
Improvement of the filtering step might follow several alternative or complementary avenues:
weighting
the filter kernel by the number of cells taken into account, in order to reduce the border artefacts; using a local normalisation of the grey levels, allowing a unique threshold value for the whole parcel; using morphological operators like ultimate eroded sets (Matheron, 1975; Serra, 1988).
A posteriori, an overlay of all vertices (either considered as present, or added by algorithm) on the initial image should reduce most of O2 and C3 errors, which cause by far the most frequent problems. First trials conducted onto the vineyard images seem to be very encouraging.
5. Conclusion A novel algorithm for identification and localisation of geometrical irregularities on images representing point grids has been implemented. It uses both radiometric and spatial dimension of images by the creation of a vertex graph whose geometry and topology are exploited. It allows the reconstruction of a complete graph on an iterative basis, by relying on the evolution of local and global quality indicators. Conditions and limitations of use, as sensitivity of the system, have been explored with relevant data sets. Results show that the approach could be used in the framework of precision agriculture on vineyard and orchard images, which are becoming increasingly and widely available. Potential applications include, of course, inventory and monitoring operations on vegetable biomass, but also on variables or processes made known by their variability (diseases and pests, soil or erosion problems).
Acknowledgements We would like to acknowledge receipts of Grants from French Ministe`re de l’E´cologie et du De´veloppement Durable (GesSol program no 1: environmental functions of soil), which permitted us to explore the ideas discussed in this paper. Many thanks to Philippe Lagacherie who managed the contract and gave us the opportunity for this work. Jean-Marc Robbez-Masson is grateful to Ecole Nationale Supe´rieure Agronomique de Montpellier (Agro.M) for funding a 3-month research visit, and to Universite´ de Besanc- on for welcoming him during this time. The image of olive groves has been downloaded from the MARS Project
ARTICLE IN PRESS J.M. Robbez-Masson, J.C. Folteˆte / Computers & Geosciences 31 (2005) 900–912
911
Fig. 7. Screen copy of program implemented. Central window shows processing: on labelled image (coloured blobs), reconstruction algorithm locates new vertices one by one—here five were added, according to operator’s instructions. These are entered into right window: initialisation of orientation and spacing parameters; stopping criterion (either maximal length Lmax, or a given number of new vertices); some progression indicators are given here (e.g. total number of vertices, including new ones), like in lower part of left window, where statistical parameters are iteratively updated.
web site,2 and we thank the Joint Research Centre of European Commission for the availability of their work. Vineyard image has been taken by Tom Wassenaar. Thanks to David Pleydell and to an anonymous reviewer for help with English.
Appendix A The software implemented for this work can be used in interactive and in batch mode. It uses raw or Idrisi images. Attributes of edges, locations and attributes of vertices are saved as tabulated text files. These are directly readable in a spreadsheet, a DBMS or a GIS as tables or point shapefiles. Each vertex is identified unambiguously by:
Two key fields—parcel and vertex ID#—in order to
create a single database of plants covering a whole administrative area. Its complete lineage: IDs of parents, detailed or global quality indicators, and the whole set of statistical indicators at the moment of its creation.
The statistical distribution of edge lengths and orientation is also saved. Fig. 7 shows a working session. Code has been written in Pascal for Windowss. The executable in French and English languages, a short help and some tutorial data sets are freely available from the authors,3 or from the IAMG server. 2 3
http://mars.jrc.it/marspac/olivine/olicount.htm. http://sol.ensam.inra.fr/Produits/Vineyard.
References Aitkenhead, M.J., Dalgetty, I.A., Mullins, C.E., McDonald, A.J.S., Strachan, N.J.C., 2003. Weed and crop discrimination using image analysis and artificial intelligence methods. Computers and Electronics in Agriculture 39 (3), 157–171. Asseline, J., De Noni, G., Chaume, R., 1999. Note sur la conception et l’utilisation d’un drone lent pour la te´le´de´tection rapproche´e (on the conception and use of a slow drone for close remote sensing). Photo Interpre´tation: Images Ae´riennes et Spatiales 37 (3–9), 42–47. Champagnol, F., 1984. Ele´ments de Physiologie de la Vigne et de Viticulture Ge´ne´rale. Dehan, Montpellier 351pp. Clark, P., Evans, F., 1954. Distance to nearest neighbor as a measure of spatial relationships in populations. Ecology 35, 445–453. Congalton, R.G., 1991. A review of assessing the accuracy of classifications of remotely sensed data. Remote Sensing of Environment 37 (1), 35–46. Hall, A., Lamb, D.W., Holzapfel, B., Louis, J., 2002. Optical remote sensing applications in viticulture—a review. Australian Journal of Grape and Wine Research 8 (1), 36–47. Hall, A., Louis, J., Lamb, D., 2003. Characterising and mapping vineyard canopy using high-spatial-resolution aerial multispectral images. Computers & Geosciences 29 (7), 813–822. Kay, S., Le´o, O., Peedell, S., 1998. Computer-assisted recognition of Olive trees in digital imagery. In: Proceedings of ISPRS conference. Budapest, pp.6. Klette, R., Zamperoni, P., 1996. Handbook of Image Processing Operators. Wiley, Chichester, UK 416pp. Labrador, M., Robbez-Masson, J.M., Asseline, J., 2003. Analyse des relations sol-vigne (analysis of soil-vine relationship). In: Proceedings of SAlon des Technologies de l’Image et du Son (SATIS), Paris. Lagacherie, P., Collin-Bellier, C., Goma-Fortin, N., 2001. Evaluation et analyse de la variabilite´ spatiale de la mortalite´ des ceps dans un vignoble languedocien a` partir
ARTICLE IN PRESS 912
J.M. Robbez-Masson, J.C. Folteˆte / Computers & Geosciences 31 (2005) 900–912
de photographies ae´riennes a` haute re´solution (Assessment and analysis of spatial variability of vine plant mortality from high-resolution aerial images). Journal International Science Vigne Vin 35 (3), 141–148. Lagacherie, P., et al., 2004. De´gradations physiques des sols de vigne et impacts sur la ressource en eau en milieu me´diterrane´en viticole. Ministe`re de l’ame´nagement du territoire et de l’environnement; Programme GESSOL, Montpellier 11pp. Llobera, P., Robbez-Masson, J.M., 2002. Contribution a` l’e´tude du de´pe´rissement du vignoble languedocien. Inventaire automatise´ de pieds de vigne manquants sur des photographies ae´riennes a` haute re´solution traite´es dans un SIG, et recherche de causes explicatives au de´pe´rissement (contributing to study withering of languedocian vineyards. Automated inventories of missing vine plants on highresolution aerial photographs in a GIS, and research of explanatory causes). In: Proceedings of Journe´es Nationales d ‘Etude des Sols (JNES), Orle´ans. Madeira Netto, J.S.d., Bedidi, A., Martins, E., RobbezMasson, J.M., 2004. Spectral reflectance as a tool to discriminate soil types in the la Peyne watershed (France). In: Proceedings of Global Workshop on Digital Soil Mapping (IUSS/Pedometrics), Montpellier. Marchant, J.A., 1996. Tracking of row structure in three crops using image analysis. Computers and Electronics in Agriculture 15 (2), 161–179. Masson, J., 2002. Report of the JRC activities on Olive GIS in the 5 Member States (FR, GR, I, SP, PO). European Commission—Unite´ surveillance de l’agriculture par te´le´de´tection (MARS Unit), Ispra 38pp. Matheron, G., 1975. Random Sets and Integral Geometry, 1st edn. Wiley, New York 288pp. Morrison, J.J., 1970. A link between cartographic theory and mapping practice: the nearest neighbour statistic. Geographical Review 60, 494–510. Onyango, C.M., Marchant, J.A., 2003. Segmentation of row crop plants from weeds using colour and morphology. Computers and Electronics in Agriculture 39 (3), 141–155. Pen˜a-Barraga´n, J.M., Jurado-Expo´sito, M., Lo´pez-Granados, F., Atenciano, S., Sa´nchez-de la Orden, M., Garcı´ a-Ferrer, A., Garcı´ a-Torres, L., 2004. Assessing land-use in olive groves from aerial photographs. Agriculture, Ecosystems & Environment 103 (1), 117–122.
Perez, A.J., Lopez, F., Benlloch, J.V., Christensen, S., 2000. Colour and shape analysis techniques for weed detection in cereal fields. Computers and Electronics in Agriculture 25 (3), 197–212. Ranchin, T., Naert, B., Albuisson, M., Boyer, G., Astrand, P., 2001. An automatic method for vine detection in airborne imagery using wavelet transform and multiresolution analysis. Photogrammetric Engineering and Remote Sensing 67 (1), 91–98. Rosenfeld, A., Kak, A.C., 1982. Digital Picture Processing, second ed. Academic Press, Orlando, FL 349pp. Serra, J., 1988. Image Analysis and Mathematical Morphology, II: Theoretical Advances. Academic Press, London 411pp. Sotak, G.E., Boyer, K.L., 1989. The Laplacian-of-Gaussian kernel: a formal analysis and design procedure for fast, accurate convolution and full-frame output. Computer Vision, Graphics and Image Processing 48, 147–189. Tillett, N.D., Hague, T., Miles, S.J., 2001. A field assessment of a potential method for weed and crop mapping on the basis of crop planting geometry. Computers and Electronics in Agriculture 32 (3), 229–246. Tisseyre, B., Ardoin, N., Sevila, F., 1999. Precision viticulture: precise location and vigour mapping aspects. In: Proceedings of Second European Conference on Precision Agriculture. Odense, Denmark, pp. 319–330. Tsiligirides, T.A., 1998. Remote sensing as a tool for agricultural statistics: a case study of area frame sampling methodology in Hellas. Computers and Electronics in Agriculture 20 (1), 45–77. Wassenaar, T., 2001. Reconnaissance des e´tats de surface du sol en milieu viticole me´diterrane´en par te´le´de´tection a` tre`s haute re´solution spatiale (recognition of soil surface features in languedocian mediterranean vineyards by very high spatial resolution remote sensing). Ph.D. Dissertation, INRA-ENSA.M, Montpellier, 228pp. Wassenaar, T., Baret, F., Robbez-Masson, J.M., Andrieux, P., 2001. Sunlit soil surface extraction from remotely sensed imagery of perenial, discontinuous crop areas: the case of Mediterranean vineyards. Agronomie: Agriculture and Environment 21, 235–245. Wassenaar, T., Robbez-Masson, J.M., Andrieux, P., Baret, F., 2002. Vineyard identification and description of spatial crop structure by per-field frequency analysis. International Journal of Remote Sensing 23 (17), 3311–3325.