Ontology-based classification of remote sensing images using spectral rules

Ontology-based classification of remote sensing images using spectral rules

Author’s Accepted Manuscript Ontology-based classification of remote sensing images using spectral rules Samuel Andrés, Damien Arvor, Isabelle Mougeno...

2MB Sizes 0 Downloads 129 Views

Author’s Accepted Manuscript Ontology-based classification of remote sensing images using spectral rules Samuel Andrés, Damien Arvor, Isabelle Mougenot, Thérèse Libourel, Laurent Durieux www.elsevier.com/locate/cageo

PII: DOI: Reference:

S0098-3004(17)30207-8 http://dx.doi.org/10.1016/j.cageo.2017.02.018 CAGEO3915

To appear in: Computers and Geosciences Received date: 9 March 2016 Revised date: 9 September 2016 Accepted date: 22 February 2017 Cite this article as: Samuel Andrés, Damien Arvor, Isabelle Mougenot, Thérèse Libourel and Laurent Durieux, Ontology-based classification of remote sensing images using spectral rules, Computers and Geosciences, http://dx.doi.org/10.1016/j.cageo.2017.02.018 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting galley proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Ontology-based classification of remote sensing images using spectral rules Samuel Andr´es 1,∗ , Damien Arvor 1,2 , Isabelle Mougenot 1 , Th´er`ese Libourel 1 , Laurent Durieux 1 1

UMR 228 Espace Dev (UM,UR,UG,UA,IRD), Maison de la T´el´ed´etection, 500 rue JF Breton, 34093 Montpellier Cedex 5 France, [email protected], [email protected], [email protected], [email protected] 2

UMR LETG-Rennes CNRS 6554, Universit´e Rennes 2, Place du Recteur Henri Le Moal, 35043 Rennes Cedex, [email protected]

Abstract Earth Observation data is of great interest for a wide spectrum of scientific domain applications. An enhanced access to remote sensing images for ”domain” experts thus represents a great advance since it allows users to interpret remote sensing images based on their domain expert knowledge. However, such an advantage can also turn into a major limitation if this knowledge is not formalized, and thus is difficult for it to be shared with and understood by other users. In this context, knowledge representation techniques such as ontologies should play a major role in the future of remote sensing applications. We implemented an ontology-based prototype to automatically classify Landsat images based on explicit spectral rules. The ontology is designed in a very modular way in order to achieve a generic and versatile representation of concepts we think of utmost importance in remote sensing. The prototype was tested on four subsets of Landsat images and the results confirmed the potential of ontologies to formalize expert knowledge and classify remote sensing images. Keywords: Ontology, Expert Knowledge, Remote Sensing, Image classification, Description Logics

Preprint submitted to Computers & Geosciences

February 22, 2017

1

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37

1. Introduction Earth Observation data is of great interest for a wide spectrum of domain applications such as crop and biodiversity monitoring, health-environment analysis, urban planning, etc. Remote sensing images are thus expected to play an increasing role in future research led by experts from various scientific domains. However, a gap has long existed between what these ”domain” experts can expect from remote sensing data and what remote sensing experts can produce [1]. During the last decade, new image processing approaches have emerged and brought consistent advances to reduce the gap between domain and remote sensing experts. As an example, Geographic Object-Based Image Analysis (GEOBIA) is now considered as a new paradigm in remote sensing [2] since it allows users to classify images based on their domain expert knowledge. However, putting remote sensing images into the hands of domain experts can also enhance the subjectivity of the image interpretation process [3]. Thus, remote sensing needs to be accompanied by technological enhancements that allow management, aggregation and sharing of the knowledge of remote sensing and domain experts [4]. Knowledge management thus represents a major issue to be dealt with to advance remote sensing. Knowledge representation techniques such as ontologies appear to be especially promising. The use of ontologies in geosciences has been extensively discussed [5] with a special emphasis on geoinformation retrieval, geo-processing, data sharing and geo-information integration [6, 7, 8]. With regard to remote sensing, Arvor et al. [4] put a special emphasis on the way ontologies may contribute to remote sensing image interpretation. Few notable seminal papers have been published on this issue. Kohli et al. [9] proposed a Generic Slum Ontology (GSO) to serve as a framework to identify slums in remote sensing images. Durand et al. [10], Forestier et al. [11, 12] implemented an ontology-based recognition method to classify urban areas and then coastal areas based on a priori expert knowledge. More recently, Belgiu et al. [13] proposed a method to automatically integrate ontologies designed by domain experts in a GEOBIA process implemented with the eCognition software (Trimble, Sunnyvale, CA, USA). Aryal et al. [14] developed an Environmental Spatio-Temporal Ontology (ESTO) and tested its validity for land cover classification on WorldView-2 imagery. However, ontology-based remote sensing applications remain rare and limited to date. First, the number of concepts in the ontologies (i.e. number 2

38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75

of classes to be identified in the images) as well as the number of objects to classify are usually restricted. Second, the ontologies are often coupled with an image processing software at the classification step, i.e. to assign image objects to concepts of the ontologies. The reasoning potential of the ontologies is thus often unexplored. Hence new ontology-based studies are urgently needed to further reduce the semantic gap between domain and remote sensing experts. The objective of the present paper is to introduce a novel ontological approach for remote sensing image classification. We developed a prototype system through which remote sensing expert knowledge is formalized and analysed to guide the interpretation of satellite images. The added-value of that system is twofold: i) the ontological approach is coupled with an open source image processing software at the pre-processing step (Orfeo Toolbox, OTB, [15]) and ii) the system fully explores the potential of ontologies since the image classification indexing process (i.e. the assignment of an image object to a semantic class) is performed thanks to a reasoner algorithm. For the purpose of this study, the ontology prototype was built upon the expert remote sensing knowledge expressed in Baraldi et al. [16], which consists in rule sets used to classify Landsat images. Our work thus mainly intended to formalize these rules in an ontology, i.e. to first define concepts and relations from the expert knowledge and then set the classification rules. In this regard, it is worth noting that the added-value of an ontologybased approach when compared to the original rule-based approach is entirely methodological and not thematic. The originality of using ontologies stands in its ability to change the paradigm of the image analysis, from a numerical approach to a symbolic approach. Whereas traditional implementations of classification rules integrate the knowledge directly in the algorithm (as it is the case for example in the SpectralRuleBasedClassifier of Orfeo Toolbox [15]), our approach extracts the knowledge in order to consider it as a proper object. This enables to better structure the knowledge triggered for remote sensing image interpretation and opens new perspectives. For example, it is possible 1) to modify the modular knowledge without changing the classification algorithm, 2) to handle the knowledge with dedicated management tools (e.g. Prot´eg´e [17]) and 3) to share and re-use the knowledge with other colleagues, potentially from other scientific domains, using common standards (e.g. Description Logics, Web Ontology Language). After briefly recalling some basics on ontologies (section 2), we detail the ontological components of the system (section 3). In section 4, we set out 3

78

the implementation of the prototype and display the results obtained on four subsets of Landsat images. Finally (section 5), we discuss our results and introduce new prospects and challenges for future developments.

79

2. Ontologies for knowledge management

76 77

80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111

An ontology is usually defined as a formal (machine-understandable), explicit (with all concepts and relations explicitly formalized) specification of a shared (agreed by a community) conceptualization [18]. A conceptualization is an abstract, simplified view of the world that we want to represent for a specific purpose. Two types of ontologies are usually considered, i.e. framework ontologies and domain ontologies. A framework (or foundation or upper or top-level) ontology contains high-level knowledge that is not designed for a specific scientific domain. For instance, the Basic Formal Ontology (BFO) is designed for use in supporting information retrieval, analysis and integration in scientific and other domains [19]. On the contrary, a domain ontology contains knowledge that is specific to a scientific domain (e.g. remote sensing). Both ontologies must therefore be connected so that a domain ontology becomes an extension of a framework ontology. In the present study, we focused our work on the implementation of a domain ontology dedicated to the interpretation of remote sensing images. Domain ontologies are structured into two levels in order to distinguish the world of concrete representations from the world of concepts. Both levels put together therefore constitute a complete knowledge base. The first level, called ABox, is assertional and represents a given contingent situation in the world. The second level, called TBox, is terminological and represents the different kinds (types) of individuals by concepts. The TBox may refer to the part of an ontology called domain ontology where experts conceptualize their knowledge in a specific scientific domain (e.g. remote sensing). There exist different paradigms for modeling ontologies. Two current prominent approaches are in particular Description Logics (DL) [20] and rules formalisms. DL formalism is especially used in the Semantic Web and serves as a basis for OWL (Web Ontology Language), which is a very popular language designed to build ontologies that benefit from different syntaxes for serialization [21]. An OWL ontology consists of a set of axioms (including concept, role and individual axioms) that captures intended semantics of the domain under interest. At some part, knowledge representation rules may 4

126

be considered as a form of axioms so that rules can be modeled using OWL [22]. Additionally OWL can also be extended to incorporate rules and DL axioms can be translated into rules. Consequently, the knowledge base can be easily shared with other scientists. Additionally, the ontology can be used to infer knowledge thanks to the ability of DL formalism to make ontologies machine-understandable. Indeed, knowledge formalization based on logic rules allows new complex concepts to be built and implicit knowledge to be deduced. Knowledge representations and their logic rules can indeed be handled by inference engines, usually called reasoners, to automatically infer knowledge, i.e. to make some implicit knowledge explicit. In the present work, we propose to use both ontology capabilities (i) to share knowledge between conceptual expertise in remote sensing and image content and (ii) to make some implicit knowledge explicit using inference engines, i.e. assign pixels to semantic descriptors.

127

3. Modular ontological approach

112 113 114 115 116 117 118 119 120 121 122 123 124 125

128 129 130 131 132 133 134 135 136 137 138 139 140 141

142 143 144 145 146

We distinguished three levels of knowledge as proposed by Falomir et al. [23]: the reference conceptualization, the contextual knowledge and the image facts. These different parts of knowledge concern the image structure representation, the remote sensing experience and the image content, respectively. All together, they draw a global model (fig. 1) with (i) horizontally, the conceptual (concept description) and concrete (instance description) levels, i.e. the TBox and ABox respectively, and, (ii) vertically, the image and image interpretation domains. In this diagram, the bottom right box is the place for image interpretation [24], i.e. the ontological system calls upon a reasoner to infer implicit knowledge about the individuals (image pixels) of the ontology in order to assign them to concepts (spectral categories) described in the contextual knowledge. Individuals here refer to image pixels but we called them ”image objects” because our long-term objective is to integrate the prototype into a GEOBIA approach. 3.1. The reference conceptualization The reference conceptualization can be viewed as a general model to describe image objects in remote sensing images. It captures a few top-level concepts shared between two packages: the image structure package introduces the ImageObject and ImageObjectFeature concepts and the image processing 5

Figure 1. Structure of the knowledge base comprising two levels (conceptual and concrete) and two domains (image and image interpretation). 147 148 149 150 151 152 153 154 155 156 157 158

package introduces the PseudoSpectralIndex and SpectralBand concepts (fig. 2). These concepts are part of the reference conceptualization because they refer to elementary concepts a remote sensing expert relies on to describe its own contextual knowledge. Concepts related to spectral bands, spectral indices or texture indices are used by experts to interpret a remote sensing image, but they are not defined by these experts. For example, whereas the definition of a vegetation type (e.g. forest) may vary according to the geographical area or the expert’s knowledge, the concept of a spectral band is not expected to vary. The ImageObject concept allows to describe image objects according to their characteristics. These characteristics refer to the ImageObjectFeature concept so that both concepts are linked by a hasFeature relation, e.g. Im-

6

ImageObject ≡ ∃hasFeature . I mag eO bj ec tF ea tu re u ∀ hasFeature . ImageObj ec tF ea tu re ( Axiom 1) I mageObjectFeature ≡ ∃ofProcessing .( P se u d oS p e ct r a lI n d ex t SpectralBand ) u ∃ numericalValue . xsd:decimal[operator X] ( Axiom 2) where operator is <,>,<=,>= ,= ,!= and X a quantitative value Listing 1: Concept equivalence for ImageObject and ImageObjectFeature

159 160 161 162 163 164 165 166 167 168

169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185

ageObject hasFeature ImageObjectFeature. A characteristic of an image object is the result of a processing task (e.g. a NDVI value after computing a spectral index) although the processing task is not computed by the ontology. Consequently, the ImageObjectFeature is related to concepts of the image processing package (PseudoSpectralIndex and SpectralBand concepts) through an ofProcessing relation. Based on these concepts and relations, the description of any image object (in any type of remote sensing image) can be defined using description logic syntax as in listing 1. In axiom 2, the numericalValue allows the spectral rules to be included by considering the image object values. 3.2. The contextual knowledge The contextual knowledge serves to represent the remote sensing expert knowledge. This knowledge depends on the objectives and the experience of the expert (or of the expert community) and must be coherent with the data to be interpreted. This is why this knowledge is called ”contextual knowledge”. It differs from the reference conceptualization by the semantics: it is not an ”objective” representation of image structure, but a ”subjective” description of image interpretation rules. Thus, the contextual knowledge is an extension of the reference conceptualization that contains all rules defined by an expert to interpret a remote sensing image. In the present study, the remote sensing expert knowledge is derived from the method proposed by Baraldi et al. [16] to automatically interpret Landsat TM and ETM+ images. In this method, also called SIAM for Satellite Image Automatic Mapper, pixels are assigned to semantic spectral categories (up to 46 classes) on a per-pixel basis. It is worth mentioning that the spectral categories do not correspond to land cover classes but should rather be considered as semantic descriptions of spectral signatures. For example, a 7

186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216

217 218 219 220 221 222

spectral category named Strong Vegetation with High Near Infrared Spectral Category (SV HN IRSC ) might correspond to different land cover classes, i.e. broadleaved deciduous forests, vegetated croplands or pastures. There are two main reasons why we chose this method to implement our ontology-based prototype. First, it proposes an effective way to describe quantitative data (i.e. remote sensing images) with qualitative (semantic) information (i.e. spectral categories). Since the method is based on semantic description of spectral signatures, it appears fully relevant to be converted into ontologies where concepts are semantically described. Second, the definition of an ontology as mentioned in section 2 is based on three important terms, i.e. formal, explicit and shared. Being shared means that an ontology must represent a relative consensual knowledge in a given scientific domain. As the methodology is the result of an impressive review of methods applied to Landsat data, we considered it represents a consensual knowledge about Landsat-derived spectral signatures. Additionally, the method is based on explicitly defined spectral rules in opposition to supervised or unsupervised ”black-box” classification algorithms used in remote sensing applications (Maximum Likelihood, Support Vector Machine, etc.). Finally, although the methodology is based on a shared and explicit conceptualization, it is not formal, i.e. it is not machine-understandable. Thus, we intend to formalize these rules with Description Logics as part of the contextual knowledge of the ontology-based prototype. The method proposed by Baraldi et al. [16] is based on three spectral rule sets applied on pre-processed top-of-atmosphere calibrated Landsat bands (T M1 , T M2 ,..., T M7 bands, where TM stands for Thematic Mapper) and eleven derived spectral indices (Bright, Vis, NIR, MIR1, MIR2, TIR, MIRTIR, NDSI, NDBBBI, NDVI, NDBSI, see acronyms in table 1). Thus, the contextual knowledge was split into four packages that refer to (i) the description of these input pre-processed images (e.g. T M1 ,NDVI, etc.) and (ii) the three rule sets of the methodology (named Feature Space Partition, Spectral Rules and Spectral Categories by Baraldi et al. [16]) (fig. 2). 3.2.1. Feature processing We defined all the spectral bands and spectral indices required to implement the spectral rule sets into the ontology. Since they are the result of image processing tasks we called this package ”Feature processing” (although these processes are not computed by the ontology). First, the seven spectral bands of Landsat TM images were defined as 8

Figure 2. Concepts, relations and instances included in the ontology at the conceptual level (reference conceptualization and contextual knowledge).

9

Acronym TM ETM+ OLI Vis NIR MIR TIR MIRTIR NDSI NDBBBI NDVI NDBSI

Complete name Thematic Mapper Enhanced Thematic Mapper Plus Operational Land Imager Visible Near Infrared Mean Infrared Thermal Infrared Combination index between Mean Infrared and Thermal Infrared Normalized Difference Snow Index Normalized Difference Built-up areas and Barren land Index Normalized Difference Vegetation Index Normalized Difference Bare Soil Index Table 1: List of acronyms.

223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238

239 240 241 242 243

instances of the SpectralBand concept and named blue, green, red, nir, mir1, tir and mir2 concepts to refer to T M1 , T M2 ,..., T M7 bands respectively. Second, we identified two sub-concepts of PseudoSpectralIndex concept: SIAMindex and Ratio (fig. 2). The SIAMindex concept is used to include the eleven spectral indices initially computed by Baraldi et al. [16]. These indices are defined in eleven concepts (named bright, vis, nir, mir1, mir2, tir, mirtir, ndsi, ndbbbi, ndvi and ndbsi ) which are instances of the SIAMindex concept. It is worth mentioning that the nir, mir1, mir2 and tir concepts are instances of both SIAMindex and SpectralBand concepts. The Ratio concept is created to handle spectral rules such as ”T M1 > T M2 ” which cannot be efficiently handled in the ontology. We translated such rules into ratio indices, i.e. T M1 /T M2 > 1. All in all, thirty-five instances of the Ratio concept were required to implement all the spectral rules proposed in Baraldi et al. [16]. These ratio indices are named ratioTM1divTM2 (corresponding to T M1 /T M2 ), ratioTM4divMaxTM123 (corresponding to T M4 /max{T M1 , T M2 , T M3 }), etc. 3.2.2. Feature Space Partition The first rule set is applied to the eleven spectral indices and is intended to partition the feature space by assigning a categorical value (low, medium or high) to any pixel depending on its corresponding index value. For example, a high NDVI label (i.e. ”HighNDVI”) is assigned to any pixel whose 10

HighNDVI ≡ ImageObjectF ea tu re u ∃ ofProcessing .{ndvi} u ∃ numericalValue . xsd:decimal[>0.7] ( Axiom 3) Listing 2: Concept equivalence for HighNDVI

244 245 246 247 248 249 250 251

252 253 254 255 256 257

Normalized Difference Vegetation Index (NDVI) value is higher than 0.70. To express such rule in the ontology, the ImageObjectFeature concept is used to connect the ImageObject with relevant characteristics, i.e. it links an interval of values (e.g. ”> 0.7”) with a characteristic (e.g. ndvi ). New concepts referring to the different possible feature partitions (e.g. HighNDVI, MediumNDVI, LowNDVI, HighBright, MediumBright, LowBright concepts, etc.) are introduced and described by Description Logics formalism following the same structure as in listing 2. 3.2.3. Spectral Rules The second rule set refers to logical expressions applied to discriminate fourteen spectral rules (as called by Baraldi et al. [16]), such as thick clouds, vegetation, water or shadow spectral rules, etc. These spectral rules are only based on calibrated Landsat spectral bands. As an example, the Vegetation Spectral Rule label (VSR ) is defined as in equation 1: VSR = (T M2 >= 0.5 ∗ T M1 ) and (T M2 >= 0.7 ∗ T M3 ) and (T M3 < 0.7 ∗ T M4 ) and (T M4 > max{T M1 , T M2 , T M3 }) and (T M5 < 0.7 ∗ T M4 ) and (T M5 >= 0.7 ∗ T M3 ) and (T M7 < 0.7 ∗ T M5 ) (1)

258 259 260 261 262 263 264

265 266 267

where T Mi corresponds to the pixel value of the ith band of calibrated Landsat TM data. The ”Spectral Rules” are included in the ontology through the ImageObject concept. This concept was extended to include fourteen sub-concepts (e.g. VSR ) corresponding to the fourteen identified classes. Each class is described by Description Logics formalism following the same structure as in listing 3. 3.2.4. Spectral Categories Finally, the third rule set identifies final spectral categories based on the combination of linguistic labels resulting from the Feature Space Partition 11

V SR ≡ ImageObject u ∃ hasFeature .(∃ ofProcessing .{ratioT M 2divT M 1} u ∃ numericalValue . xsd:decimal[>=0.5]) u ∃ hasFeature .(∃ ofProcessing .{ratioT M 2divT M 3} u ∃ numericalValue . xsd:decimal[>=0.7]) u ∃ hasFeature .(∃ ofProcessing .{ratioT M 3divT M 4} u ∃ numericalValue . xsd:decimal[<0.7]) u ∃ hasFeature .(∃ ofProcessing .{ratioT M 4divM axT M 123} u ∃ numericalValue . xsd:decimal[>1]) u ∃ hasFeature .(∃ ofProcessing .{ratioT M 5divT M 3} u ∃ numericalValue . xsd:decimal[>=0.7]) u ∃ hasFeature .(∃ ofProcessing .{ratioT M 5divT M 4} u ∃ numericalValue . xsd:decimal[<0.7]) u ∃ hasFeature .(∃ ofProcessing .{ratioT M 7divT M 5} u ∃ numericalValue . xsd:decimal[<0.7]) ( Axiom 4) Listing 3: An example of spectral rules definition corresponding to equation 1

268 269 270

and Spectral Rules rule sets. For instance, the Strong Vegetation with High Near Infrared Spectral Category (SV HN IRSC ) is detected when the following rules are verified (eq. 2):

SV HN IRSC = VSR and HighN DV I and HighN IR and not (HighM IR1 or HighM IR2 or HighN DBSI) (2) 271 272 273 274 275 276 277 278 279 280 281

where VSR is described in equation 1 and HighNDVI, HighMIR1, HighMIR2, HighNDBSI and HighNIR refer to high pixel values (with different thresholds introduced in Baraldi et al. [16]) measured for different spectral indices, namely NDVI, MIR1, MIR2, NDBSI and NIR respectively. As for the ”Spectral Rules”, the ImageObject concept is extended to include 46 sub-concepts corresponding to the descriptions of the 46 spectral categories as in listing 4. It is noteworthy that many rules include a negative condition (for instance, HighNDVI ”and not” HighMIR1, as in equation 2). Here, we dealt with such conditions by considering the inverse rule (HighNDVI and (LowMIR1 or MediumMIR1)).

12

SVHNIR SC ≡ V SR u ∃ hasFeature . HighNDVI u ∃ hasFeature . HighNIR u ∃ hasFeature .( LowMIR1 t MediumMIR1 ) u ∃ hasFeature .( LowNDBSI t MediumNDBSI ) u ∃ hasFeature .( LowMIR2 t MediumMIR2 ) ( Axiom 5) Listing 4: An example of spectral categories definition corresponding to equation 2.

292

3.3. The image facts While both the reference conceptualization and the contextual knowledge constitute the TBox of the ontology, the image facts refer to contingent knowledge stored in the ABox. They refer to instances which are automatically computed to provide semantic descriptions of image objects based on concepts from the reference conceptualization, but regardless of any interpretation related to the contextual knowledge. For instance, an image object in the Landsat image is described based on its characteristics, i.e. spectral band and spectral index values. The description is written in Description Logics formalism including the concepts and instances already introduced in the reference conceptualization and contextual knowledge.

293

4. Implementation and results

282 283 284 285 286 287 288 289 290 291

294 295 296 297 298

299 300 301 302 303 304 305 306 307 308

The reference conceptualization, the contextual knowledge and the image facts were implemented in order to form a knowledge base to be analyzed by a reasoner program so as to infer new knowledge, i.e. assign pixels to spectral categories (fig. 1). In this section, we introduce the implementation structure and the first results. 4.1. Implementation structure The implementation involves two distinct steps: one is ontological (building the TBox) while the other is procedural (populating the ABox with image facts) [24]. At the ontological step, we implemented the reference conceptualization and the contextual knowledge using Prot´eg´e [17], a free ontology editor framework. At the procedural step, we automatically computed the image facts to be integrated as instances of the reference conceptualization. The image facts provide a quantitative description of the image content in order to be as objective as possible. The image processing tasks leading to the image facts were computed with Orfeo Toolbox (OTB) [15]. Landsat images 13

309 310 311 312 313 314 315 316 317 318 319

320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345

were calibrated in top-of-atmosphere reflectance and used to compute eleven spectral indices and thirty-five ratio indices required to apply the three rule sets. We then used the owl cpp library [25] to integrate both approaches, i.e. to combine OWL-DL ontologies with OTB. This way, all image pixels are semantically described with Description Logics using the concepts of the ontology. Finally, the knowledge base is handled by a reasoner based on Description Logics (FaCT++, [26]). This reasoner makes explicit the instantiation links from the most specific concepts defined in the contextual knowledge, i.e. instances of images objects are assigned to any concept of the ontology whose definition matches the objects description. 4.2. Results Whereas data-driven approaches can be easily validated by computing statistical indices, the validation of knowledge-driven approaches is challenging. Indeed, the contribution of the ontology-based prototype with regards to a traditional rule-based classification algorithm is not measurable since both approaches lead to rigorously similar results. Additionally, it is not possible to compare our results with another ontology-based classification for three reasons. First, there is only one ontology of the classification rules of Baraldi et al. [16] to date (the one introduced in this paper). Second, if another ontology of these classification rules existed, it would still lead to similar maps since the rules would be the same. Third, if we compared our results with maps produced by another ontology-based approach, we would actually compare the quality of the knowledge put into the ontology (i.e. the knowledge of Baraldi et al. [16] with the knowledge of someone else), not the quality of the ontology-based prototype. For these reasons, the validation of ontology-based approaches remains an open issue. As a consequence, we did not use ground data to validate our approach as it is usually the case in remote sensing applications. Nonetheless, the accuracy of the classification has already been widely discussed and ranged from 98.2% to 81% depending on the classes to be discriminated [16]. In the present study, we tested our approach on four subsets of Landsat images, including three Landsat 5 TM and one Landsat 8 OLI image (fig. 3). The tests were only performed on small image subsets (300 x 300 pixels) due to operational considerations. We then compared the output result of the ontology-based prototype with maps obtained by computing the spectral rules using the R software [27]. Both results were entirely similar and thus 14

347

confirmed the relevance of the symbolic ontology-based approach whereby pixels are classified by a reasoner.

348

5. Discussion and perspectives

346

349 350 351

352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379

Although our ontological approach has proven to be valid, numerous conceptual and technical issues deserve to be discussed to raise new prospects and identify the next challenges to be addressed. 5.1. Conceptual and technical issues The main technical issue refers to operational considerations. Our implementation is based on the Fact++ reasoner and performs some operations that are very computationally expensive such as realizing the very large Abox by computing the types for all individuals, i.e. pixels. Indeed, the reasoning process applied to assign spectral categories to pixels is quite slow to run due to the large number of concepts and relations combined with the large number of instances to be dealt with (i.e. 90,000 pixels). For example, the hasFeature relation needs to be instantiated for each feature of each image object. Actually, the processing time required to reason on an ontology grows exponentially with its complexity, i.e. the complexity of the concept descriptions, relationships between concepts and the number of instances to be analyzed. In order to overcome this problem, we implemented an alternative methodology, which consists in reasoning on one instance (e.g. one pixel) at a time in order to ensure a linear relation between the processing time and the number of instances to be classified (fig. 4). Although this approach resulted in significant time saving, our prototype system still falls short of expectations to serve real production applications when compared with traditional image processing software. Yet we consider that it still remains relevant for future applications and especially for GEOBIA since the reasoning operation would be carried out on a restricted number of image objects resulting from a segmentation process. Furthermore, since ABox reasoning affects the performance of the system in drastic ways, we will investigate a range of optimization mechanisms with a particular focus on querying large ABoxes. We propose a few courses of action for scaling up reasoning capabilities. First, we will store the content of the ABox in a triple store in order to efficiently handle the large set of instances. Additionally the RDF (Resource Description Framework) store may also store

15

Figure 3. Ontology-based classifications of four subsets of Landsat images. The complete definitions of the legend acronyms are in Baraldi et al. [16]. 16

Figure 4. Processing time depending on the number of pixels to be classified. The red line is obtained by reasoning on one instance (e.g. one pixel) at a time. The blue line is obtained by reasoning on the entire knowledge base. In this example, only three concepts are classified in order to allow comparison of both approaches. 380 381 382 383 384

385 386 387 388 389 390 391

all entailed triples explicitly. Second, we will focus on the OWL 2 DL fragments based on a predefined set of entailment rules (e.g. OWL 2 RL) that may make better use of materialization techniques. Third, we will also explore the capabilities of a scalable distributed system to support efficient materialization. 5.2. Prospects To date, the prototype system performs a pixel-based classification applied to Landsat data. It was an important step to implement and test the architecture of the system. However, this prototype actually fits into a GEOBIA approach. GEOBIA relies on two main steps: i) the segmentation step in order to delineate ”homogeneous” image objects and ii) the classification step to assign image objects to classes [28], usually based on a trial-and-error 17

392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429

approach. A major asset of GEOBIA consists in its ability to allow users (e.g. domain experts) to perform both steps based on their expert knowledge in a defined scientific domain (ecology, agronomy, urbanism, etc.). However, although GEOBIA’s ability to put remote sensing images into the hands of domain experts represents a great advance, it can also turn into a major limitation since it enhances the subjectivity of the image interpretation process [3]. In this regard, it appears relevant to formalize the knowledge triggered to interpret images so that it can be easily shared with and understood by other users. One main advantage of the object-based approach is that it allows to analyze a large variety of image object features. Liu et al. [29] consider three levels of features that can be derived from a segmented image: i) Level 1 features are properties of a single image object (e.g. radiometry, texture, geometry), ii) Level 2 features focus on relations between two objects (e.g. spatial relations) and iii) Level 3 features are spatial patterns in which more than two objects are involved (e.g. aligned objects). Whereas the integration of Level 1 features in the prototype system is quite straightforward (new concepts such as Texture and Geometry can be included as sub-classes of the ImageObjectFeature concept), including Level 2 and Level 3 features in the ontology is quite challenging. The main problems with spatial and temporal relations come from the fact that i) experts are used to describe them with vague terms (e.g. close to, etc.) which are difficult to formalize and ii) they can be modeled into ontologies as relationships between objects or as concepts, therefore allowing better description (with distance and orientation information for instance) but increasing the complexity of the reasoning step. Some recent papers on the consideration of spatial relations in GEOBIA may serve as a guide to overcome such issues [14, 30, 31]. Furthermore, we consider that the prototype system is still incomplete since it assigns image objects to spectral categories, which are not directly useful for domain experts. Thus, we should now connect remote sensing knowledge with domain expert knowledge to derive land cover/land use classes or ecological habitats. In this regard, we are currently working on domain ontologies to describe land cover classes and ecological habitats according to the Land Cover Classification System (LCCS; [32]) and General Habitat Categories (GHC; [33]), respectively. Finally, our system falls within a top-down approach where a domain ontology representing expert knowledge is first produced and then analysed to interpret the data. Such an approach has long been supported by the on18

441

tological community because it is clearly knowledge-driven. However, it has also been criticized because of the difficulty in formalizing expert knowledge and reaching an ontological commitment across a scientific community. A bottom-up approach may thus be preferred to report the diversity of viewpoints in a domain and the heterogeneity of data [34]. We suggest that process trees computed by GEOBIA experts using dedicated software (e.g. eCognition) implicitly contain expert knowledge and should thus be converted into ontologies in order to create new knowledge bases that would make up a large set of information to be analyzed thanks to ontological capacities (ontology mapping, reasoning, etc.). In this regard, the reference conceptualization proposed in the present paper could serve as a framework to describe these process trees.

442

6. Conclusion

430 431 432 433 434 435 436 437 438 439 440

453

We introduced an ontology-based prototype system dedicated to the classification of remote sensing images. The system was tested for pixel-based classification of Landsat images based on explicitly defined spectral rules. The ontology is used to formalize expert knowledge in a domain of interest and guide the image interpretation process thanks to a Fact++ reasoner. Our results confirmed the feasibility of the approach but it also highlighted major limitations especially with regards to the required processing time. Future prospects for implementing the system with a GEOBIA approach should at least partially overcome this issue since the segmentation step drastically reduces the number of objects to be processed and consequently reduce the processing time.

454

Acknowledgements

443 444 445 446 447 448 449 450 451 452

455 456 457 458 459 460

This work was conducted as part of (i) the PO FEDER GUYANE 20072013 program under the frame of the CARTAM-SAT project; (ii) the European Union FP7 BIO SOS project (BIOdiversity Multi-Source Monitoring System: from Space To Species; grant agreement 263435) and (iii) the GEOSUD project (ANR-10-EQPX-20) of the ”Investissements d’Avenir” program managed by the French National Research Agency.

19

461

462 463 464 465

466 467 468 469 470

471 472 473 474

475 476 477 478

479 480 481

482 483 484

485 486 487

488 489 490

491 492 493

References [1] V. Herbreteau, G. Salem, M. Souris, J.-P. Hugot, J.-P. Gonzalez, Thirty years of use and improvement of remote sensing, applied to epidemiology: From early promises to lasting frustration, Health & Place 13 (2007) 400–403. [2] T. Blaschke, G. J. Hay, M. Kelly, S. Lang, P. Hofmann, E. Addink, R. Queiroz Feitosa, F. van der Meer, H. van der Werff, F. van Coillie, D. Tiede, Geographic Object-Based Image Analysis Towards a new paradigm, ISPRS Journal of Photogrammetry and Remote Sensing 87 (2014) 180–191. [3] M. Belgiu, L. Dragut, J. Strobl, Quantitative evaluation of variations in rule-based classifications of land cover in urban neighbourhoods using WorldView-2 imagery, ISPRS Journal of Photogrammetry and Remote Sensing 87 (2014) 205–215. [4] D. Arvor, L. Durieux, S. Andr´es, M.-A. Laporte, Advances in Geographic Object-Based Image Analysis with ontologies: A review of main contributions and limitations from a remote sensing perspective, ISPRS Journal of Photogrammetry and Remote Sensing 82 (2013) 125–137. [5] F. Reitsma, J. Laxton, S. Ballard, W. Kuhn, A. Abdelmoty, Semantics, ontologies and eScience for the geosciences, Computers & Geosciences 35 (2009) 706–709. [6] A. Buccella, A. Cechich, P. Fillottrani, Ontology-driven geographic information integration: A survey of current approaches, Computers & Geosciences 35 (2009) 710–723. [7] H. Pundt, Y. Bishr, Domain ontologies for data sharingan example from environmental monitoring using field GIS, Computers & Geosciences 28 (2002) 95–102. [8] U. Visser, H. Stuckenschmidt, G. Schuster, T. V¨ogele, Ontologies for geographic information processing, Computers & Geosciences 28 (2002) 103–117. [9] D. Kohli, R. Sliuzas, N. Kerle, A. Stein, An ontology of slums for imagebased classification, Computers, Environment and Urban Systems 36 (2012) 154–163. 20

494 495 496

497 498 499

500 501 502

503 504 505

506 507 508 509

510 511

512 513 514 515

516 517

518 519 520

521 522 523

[10] N. Durand, S. Derivaux, G. Forestier, C. Wemmert, P. Gan¸carski, O. Boussaid, A. Puissant, Ontology-Based Object Recognition for Remote Sensing Image Interpretation, IEEE, 2007, pp. 472–479. [11] G. Forestier, A. Puissant, C. Wemmert, P. Gan¸carski, Knowledge-based region labeling for remote sensing image interpretation, Computers, Environment and Urban Systems 36 (2012) 470–480. [12] G. Forestier, C. Wemmert, A. Puissant, Coastal image interpretation using background knowledge and semantics, Computers & Geosciences 54 (2013) 88–96. [13] M. Belgiu, B. Hofer, P. Hofmann, Coupling formalized knowledge bases with object-based image analysis, Remote Sensing Letters 5 (2014) 530– 538. [14] J. Aryal, A. Morshed, R. Dutta, Land cover class extraction in GEOBIA using environmental spatial temporal ontology, South-Eastern European Journal of Earth Observation and Geomatics Special Issue, Thessaloniki, Greece, 2014, pp. 429–434. [15] J. Inglada, E. Christophe, The Orfeo Toolbox remote sensing image processing software, IEEE, 2009, pp. IV–733–IV–736. [16] A. Baraldi, V. Puzzolo, P. Blonda, L. Bruzzone, C. Tarantino, Automatic Spectral Rule-Based Preliminary Mapping of Calibrated Landsat TM and ETM+ Images, IEEE Transactions on Geoscience and Remote Sensing 44 (2006) 2563–2586. [17] The prot´eg´e ontology editor and knowledge acquisition system, 2015. http://protege.stanford.edu/. [18] T. R. Gruber, Toward principles for the design of ontologies used for knowledge sharing?, International Journal of Human-Computer Studies 43 (1995) 907–928. [19] K. Munn, B. Smith, Applied ontology an introduction, Ontos Verlag ; [Distributed in] North and South America by Transaction Books, Frankfurt; Piscataway, NJ, 2008.

21

524 525 526

527 528 529

530 531 532 533 534 535 536 537

538 539 540

541 542

543 544 545

546 547 548 549 550 551 552

553 554

555 556

[20] F. Baader, D. Calvanese, D. McGuiness, D. Nardi, P. Patel-Schneider, Description Logic Handbook, 2nd Edition, Cambridge University Press, 2007. [21] B. C. Grau, I. Horrocks, B. Motik, B. Parsia, P. Patel-Schneider, U. Sattler, OWL 2: The next step for OWL, Web Semantics: Science, Services and Agents on the World Wide Web 6 (2008) 309–322. [22] A. Krisnadhi, F. Maier, P. Hitzler, OWL and Rules, in: D. Hutchison, T. Kanade, J. Kittler, J. M. Kleinberg, F. Mattern, J. C. Mitchell, M. Naor, O. Nierstrasz, C. Pandu Rangan, B. Steffen, M. Sudan, D. Terzopoulos, D. Tygar, M. Y. Vardi, G. Weikum, A. Polleres, C. dAmato, M. Arenas, S. Handschuh, P. Kroner, S. Ossowski, P. Patel-Schneider (Eds.), Reasoning Web. Semantic Technologies for the Web of Data, volume 6848, Springer Berlin Heidelberg, Berlin, Heidelberg, 2011, pp. 382–415. [23] Z. Falomir, E. Jim´enez-Ruiz, M. T. Escrig, L. Museros, Describing Images Using Qualitative Models and Description Logics, Spatial Cognition & Computation 11 (2011) 45–74. [24] S. Andr´es, Les ontologies dans les images satellitaires, interpr´etation s´emantique des images, Ph.D. thesis, Montpellier 2, Montpellier, 2013. [25] M. K. Levin, A. Ruttenberg, A. M. Masci, L. G. Cowell, owl cpp, a C++ Library for Working with OWL Ontologies, volume 833, CEUR Workshop Proceedings, Buffalo, NY, USA, 2011, pp. 255–257. [26] D. Tsarkov, I. Horrocks, FaCT++ Description Logic Reasoner: System Description, in: D. Hutchison, T. Kanade, J. Kittler, J. M. Kleinberg, F. Mattern, J. C. Mitchell, M. Naor, O. Nierstrasz, C. Pandu Rangan, B. Steffen, M. Sudan, D. Terzopoulos, D. Tygar, M. Y. Vardi, G. Weikum, U. Furbach, N. Shankar (Eds.), Automated Reasoning, volume 4130, Springer Berlin Heidelberg, Berlin, Heidelberg, 2006, pp. 292–297. [27] R Core Team, R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria, 2015. [28] S. Lang, Object-based image analysis for remote sensing applications: modeling reality dealing with complexity, in: T. Blaschke, S. Lang, G. J. 22

557 558

559 560 561 562

563 564 565

566 567 568

569 570 571 572 573

574 575 576 577 578 579

580 581 582

Hay (Eds.), Object-Based Image Analysis, Springer Berlin Heidelberg, Berlin, Heidelberg, 2008, pp. 3–27. [29] Y. Liu, Q. Guo, M. Kelly, A framework of region-based spatial relations for non-overlapping features and its application in object based image analysis, ISPRS Journal of Photogrammetry and Remote Sensing 63 (2008) 461–475. [30] R. Oliva-Santos, F. Maci´a-P´erez, E. Garea-Llano, Ontology-based topological representation of remote-sensing images, International Journal of Remote Sensing 35 (2014) 16–28. [31] C. Pierkot, S. Andr´es, J. F. Faure, F. Seyler, Formalizing spatiotemporal knowledge in remote sensing applications to improve image interpretation, Journal of Spatial Information Science (2013). [32] A. Di Gregorio, Food and Agriculture Organization of the United Nations, United Nations Environment Programme, Land cover classification system: classification concepts and user manual: LCCS, number 8 in Environment and natural resources series, Food and Agriculture Organization of the United Nations, Rome, software version 2 edition, 2005. [33] R. G. H. Bunce, M. J. Metzger, R. H. G. Jongman, J. Brandt, G. de Blust, R. Elena-Rossello, G. B. Groom, L. Halada, G. Hofer, D. C. Howard, P. Kovˇr, C. A. M¨ ucher, E. Padoa-Schioppa, D. Paelinx, A. Palo, M. Perez-Soba, I. L. Ramos, P. Roche, H. Skˇanes, T. Wrbka, A standardized procedure for surveillance and monitoring European habitats and provision of spatial data, Landscape Ecology 23 (2008) 11–25. [34] K. Janowicz, Observation-Driven Geo-Ontology Engineering: Observation-Driven Geo-Ontology Engineering, Transactions in GIS 16 (2012) 351–374.

23