JID:FSS AID:7762 /FLA
[m3SC+; v1.304; Prn:8/11/2019; 11:03] P.1 (1-17)
Available online at www.sciencedirect.com
ScienceDirect
1
1 2
2
3
Fuzzy Sets and Systems ••• (••••) •••–•••
3 4
www.elsevier.com/locate/fss
4
5
5
6
6 7
7 8 9 10
A Munsell colour-based approach for soil classification using Fuzzy Logic and Artificial Neural Networks
8 9 10 11
11
a
a
b
M.C. Pegalajar , L.G.B. Ruiz , M. Sánchez-Marañón , L. Mansilla
12
a
12 13
13
14
15
a Department of Computer Science and Artificial Intelligence, University of Granada, Spain b Department of Soil Science and Agricultural Chemistry, University of Granada, Spain
16
Received 12 March 2019; received in revised form 30 October 2019; accepted 4 November 2019
16
14
15
17
17
18
18
19
19
20 21 22 23 24 25 26 27 28 29
Abstract Munsell soil-colour charts are widely used for soil classification. These charts contain 238 standardised colours in small rectangular chips arranged in seven charts and encoded in the Munsell system. Each chart uses three coordinates well correlated with the visual colour attributes: hue, value and chroma. The colour of a soil sample is commonly estimated by visual comparison between the actual soil colour and the Munsell chips, looking for the closest one and taking its Munsell notation. Consequently, the visual determination of soil colour with Munsell charts is a difficult task due to the subjectivity of the observer to match the colour of a soil sample with a single standard Munsell chip. For this reason, to avoid misclassification caused by subjective coincidence, we propose an intelligent method to provide the closest values of the Munsell chips to an unknown colour of a soil sample by using artificial neural networks and fuzzy logic. © 2019 Elsevier B.V. All rights reserved.
21 22 23 24 25 26 27 28 29 30
30
31
31 32
20
Keywords: Fuzzy Logic; Artificial Neural Networks; Colour matching; Munsell soil-colour charts
32
33
33
34
34
35
35
36
36
37 38 39 40 41 42 43 44 45 46
1. Introduction Sight is commonly one of the most important senses used in science. It also allows to understand and interpret the surrounding environment. The sense of sight allows us to perceive and comprehend numerous forms, shapes, distances, sizes, as well as other environmental qualities, e.g., textures, materials and colour. All of those features are essential to understand reality. Among these, the colour is worthy of mention due to its particular importance to interpret our surroundings. The colour is commonly considered as an inherent property of any object. Nevertheless, in reality, it is a sensation, similar to hunger or fatigue. The perception of a specific colour is caused by many factors, for instance, light, surfaces of the observed object or the interpretation of the brain in response to light stimuli. Like other
49 50 51 52
38 39 40 41 42 43 44 45 46 47
47 48
37
E-mail addresses:
[email protected] (M.C. Pegalajar),
[email protected] (L.G.B. Ruiz),
[email protected] (M. Sánchez-Marañón),
[email protected] (L. Mansilla). https://doi.org/10.1016/j.fss.2019.11.002 0165-0114/© 2019 Elsevier B.V. All rights reserved.
48 49 50 51 52
JID:FSS AID:7762 /FLA
2
[m3SC+; v1.304; Prn:8/11/2019; 11:03] P.2 (1-17)
M.C. Pegalajar et al. / Fuzzy Sets and Systems ••• (••••) •••–•••
1
1
2
2
3
3
4
4
5
5
6
6
7
7
8
8 9
9 10 11
Fig. 1. Simplified illustrative example of Munsell Soil-Colour Charts and organization of the chips in different cardboard. (For interpretation of the colours in the figure(s), the reader is referred to the web version of this article.)
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52
11 12
12 13
10
sensations, the colour can be measured and quantified [1]. What is more, the colour is the attribute to extract much more information about particular characteristics like antiquity, status, structure and chemistry of a material. That’s why the colour is commonly utilized in diverse industries, e.g., textile industry [2], food industry [3], pharmaceutical industry [4] or construction industry [5], without ignoring the high artistic importance [6]. Pedology or soil science is one of the scientific disciplines in which the colour is an essential feature to see and distinguish specific soil properties. An expert can draw conclusions about soil composition, organic-matter content, weathering, topography and fertility-related soil features depending on colour properties [7]. Diverse standard soil colours are gathered in the Munsell Soil-Colour Charts (MSCC), and soil scientists routinely employ these charts to determine soil colour by visual comparison with the standardised colours. These are small coloured rectangular papers encoded in the Munsell system, usually named colour chips. Munsell system was proposed for soil colour measurement in the 1920s and it has become a worldwide standard to measure soil colour [8]. For diverse issues, colour charts can help to obtain an adequate solution. Munsell charts were devised in a standard list of coloured cardboards arranged by shade, which defines the first coordinate of this colour space: hue. Each hue card groups a set of colours organized by lightness (value) and intensity/saturation (chroma) in vertical and horizontal directions, respectively. Thus, the colour space is determined by three variables: hue, value and chroma (HVC). Fig. 1 illustrates this idea in a simplified graph. The charts are organized according to the hue (H), i.e., the shade of colour. In the Munsell system this property is divided into five principal values: Red (R), Yellow (Y), Green (G), Blue (B) and Purple (P), along with five intermediate hues halfway between adjacent principal hues: Yellow-Red (YR), Green-Red (GR), Blue-Green (BG), Purple-Blue (PB) and RedPurple (RP) [6,9]. In the same way, the V coordinate (value) represent the degree of brightness, that is to say, the visual perception of lightness or darkness. And the last parameter, chroma (C) stands for the amount of saturation or intensity of the actual colour. In brief, according to the Munsell system, a colour can be defined by three criterions: HVC. By specifying these three parameters a colour can be located in a 3D space [10]. In the MSCC, H propagates along the applicate axis, V is placed in the ordinate axis and C grows along the abscissa axis. As illustrated in Fig. 1, H would be the ‘page number’ of the book, distinguished by a specific colour; and the other two parameters, V and C, would define the corresponding chip arranged in the cardboard. The Munsell colour system was adopted to solve a handful of problems. Consequently, there are diverse methodologies and research literature related to identifying the colour using these charts. An interesting and ingenious study was presented by Caves et al. [11], who analysed the perception of colour signals in a songbird through Munsell colours so as to create different stimuli and to demonstrate the categorical perception of signal-based colouration in a bird. Another related work was recently published by Imbery et al. [10], who utilized a Munsell-based test to predict dental shade matching and value discernment in first-year dental students. Even though several application areas have been previously appointed, the most common scenario of the Munsell system is in soil science or pedology. This research area gathers plenty of literature associated with the identification and quantification of soil samples. For example, Mikhailova et al. [12] presented an inexpressive colour sensor to develop soil organic carbon and total nitrogen prediction models in Russia. Aitkenhead et al. [13] proposed a Munsell-based method to estimate topsoil organic matter from field observations using environmental factors from the National Soil Inventory of Scotland. The primary motivation for proposing this methodology is that there is not many of studies regarding soil colour classification and Intelligent Systems using Machine Learning techniques. What’s more, little research employs smart-
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52
JID:FSS AID:7762 /FLA
[m3SC+; v1.304; Prn:8/11/2019; 11:03] P.3 (1-17)
M.C. Pegalajar et al. / Fuzzy Sets and Systems ••• (••••) •••–•••
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52
3
phones or common cameras for sample collection, all of them carry out this process in a controlled environment which is normally a laboratory with very expensive and specific instrumentation. As a result, very few relatively new works can be found in the literature. Among them, one can highlight the computer-implemented intelligent alignment method for a colour sensing device was developed by Ataieyan et al. [14]. They suggest combining Artificial Neural Networks (ANN) and k-Nearest Neighbour (k-NN) to process a small set of different material types and to identify a given colour. Han et al. [15] developed a smartphone-based soil colour sensor for soil classification using spectroscopy. Another proposal of smartphone usage is found in [16] where inexpensive sensor technologies are used and an application is developed to integrate a low-cost sensor with GPS technologies testing for its ability to match the Munsell system. This study could be one of the most similar work to ours, the main drawback of that approaches lies in the fact that the mobile phone is a mere emissary of the information and all the data must be uploaded to other units to be processed. Besides, they base their classification in geographic information systems by using a specialized database and without any intelligent method for that purpose. In [17], however, an ANN-based method is suggested to identify relationships between soil colour and range of chemical and physical properties. But they leave aside the matter of how data is obtained and they employ directly a public dataset. In our study, data acquisition is carried out from scratch as well as the method to process those data to get our goal though. Another interesting example is proposed by Noshadi et al. [18] who utilized satellite remote sensory data to provide a measurement of soil colour. They carried out an analysis to finds linear and non-linear relationships between the input measurement from remote sensing data and Munsell soil colour attributes by using Multiple Linear Regression and ANN models. On the other hand, Fuzzy Logic is a form of logic whose underlying modes of reasoning are approximate instead of fixed and exact. It deals with multivalued logic to reason and find solutions. It is based on “if-then” rules that can be set up with the aid of experts in the area of application [19]. Furthermore, Fuzzy Systems (FS) have demonstrated to be a powerful tool in varied scenarios. For example, knowledge-based fuzzy inference systems were applied in [20] for vine vigour and precocity prediction using the colour of soil surfaces, among other variables, and thus to improve viticultural practices. Haiou et al. [21] designed a regularized adaptive fuzzy model for diagnosis of crop nutrient deficiency symptoms based on colour characteristics from images. Zádorová et al. [22] proposed a method of delineating soils by utilizing fuzzy classification models to predict Colluvial soil areas. Rizzo et al. [23] also developed another method for delineation of soil units in toposequences by using spectral responses of soils. They applied a fuzzy k-Means method for clustering samples so as to identify distinct soil classes and obtain pedological profiles. Finally, fuzzy membership values were used in [24] to predict detailed spatial variation of soil properties. Most of the research found so far is focus on applying clustering methods to soil measurements previously treated by matching some predefined database. Thus, the classification relies on static boundaries which define the limit of each class. This is one of the matter this work is intended to solve as well. A similar solution to ours was proposed in [25] where mobile phone data is used to derive land use information. However, the success rate of their system is a little lower than one can expect from a reliable system. We leverage the Fuzzy Logic to improve classification and thus attain better classification rate by combining it with the ANN. Owing to the satisfactory results and outstanding effectiveness achieve in assorted studies, such as cited above, ANN and FS have become as two of the most popular models for classification and prediction in real applications in soil science [26–30], among many others. Accordingly, in this research, these two machine learning techniques were adopted to compute and modelling the colour properties associated with a given soil sample. Due to the soil classification currently represents a rising area and there is still a lot of to be done, this research attempts to fill this gap in the literature by designing an intelligent system to identify Munsell soil colour by means of FS and ANN. Therefore, while the main drawback of previous works lies in the fact that they focus on a specific task of the classification process in very controlled environments, we propose a methodology which is capable of classifying soil samples by using a common digital camera or smartphone. This fact supports the expert decisions, lowers costs of soil classification and improves its accuracy. The rest of the paper is structured as follows. Section 2 introduces some important basic aspects and concept so as to comprehend the problem to solve. Section 3 presents our proposed methodology for soil classification based on colour properties and the data used are also detailed. Section 4 shows the experimentation with the proposed approaches and their results. Section 5 gathers the main conclusions obtained and proposes future research.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52
JID:FSS AID:7762 /FLA
[m3SC+; v1.304; Prn:8/11/2019; 11:03] P.4 (1-17)
M.C. Pegalajar et al. / Fuzzy Sets and Systems ••• (••••) •••–•••
4
1
1
2
2
3
3
4
4
5
5
6
6 7
7 8
Fig. 2. Example of how lighting conditions affect the colour of an image.
10
10
2. Preliminaries
11
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52
8 9
9
Colour is a common feature used in many situations because of its simplicity and its robustness to scale changes, object positions and partial occlusions [31]. The colour can be defined in a simple way as the result of the visual perception of different wavelengths of light. Concrete colour is usually described according to the reflectance in a wavelength range (λ). Thus, the visible light refers to the portion of λ which is visible to the human eye and it represents a very narrow range of the electromagnetic spectrum; specifically, it is positioned between about 700 and 400 nanometres of the spectrum whose limits are known as infrared and ultraviolet, respectively [32]. As pointed out above, the colour is a sensation caused in the brain by three major factors: observer, object and light [33]. A colour image can be mathematically defined as a product of three variables as a function of the wavelength λ, over the visible spectrum ω [34]: Ek (x, y, λ) = R (x, y, λ) L (λ) Sk (λ) dλ (1) ω
where R (x, y, λ) defines the surface reflectance, L (λ) represents the illumination attribute and Sk (λ) stands for the sensor properties. The index k assumes the role of the device’s response in each of the three channels whose values are: red (R), green (G) and blue (B). Being Ek (x, y, λ) the image tantamount to the kth channel. Lighting conditions are an important factor in order to determine a given colour. For that matter, the light source and a white reference need to be defined so as to calibrate measuring instruments, and thus ensuring more precise outcomes [9]. The white reference is utilized to establish a standard white to which are referred the rest of colours of a specific gamut. Hence, colour constancy appears as one of the most important issues to face. Colour constancy is the ability to minimize the influence of changing illuminations. While the human vision manifests a good colour constancy processing, artificial systems have trouble with this task. For artificial systems, the success or failure rate is closely linked to the image processing that is intended to solve the problem [34,35]. Mathematically, this can be simplified by developing a procedure to transform (R1 , G1 , B1 ) coordinates from an image taken under an unknown illuminant to others (Ra , Ga , Ba ) extracted from the same image but it is necessary to know the lighting conditions. Fig. 2 depicts an example of the colour constancy problem under three different illuminants. As can be seen from the aforementioned features, colour is an unclear concept. The same idea, e.g., blue, can refer to many colours, and all of them could be equally valid. For this reason, some authors have proposed formal definitions of colour modelling. Chamorro-Martinez et al. [36] introduce fuzzy colour definitions, and thus they illustrate the correspondence between perceptual categories and its digital representation. In addition to this, it is necessary to measure similarities among colours, so that the system can compare linguistic labels in a logical way, e.g., green and yellow should be more alike than green and red. To do so, there are also some proposals like Seaborn et al. [37] who categorize colours by means of fuzzy membership to provide similarities and discrepancies between two colours. As a result, one can find many tools for fuzzy system definition such as [38], a unified open-source Java library which implements the new IEEE standard. In these lines, it is, therefore, necessary to briefly introduce some concepts regarding the Fuzzy Rule-Based Systems in order to clarify and to help readers understand basic FS aspects. Therefore, and using [36] definitions, a fuzzy colour will be a linguistic label whose semantic is represented as a fuzzy subset of colours. And the colour space would be represented as a collection of fuzzy colours defined by membership functions. In this way, a specific colour will not be a concrete label but a set of labels with a membership degree and, in doing so FS takes truth degrees a mathematical basis on the model of the vagueness phenomenon. For example, if we ask someone about an image “is red the colour
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52
JID:FSS AID:7762 /FLA
[m3SC+; v1.304; Prn:8/11/2019; 11:03] P.5 (1-17)
M.C. Pegalajar et al. / Fuzzy Sets and Systems ••• (••••) •••–•••
5
1
1
2
2
3
3
4
4
5
5
6
6
7
7
8
8
9
9
10
10
11
11
12
12
13
13
14
14
15
15
16
16
17
17
18
18 19
19 20
20
Fig. 3. Representation of the (a) RGB, (b) xyY, (c) L*a*b* and (d) Munsell colour distribution.
21
21
22
22
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
shown?”, the classical approach will have two answers: yes (true/1) or no (false/0); however, the fuzzy solution could provide several alternatives and providing a degree of membership: it is completely red (1), high red (0.85), medium red (0.5), low red (0.35) definitively no (0). Anyhow, a point of an image must be modelled in specific colour space as a set of parameters. Ergo, a colour space is a mathematical model designed to organize and represent colours. And consequently, selecting an appropriate colour space is a very important decision for image processing [39]. Colour spaces typically employ three variables to designate a specific colour. The range of colours that the colour space can represent is called gamut. Every colour space differs from each other in shape, amount of colours which is able to represent, colour distribution, among others. Colour spaces also differ in every device, i.e., colours depend on the particular device. Colour coordinates from a device-independent space are the same on all output media; the CIE L*a*b* space defined by the International Commission on Illumination, the xyY space and its equivalent XYZ representation, which is based on human perception of light, belong to this group. On the other hand, a device-dependent colour space, such as RGB or CMYK, provides distinct coordinates for the same colour using different output devices [40]. Nowadays, four of the most commonly used colour spaces are: RGB, CIE xyY, CIE L*a*b* and HVC. The RGB colour space determines a colour by combining the three primary colours: red (R), green (G) and blue (B). In this way, the wavelength λ in each channel, r (λ) , g (λ) and b (λ) defines the total amount of R, G and B to determine a colour. The value of each component can be in the range from 0 to 255, where 0 corresponds to darkness and 255 to full brightness. Fig. 3a illustrates this colour distribution. The xyY colour space was created by the Commission Internationale de l’Eclairage to address deficiencies in the RGB space. Colour is determined by three new components or tristimulus values: X, Y, Z. This space was created to represent all colours that the human sight is able to perceive and its coordinates are determined by the equation (2) [7,41]. l2
47
49
X=k
l2 P (λ) · x (λ) dλ,
l1
Y =k
l2 P (λ) · y (λ) dλ,
l1
Z=k
52
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
47
P (λ) · z (λ) dλ
(2)
l1
48 49 50
50 51
24
46
46
48
23
where l1 and l2 are the limits of the visible light range, 380 and 780, respectively. P (λ) is the spectral reflectance distribution under a given illuminant; and x (λ) , y (λ) , z (λ) are the colour stimuli perceived by a standard observer.
51 52
JID:FSS AID:7762 /FLA
6
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
[m3SC+; v1.304; Prn:8/11/2019; 11:03] P.6 (1-17)
M.C. Pegalajar et al. / Fuzzy Sets and Systems ••• (••••) •••–•••
k is an arbitrary constant chosen for a normalized value. The XYZ tristimulus are commonly expressed in the xyY colour space which is a bijective function of XYZ values, computed as follows: X Y x= , y= , Y =Y (3) X+Y +Z X+Y +Z In this manner, as depicted in Fig. 3b, a given colour is defined by x, y components which determine the shade and Y indicates brightness. The CIE L*a*b* colour space was also proposed by the Commission Internationale de l’Eclairage in 1964 with a view to solve the non-uniform perception of the previous colour space xyY (see Fig. 3b). As a result, CIE L*a*b presented a perceptually linear structure, i.e., the Euclidean distance between two colours in this space is proportional to the colour difference between these colours perceived by a standard observer [42]. For this reason, it was used in diverse applications for colour image processing and computational colour science [43–45]. The CIE L*a*b* coordinates are estimated by applying a nonlinear transformation of the XYZ parameters. In this colour space, L* determines brightness —in the range [0, 100]—, a* defines red and green colours and b* indicates yellow and blue levels in the same range. Fig. 3c shows the colour distribution in this colour space. Lastly, the Munsell colour system was proposed by Albert Henry Munsell in order to set up a rational way to describe colour using a decimal notation instead of colour labels [46]. As has already been discussed above, this space colour is structured into three parameters: hue (H), value (V) and chroma (C). Fig. 3d displays how colour is arranged in a three-dimensional shape. The first dimension refers to the colour hues (R, Y, G, B and P) together with five more intermediate colours which are the combination of the main hues, e.g., PB, RP. The V coordinate is associated with the colour lightness and it varies along the vertical axis from 0 to 10, darkness and lightness, respectively. The C dimension represents the colour purity or saturation, and it is aligned along the horizontal axis [47]. In the same way as the CIE xyY and RGB colour spaces, the HVC does not have a linear numeric expression, yet it is designed in constant intervals of H, V and C. In real-world problems, there is often needed to work with several colour spaces at the same time because of the nature of the problem. Each colour space presents different advantages and drawbacks. For this reason, there are diverse equations to transform coordinates into other colour space [9,40,48]. As this section is intended to be an introductory description of the most relevant concepts for a proper understanding of the entire work, we refer the reader to previous references for a deepen description of the colour spaces transformations, otherwise, it might hamper the readability of this work. In this study, the systems RGB, CIE L*a*b* and Munsell, as well as the tristimulus values XYZ were adopted to solve soil classification. 3. Methodology
36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
33 34
34 35
2
32
32 33
1
Some fundamental aims of soil science are to describe, catalogue and extract properties of the soil, as well as the measurement, prediction and observation of processes that happen in the environment [49]. There are diverse characteristics of the soil, such as, aeration, organic matter composition, reactions, genesis, pedoclimate, fertility, etc., in which colour can be indispensable for a proper understanding of the soil and its evolution. For this reason, soil colour is often used to obtain information related to soil components, properties and other knowledge [7]. Since colour provides a lot of information about soil quality, its visual characterization with the Munsell system has been adopted as a standard for soil classification since many decades ago. Even though the Munsell colour space can represent a far higher number of colours, because no such amount of colours exist in real-world soils, a subset of them are used for soil classification. This subset of colours is known as the Munsell Soil-Colour Charts (MSCC). In this study, the edition from the year 2000 [50] was utilized, which incorporates seven cards with a total of 238 chips distributed among them: 10R (35 chips), 2.5YR (37), 5YR (31), 7.5YR (35), 10YR (36), 2.5Y (31) and 5Y (31). On the other hand, colour identification becomes a complex task as the human eye perceives colour differently [1] and making a precise match the correct chip is rarely done. As a result, soil scientists allow approximations in the soil colour identification process. In consequence, this study seeks to establish the basis for an artificial soil-scientist capable of classifying soil by its colour. Hence, an intelligent system is proposed whose output is a set of Munsell colour chips for a given input image. Fig. 4 presents the general system scheme. A soil sample is taken by a digital camera (or mobile phone). The image is processed by applying Artificial Neural Networks (ANN) and Fuzzy Systems (FS) to obtain a set of Munsell colour chips whose colour is similar to a given colour from a digital device.
35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52
JID:FSS AID:7762 /FLA
[m3SC+; v1.304; Prn:8/11/2019; 11:03] P.7 (1-17)
M.C. Pegalajar et al. / Fuzzy Sets and Systems ••• (••••) •••–•••
7
1
1
2
2
3
3
4
4
5
5
6
6
7
7
8
Fig. 4. Overview of the proposed system.
8
9
9
10
10
11
11
12
12
13
13
14
14
15
15
16
16
17
17
18
18
19
19
20
20
21
21
22
22
23
23
24
24
25
25
26
26 27
27 28
Fig. 5. Example of the 2.5YR Munsell colour chart with a reference white.
30 31 32 33 34
30
The proposed methodology was designed considering two main stages. Since digital devices provide a RGB measurement, the first step is to compute the HVC values related to a given image using neural networks. Subsequently, the second point is the selection of the Munsell colour chips which are most similar to the transformed HVC values by means of fuzzy logic. 3.1. Dataset
39 40 41 42 43 44 45 46 47 48 49 50 51 52
32 33 34
36 37
37 38
31
35
35 36
28 29
29
For this study, all data were captured manually. Three devices were used in order to retrieve a representative dataset. Therefore, images were taken by two mobiles phones (Nokia C301 and Samsung Galaxy S2) and one digital camera (Canon EOS 1100D). Since the system must learn MSCC adequately, it should not be contaminated by subjective information so as to guarantee good abstraction. Images were taken on grey background and adding a reference white to assign the camera white balance according to the light source. Fig. 5 shows an example of a photograph taken from the 2.5 YR Munsell chart. Note that a good training process is frequently achieved if a high quantity of data is available. In this work, the number of instances is determined by the number of chips in the MSCC. Considering this observation, in our case, the entire classification space is covered once that all charts are implied. However, seven pictures were taken for each Munsell cardboard using each of the devices. An image segmentation algorithm was applied to extract the colour chips. As a result, chips are associated with different RGB values which depend on the device used and the particular lighting conditions. Images were taken in several days, hours and lighting circumstances as mentioned in previous work [51]. Accordingly, in this study, a significant amount of data was collected considering different situations. The number of instances reaches a total of 4998: 1666 per device and 7 per colour chip. The measurements were obtained by an image-segmentation algorithm detailed in the following section. An example of the data obtained for the 10R
38 39 40 41 42 43 44 45 46 47 48 49 50 51 52
JID:FSS AID:7762 /FLA
[m3SC+; v1.304; Prn:8/11/2019; 11:03] P.8 (1-17)
M.C. Pegalajar et al. / Fuzzy Sets and Systems ••• (••••) •••–•••
8
Table 1 Example of the RGB values for the 10R 8/1 chip taken by each device in different daylight conditions.
1
2 3
Day
3
1
Time
4 5 6 7 8 9 10
28th May 28th May 28th May 29th May 29th May 29th May 29th May
13:00 14:00 16:00 12:00 13:00 14:00 16:30
Nokia
Samsung
2
Canon
R
G
B
R
G
B
R
G
B
4
255 255 255 250 255 246 255
225 227 236 224 228 224 233
199 206 214 199 195 195 210
253 245 248 248 251 255 255
231 227 236 227 233 233 236
222 213 223 214 219 222 226
227 225 232 234 231 225 232
218 215 224 221 220 216 224
200 202 207 206 204 198 208
5
8/1 chip is illustrated in Table 1. This table reveals the substantial variations caused by the device and time, i.e., the colour constancy problem.
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
3.2. Image-segmentation algorithm The previous section defines the data used in this study obtained by taking photos from the Munsell Soil-Colour Charts. Nevertheless, those images should be processed so as to acquire information from them automatically. Thereby, the creation of an image-segmentation algorithm to acquire colour coordinates from an image is an essential task here. Otherwise, it would be necessary to scan manually all pictures, alongside the probability of making a measurement error would be very high because of the number of chips involved in the process. In consequence, an algorithm capable of obtaining colour measurements was designed. The algorithm receives an image of a specific Munsell chart and it identifies the chip that appears in the picture. The algorithm returns the corresponding set of chips, defined by the related RGB coordinates along with the chip description, i.e., Hue, Value and Chroma. The aim of the algorithm is to compute the centre point of each chip so that one can take a representative sample of the colour pixels of that chip. The mode is used as the descriptive measure of the RGB values for that sample. The middle point of the chip is based on three consecutive reference chips. Once these three centres are established vertical and horizontal distances can be calculated, and the rest of the centres can be obtained. The next lines detail all the process: 1. The algorithm receives an image as input. The image contains a Munsell chart and the white reference, in the same way as Fig. 5. 2. Edges of the chart and white reference are identified by converting RGB image to grayscale and binary image. 3. Object boundaries are traced removing small objects from the binary image. 4. Each pixel is labelled by measuring the properties of image regions, and the related area is calculated. 5. Since the second largest object is always the white reference, the centre of this object is calculated and a white colour sample is taken in order to adjust white balance properly. 6. Based on step 2 and 3, the Munsell chart is cut off removing the white reference and the rest of the image. 7. From this point, the algorithm seeks to get the centres of all chips by using three adjacent chips used as a guide. The image is transformed into the CIE L*a*b* colour space and those values are used to compute a threshold for identifying the boundaries of the three chips. 8. The distance between two chips is calculated according to the coordinates of centres of the three chips obtained in the previous step. 9. Finally, a representative set of pixels from each chip are utilized to get one measurement per chip by applying the mode. 3.3. Calculation of the HVC values with Artificial Neural Networks (ANN)
51 52
10
13 14
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49
49 50
9
15
15 16
8
12
12
14
7
11
11
13
6
As previously mentioned, there is no perfect colour space for all problems, which, however, can be solved in the best way using different colour spaces. For this reason, three approaches are proposed for calculating the HVC values from a picture:
50 51 52
JID:FSS AID:7762 /FLA
[m3SC+; v1.304; Prn:8/11/2019; 11:03] P.9 (1-17)
M.C. Pegalajar et al. / Fuzzy Sets and Systems ••• (••••) •••–•••
9
1
1
2
2
3
3
4
4
5
5
6
6
7
7
8
Fig. 6. Representation of the three proposed approaches.
8
9
9
10
10
11
11
12
12
13
13
14
14
15
15
16
16
17
17
18
Fig. 7. Artificial Neural Networks architecture for the HVC prediction.
20
20 21 22 23 24 25 26 27 28 29 30 31
1. The first approach consists of estimating the HVC coordinates using the widely-used RGB colour space (see Fig. 6a). 2. The second approach transforms the input into the CIE tristimulus values XYZ and CIE L*a*b* colour parameters in order to check whether the input colour variables might affect how models fit. This conversion is carried out using transformation matrices for RGB to CIE XYZ and CIE L*a*b* [9,40], as depicted in Fig. 6b. 3. And the last approach is based on a reference colour gamut. That is to say, the best device will be used as a reference, and the measurements of the other two devices will be converted using the reference device. Thus, the ANN receives as input a three-element vector in RGB and transforms it according to the best device. This transformation is performed by implementing two more ANNs which were trained with all measures obtained in the first step. For example, an ANN could use the Nokia’s RGB values and would return the Camera’s measures, the R , G , B coordinates showed in Fig. 6c.
34 35 36 37 38 39 40 41 42 43 44 45
In all cases, an ANN was trained to compute the HVC values extracted from the images. The ANN gets the colour coordinates and predicts the three Munsell coordinates. Full dataset was divided into 70% and 30% for train and test respectively. Note that an inappropriate regression in the HVC values would affect in the next stage. Therefore, a major effort was made to minimize the error in the ANN phase to predict the HVC and, subsequently, to improve the set of chips that will be provided. Thus, the ANN estimates the HVC coordinates by using the three coordinates of each input (RGB, XYZ, L*a*b*). It was necessary to adapt the possible H values to an equivalent numeric representation: 10 (10R), 12.5 (2.5YR), 15 (5YR), 17.5 (7.5YR), 20 (10YR), 22.5 (2.5Y) and 25 (5Y). Three ANN were utilized to compute each coordinate of the HVC space (see Fig. 7). The input {c1 , c2 , c3 } represents the corresponding colour coordinates: RGB, CIE L*a*b*, CIE XYZ or reference gamut. The entire ANN architecture is shown in Fig. 7. The input layer has three input, and there is one hidden layer and one neuron as output. Several hidden neurons were tested, and 60 were set as the best number of neurons. The sigmoid function was utilized as the activation function, and the Levenberg-Marquardt procedure was used as the learning algorithm using the Mean Square Error (MSE). Finally, the data were normalized within the interval [0, 1]. 3.4. Fuzzy Logic System for the selection of Munsell colour chips
50 51 52
23 24 25 26 27 28 29 30 31
33 34 35 36 37 38 39 40 41 42 43 44 45
47 48
48 49
22
46
46 47
21
32
32 33
18 19
19
This subsequent phase is responsible for the identification of the most similar set of Munsell chips to a given input image. The system receives the HVC coordinates from the ANN models and it gives the best chips related to those predictions. Since there are three linguistic variables in this problem, we have to define them: hue, value and chroma. The H value presents 7 classes: 10, 12.5, 15, 17.5, 20, 22.5 and 25. The V value has 8 labels: 2, 2.5, 3, 4, 5, 6, 7 and 8.
49 50 51 52
JID:FSS AID:7762 /FLA
[m3SC+; v1.304; Prn:8/11/2019; 11:03] P.10 (1-17)
M.C. Pegalajar et al. / Fuzzy Sets and Systems ••• (••••) •••–•••
10
1
1
2
2
3
3
4
4
5
5
6
6
7
7
8
8
9
9
10
10
11
11
12
12
13
13
14
14
15
15
16
16
17
17
18
18
19
19 20
20
Fig. 8. Fuzzy set representation of the (a) hue, (b) value and (c) chroma coordinates.
21
21 22
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41
The C parameter takes 6 function μA (x) [51]: ⎧ 0, ⎪ ⎪ ⎪ ⎨ x−a , μA (x) = m−a b−x ⎪ ⎪ ⎪ b−m , ⎩ 0,
values: 1, 2, 3, 4, 6 and 8. These designs are illustrated in Fig. 8 by following the triangle
24 25
if x ≤ a if a < x ≤ m if m < x < b
26
(4)
if x ≥ b
This function depends on three parameters, a, m and b, which define the shape and location of the membership degree. The configuration of all parameters was based on expert soil scientist judgements. This is the first step to represent the reality of the problem. Since related labels of each variable are numerical (non-categorical), then fuzzy numbers were employed for defining each fuzzy set. In this manner, the rule-based system incorporates 238 if-then rules, one per chip. For example, the 10R 8/1 chip rule is defined as « IF H is 10 and V is 8 and CHROMA is 1, then 10R », being the ‘and logic’ the t-norm product. Therefore, the system proceeds as follows. The three HVC coordinates forecasted by the ANN in the previous stage are entered in the system. The activation value of the 238 fuzzy rules is calculated. Then, chips with positive activation value are chosen, and subsequently, chips are sorted according to their activation value. Thus, the final design of the system is proposed as reflected in Fig. 9. In the fuzzy part, a total of 238 membership functions should have appeared. However, for a better understanding of the overall scheme, just a few of them are depicted. 4. Results
46 47 48 49 50 51 52
28 29 30 31 32 33 34 35 36 37 38 39 40 41
43 44
44 45
27
42
42 43
23
To minimize the errors of our models, the topology of the ANN was explored by trial and error. Hence, the networks were trained while the test error was lower than the training error in order to avoid overfitting. The best topology was found with 60 hidden neurons. Finally, a correlation test was performed for the test data so as to check the consistency of the results. Table 2 shows error obtained using each device. The errors provided in this table were normalized in the range from 0 to 1. By analysing these results, we identified that the H error is worse than the other two components in all cases. The H coordinate is essential to get the set of chips properly. The V component presents a better fit; but using mobile phones, it differs strongly from the digital camera. Nonetheless, this is not the case for the third component C
45 46 47 48 49 50 51 52
JID:FSS AID:7762 /FLA
[m3SC+; v1.304; Prn:8/11/2019; 11:03] P.11 (1-17)
M.C. Pegalajar et al. / Fuzzy Sets and Systems ••• (••••) •••–•••
11
1
1
2
2
3
3
4
4
5
5
6
6
7
7
8
8
9
9
10
10
11
11
12
12
13
13 14
14
Fig. 9. General outline of the proposed system.
15
15 16
16 17 18 19
Table 2 Normalized ANN errors of the three devices for train and test data. Device
20 21 22 23 24 25 26 27
Training Canon Nokia Samsung Test Canon Nokia Samsung
RGB
17 18
CIE XYZ
CIE L*a*b*
19
H
V
C
H
V
C
H
V
C
0.0030 0.0067 0.0065
0.0006 0.0044 0.0057
0.0001 0.0007 0.0006
0.0024 0.0055 0.0081
0.0010 0.0051 0.0095
0.0001 0.0007 0.0009
0.0025 0.0063 0.0057
0.0006 0.0045 0.0070
0.0001 0.0006 0.0005
22
0.0019 0.0057 0.0084
0.0011 0.0041 0.005
0.0001 0.0006 0.0005
0.0022 0.0063 0.0073
0.0014 0.0045 0.0045
0.0001 0.0007 0.0004
0.0021 0.0058 0.0071
0.0010 0.0040 0.0047
0.0001 0.0008 0.0006
25
21
30 31
Device
33
35 36 37 38 39 40
43 44 45 46 47 48 49 50 51 52
26 27
Training Canon Nokia Samsung Test Canon Nokia Samsung
RGB
30
CIE XYZ
31
CIE L*a*b*
H
V
C
H
V
C
H
V
C
1.8848 4.2114 4.0463
0.0363 0.2795 0.3644
0.0080 0.0439 0.0374
1.4786 3.4125 5.0378
0.0637 0.3233 0.6058
0.0083 0.0429 0.0556
1.5528 3.9253 3.5930
0.0398 0.2857 0.4504
0.0076 0.0394 0.0313
1.2082 3.5850 5.2796
0.0694 0.2593 0.3179
0.0048 0.0392 0.0331
1.3815 3.9124 4.5638
0.0912 0.2876 0.2849
0.0049 0.0461 0.0249
1.2989 3.5944 4.4650
0.0618 0.2570 0.2984
0.0077 0.0494 0.0367
32 33 34 35 36 37 38 39 40 41
41 42
24
29
Table 3 De-normalized ANN error for all devices in training and test.
32
34
23
28
28 29
20
which achieves the lower error, regardless of the device. Besides, the errors in RGB, CIE XYZ, and CIE L*a*b* were very similar. Furthermore, when errors are de-normalized (see Table 3), then one can observe that the results are below one, except in the H coordinate. Likewise, the difference is considerably larger with the mobile phones. Between devices, the lower error is attained by the Canon camera. This result is consistent with the data obtained in Table 1 since the measurements of the digital camera were more constant and less sensitive to changes in daylight. Mobile phones, however, vary their measurements depending on the light conditions in a more abrupt way. Finally, the correlation coefficients between real and predicted values were very high in all cases (Table 4). Therefore, the three models developed provide an adequate prediction and regression of the HVC values using diverse devices with different quality and prices. Virtually all the coefficients are above 0.9, except in the Samsung phone for the H parameter in the RGB space. There is hardly any difference among using one colour rendering system or
42 43 44 45 46 47 48 49 50 51 52
JID:FSS AID:7762 /FLA
[m3SC+; v1.304; Prn:8/11/2019; 11:03] P.12 (1-17)
M.C. Pegalajar et al. / Fuzzy Sets and Systems ••• (••••) •••–•••
12
1 2
Table 4 Correlation coefficients for all devices in the training stage.
3
Device
4 5 6 7
Canon Nokia Samsung
RGB
1 2
CIE XYZ
3
CIE L*a*b*
H
V
C
H
V
C
H
V
C
4
0.9758 0.9247 0.8929
0.9934 0.9724 0.9713
0.9992 0.9946 0.9949
0.9738 0.9179 0.9056
0.991 0.9696 0.9705
0.9992 0.9934 0.9960
0.9754 0.9239 0.9109
0.9943 0.9735 0.9684
0.9987 0.9940 0.9941
5
9
9
11 12 13 14
Table 5 Errors in the algorithm for transforming colour coordinates in the reference gamut.
10 11
Device
Train
Test
De-norm Train
De-norm Test
Correlation coefficient
Nokia Samsung
0.0007088 0.0006550
0.0006210 0.0005251
44.5024 52.5885
35.726 34.142
0.9958 0.9962
17 18
14 15
21
16
Table 6 Comparative of the errors obtained between the reference gamut and the first approach (Table 3). Device
19 20
Nokia Samsung
Reference gamut
17 18
First approach (RGB, Test)
H
V
C
H
V
C
4.71020 7.44860
0.25970 0.36990
0.02870 0.02250
3.58500 5.27960
0.25930 0.31790
0.03920 0.03310
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52
20 21
23
23
25
19
22
22
24
12 13
15 16
7 8
8
10
6
another. This indicates that each system converges to the same optimum in all cases. As a result, we cannot confirm whether or not using alternative colour parameters can improve the results. Taking into account several observations during the progress of this study, one more approach was examined using a reference gamut. Since the three previous approaches were addressed separately, then the three ANNs were trained separately. Colour sensitivity of the three channels RGB in a given device can be considered constant, and changes in measures would be caused by lighting conditions, i.e., due to the problem of colour constancy. Consequently, new ANNs were also used to solve this problem. They utilize the measures obtained from a concrete device and they predict, or transform, those coordinates according to the reference gamut, i.e., the values which should have been obtained if the reference device had been employed. During this training process, the ANN attempts to solve the colour constancy among measures which belong to the same chip. In consideration of the previous results, the Canon camera had better results, particularly with regard to the H coordinate, than mobile phones. This is due to the device’s sensitivity and the fact that the digital camera is more stable against light changes. Therefore, the last approach consisted in considering the colours of the digital camera as the reference gamut, as it presents the most precise measurements. Hence, by transforming the RGB values from the mobile into the reference gamut, the ANN of the reference gamut might be used to predict the HVC values regardless of the device. In this case, the best number of neurons obtained was 50 neurons. The same activation functions and learning rule were set as mentioned before. Table 5 lists the errors for this last approach, being its final goal the same as the previous proposals: to compute the HVC values from a given RGB colour sample. To examine the efficiency of the reference gamut, it is necessary to compute the HVC values based on the new coordinates, as well as those coordinates procured by the three ANN used in the case of the digital camera. Table 6 compares the results using this solution and the previous one. The error trends are very similar to the previous solution without using the reference gamut. The H parameter is again the worst coordinate predicted and with greater differences than the other two variables. Between the two mobile phones, Nokia gets better results, the same thing happens in the preceding outcomes presented. Only in the C coordinate, the error is lower than in the first solution. However, this improvement is good enough for being contemplated. Consequently, the following results will show these latter two solutions (Table 6), but not those for the CIE XYZ and CIE L*a*b*. The results indicate that the first approach estimating the HVC coordinates directly from RGB (Fig. 6a) is good enough to continue with the Fuzzy System, in order to classify these outputs and obtain the corresponding set of chips. For this task, Table 7 shows the errors obtained per device using the ANN of the RGB values (Solution A) and
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52
JID:FSS AID:7762 /FLA
[m3SC+; v1.304; Prn:8/11/2019; 11:03] P.13 (1-17)
M.C. Pegalajar et al. / Fuzzy Sets and Systems ••• (••••) •••–•••
13
1
3
Table 7 Success rate in classifying the correct Munsell colour chip by the fuzzy system using the ANN of the RGB values (Solution A) and the ANN of the reference gamut (Solution B).
4
Device
4
1 2
5 6
Canon Nokia Samsung
7 8
Solution A
2 3
Solution B
Success (%)
Fail (%)
Success (%)
Fail (%)
5
94.1176 76.4706 73.3193
5.8824 23.5294 26.6807
94.1176 73.3193 68.0672
5.8824 26.6807 31.9328
6 7 8 9
9 10 11 12
Chart
13 14 15 16 17 18 19 20
10
Table 8 Detailed percentages of successes and failures for all Munsell charts of the solution A.
10R 10YR 2.5Y 2.5YR 5Y 5YR 7.5YR
Canon
Nokia
11 12
Samsung
Success (%)
Fail (%)
Success (%)
Fail (%)
Success (%)
Fail (%)
88.5714 91.6667 96.7742 94.5946 93.5484 96.9697 97.1429
11.4286 8.3333 3.2258 5.4054 6.4516 3.0303 2.8571
72.8571 63.8889 88.7097 81.0811 80.6452 90.9091 60.0000
27.1429 36.1111 11.2903 18.9189 19.3548 9.0909 40.0000
81.4286 69.4444 74.1935 79.7297 59.6774 72.7273 74.2857
18.5714 30.5556 25.8065 20.2703 40.3226 27.2727 25.7143
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
the ANN of the reference gamut (Solution B). The percentage of positive classification is determined by the correct incorporation of the chip in the provided output, i.e., the output is indicated as right if the correct chip has been included. Otherwise, it would be badly classified. Both solutions present similar results, although the success rate (%) in the B proposal is slightly lower, which corresponds to the reference gamut. It should be noted that the errors in A and B for the Canon camera are the same because the reference gamut used was adopted from this device and, consequently, the results obtained in the solutions A and B must be coincident. Regarding the devices, the digital camera achieves the best results all the time. Being that notably better than the other two mobile phones. Besides, Nokia accomplishes to improve the Samsung outcomes in all cases, even though the difference is not significant. For this reason, the process described in Fig. 6a is considered as the best option and Table 8 details the classification performed through solution A, which has been broken down per chart. Several conclusions can be drawn on this subject. There is no common chart, i.e., hue, which is particularly problematic in terms of classification. The worst classified is the 10R chart in the case of Canon, for Nokia mobile phone is the 7.5YR and 5Y for Samsung. That gives an idea of the dependency of outcomes on the device utilized. What’s more, errors in the digital camera maintains relatively constant, while there are some charts whose errors decrease considerably in mobile phones. This is logical, as the HVC approximations obtained with the digital camera was better than mobiles’ prediction. Finally, with a view to illustrate the overall operation mode of the system, an actual example of the output attained is shown in Fig. 10. In this case, the system receives a colour measurement which is perceived as (243, 204, 180) coordinates in the RGB colour space by applying the aforementioned process. Once a specimen of the soil is taken, the sample is entered in the intelligent system. The HVC parameters are calculated in the ANN stage and fuzzy functions are activated to provide a specific set of chips. In this example, the taken image matches the 10R 8/4 chip of the Munsell charts. The system supplies this chip as the most similar one giving its membership degree up to 0.6789, together with a set of chips whose colour is quite close to the input colour. Moreover, the rest of the proposed chips hardly differ in terms of colour, so either of them could be considered as correct for a non-expert judgement. 5. Conclusions
50 51 52
16 17 18 19 20
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
47 48
48 49
15
46
46 47
14
21
21 22
13
A novel methodology is proposed for soil classification using Munsell Soil-Colour Charts. These charts have been used in many real scenarios so far and one can determine soil colour by matching the sample and one of the chips. In our proposed methodology, we solve this arduous and tedious task by combining machine learning techniques and using affordable devices. Our solution is based on Artificial Neural Networks (ANN) and Fuzzy Logic Systems
49 50 51 52
JID:FSS AID:7762 /FLA
14
[m3SC+; v1.304; Prn:8/11/2019; 11:03] P.14 (1-17)
M.C. Pegalajar et al. / Fuzzy Sets and Systems ••• (••••) •••–•••
1
1
2
2
3
3
4
4
5
5
6
6
7
7
8
8
9
9
10
10
11
11
12
12
13
13
14
14
15
15
16
16
17
17
18
18
19
19
20
20
21
21
22
22
23
23
24
24
25
25
26
26 27
27 28
Fig. 10. Illustrative example of a concrete application of the proposed system.
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52
28 29
29
(FS). The result of this combination is a black-box system that which acts as a classifier. This artificial soil classifier accepts an image as input and produces as output a set of colour-chips as similar as possible to the given soil sample introduced into the model. On the other hand, this work attempts to lay the basis for an artificial intelligent soil-scientist using colour properties. Our methodology is analysed using three distinct devices and different variations of the main approach. It is important to note that, although all measurements were taken without controlled lighting conditions, we obtained a low error in the estimates. Besides, both stages of the system were studied separately to analyse the adequate adjustment in each step. First of all, the ANN was tested using several parameters as well as three different approaches; all three provided adequate forecasting of the HVC values for every device utilized. Secondly, some experiments were carried out with the FS and we found that the classification done with the two proposed colour inputs was similar between them, although the success rate of the reference gamut was slightly lower. Regarding the device, the best outcomes were seen in digital camera Canon due to the higher quality of its photos. We did not find out any improvement referring to the HVC values of an RGB image of the mobile phone compared with those with a more accurate digital camera. Accordingly, a single ANN for each mobile phone is enough to optimize its performance. However, the efficiency of each ANN to carry out this task depends on the ability of the mobile phone to capture colours. The latter also influences the selection done by the FS, i.e., the Munsell colour chips that most resembles a real soil colour. On the other hand, Hue has proven to be the most difficult Munsell coordinate to predict in all cases. However, there is no particular Munsell chart which was problematic to be modelled because the worst-modelled chart varied depending on the device. Colour is an essential feature in soil science. By using it a soil scientist can draw several conclusions. As a result, the implementation of an intelligent system capable of identifying soils based on colour properties is of huge interest. In addition, measurement and evaluation of a soil sample entail a long and costly procedure under laboratory conditions. This justifies the development and application of an effective method to estimate those properties by using chromatic
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52
JID:FSS AID:7762 /FLA
[m3SC+; v1.304; Prn:8/11/2019; 11:03] P.15 (1-17)
M.C. Pegalajar et al. / Fuzzy Sets and Systems ••• (••••) •••–•••
1 2 3 4 5 6 7 8
15
information. It could, therefore, be said that these Soft Computing techniques are becoming a valuable tool for soil scientists. Furthermore, since the original RGB measures for the mobile phone in the reference gamut solution obtain good results, a new hypothesis arises, so that using a more stable reference gamut might improve the HVC approximations. Moreover, as future works, a genetic algorithm could be implemented to find the optimal topology of the neural networks in order to enhance its prediction, and therefore, to minimize the classification error of the artificial classifier. On the other hand, the replacement of this regression model by others or even the combination with other techniques, such as support vector machines, could be an interesting future direction of scientific research. 6. Abbreviations
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
ANN B BG C CIE FS G GY H HVC MSCC P PB R RGB RP V Y YR
Artificial Neural Network. Blue. Blue-Green. Chroma. Commission Internationale de l’Eclairage. Fuzzy System. Green. Green-Yellow. Hue. Hue, Value and Chroma. Munsell Soil-Colour Charts. Purple Purple-Blue. Red. Red, Green and Blue. Red-Purple. Value. Yellow. Yellow-Red.
Acknowledgements
35
This work was developed with the support of the Department of Computer Science and Artificial Intelligence of the University of Granada, Approximate Reasoning and Artificial Intelligence Group (TIC111).
38 39 40 41 42 43 44 45 46 47 48 49 50 51 52
6 7 8
10
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
32
34 35 36
36 37
5
33
33 34
4
31
31 32
3
11
11 12
2
9
9 10
1
References [1] F. Stanco, D. Tanasi, A. Bruna, V. Maugeri, Automatic Color Detection of Archaeological Pottery with Munsell System, Springer Berlin Heidelberg, Berlin, Heidelberg, 2011, pp. 337–346. [2] M.Y. Wong, Y. Zhou, H. Xu, Big Data in Fashion Industry: Color Cycle Mining From Runway Data, 2016, http://aisel.aisnet.org/amcis2016/ Decision/Presentations/29/. [3] R.L. Blaszczyk, U. Spiekermann, Bright modernity: color, commerce, and consumer culture, in: R.L. Blaszczyk, U. Spiekermann (Eds.), Bright Modernity: Color, Commerce, and Consumer Culture, Springer International Publishing, Cham, 2017, pp. 1–34. [4] H. Kobayashi, M. Ito, A. Nishio, Resin Container Filled with Antifungal Pharmaceutical Composition, Google Patents, 2016, http:// patentimages.storage.googleapis.com/e4/68/94/bc549ef6d70e74/US9480746.pdf. [5] J.C.d.O. Cesar, Chromatic harmony in architecture and the Munsell color system, Color Res. Appl. 43 (2018) 865–871, https://doi.org/10. 1002/col.22283. [6] Y. Wang, Study on Colors of Modern Historical Buildings of Wuhan University Based on Munsell Color System, Atlantis Press, 2017. [7] M. Sánchez-Marañón, Color indices, relationship with soil characteristics, in: Encyclopedia of Agrophysics, Springer, 2011, pp. 141–145. [8] N.P. Kirillova, J. Grauer-Gray, A.E. Hartemink, T.M. Sileova, Z.S. Artemyeva, E.K. Burova, New perspectives to use Munsell color charts with electronic devices, Comput. Electron. Agric. 155 (2018) 378–385, https://doi.org/10.1016/j.compag.2018.10.028. [9] R.A. Viscarra Rossel, B. Minasny, P. Roudier, A.B. McBratney, Colour space models for soil science, Geoderma 133 (2006) 320–337, https:// doi.org/10.1016/j.geoderma.2005.07.017.
37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52
JID:FSS AID:7762 /FLA
16
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52
[m3SC+; v1.304; Prn:8/11/2019; 11:03] P.16 (1-17)
M.C. Pegalajar et al. / Fuzzy Sets and Systems ••• (••••) •••–•••
[10] T.A. Imbery, D. Tran, M.A. Baechle, J.L. Hankle, C. Janus, Dental shade matching and value discernment abilities of first-year dental students, J. Prosthodontics 27 (2018) 821–827, https://doi.org/10.1111/jopr.12781. [11] E.M. Caves, P.A. Green, M.N. Zipple, S. Peters, S. Johnsen, S. Nowicki, Categorical perception of colour signals in a songbird, Nature 560 (2018) 365–367, https://doi.org/10.1038/s41586-018-0377-7. [12] E.A. Mikhailova, R.Y. Stiglitz, C.J. Post, M.A. Schlautman, J.L. Sharp, P.D. Gerard, Predicting soil organic carbon and total nitrogen in the Russian Chernozem from depth and wireless color sensor measurements, Eurasian Soil Sci. 50 (2017) 1414–1419, https://doi.org/10.1134/ s106422931713004x. [13] M.J. Aitkenhead, D. Donnelly, L. Sutherland, D.G. Miller, M.C. Coull, H.I.J. Black, Predicting Scottish topsoil organic matter content from colour and environmental factors, Eur. J. Soil Sci. 66 (2015) 112–120, https://doi.org/10.1111/ejss.12199. [14] P. Ataieyan, P.A. Moghaddam, E. Sepehr, Estimation of soil organic carbon using artificial neural network and multiple linear regression models based on color image processing, J. Agric. Mach. 8 (2018) 137–148, https://doi.org/10.22067/jam.v8i1.59228. [15] P. Han, D. Dong, X. Zhao, L. Jiao, Y. Lang, A smartphone-based soil color sensor: for soil type classification, Comput. Electron. Agric. 123 (2016) 232–241, https://doi.org/10.1016/j.compag.2016.02.024. [16] R. Stiglitz, E. Mikhailova, C. Post, M. Schlautman, J. Sharp, R. Pargas, B. Glover, J. Mooney, Soil color sensor data collection using a GPS-enabled smartphone application, Geoderma 296 (2017) 108–114, https://doi.org/10.1016/j.geoderma.2017.02.018. [17] M.J. Aitkenhead, M. Coull, W. Towers, G. Hudson, H.I.J. Black, Prediction of soil characteristics and colour using data from the National Soils Inventory of Scotland, Geoderma 200–201 (2013) 99–107, https://doi.org/10.1016/j.geoderma.2013.02.013. [18] E. Noshadi, H.A. Bahrami, S. Alavipanah, Prediction of surface soil color using ETM+ satellite images and artificial neural network approach, Int. J. Agric. 3 (2013) 87, http://www.researchgate.net/publication/236667076_Prediction_of_Surface_Soil_Color_using_ETM_ Satellite_Images_and_Artificial_Neural_Network_Approach. [19] D. Wu, W.W. Tan, A type-2 fuzzy logic controller for the liquid-level process, in: FUZZ-IEEE, 2004, pp. 953–958, http://s3.amazonaws. com/academia.edu.documents/7070250/a%20type-2%20fuzzy%20logic%20controller%20for%20the%20liquid-level%20process.pdf? AWSAccessKeyId=AKIAIWOWYYGZ2Y53UL3A&Expires=1548415318&Signature=Tmmbz1dRZybl9ShnbQcJQZ3wrT0%3D& response-content-disposition=inline%3B%20filename%3DA_type-2_fuzzy_logic_controller_for_the.pdf. [20] C. Coulon-Leroy, B. Charnomordic, D. Rioux, M. Thiollet-Scholtus, S. Guillaume, Prediction of vine vigor and precocity using data and knowledge-based fuzzy inference systems, J. Int. Sci. Vigne Vin 46 (2012) 185–205, http://www.researchgate.net/profile/Cecile_CoulonLeroy/publication/235336038_Prediction_of_vine_vigor_and_precocity_using_data_and_knowledge-based_fuzzy_inference_systems/links/ 02e7e521b055d7c06e000000.pdf. [21] G. Haiou, Y. Shujuan, J. Feng, X. Shaohua, Z. Yuhu, J. Baoshi, Diagnosis model of crop nutrient deficiency symptoms based on regularized adaptive fuzzy neural network [J], Trans. Chin. Soc. Agric. Mach. 5 (2012) 029, http://en.cnki.com.cn/Article_en/CJFDTotalNYJX201205029.htm. [22] T. Zádorová, V. Penížek, L. Šefrna, M. Rohošková, L. Bor˚uvka, Spatial delineation of organic carbon-rich Colluvial soils in Chernozem regions by Terrain analysis and fuzzy classification, Catena 85 (2011) 22–33, https://doi.org/10.1016/j.catena.2010.11.006. [23] R. Rizzo, J.A. Demattê, M.P.C. Lacerda, Soil vis-NIR spectra and fuzzy K-means on definition of soil mapping units in topossequences, Rev. Bras. Ciênc. Solo 39 (2015) 1533–1543, https://doi.org/10.1590/01000683rbcs20140694. [24] A.X. Zhu, F. Qi, A. Moore, J.E. Burt, Prediction of soil properties using fuzzy membership values, Geoderma 158 (2010) 199–206, https:// doi.org/10.1016/j.geoderma.2010.05.001. [25] T. Pei, S. Sobolevsky, C. Ratti, S.-L. Shaw, T. Li, C. Zhou, A new insight into land use classification based on aggregated mobile phone data, Int. J. Geogr. Inf. Sci. 28 (2014) 1988–2007, https://doi.org/10.1080/13658816.2014.913794. [26] N.J. Rodríguez-Fernández, J. Muñoz Sabater, P. Richaume, P.d. Rosnay, Y.H. Kerr, C. Albergel, M. Drusch, S. Mecklenburg, SMOS nearreal-time soil moisture product: processor overview and first validation results, Hydrol. Earth Syst. Sci. 21 (2017) 5201–5216, https:// doi.org/10.5194/hess-21-5201-2017. [27] A. Tannouche, K. Sbai, Y. Ounejjar, A. Rahmani, A real time efficient management of onions weeds based on a multilayer perceptron neural networks technique, Int. J. Farming Allied Sci. 4 (2015) 161–166, http://ijfas.com/wp-content/uploads/2015/03/161-166.pdf. [28] S. Lameri, F. Lombardi, P. Bestagini, M. Lualdi, S. Tubaro, Landmine detection from GPR data using convolutional neural networks, in: Signal Processing Conference (EUSIPCO), 2017 25th European, IEEE, 2017, pp. 508–512. [29] M.V. Biezma, D. Agudo, G. Barron, A Fuzzy Logic method: predicting pipeline external corrosion rate, Int. J. Press. Vessels Piping 163 (2018) 55–62, https://doi.org/10.1016/j.ijpvp.2018.05.001. [30] S.S. Kale, P.S. Patil, Data mining technology with fuzzy logic, neural networks and machine learning for agriculture, in: Data Management, Analytics and Innovation, Springer, 2019, pp. 79–87. [31] S. Bazeille, I. Quidu, L. Jaulin, Color-based underwater object recognition using water light attenuation, Intell. Serv. Robot. 5 (2012) 109–118, https://doi.org/10.1007/s11370-012-0105-3. [32] P.B. Shelley, Communicating through visible light: Internet of things perspective, Curr. Sci. 111 (2016) 1903, https://www.currentscience.ac. in/Volumes/111/12/1903.pdf. [33] H. Levkowitz, Color Theory and Modeling for Computer Graphics, Visualization, and Multimedia Applications, Springer Science & Business Media, 1997, http://www.springer.com/us/book/9780792399285. [34] V. Agarwal, B.R. Abidi, A. Koschan, M.A. Abidi, An overview of color constancy algorithms, J. Pattern Recognit. Res. 1 (2006) 42–54. [35] J.T. Barron, Convolutional color constancy, in: Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 379–387. [36] J. Chamorro-Martínez, J.M. Soto-Hidalgo, P.M. Martínez-Jiménez, D. Sánchez, Fuzzy color spaces: a conceptual approach to color vision, IEEE Trans. Fuzzy Syst. 25 (2017) 1264–1280, https://doi.org/10.1109/TFUZZ.2016.2612259. [37] M. Seaborn, L. Hepplewhite, J. Stonham, Fuzzy colour category map for the measurement of colour similarity and dissimilarity, Pattern Recognit. 38 (2005) 165–177, https://doi.org/10.1016/j.patcog.2004.05.001.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52
JID:FSS AID:7762 /FLA
[m3SC+; v1.304; Prn:8/11/2019; 11:03] P.17 (1-17)
M.C. Pegalajar et al. / Fuzzy Sets and Systems ••• (••••) •••–•••
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
17
[38] J.M. Soto-Hidalgo, J.M. Alonso, G. Acampora, J. Alcala-Fdez, JFML: a Java library to design fuzzy logic systems according to the IEEE Std 1855-2016, IEEE Access 6 (2018) 54952–54964, https://doi.org/10.1109/ACCESS.2018.2872777. [39] D.J. Bora, A.K. Gupta, F.A. Khan, Comparing the performance of L* A* B* and HSV color spaces with respect to color image segmentation, arXiv preprint, arXiv:1506.01472, 2015, http://arxiv.org/ftp/arxiv/papers/1506/1506.01472.pdf. [40] D. Pascale, A review of rgb color spaces... from xyy to r’g’b’, Babel Color 18 (2003) 136–152, http://www.babelcolor.com/index_htm_files/ A%20review%20of%20RGB%20color%20spaces.pdf. [41] G. Hoffmann, CIE color space (2000), http://www.labri.fr/perso/granier/Cours/IOGS/color/ciexyz29082000.pdf. (Accessed January 2009). [42] P. Colantoni, J.-B. Thomas, A. Trémeau, Sampling CIELAB color space with perceptual metrics, Int. J. Imaging Robot. 16 (2016) 1–22, http:// jbthomas.org/Journals/2016IJIR.pdf. [43] K.M. Goh, Z.b. Ismaail, 8 Colour Quantization of Colour Construct Code in CIELAB Colour Space Using K-Means Clustering and Hungarian Assignment, Springer International Publishing, Cham, 2016, pp. 671–681. [44] K. Misue, H. Kitajima, Design tool of color schemes on the CIELAB space, in: 2016 20th International Conference Information Visualisation (IV), 2016, pp. 33–38. [45] Y.N. Vodyanitskii, A.T. Savichev, The influence of organic matter on soil color using the regression equations of optical parameters in the system CIE-L*a*b*, Ann. Agrarian Sci. 15 (2017) 380–385, https://doi.org/10.1016/j.aasci.2017.05.023. [46] A.H. Munsell, A Color Notation, Munsell Color Company, 1919. [47] P. Silva, E. Roquero, M. Rodríguez-Pascua, T. Bardají, P. Huerta, J. Giner, R. Pérez-López, Development of a Numerical System and Field-Survey Charts for Earthquake Environmental Effects Based on the Munsell Soil Color Charts, 2013, http://www.researchgate.net/profile/J_Giner-Robles/publication/257918864_Development_of_a_numerical_system_and_field-survey_ charts_for_earthquake_environmental_effects_based_on_the_Munsell_Soil_Color_Charts/links/00b4952612423a78e4000000.pdf. [48] I. Farup, A computational framework for colour metrics and colour space transforms, PeerJ Computer Science 2 (2016) e48, https://doi.org/ 10.7717/peerj-cs.48. [49] R. Chandramohan, Studies on the Effect of Graphite on Soil Cec Values Estimated Using Hexamine Cobalt Trichloride Cohex Method and Other Physical Properties, 2018, http://baadalsg.inflibnet.ac.in/bitstream/10603/203115/10/10_chapter1.pdf. [50] Munsell Color Company, Munsell Soil Color Charts, Munsell Color Co., Baltimore, MD, 2000. [51] M.C. Pegalajar, M. Sánchez-Marañón, L.G.B. Ruíz, L. Mansilla, M. Delgado, Artificial Neural Networks and Fuzzy Logic for Specifying the Color of an Image Using Munsell Soil-Color Charts, Springer International Publishing, Cham, 2018, pp. 699–709.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
24
24
25
25
26
26
27
27
28
28
29
29
30
30
31
31
32
32
33
33
34
34
35
35
36
36
37
37
38
38
39
39
40
40
41
41
42
42
43
43
44
44
45
45
46
46
47
47
48
48
49
49
50
50
51
51
52
52