Self-Organizing Maps applied to ecological sciences

Self-Organizing Maps applied to ecological sciences

Ecological Informatics 6 (2011) 50–61 Contents lists available at ScienceDirect Ecological Informatics j o u r n a l h o m e p a g e : w w w. e l s ...

627KB Sizes 15 Downloads 205 Views

Ecological Informatics 6 (2011) 50–61

Contents lists available at ScienceDirect

Ecological Informatics j o u r n a l h o m e p a g e : w w w. e l s ev i e r. c o m / l o c a t e / e c o l i n f

Self-Organizing Maps applied to ecological sciences Tae-Soo Chon Division of Biological Sciences, Pusan National University, Pusan 609-735, Republic of Korea

a r t i c l e

i n f o

Article history: Received 7 November 2010 Accepted 7 November 2010 Available online 26 November 2010

a b s t r a c t Ecological data are considered to be difficult to analyze because numerous biological and environmental factors are involved in a complex manner in environment–organism relationships. The Self-Organizing Map (SOM) has advantages for information extraction (i.e., without prior knowledge) and the efficiency of presentation (i.e., visualization). It has been implemented broadly in ecological sciences across different hierarchical levels of life. Recent applications of the SOM, which are reviewed here, include the molecular, organism, population, community, and ecosystem scales. Further development of the SOM is discussed regarding network architecture, spatio-temporal patterning, and the presentation of model results in ecological sciences. © 2010 Elsevier B.V. All rights reserved.

Contents 1. 2. 3.

4.

Introduction . . . . . . . . . . . . . . . . Algorithm and structure. . . . . . . . . . . Application . . . . . . . . . . . . . . . . . 3.1. Molecules and genes . . . . . . . . . 3.2. Organisms . . . . . . . . . . . . . . 3.2.1. Response to toxic substances. 3.2.2. Pro-ecological activities . . . 3.3. Communities and populations . . . . 3.3.1. Benthic macroinvertebrates . 3.3.2. Algae . . . . . . . . . . . . 3.3.3. Fish . . . . . . . . . . . . 3.3.4. Other taxa . . . . . . . . . 3.3.5. Inter-taxa . . . . . . . . . . 3.3.6. Population . . . . . . . . . 3.4. Ecosystems . . . . . . . . . . . . . 3.4.1. Climate change . . . . . . . 3.4.2. Remote sensing . . . . . . . 3.4.3. Water resources . . . . . . . 3.4.4. Assessment . . . . . . . . . 3.4.5. Management . . . . . . . . Future development . . . . . . . . . . . . 4.1. Convergence and stability . . . . . . 4.2. Data hierarchy . . . . . . . . . . . . 4.3. Temporal data . . . . . . . . . . . . 4.4. Adaptive and modular networks . . . 4.5. Supervised SOM . . . . . . . . . . . 4.6. Sensitivity . . . . . . . . . . . . . . 4.7. Visualization . . . . . . . . . . . . . 4.8. Combined application . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

E-mail address: [email protected]. 1574-9541/$ – see front matter © 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.ecoinf.2010.11.002

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

51 51 52 52 53 53 53 53 53 54 54 54 55 55 55 55 55 55 56 56 56 56 56 56 57 57 57 57 58

T.-S. Chon / Ecological Informatics 6 (2011) 50–61

51

5. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1. Introduction Due to unprecedented industrial development and population aggregation since the 20th century, human-induced disturbances have been manifested on a global basis and are ubiquitous across different hierarchical levels of life (e.g., molecular, cellular, organism, population, community, and ecosystem). Analysis of ecosystem data is essential for providing a comprehensive view of the complexity of environment– organism relationships. Ecological data, however, are considered difficult to analyze because numerous biotic and abiotic components are involved in ecosystem processes at all hierarchical levels of life. The components are related not only within but also between the hierarchical levels, eventually leading to trans-disciplinary holism (Odum and Barrett, 2005). Conventional multivariate methods are somewhat limiting for revealing the non-linear and complex dynamic nature that is frequently associated with analyzing and synthesizing ecological data because they generally apply only to linear data and are less flexible for data handling (e.g., noise and uncertainty). Consequently, more feasible models are expected in ecology to deal with the complex nature of the system and its data. Biologically inspired machine intelligence and modeling has been proposed recently for analyzing complex data in response to motivation to understand the ecological (i.e., evolution) and physiological (i.e., neural processes) functioning of life systems. The properties of biologically inspired models are inherently analogous to life systems in that they are adaptive and self-organized when dealing with the data. The development of these types of models consequently led to the establishment of ecological informatics. In regarded to the application of logic and control in ecological systems, the possibility of using artificial neural networks in ecology was envisioned as early as the 1980s by Odum (1983). Ecological informatics is defined as an interdisciplinary framework that promotes the use of advanced computational technology for elucidating the principles of information processing at and between all levels of ecosystems (http://www.waite.adelaide.edu.au/ISEI/). The techniques used in ecological informatics could be general tools for information processing in ecosystem data, including ordination, classification, prediction, data characterization, identifying environment– organism causality relationships, decision making, etc. Methods in ecological informatics are closely associated with biologically inspiring computation including artificial neural networks, evolutionary and genetic algorithms, fuzzy logic, support vector machines, radial basis function, individual-based models, cellular automata, fuzzy models, etc. Extensive reviews and examples regarding ecological informatics can be found in Lek and Guegan (2000), Lek et al. (2005), and Recknagel (2006). Through both supervised and unsupervised learning methods, artificial neural networks have been extensively used in ecological informatics based on biologically inspired machine intelligence. Supervised learning (e.g., Multi-Layer Perceptron; MLP) is conducted for data estimation (e.g., prediction and environment–community causality relationships) based on a priori knowledge, while unsupervised learning is used when deriving information from data (e.g., ordination and classification) without previous knowledge. In this system, solutions are obtained through an adaptive process to reach a global extrema of energy (i.e., information) in the system. Significant developments achieved in unsupervised learning include the SelfOrganizing Map (SOM) (Kohonen, 1982a,b), Adaptive Resonance

58 58 58

Theory (ART) (Grossberg, 1980), and the Hopfield Network (Hopfield, 1982) and other related models. The SOM (Kohonen, 1982a,b, 2001) has been used extensively for the extraction of information from complex data in various fields, including engineering, agriculture, health, etc. (Ritter et al., 1992; Kaski et al., 1998; Kohonen, 2001). The SOM is an efficient means of creating maps of multi-dimensional and complex data in order to approximate the probability density function of the input data and show the data in a more comprehensive fashion and in fewer dimensions (Kohonen, 2001). Consequently, the SOM makes it possible to extract information from a complex system and achieve flexibility in data presentation while the system stability is maintained through convergence to an equilibrium map (Ritter and Schulten, 1986, 1988). Application to environmental and science was reviewed by Kalteh et al. (2008) and Céréghino and Park (2008) regarding water resources classification. SOMs and related networks (e.g., Walley and O'Connor, 2001) have been applied to ecological data since the mid-1990s for community patterning (e.g., Chon et al., 1996; Foody, 1999). They have recently been used as one of major models in biologically inspired machine intelligence in ecology (Lek et al., 2005; Recknagel, 2006; Chon and Park, 2006). This article outlines the application of the SOM across different hierarchical levels of life as well as the future development of the SOM in regard to network development, architectural modification, spatio-temporal patterning, and presentation of model results in ecological sciences. 2. Algorithm and structure A comprehensive understanding of data can generally be achieved through networking that allows self-organizing processes. The SOM extracts information from multi-dimensional biological and/or environmental data and maps it into fewer dimensions (conveniently 2 or 3 dimensions). In the SOM, a linear array of M2 artificial neurons (i.e., computational nodes), with each neuron being represented as j (Fig. 1), is arranged in a limited number of dimensions (usually two) for the convenience of visualization (Kohonen, 1982a,b; Chon et al., 1996). Suppose data for a community containing n species or environmental factors (i.e., n dimensions), where the density of species, i, is expressed as a vector, xi. Vector xi is considered to be an input layer for the SOM. Each node, j, is connected to each input node, i. The connection weights, wij(t), change adaptively at each iteration of the calculation, t, until convergence is reached through minimization of the difference, dj(t), between input data xi and the weight wij(t): N−1

2

dj ðtÞ = ∑ ðxi −wij ðtÞÞ i=0

Initially, the weights are randomly assigned small values. The neuron responding maximally to a given input vector is chosen as the winner, the weighted vector of which has the shortest distance to the input vector. The chosen neuron, and possibly its neighboring neurons, is allowed to adapt by changing weights to further reduce the distance between the weighted vector and the input vector as shown below:

wij ðt + 1Þ = wij ðtÞ + ηðtÞðxi −wij ðtÞÞZj ;

52

T.-S. Chon / Ecological Informatics 6 (2011) 50–61

Variables (Xi ) ----------------------------------------------Sample SU1 units SU2 :

O ....

O

.....

O

O ....

O

.....

O

:

SUp ------

:

O

....

:

O

.....

O

------------------------------------

... ...

x1

Input layer

xi

... ...

xn

wij

O t t layer Output l

(Ward, 1963), as shown in Fig. 2b. Fig. 2 additionally shows the groupings that were determined with Ward's clustering method, depicted by thick lines. For a detailed description of the algorithm, refer to Zurada (1992), Haykin (1994) and Kohonen (2001), and Chon et al. (1996) and Lek and Guegan (2000) for ecological application.

j

Fig. 1. Schematic diagram of SOM.

where Zj is assigned 1 for the chosen (and its neighboring) neuron(s) and is assigned 0 for the remaining neurons. The term η(t) denotes some fractional increment of correction for learning. The radiusdefining neighborhood is usually set to a larger value early in the training process and is gradually reduced as convergence is reached. The weights of the best-matching unit and neurons close to it are adjusted toward the input vector through interactive calculation in the SOM lattice. The network is flexible for data organization, and so further development in network architecture has been proposed to deal with complexity in ecological systems, which will be discussed later. Fig. 2a shows groupings of communities after training by the SOM. In order to more clearly indicate the distances between the groups, the methods for discriminating between the groups are reported in a U-matrix (Ultsch, 1993). The degree of association among the grouped sample sites was also determined with Ward's clustering

3. Application 3.1. Molecules and genes Along with a rapid development in molecular biology, the increased feasibility of information extraction has been demonstrated by the SOM with the illustration of molecular structure and functional characterization. The SOM has been used in the clustering of high dimension molecular composition since the early 1990s (e.g., Ferrán and Ferrara, 1992 and others). Several accounts of research using the SOM have been reported in the identification of chemicals in cell and molecular biology as well as in medical applications. Review of these topics, however, is beyond the scope of this paper. SOMs increased the feasibility of information extraction in both molecular genetics and microbiology, thus bridging the academic fields of molecular biology and ecology to elucidate the effects of selection pressures (e.g., environmental changes, geographic differences, tolerance, competition, and predation) on molecular genetics. Giraudel et al. (2000) applied a SOM to detect genetic structure in brown trout Salmo trutta based on microsatellite data along with fuzzy clustering. Roux et al. (2007) developed tools to discriminate between the genetic structures of different populations of an insect pest, the diamond back moth, Plutella xylostella, collected from widely distributed areas and were capable of identifying the pest populations from different geographic origins using Inter Simple Sequence Repeat (ISSR) markers. Genetic structures of pig population were trained by the SOM, and individuals could be classified according to breed origins (Nikolic et al., 2009). The SOM was applied to clustering spatial patterns of the surnames and can be a new strategy for a better Y-chromosome sampling design in retrospective genetics studies. Genetic responses to environmental changes have also been illustrated with the use of SOMs. Gene expressions from microarray data of plants (paper birch, Betula paperifera) exposed to elevated CO2 and O3 gases during leaf maturation and senescence were analyzed by a SOM to illustrate the clustering of a fixed amount of gene expression in response to environmental changes (Kontunen-Soppela et al., 2010). The SOM has also been applied to the field of molecular genetics in regard to the potential for epidemic disease. There is a high potential

Fig. 2. Classification of the samples according to the trained SOM. a) The SOM units classified to 8 clusters, b) the dendrogram according to the Ward linkage method based on Euclidean distance, c) geographical location of the sampling sites matching to clusters according to the SOM. From Park et al. (2007).

T.-S. Chon / Ecological Informatics 6 (2011) 50–61

for disease irruption due to global warming. Wu X.-L. et al. (2004) applied a human gene-based strategy for the census of orthologous genes and produced maps of biologically relevant transcriptional patterns using SOM. Wu et al. L.Y. (2009) further investigated the ability of SOMs to solve the haplotype assembly problem with a minimum error correction model. A SOM-based algorithm can efficiently reconstruct haplotype pairs in a highly accurate manner under realistic parameter settings. Considering the current levels of population aggregation and environmental disturbance, epidemics and epizootics, including vector-borne diseases, could be prevalent in the future. Consequently, there may be a growing need for ecological informatics as applied to data for epidemics and related molecular/ cellular studies regarding the eruption of pathogens and vectors in the future. In addition to the study of gene expression with the use of the SOM at the population level, it has also been examined at the individual level in behavioral data. Choi et al. (2006) investigated the variation of mutant strains of a fruit fly Drosophila melonogaster and detected corresponding movement behaviors in response to genotypes. These studies demonstrate that the techniques used in examining artificial neural networks could be useful for analyzing complex ecological and behavioral events induced by changes in genetic information. 3.2. Organisms 3.2.1. Response to toxic substances At the organism level, SOMs have primarily been used to analyze behavioral data in response to the existence of environmental stressors (e.g., toxic chemicals). As described previously, SOMs has been used in the analysis of behavioral data in response to different genotypes (Choi et al., 2006). However, the majority of applications of SOMs to behavioral data have focused on the detection of response behaviors of individual animals in ecological risk assessment. Specimens have been reported to be sensitive to sub-lethal exposure of environmental stressors such as toxic chemicals since the late 1980s (Lemly and Smith, 1986 and others). Behavioral data are complex, and so methods for analyzing behavior without a priori knowledge are required in behavioral monitoring. SOMs have been used to efficiently identify the patterns of response behavior in individual animals by selecting the behavioral parameters (e.g., speed (mm/s), stop number (stop frequency during 1 h), etc.) in the data (Chon et al., 2004; Park et al., 2005; Ji et al., 2007). The SOM made it feasible to classify behavioral patterns and organize behaviors in different categories (i.e., sub-clustering) according to response to chemicals. Recently, behavioral classification with the SOM has been expanded for use with humans in clinical applications. For example, the human gait signature was analyzed with a combination of wavelet and SOM (Lakany, 2008). In this study, joint motion trajectories were extracted with the use of wavelets to identify spatio-temporal features, and the data were subsequently input into a SOM to reveal the classification of walking patterns of individuals. As a result, it was possible to extract features that successfully discriminated between those individuals with and without impaired locomotion. 3.2.2. Pro-ecological activities The ability of the SOM to identify patterns in complex behavioral data has been further demonstrated at a large scale in the socioeconomic behavior of humans. The idea of ecological value has garnered special attention in regard to conservation policy. Consumers have undergone a change in social attitude that has resulted in an increased feeling of responsibility for personal habits and lifestyles as they relate to environmental problems (Stone et al., 1995). With the use of a SOM, Mostafa (2009) analyzed variables affecting green consumption, such as altruistic values, environmental concern, skepticism towards environmental claims, and attitudes toward green

53

consumption. Psychographic segmentation of the green consumer improved clustering quality based on the SOM training. Pisati et al. (2009) examined the patterns of multiple deprivation regarding social exclusion in human society in Ireland and applied a SOM to provide a differentiated and interpretable picture of the structure of multiple deprivation. The SOM showed considerable additional discriminatory power in relating an individual's experience to his/her economic circumstances. The results suggest that a SOM approach can provide a valuable additional ‘methodological platform for analyzing the shape and form of social exclusion.’ Conservation policy and the sustainable management of ecosystems could cause conflicts between humans of different social statuses in a similar manner, and the type of information extracted with the SOM described previously could be suitable for analyzing the complex behaviors of humans and ecosystems in the future. The reconciliation of environmental conflicts requires sufficient communication between the social members involved, and the internet is essential in this regard. Prieto et al. (2008) recognized the importance of weblogs in creating new social networks and visualized the evolution of such web-based social networks with the use of the SOM. When neural networks apply SOMs and biologically inspired learning algorithms, they become suitable for either analyzing data or providing necessary information through data treatment. The SOM was extended to the integrated modeling of environmental and economic systems. Shanmuganathan et al. (2006) applied the SOM at regional (i.e., ecosystem response to human influence) and global (i.e., environmental–economic system and trade-off) scales to be able to inform sustainable environment management. 3.3. Communities and populations The demonstration of response to natural and anthropogenic environmental changes has been a key issue at the hierarchical level of communities, and SOMs have been used to monitor such community-level responses. Initially, community response was characterized to represent environmental impact by ordination and classification. Subsequently, community response assessments and indicator systems were developed by analyzing large scale data. Among various taxa, those considered to be good indicators, such as benthic macroinvertebrates, algae, and fish in freshwater ecosystems (Hellawell, 1986), have been widely analyzed with the use of SOMs. 3.3.1. Benthic macroinvertebrates SOMs have demonstrated numerous accounts of community patterning of benthic macroinvertebrates in disturbed streams since 1990s (e.g., Chon et al., 1996). The patterns in benthic macroinvertebrate communities identified by SOMs include community parameters (e.g., richness, diversity index, etc.) (Compin et al., 2005; Park et al., 2003b), functional feeding groups (Compin and Céréghino, 2007), hierarchical presentation (Park et al., 2004), effects of dam removal (Tszydel et al., 2009), and overall community occurrence patterns (Kwak et al., 2005). Obach et al. (2001) used a SOM in combination with a Radial Based Function (RBF) to predict the abundance of aquatic insect populations based on relevant environmental factors. Typically, the SOM improves understanding of community compositions in various ecosystems. A SOM was used to examine the variation in benthic macroinvertebrate biodiversity in ponds in agricultural areas (Ruggiero et al., 2008). In another study, researchers were able to cluster different groups of benthic macroinvertebrates from different sites in geographic (kryal, rhithral, and krenal) areas in glacial streams (Lencioni et al., 2007). The structure of aquatic insect communities in bromeliads in an East-Amazonian rainforest in French Guiana was illustrated with the use of a SOM (Jabiol et al., 2009).

54

T.-S. Chon / Ecological Informatics 6 (2011) 50–61

In addition to identifying community patterns, SOMs have also been used to understand community development trends. Temporal changes in the community patterns of benthic macroinvertebrates have been traced with a SOM (Chon et al., 1996; Song et al., 2007). Evaluation of ecological state in time series data was consequently possible through recognition. SOMs have also been used in the characterization of restored stream sites by clustering benthic macroinvertebrate communities (Song et al., 2006). In addition to being used for data accumulation, SOMs have been efficient in identifying patterns in large-scale data sets. Using a SOM, Céréghino et al. (2003) analyzed the regional distribution of lotic macroinvertebrate species in the Adour–Garonne drainage basin in southwestern France and provided stream classification based on characteristic species assemblages. They found that any change in species composition within a given region (e.g., EPTC region) could be considered as a biological indicator of environmental change. Benthic macroinvertebrates were used in a SOM to represent community grouping that reflected large-scale natural and anthropogenic impacts in Korea (Park et al., 2007). Fig. 2a shows an application of the SOM to large-scale community data. Benthic macroinvertebrates were collected at 1970 sample sites located in relatively clean to intermediately polluted areas in South Korea from 1997 to 2002. The resulting clusters accordingly represented geographic regions (Fig. 2b, c) (Park et al., 2007). This further indicates the ability of the SOM to define ecoregions. 3.3.2. Algae Algal communities exhibit complex reactions to environmental impacts, and they are important ecosystem indicators. Algae play a key role in the productivity and ecological integrity of ecosystems, and as such, they are regarded to be important for water quality monitoring and for understanding the state of aquatic ecosystems. The Multi-Layer Perceptron (MLP) has been used to predict and understand environment–population causal relationships in algae (French and Recknagel, 1994; Scardi, 1996). Environmental stressors and water quality indices are checked with the use of a SOM through unsupervised learning. A data set, large in both size and length of time over which it was gathered, has been accumulated, and the resulting reports were recently presented. The complicated factors residing in the data set make the use of the SOM suitable for ordination and clustering of the response of algal communities to environmental changes (Joo and Jeong, 2005). Tison et al. (2005) reported a SOM in using algal communities to characterize ecosystems, such as hydroecoregions, and then investigated the typology of diatom communities as a result of the influence of hydro-ecoregions. By comparing the diatom communities in natural and disturbed sites, indicators for different types and levels of anthropogenic disturbance were found throughout the French hydrosystem. A SOM was efficient in revealing environmental impact on algal communities and was suitable in presenting eutrophication states in lakes (Recknagel et al., 2006b). Joo and Jeong (2005) used algal communities to identify patterns in the eutrophication process of a river, revealing the seasonal occurrence of an algal genus (e.g., Anabaena in spring and summer) that resulted in the occurrence of a seasonal cyanobacteria bloom. Cyanobacteria blooms were also analyzed with the SOM and MLP in combination to identify the major environmental factors in the clustered communities (Oh et al., 2007). Complex algal dynamics was unraveled and forecasted by the combined use of SOM and Evolutionary Algorithm (EA) through organized data processing of ordination, clustering, and rule discovery (Recknagel et al., 2006a; Chan et al. 2007). Morin et al. (2009) examined the linkage between diatom community structure and pesticide occurrence in a large-scale data set in southwest France. The typology of the diatoms collected was determined using artificial neural networks, which generated patterns in diatom community composition that indicated which species were influenced by pesticides combined with organic pollutants.

Tison et al. (2007) further examined nation-wide data and focused on the biogeographical variation in diatom assemblages in natural or near-natural conditions. They then used a SOM to characterize anthropogenic disturbances and resulting algal community changes. The SOM was also efficient in developing a water quality indicator system based on algal community responses. Gevrey et al. (2004) conducted a water quality assessment using diatom assemblages and reported 12 representative assemblages in presenting different ecological states. Park et al. (2006b) used a SOM to determine species within a diatom distribution in France as well as their structural index to understand the species associated with the distribution gradient. Coste et al. (2009) developed a system based on a long-term, largescale algal data set to improve the Biological Diatom Index (BDI) in France for the surveillance of water quality. The physico-chemical and biological datasets were explored with both classical analysis (Principal Component Analysis (PCA)) and neural networks (SOM) in combination to identify new key species and ecological profiles. Recently, Rimet et al. (2009) reported on the BDI for Lake Geneva for 1974 to 2007. The SOM identified clear seasonal patterns during water stratification and mixing phases as well as influential environmental factors. 3.3.3. Fish SOMs were also efficiently used to pattern fish communities, and were feasible in characterizing species diversity and in revealing environmental effects. Brosse et al. (2001) classified fish assemblages with a SOM and verified the classification with a PCA. Penczak et al. (2005) investigated spatial variations in fish assemblage structures and diversity to determine that fish species were grouped according to geomorphology: small streams, middle or lower courses of main channels, or large tributaries. Lasne et al. (2007) used floodplain data in a SOM to report on the role of hydrological connectivity in determining biodiversity in fish communities and revealed a gradient of flow preferences. The effects of physical changes in river flow such as dam construction (Jabiol et al., 2009) were identified by changes in fish community. The distribution patterns of endemic fishes were characterized by a SOM after construction of the Three Gorges Dam in China and the survival rates of fish after dam filling, including species in danger of extinction, were reported (Park et al., 2003a). An indicator system was also developed with the use of fish communities and SOMs. Using multimetric data, information was extracted that showed the links between water body stressors and biotic integrity (Manolakos et al., 2007), providing an efficient data analysis and visualization tool for assessing the effects of anthropogenic stressors on the fish population through the fish metrics. The SOM was employed to identify a pattern in the sampling sites, and graphics can be used to explore the complex interrelationships in understanding the effects of the environment on fish communities. Long-term, temporal data has also been examined with the use of SOMs in fish communities. Hyun et al. (2005) used a SOM to patternize long-term fisheries data from South Korea from 1954 to 2001 to reflect the impact of anthropogenic stressors (i.e., war and economic development) on fish communities temporally. Kangur et al. (2007) examined long-term changes in fish communities in Lake Peipsi in France. A SOM was applied to extract changes in the fish community in the shallow, eutrophic lake using the commercial fishery statistics recorded for 64 years between 1931 and 2002 (excluding 1940–1949 during World War II). The fish community changed gradually, although some abrupt changes were also recorded, varying from a community of clean- and cold-water species to one reflecting ongoing eutrophication. 3.3.4. Other taxa The use of SOMs for determining patterns has been extended to other taxa. Both avian and terrestrial communities have been examined with SOMs. Recently, avifaunal assemblage patterns were

T.-S. Chon / Ecological Informatics 6 (2011) 50–61

examined according to habitat preference, showing the clusters characterized by seasonality and other environmental and habitat properties (Özesmi and Özesmi, 1999; Lee W.S. et al., 2006, 2007; Lee et al., 2009). Lee J. et al. (2007) conducted a classification of breeding bird communities along an urbanization gradient using a SOM and defined the relationships between environmental variables and breeding bird assembly patterns. SOMs have also been applied to plant communities in terrestrial settings. A SOM ordination method was used in conjunction with Discriminant Component Analysis (DCA) and PCA in a gradient analysis and successfully displayed quadrates in species space to reveal ecological gradients (Zhang, J. et al., 2008). Foody and Dash (2007) identified and mapped the C3 and C4 composition of the grasslands in the northern Great Plains of the United States based on remotely sensed data. By modifying a SOM, Ghosh et al. (2009) constructed a model for detecting context sensitive change in plant communities. In another study Collembolan terrestrial macroinvertebrate were examined with a SOM to compare those found in communities in a peat bog to those found in the surrounding forest (Lek-Ang et al., 2007). 3.3.5. Inter-taxa An improvement in surveying methods led to the concurrent collection of data for different types of taxa recently. These different taxa (e.g., consumers/macroinvertebrates and decomposers/microorganisms), when studied in conjunction, can provide a comprehensive and broad understanding of an ecosystem. Song et al. (2005) conducted integrated studies on communities of macroinvertebrates and microorganisms in polluted aquatic systems with the use of SOMs. The scope of each taxa changed in conjunction with the degree of pollution in aquatic ecosystems. Kim et al. (2008) also examined the structural and functional relationships between macroinvertebrates and microorganisms with the use of a SOM, and associations between them (e.g., negative correlation Acidovorax sp. (from polluted sites) and Gammaridae (mostly from the clean site)) were revealed. It can be foreseen that a broad scope of taxa, covering producers, consumers and decomposer, will provide essential information of ecological integrity in ecosystems in the future. The SOM will be suitable in analyzing the data for this type complex inter-taxa data, regarding both information extraction and model presentation. Communities across different taxa need to be collected concurrently in different ecosystems such as lakes and rivers in order to reveal their ecological integrity. The “Ecostar” project, aiming development of ecological health indices from lakes in Korea (2008–2012), is a good example in this regard to survey major taxa covering algae, fish, macroinvertebrate, plant, and other taxa. Plankton communities have been extensively surveyed in lakes and reservoirs and have been analyzed by the SOM (e.g., French and Recknagel, 1994; Oh et al., 2007; Rimet et al., 2009). Fishes were also collected and analyzed by the SOM (e.g., Kangur et al., 2007). In lakes, however, communities of macro invertebrates have been scarcely investigated by SOM except some taxa (e.g., chironomid assemblages (Penczak et al., 2006)). More studies on macroinvertebrate would be needed to accomplish the multi-taxa surveys in lentic conditions. Various taxa including benthic macro invertebrates in streams, in contrast to lakes, have been extensively trained with the SOM as stated above. Microorganisms, however, have not been analyzed in both ecosystems except few cases reporting macroinvertebrate–microorganism associations in polluted streams (Song et al., 2005; Kim et al., 2008). Systemized survey plans and standardized methods may be desirable to collect major taxa concurrently in the target ecosystems in the future. 3.3.6. Population Population ecology often focuses on spatial and temporal variation in population dynamics (i.e., competition and prediction). Population

55

dynamics have been mainly investigated with the use of MLPs to reveal environment–organism causal relationships (e.g., French and Recknagel, 1994; Scardi, 1996). Consequently, the SOM is not used as frequently in population studies as it is in community studies. In one study, a SOM was applied to population data to characterize environmental conditions contributing to population establishment. Gevrey et al. (2006) estimated potential pest invasion with the use of a SOM. Park and Chung (2006) conducted an assessment of the risk of pine trees to an insect pest with the use of two different artificial neural networks, both supervised and unsupervised, in conjunction. Eight variables of tree measurement were used to classify the trees, and the SOM proved to be useful for the classification of input vectors and for the analysis of relationships between input variables. Griebeler and Seitz (2006) examined Markovian metapopulation models and reduced their dimensionality with the use of a SOM by grouping nearly identical states to reveal more accurate and more reliable estimates. 3.4. Ecosystems 3.4.1. Climate change Revealing the impact of environmental disturbances, including global climate change, is currently a key issue in ecosystem studies. The SOM has been useful for examining complex environmental data and providing information for ecosystem management policies. Estimation of rainfall has been one of main concerns in climate change. Parikh et al. (1999) employed a SOM on an evolutionary algorithm for the recognition and tracking of synoptic-scale storm systems based on optical flow and cloud motion information from global satellite-based datasets. Nishiyama et al. (2007) identified the typical synoptic patterns causing heavy rainfall, defined by the water vapor amount contained in a vertical column of the atmosphere, created for rainy season in Japan. Lin and Wu (2009) combined a SOM and MLP in order to forecast typhoon rainfall. Initially clustering and discrimination analyses were performed with a SOM, followed by the use of MLP to construct the relationship between input and output data. The SOM and wavelet transformation have also been used in conjunction for clustering spatial and temporal precipitation data to extract dynamic and multiscale features from a non-stationary precipitation time series (Hsu and Li, 2010). Recently nonlinear climatology and paleoclimatology data were analyzed by the SOM to be able to capture the spatial and temporal variability in atmospheric circulation data sets (Reusch, 2010). Chang et al. (2010) assessed the effect of meteorological variables for evaporation estimation by the SOM and showed that the topological structures by the SOM could give a meaningful map to present the cluster of meteorological variables. 3.4.2. Remote sensing Considering complexity in the remotely sensed data the SOM was suitable in information extraction from the satellite data (Richardson et al., 2003). Foody and Dash (2007) applied a stepwise discriminant analysis to map the C3 and C4 composition patterns of grasslands in the northern Great Plains of the United States based on remotely sensed data. By using remote sensing data plant disease was detected by Moshou et al. (2005). A SOM was used in a context-sensitive technique to create a map depicting changes in a temporal remote sensing image (Ghosh et al., 2009). Land cover classification in China was possible with an iterative self-organizing data analysis applied to spectro radiometer time series data (Xia et al., 2008). 3.4.3. Water resources Conservation of water resources have been regarded critical due to the shortage and contamination in both aspect. Water types have been classified in revealing complexity residing in spatial and temporal variation by the SOM (Kalteh et al., 2008; Céréghino and

56

T.-S. Chon / Ecological Informatics 6 (2011) 50–61

Park, 2008). Bowden et al. (2005, 2006) reported classification of water resources regarding different types of salinity, flow, and water level. Jeong et al. (2010) applied SOM to large sets of catchment-wise data in the Nakdong River in Korea to reveal stream modification patterns. SOM and counter propagation network were used as a fluvial hazard management tool by dealing with the resource data in a hierarchical manner to predict reach-scale geomorphic condition (Besaw et al., 2009). Hydrodynamics is also important in influencing ecosystems. The SOMs were applied to quantifications of uncertainty in high-resolution coupled hydrodynamic-ecosystem models by applying two-stage SOMs to simplify dimensions of the problem (Allen et al., 2007). Distributions were illustrated in association with real data in hydrodynamics in marine ecosystems. 3.4.4. Assessment In addition to providing environmental indicators at the community level, as described above, such an assessment can also be conducted at the ecosystem level (e.g., Tran et al., 2003), based on a combination of environmental factors, and demographic and biological data. Aguilera et al. (2001) applied a SOM to nutrient data (e.g., ammonia and nitrite) in order to create an activation map and quadrate system methodology to make a better classification system for differentiating coastal water quality. Assessment of grounding water in a semiarid zone was conducted by the SOM, by allowing an easy and rapid mean of water quality evaluation (Sánchez-Martos et al., 2002). Trophic status assessment of water quality is also important for the water resources management. Lu and Lo (2002) applied the SOM to historical data base of environmental data (e.g., temperature, pH and dissolved oxygen) for diagnosing reservoir water quality. Mele and Crowley (2008) evaluated soil biological quality by analyzing a range of environmental (e.g., soil, chemical and physical) and biological (e.g., genetic and biochemical signatures) variables. A SOM was used to identify a pattern in a large-scale data set to create a worst-case definition in a pesticide risk assessment (Sørensen et al., 2010). One goal in the study of ecosystems has been the quantification of total ecosystem quality. Prior to the initiation of any ecosystem management policy, an objective evaluation of the system is required to understand its current quality in relation to maturity and disturbance. The energy states used in ecological aspects have been defined by Jørgensen (Exergy, 1992) and Odum (Emergy, 1983). Parameters representing ecosystem quality have been revealed through the use of a SOM. Park et al. (2006a) showed that exergy responded differently to different water body types by using a SOM. 3.4.5. Management In addition to detecting environmental changes and evaluating ecosystem quality, SOMs are also useful in the creation of ecosystem management policies. A SOM was applied as a tool for predicting the performance of an integrated constructed wetland agroecosystem and assessment of nutrient removal performance, allowing real time control of the outflow water quality of the wetland (Zhang, L. et al., 2008, 2009). The effects of phosphorus removal have also been detected with the use of a SOM and a PCA in conjunction by analyzing and interpreting data from a P-removal Sequencing Batch Reactor (SBR) (Aguado et al., 2008).

was demonstrated (e.g., de Bodt et al., 2004) in the SOM mainly stems from topology preservation through dimension compression (Kohonen, 1982a,b, 2001). Stability and convergence properties of the SOMs topology maps were mathematically reported from various aspects, including stationary mapping and Markov processes (Ritter and Schulten, 1986, 1988), energy functions (Tolat, 1990; Jockusch and Ritter, 1994), and the maximum-entropy principle (Grabec, 1990). When considering dimension compression from multivariate data, however, loss of information is unavoidable. Computational methods need to be developed to guarantee stability and topology preservation to the maximum extent. Mathematical issues regarding the SOM, including points that need to be proven in order to provide a framework for formulating mathematical questions are discussed further by Fort (2006). Further developments of algorithms for topology conservation have been proposed, including quantification of neighborhood preservation in the self-organizing feature (Bauer and Pawelzik, 1992). The enhancement of convergence in topology preservation was described through various ways: neighborhood interaction function (Lo and Bavarian, 1991), reduced neighborhood width (Flanagan, 1998), metric topology preservation transportation (Bezdek and Pal, 1995), preservation of ordering relationship (Jin et al., 2004), reducing variability of neighborhood structure (Rousset et al., 2006), and convergence in vector quantization (Lepetz et al., 2007). Network architecture in the SOM is flexible and has been continuously developing, so further advancement in key algorithms such as neighborhood quantification is expected regarding data application and ecosystems types in the future. 4.2. Data hierarchy The feasibility of information extraction was further elaborated by the hierarchical data organization of the SOM. The hierarchical data structure has been demonstrated with various methods, including algorithm, combinational application, or network modification. Mangiameli et al. (1996) discussed about superiority of SOM to hierarchical clustering methods. For grouping messy empirical data, Chen et al. (1995) demonstrated the superiority of the SOM for hierarchical data organization over conventional hierarchical clustering methods regarding accuracy and robustness. The clustering properties of hierarchical SOMs were also demonstrated by the indexing method (Lampinen and Oja, 1992). Further hierarchical data structure can be achieved with the use of network modification. In one study, a secondary SOM (Pascual-Marqui et al., 2001) was used, and the SOM was enhanced when the sets of data associated with a particular weight vector were partitioned into a finer set of clusters. Vesanto and Alhoniemi (2000) applied hierarchical agglomerative and partitive clustering to prototypes that were initially produced by a SOM, and they were subsequently clustered into a second stage. Further development in hierarchical presentation was achieved through modification of algorithms, including multiple hierarchical overlapped SOM (Suganthan, 2001), an index system applied to an adaptation double SOM (Ressom et al., 2003), and long term hierarchical organization (Carpinteiro et al., 2007). The SOM and the Adaptive Resonance Theory (ART) were used also in combination, applied to a hierarchical organization of benthic macroinvertebrate communities in polluted streams, which efficiently reflected the impact of environmental factors (Park et al., 2004).

4. Future development 4.3. Temporal data 4.1. Convergence and stability As described in the previous sections, SOMs based on the Kohonen network have been widely applied across different hierarchical levels in biological systems because of their feasibility for information extraction and applicability to prediction, assessment, and production of management policies. The efficiency of information extraction

Long-term data are accumulated rapidly as field surveys are continuously conducted for long-term ecological research. Recently, models dealing with long-term temporal data have been presented. Algorithms were proposed to facilitate information extraction in temporal data. The SOM was a feasible method for identifying patterns in temporal data (e.g., Voegtlin and Dominey, 2001; Simon

T.-S. Chon / Ecological Informatics 6 (2011) 50–61

et al., 2007). Information extractions from temporal data with SOMs and model development have been reviewed by Hammer et al. (2004), Strickert and Hammer (2005), and Strickert et al. (2005). Time series data could be analyzed by combined networks with SOM or by algorithm development. Cao (2003) used the SOM and support vector machines (SVMs) for forecasting time series data. In the first stage SOM was used as a clustering algorithm to partition the whole input space into several disjointed regions, while multiple SVMs were constructed by finding the most appropriated kernel function and the optimal free parameters. One of the major algorithm developments was conducted by Temporal Kohonen Map (TKM) (Chappell and Taylor, 1993). By recurrent self-connections the neurons act as leaky integrators with exponentially decreasing decay factor by extending the SOM in TKM. Subsequently the recurrent SOM (RSOM) (Koskela et al., 1998) was devised by integrating directions of deviations of the weights and consequently storing more information than the weighted distance of the TKM for successful time series prediction (Angelovič, 2005). While TKM does not directly use the temporal contextual information of input sequences in weight updating, direct learning of the temporal context was possible with RSOM (Angelovič, 2005). It allows model building using a large amount of data with only a little a priori knowledge. The recurrent SOM was utilized to train line movement of Oligochaetes in classifying movement sequence in response to the chemical treatments (Son et al., 2006). It is also notable that periods in various scales were also detected by recurrent oscillatory SOM (Kaipainen and Ilmonen, 2003). Additional algorithms applying recursive processes have been recently proposed. Recursive SOM (RecSOM) was developed by including context comparisons in addition to weights in the network (Voegtlin, 2000), while TKM and RSOM only deal with temporal effects on weights. A SOM for structural data proposed that the neurons are arranged on a rectangular d-dimension lattice structure, and with a richer representation of context, better quantization of time series data was possible (Hammer et al., 2004). Strickert and Hammer (2005) developed the Merge SOM (MSOM) for the unsupervised sequence processing of temporal data. Vančo and Farkaš (2010) compared three models for processing tree-structured data, SOM for structural data-SOMSD, merge SOM-MSOM and recursive SOM-RecSOM, by focusing on the unit's memory depth and on its capability to differentiate among the trees. SOMSD had the highest sensitivity to structural information that affected tree clustering according to Vančo and Farkaš (2010).

4.4. Adaptive and modular networks Based on the flexibility of training processes, network architecture has been greatly expanded in recent years. Due to its self-organizing properties, network architecture can be easily modified. The adaptive properties of the SOM networks were made clear with network. A series of studies were developed regarding growing or modular networks (Tokunaga and Furukawa, 2009). The modular forms originally stemmed from operator maps, allowing transformation from vector space to function space (Kohonen, 1993) and subspace that presented the given data distribution in a set of infinite union subspaces (Kohonen et al., 1997). A growing SOM was one initial example (Bauer and Villmann, 1997) with the growth of nodes and connective hypercubical output space in a self-organizing feature map. By allowing the growth of nodes, the network can approximate the input space more accurately and parsimoniously and can more easily deal with a dynamic input distribution (Marsland et al., 2002). The map output space was further enlarged by creating an extension into one of the existing dimensions (Villmann and Bauer, 1998). Adaptive formation of the nodes in GSOM (Growing SOM) (Marsland et al., 2002) was proposed to provide flexibility so that the growth of nodes may occur when the system does not sufficiently match the

57

input. The architecture of the GSOM was further modified according to hierarchical structure (Dittenbach et al., 2002). The flexibility of network architecture has been further demonstrated in recent years with the sophisticated formation of modulars. A SOM of a SOM (SOM2) was proposed by Furukawa (2005). In SOM2, the mapping objects themselves are SOMs and this is different from SOMs simply being used in combination. SOM2 has a hierarchical structure consisting of a single parent SOM and a set of child SOMs. Consequently, a SOM2 is an architecture that organizes a product manifold represented as (child SOM) × (parent SOM). Furthermore, a SOM2 can be extended to a SOMn. 4.5. Supervised SOM The supervised SOM, as originally proposed by Kohonen (2001), is another alternative to broaden the scope of the SOM, and is useful for tracing the whole patterns of input–output relationships. This network has been used for research in the fields of drug discovery and quantitative structure activity relationships (Xiao et al., 2005; Bayram et al., 2004). Melssen et al. (2006, 2007) proposed models combining counter propagation network, supervised SOMs and partial least squares, to pattern the input and output map, which captures the multivariate structure and the-input–output relationships, to handle both linear and non-linear regression problems efficiently. 4.6. Sensitivity Sensitivity analysis, a tool for studying behavior of a system, ascertains how much the system's output depends on each or some of its input parameters. Numerous methods have been introduced in both global and local sensitivity analysis (Saltelli et al., 2004). Conventionally sensitivity analyses have been conducted with MLPs in artificial neural networks (Lek and Guegan, 2000), and not much with SOMs. By applying sensitivity analyses, however, the SOM will be suitable in patterning the data without prior knowledge and diagnosing the system response to errors in variables. Gonzalez et al. (1997) conducted a sensitivity analysis of the SOM as an adaptive one-pass, non-stationary clustering algorithm in a study of color quantization in image sequences. They revealed that the SOM was highly sensitive to the size (i.e., number of cluster) of the network to be adapted. Kropp (1998) also discussed about sensitivity in city systems by inputting a sequence of small changes to the data about a given city. It was possible to observe whether and how it evolves towards the characteristics of another group. Paini et al. (2010) recently conducted sensitivity analyses in prediction of invasion species by altering presence/absence data to estimate risk of pest establishment. 4.7. Visualization Data presentation was efficient in the SOM to provide a comprehensive view on the complex input data. Conveniently used visualization methods were reviewed by Vesanto (1999), providing information required for different presentations and exploratory data visualization. The process of visualization in general consists of vector quantization and vector projection (e.g., Sammon's mapping) (Vesanto, 1999). Other improvement was reported with using correlation matrix (Dzemyda, 2001) and incorporation with other networks. Genetic algorithm GA was linked with the SOM to perform embedded hybrid visualization, using visualization as the core of the data analysis step for the optimization procedure (Wang et al., 2004). An alternative to the SOM was also suggested by growing cell structure networks for visualizing high-dimensionality data sets (Wong and Cartwright, 2005). The model enabled generation of the

58

T.-S. Chon / Ecological Informatics 6 (2011) 50–61

deterministic projection matrix, which outperforms the SOM projection method. Similä (2006) combined local linear quantile regression with the SOM to provide a fully operational method for visualization of the variable at the upper and lower tails. Yin (2002, 2008) discussed about yielding similar results to multidimensional scaling (MDS) and reported that the model produced a quantitative or metric scaling and could aid adaptive embedding of highly nonlinear manifolds. Recently, Xu et al. (2010) used polar SOM that not only preserved the data topology and the inter-neuron distance but also visualized the differences among clusters in terms of weight and feature. The visualization map could be divided into tori and circular sectors by radial and angular coordinates. Considering that visualization is regarded more appreciated in dealing with complex data and that SOMs are flexible in information and data presentation more advancement in visualization is expected in the future. 4.8. Combined application The SOM is flexible in linking with other models. The combined or hybrid model is more specifically suitable to the given conditions. The MLP has been popular to be combined with the SOM. Considering MLP is trained by supervised learning, the combined models would be able to exercise both supervised and unsupervised learning concurrently in the process of information extraction. The SOM could be used for partitioning the data initially, and the MLP could be used to reveal input–output relationship within the partitioned data. The combined application was conducted to diatom assemblage (Gevrey et al., 2004; Tison et al., 2005) and algal bloom (Oh et al., 2007). The SOM and MLP were also used for patterning and predicting climate data (Lin and Wu, 2009; Dai and Chen, 2006). Evolutionary algorithms were efficiently combined with the SOM for unraveling algal population/community dynamics. Through organized data treatments of ordination, clustering, and rule discovery by the combined networks, prediction of algal dynamics and comprehensive outlining for management were possible, along with improved understanding of the complex community–environment relationships in the target lakes (Recknagel et al., 2006a; Chan et al., 2007). Kita et al. (2010) used SOM and GA to pattern the populations in different scales, GA for extracting information at the subpopulation levels and SOM for organizing patterns at the overall population levels. The use of the subpopulation search algorithm improved the local search performance. The combined use of SOM and GA was also conducted in prediction and classification of reservoir (Hakimi-Asiabar et al., 2009). Synaptic-scale storm system were recognized and tracked by using a SOM and evolutionary algorithm. Besides MLP and EA, ART (Chon et al., 2000; Park et al., 2004), SVM (Cao, 2003), fuzzy logic (Mitra, 1994; Giraudel et al., 2000; Wong et al., 2008), counter propagation network (Melssen et al., 2006, 2007; Besaw et al., 2009), RBF (Obach et al., 2001), and wavelet (Hsu and Li, 2010) were also used in combination for analyzing community and/or environment data. Along with the biological-inspired models, statistical models were incorporated with the SOM for analysis of ecological data (Zhang, J. 2008). Lencioni et al. (2007) applied multivariate methods (PCA and Multiple Linear Regression (MLR)) and neural networks (SOM and MLP) together. PCA and SOM classified the stations (i.e. glacial and non-glacial), while MLR and MLP detected the best predictors. Coste et al. (2009) used PCA and SOM in combination for analysis of the physico-chemical and the biological data sets to draw new key species profiles, and de Bodt et al. (2004) used statistical tools to assess sensitivity of SOM. 5. Conclusions As is well known, monitoring and management of environmental disturbances is a critical issue for keeping sustainability in ecological

systems. As illustrated previously, the SOM appears to be suitable in providing comprehensive views on dealing with complexities in data through dimension compression and topology preservation. The efficiency of SOMs in information extraction was demonstrated across different hierarchical levels of life from molecules to ecosystems. At the molecular level SOMs were useful for identifying structural and functional patterns in the relationships between molecular genetics and species occurrence, contributing to the filling the gap between molecular biology and ecology. At the individual level, complex behavioral data are efficiently analyzed to monitor responses of animals (e.g., toxin detection) and humans (e.g., pro-ecological activities). Feasibility in SOMs was demonstrated in diagnosing impacts of disturbances and in establishing indicating systems at community and population levels. At the ecosystem level SOMs are were suitable in revealing the effects of environmental disturbances (i.e., climate change) and were applicable to monitoring ecosystem quality and to creating management policies. Application of SOM to complex data lies on its feasibility in information extraction and presentation, and the SOM implementation in future would be dependent on development of computational methodology, including topology preservation and stability in convergence. Development of the SOM can be also based on network architecture, ensuring convergence in complex spatial and temporal data and adaptability to evolve toward more efficient and sophisticated SOMs to deal with the complexity in ecological processes. Along with the development of SOMs for the lines of ecology and informatics, SOMs could be applicable to a wide range of research areas in biological and environmental sciences. Acknowledgements This study was supported by the CAER (Center for Aquatic Ecosystems Restoration) of Eco-STAR project (Title: Evaluation of ecological integrity in lake ecosystem (08-IV-11)) from Ministry of Environment, Republic of Korea. References Aguado, D., Montoya, T., Borras, L., Seco, A., Ferrer, J., 2008. Using SOM and PCA for analysing and interpreting data from a P-removal SBR. Eng. Appl. Artif. Intel. 21, 919–930. Aguilera, P.A., Frenich, A.G., Torres, J.A., Castro, H., Vidal, J.L.M., Canton, M., 2001. Application of the Kohonen neural network in coastal water management: methodological development for the assessment and prediction of water quality. Water Res. 35, 4053–4062. Allen, J.I., Somefield, P.J., Gilbert, F.J., 2007. Quantifying uncertainty in high-resolution coupled hydrodynamic-ecosystem models. J. Mar. Syst. 64, 3–14. Angelovič, P., 2005. Time series prediction using RSOM and local models. IIT. SRC 2005: Student research conference, pp. 27–34. Bauer, H.U., Pawelzik, K.R., 1992. Quantifying the neighbourhood preservation of selforganizing feature maps. IEEE T. Neural Networ. 3, 570–579. Bauer, H.U., Villmann, T., 1997. Growing a hypercubical output space in a selforganizing feature map. IEEE Trans. Neural Networ. 8, 218–226. Bayram, E., Santiago II, P., Harris, R., Xiao, Y.D., Clauset, A.J., Schmitt, J.D., 2004. Genetic algorithms and self-organizing maps: a powerful combination for modelling complex QSAR and QSPR problems. J. Comp. Aid. Mol. Des. 18, 483–493. Besaw, L.E., Rizzo, D.M., Kline, M., Underwood, K.L., Doris, J., Morrissey, L.A., Pelletier, K., 2009. Stream classification using hierarchical artificial networks: a fluvial hazard management tool. J. Hydrol. 373, 34–43. Bezdek, J.C., Pal, N.R., 1995. Two soft relatives of learning vector quantization. Neural Networ. 8, 729–743. Bowden, G.J., Nixon, J.B., Dandy, G.C., Maier, H.R., Holmes, M., 2005. Forecasting chlorine residuals in a water distribution system using a general regression neural network. Math. Comput. Model. 44, 469–484. Bowden, G.J., Dany, G.C., Maier, H.R., 2006. Input determination for neural network models in water resources applications. Part 1—background and methodology. J. Hydrol. 301, 75–92. Brosse, S., Giraudel, J.L., Lek, S., 2001. Utilisation of non-supervised neural networks and principal component analysis to study fish assemblages. Ecol. Model. 146, 159–166. Cao, J.D., 2003. New results concerning exponential stability and periodic solutions of delayed cellular neural networks. Phys. Lett. A 307, 136–147. Carpinteiro, O.A.S., Leme, R.C., Zambroni de Souza, A.C., Pinheiro, C.A.M., Moreira, E.M., 2007. Long-term load forecasting via a hierarchical neural model with time integrators. Electr. Pow. Syst. Res. 77, 371–378.

T.-S. Chon / Ecological Informatics 6 (2011) 50–61 Céréghino, R., Park, Y.-S., 2008. Review of Self-Organizing Map (SOM) approach in water resources: analysis, modeling and application. Environ. Modell. Softw. 23, 835–845. Céréghino, R., Park, Y.-S., Compin, A., Lek, S., 2003. Predicting the species richness of aquatic insects in streams using a limited number of environmental variables. J. N. Am. Benthol. Soc. 22, 442–456. Chan, W.S., Recknagel, F., Cao, H., Park, H.-D., 2007. Elucidation and short-term forecasting of microcystin concentrations in Lake Suwa (Japan) by means of artificial neural networks and evolutionary algorithms. Water Res. 41, 2247–2255. Chang, F.-J., Chang, L.-C., Kao, H.-S., Wu, G.-R., 2010. Assessing the effort of meteorological variables for evaporation estimation by self-organizing map neural network. J. Hydrol. 384, 118–129. Chappell, G., Taylor, J.G., 1993. The temporal Kohønen map. Neural Netw. 6, 441–445. Chen, S.K., Mangiameli, P., West, D., 1995. The comparative ability of self-organizing neural networks to define cluster structure. Omega Int. J. Manage. S. 23, 271–279. Choi, K.-H., Kim, J.-S., Kim, Y.S., Yoo, M.-A., Chon, T.-S., 2006. Pattern detection of movement behaviors in genotype variation of Drosophila melanogaster by using self-organizing map. Ecol. Inform. 1, 219–228. Chon, T.-S., Park, Y.-S., 2006. Ecological informatics as an advanced interdisciplinary interpretation of ecosystems. Ecol. Inform. 1, 213–217. Chon, T.-S., Park, Y.-S., Moon, K., Cha, E., 1996. Patternizing communities by using an artificial neural network. Ecol. Model. 90, 69–78. Chon, T.-S., Park, Y.-S., Park, J.H., 2000. Determining temporal pattern of community dynamics by using unsupervised learning algorithms. Ecol. Model. 132, 151–166. Chon, T.-S., Park, Y.-S., Park, K.Y., Choi, S.-Y., Kim, K.T., Cho, E.C., 2004. Implementation of computational methods to pattern recognition of movement behavior of Blattella germanica (Blattaria: Blattellidae) treated with Ca2+ signal inducing chemicals. Appl. Entomol. Zool. 39, 79–96. Compin, A., Céréghino, R., 2007. Spatial patterns of macroinvertebrate functional feeding groups in streams in relation to physical variables and land-cover in southwestern France. Landscape Ecol. 22, 1215–1225. Compin, A., Park, Y.-S., Lek, S., Céréghino, R., 2005. Species spatial distribution and richness of stream insects in south-western France using artificial neural networks with potential use for biosurveillance. In: Lek, S., Scardi, M., Verdonschot, P.F.M., Descy, J.P., Park, Y.S. (Eds.), Modelling Community Structure in Freshwater Ecosystems. Berlin, Springer-Verlag, pp. 221–238. Coste, M., Boutry, S., Tison-Rosebery, J., Delmas, F., 2009. Improvements of the Biological Diatom Index (BDI): description and efficiency of the new version (BDI-2006). 9, 621–650. Dai, Q., Chen, S., 2006. Integrating the improved CBP model with kernel SOM. Neurocomputing 69, 2208–2216. de Bodt, E., Cottrell, M., Letremy, P., Verleysen, M., 2004. On the use of self-organizing maps to accelerate vector quantization. Neurocomputing 56, 187–203. Dittenbach, M., Rauber, A., Merkl, D., 2002. Uncovering hierarchical structure in data using the growing hierarchical self-organizing map. Neurocomputing 48, 2002. Dzemyda, G., 2001. Visualization of a set of parameters characterized by their correlation matrix. Comput. Stat. Data Anal. 36, 15–30. Ferrán, E.A., Ferrara, P., 1992. Clustering proteins into families using artificial neural networks. Comput. Appl. Biosci. 8, 39–44. Flanagan, J.A., 1998. Sufficient condition for self-organisation in the one dimensional SOM with a reduced width neighbourhood. Neurocomputing 21, 51–60. Foody, G., 1999. Applications of the self-organising feature map neural network in community data analysis. Ecol. Model. 210, 97–107. Foody, G., Dash, J., 2007. Discriminating and mapping the C3 and C4 composition of grasslands in the northern Great Plains, USA. Ecol. Inform. 2, 89–93. Fort, J.C., 2006. SOM's mathematics. Neural Netw. 19, 812–816. French, M.N., Recknagel, F., 1994. Modeling of algal blooms in freshwaters using artificial neural networks. In: Zanetti, P. (Ed.), Computer Techniques in Environmental Studies V. Environmental Systems II. Computational Machanics Publication, Southampton, Boston, pp. 87–94. Furukawa, T., 2005. SOM of SOMs: self-organizing map which maps a group of selforganizing maps. Lect. Notes Comput. Sci. 3696, 391–396. Gevrey, M., Rimet, F., Park, Y.-S., Giraudel, J.L., Ector, L., Lek, S., 2004. Water quality assessment using diatom assemblages and advanced modelling techniques. Freshw. Biol. 49, 208–220. Gevrey, M., Worner, S., Kasabov, N., Pitt, J., Giraudel, J.-L., 2006. Estimating risk of events using SOM models: a case study on invasive species establishment. Ecol. Model. 127, 361–372. Ghosh, S., Patra, S., Ghosh, A., 2009. An unsupervised context-sensitive change detection technique based on modified self-organizing feature map neural network. In. J. Appoximate Reasoning 50, 37–50. Giraudel, J.L., Aurelle, D., Berrebi, P., Lek, S., 2000. Application of the self-organizing mapping and fuzzy clustering to microsatellite data: how to detect genetic structure in brown trout (Salmo trutta) populations. In: Lek, S., Guégan, J.F. (Eds.), Artificial Neuronal Networks, Application to Ecology and Evolution. SpringerVerlag, Berlin, pp. 187–202. Gonzalez, A.I., Graña, M., Anjou, A.D´., Albizuri, F.X., 1997. A sensitivity analysis of the self organizing map as an adaptive one-pass non-stationary clustering algorithm: the case of color quantization of image sequences. Neural Process. LeETT. 6, 77–89. Grabec, I., 1990. Self-organization of neurons described by the maximum-entropy principle. Biol. Cybern. 63, 403–409. Griebeler, E., Seitz, A., 2006. The use of Markovian metapopulation models: reducing the dimensionality of transition matrices by self-organizing Kohonen networks. Ecol. Model. 192, 271–285. Grossberg, S., 1980. How does a brain build a cognitive code? Psychol. Rev. 87, 1–51. Hakimi-Asiabar, M., Ghodsypour, S., Kerachian, R., 2009. Multi-objective genetic local search algorithm using Kohonen's neural map. Comput. Ind. Eng. 56, 1566–1576.

59

Hammer, B., Micheli, A., Sperduti, A., Strickert, M., 2004. Recursive self-organizing network models. Neural Netw. 17, 1061–1085. Haykin, S., 1994. Neural Networks: A Comprehensive Foundation. Macmillan, New York. 695 pp. Hellawell, J.M., 1986. Biological Indicators of Freshwater Pollution and Environmental Management. Elsevier, London. 546 pp. Hopfield, J., 1982. Neural networks and physical systems with emergent collective computational abilities. PNAS 9, 2554. Hsu, K.-C., Li, S.-T., 2010. Clustering spatial–temporal precipitation data using wavelet transform and self-organizing map neural network. Adv. Water Res. 33, 190–200. Hyun, K., Song, M.-Y., Kim, S., Chon, T.-S., 2005. Using an artificial neural network to patternize long-term fisheries data from South Korea. Aquat. Sci. 67, 382–389. Jabiol, J., Corbara, B., Dejean, A., Cereghino, R., 2009. Structure of aquatic insect communities in tank-bromeliads in a East-Amazonian rainforest in French Guiana. For. Ecol. Manage. 257, 351–360. Jeong, K.-S., Hong, D.-G., Byeon, M.-S., Jeong, J.-C., Kim, H.-G., Kim, D.-K., Joo, G.-J., 2010. Stream modification patterns in a river basin: field survey and self-organizing map (SOM) application. Ecol. Inform. 5, 293–303. Ji, C.W., Choi, K.H., Lee, S.H., Park, Y.-S., Chon, T.-S., 2007. Monitoring of movement behaviors of chironomid larvae after exposure to diazinon using fractal dimension and self-organizing map. Int. J. Ecodyn. 2, 1–12. Jin, H., Shum, W.H., Leung, K.S., Wong, M.L., 2004. Expanding self-organizing map for data visualization and cluster analysis. Inf. Sci. 163, 157–173. Jockusch, S., Ritter, H., 1994. Self-organizing maps: local competition and evolutionary optimization. Neural Netw. 7, 1229–1239. Joo, G.J., Jeong, K.S., 2005. Modelling community changes of cyanobacteria in a flow regulated river (the lower Nakdong River, S. Korea) by means of a Self-Organizing Map (SOM). In: Lek, S., Scardi, M., Verdonschot, P., Descy, J.P., Park, Y.S. (Eds.), Modelling Community Changes of Cyanobacteria in a Flow Regulated River (the Lower Nakdong River, S. Korea) by Means of a Self-Organizing Map (SOM). Springer Verlag, Berlin, pp. 273–287. Jørgensen, S.V., 1992. Exery and ecology. Ecol. Model. 63, 185–214. Kaipainen, M., Ilmonen, T., 2003. Period detection and representation by recurrent oscillatory self-organizing map. Neurocomputing 55, 699–710. Kalteh, A.M., Hjorth, P., Berndtsson, R., 2008. Review of the self-organizing map (SOM) approach in water resources: analysis, modeling and application. Environ. Modell. Softw. 23, 835–845. Kangur, K., Park, Y.-S., Kangur, A., Kangur, P., Lek, S., 2007. Patterning long-term changes of fish community in large shallow Lake Peipsi. Ecol. Model. 203, 34–44. Kaski, S., Kangas, J., Kohonen, T., 1998. Bibliography of self-organizing map (SOM) papers: 1981–1997. Neural Comput. Surv. 1, 102–350. Kim, B., Lee, S.-E., Song, M.-Y., Choi, J.-H., Ahn, S.-M., Lee, K.-S., Cho, E., Chon, T.-S., Koh, S.-C., 2008. Implementation of artificial neural networks (ANNs) to analysis of inter-taxa communities of benthic microorganisms and macroinvertebrates in a polluted stream. Sci. Total Environ. 390, 262–274. Kita, E., Kan, S., Fei, Z., 2010. Investigation of self-organizing map for genetic algorithm. Adv. Eng. Softw. 41, 148–153. Kohonen, T., 1982a. Self-organized formation of topologically correct feature maps. Biol. Cybern. 43, 59–69. Kohonen, T., 1982b. Analysis of a simple self-organizing process. Biol. Cybern. 44, 135–140. Kohonen, T., 1993. Physiological interpretation of the self-organizing map algorithm. Neural Netw. Oxford. 6, 895-895. Kohonen, T., 2001. Self-Organizing Maps, 3rd ed. Springer, Berlin. 501 pp. Kohonen, T., Kaski, S., Lappalainen, H., 1997. Self-organized formation of various invariantfeature filters in the adaptive-subspace SOM. Neural Comput. 9, 1321–1344. Kontunen-Soppela, S., Parviainen, J., Ruhanen, H., Brosché, M., Keinäen, M., Thakur, R.C., Kolehmainen, M., Kangasjävi, J., Oksanen, E., Karnosky, D.F., Vapaavuori, E., 2010. Gene expression responses of paper birch (Betula papyrifera) to elevated CO2 and O3 during leaf maturation and senescence. Environ. Pollut. 158, 959–968. Koskela, T., Varsta, M., Heikkonen, J., Kaski, K., 1998. Time series prediction using recurrent SOM with local linear models. Int. J. Knowl. Based Intell. Eng. Syst. 2, 60–68. Kropp, J., 1998. A neural network approach to the analysis of city systems. Appl. Geogr. 18, 83–96. Kwak, I.-S., Song, M.-Y., Park, Y.-S., Chon, T.-S., 2005. Patterning community changes in benthic macroinvertebrates in a polluted stream by using artificial neural networks. In: Lek, S., Scardi, M., Verdonschot, P., Descy, J.P., Park, Y.S. (Eds.), Modelling Community Changes of Cyanobacteria in a Flow Regulated River (the Lower Nakdong River, S. Korea) by Means of a Self-Organizing Map (SOM). Springer Verlag, Berlin, pp. 239–249. Lakany, H., 2008. Extracting a diagnostic gait signature. Pattern Recogn. 41, 1627–1637. Lampinen, J., Oja, E., 1992. Clustering properties of hierarchical self-organizing maps. J. Math. Imaging Vis. 2, 261–272. Lasne, E., Lek, S., Laffaille, P., 2007. Patterns in fish assemblages in the Loire floodplain: the role of hydrological connectivity and implications for conservation. Biol. Conserv. 139, 258–268. Lee, W.-S., Kwon, Y.-S., Yoo, J.-C., Song, M.-Y., Chon, T.-S., 2006. Multivariate analysis and self-organizing mapping applied to analysis of nest-site selection in Blacktailed Gulls. Ecol. Model. 193, 602–614. Lee, J., Kwak, I.-S., Lee, E., Kim, K.A., 2007a. Classification of breeding bird communities along an urbanization gradient using an unsupervised artificial neural network. Ecol. Model. 203, 62–71. Lee, W.-S., Kwon, Y.-S., Park, Y.-S., Chon, T.-S., Yoo, J.-C., 2007b. Evaluation of environmental factors to predict breeding success of black-tailed gulls. Ecol. Inform. 1, 331–339. Lee, C.-W., Jang, J., Jeong, K., Kim, D., Joo, G., 2009. Patterning habitat preference of avifaunal assemblage on the Nakdong River estuary (South Korea) using selforganizing map. Ecol. Inform. 5, 89–96.

60

T.-S. Chon / Ecological Informatics 6 (2011) 50–61

Lek, S., Guegan, J.-F., 2000. Artificial Neuronal Networks: Application to Ecology and Evolution. Springer, Berlin. 262 pp. Lek, S., Scardi, M., Verdonschot, P., Descy, J.-P., Park, Y.-S., 2005. Modelling Community Structure in Freshwater Ecosystems. Springer Verlag, Berlin. 518 pp. Lek-Ang, S., Park, Y., Ait-Mouloud, S., Deharveng, L., 2007. Collembolan communities in a peat bog versus surrounding forest analyzed by using self-organizing map. Ecol. Model. 203, 9–17. Lemly, A.D., Smith, R.J.F., 1986. A behavioral assay for assessing effects of pollutants on fish chemoreception. Ecotox. Environ. Saf. 11, 210–218. Lencioni, V., Maiolini, B., Marziali, L., Lek, S., Rossaro, B., 2007. Macroinvertebrate assemblages in glacial stream systems: a comparison of linear multivariate methods with artificial neural networks. Ecol. Model. 203, 119–131. Lepetz, D., Némoz-Gaillard, M., Aupetit, M., 2007. Concerning the differentiability of the energy function in vector quantization algorithms. Neural Netw. 20, 621–630. Lin, G., Wu, M., 2009. A hybrid neural network model for typhoon-rainfall forecasting. J. Hydrol. 375, 450–458. Lo, Z.-P., Bavarian, B., 1991. On the rate of convergence in topology preserving neural networks. Biol. Cybern. 65, 55–63. Lu, R.-S., Lo, S.-L., 2002. Diagnosing reservoir water quality using self-organizing maps and fuzzy theory. Water Res. 36, 2265–2274. Mangiameli, P., Chen, S.W., West, D., 1996. A comparison of som neural network and hierarchical clustering methods. Eur. J. Oper. Res. 93, 402–417. Manolakos, E., Virani, H., Novotny, V., 2007. Extracting knowledge on the links between the water body stressors and biotic integrity. Water Res. 41, 4041–4050. Marsland, S., Shapiro, J., Nehmzow, U., 2002. A self-organizing neural network that grows when required. Neural Netw. 15, 1041–1058. Mele, P., Crowley, D., 2008. Application of self-organizing maps for assessing soil biological quality. Agric. Ecosyst. Environ. 126, 139–152. Melssen, W., Wehrens, R., Buydens, L., 2006. Supervised Kohonen networks for classification problems. Chemom. Intell. Lab. Syst. 83, 99–113. Melssen, W., Ustün, B., Buydens, L., 2007. SOMPLS: A supervised self-organising map— partial least squares algorithm for multivariate regression problems. 86, 102–120. Mitra, A., 1994. Dynamic externalities and industrial location. Brown University mimeo. Morin, S., Bottin, M., Mazzella, N., Macary, F., Delmas, F., Winterton, P., Coste, M., 2009. Linking diatom community structure to pesticide input as evaluated through a spatial contamination potential (Phytopixal): a case study in the Neste river system (South-West France). Aquat. Toxicol. 94, 28–39. Moshou, D., Bravo, C., Oberti, R., West, J., Bodria, L., McCartney, A., Ramon, H., 2005. Plant disease detection based on data fusion of hyper-spectral and multi-spectral fluorescence imaging using Kohonen maps. Real-Time Imaging 11, 75–83. Mostafa, M.M., 2009. Shades of green: a psychographic segmentation of the green consumer in Kuwait using self-organizing maps. Expert Syst. Appl. 36, 11030–11038. Nikolic, N., Park, Y.-S., Sancristobal, M., Lek, S., Chevalet, C., 2009. What do artificial neural networks tell us about the genetic structure of populations? The example of European pig populations. Genet. Res. Camb. 91, 121–132. Nishiyama, K., Endo, S., Jinno, K., Bertacchi Uvo, C., Olsson, J., Berndtsson, R., 2007. Identification of typical synoptic patterns causing heavy rainfall in the rainy season in Japan by a Self-Organizing Map. Atmos. Res. 83, 185–200. Obach, M., Wagner, R., Werner, H., Schmidt, H., 2001. Modelling population dynamics of aquatic insects with artificial neural networks. Ecol. Model. 146, 207–217. Odum, H.T., 1983. Systems Ecology. Wiley, New York. 644 pp. Odum, E.P., Barrett, G.W., 2005. Fundamentals of Ecology. Thomson Brooks/Cole, Belmont. 598 pp. Oh, H.-M., Ahn, C.-Y., Lee, J.-W., Chon, T.-S., Choi, K.H., Park, Y.-S., 2007. Community patterning and identification of predominant factors in algal bloom in Daechung Reservoir (Korea) using artificial neural networks. Ecol. Model. 203, 109–118. Özesmi, S.L., Özesmi, U., 1999. An antificial neural network approach to spatial habitat modeling with interspecific interaction. Ecol. Model. 116, 15–31. Paini, D.R., Worner, S.P., Cook, D.C., De Barro, P.J., Thomas, M.B., 2010. Using a selforganizing map to predict invasive species: sensitivity to data errors and a comparison with expert opinion. J. Appl. Ecol. 47, 290–298. Parikh, J., DaPonte, J., Vitale, J., Tselioudis, G., 1999. An evolutionary system for recognition and tracking of synoptic-scale storm systems. Pattern Recognit. Lett. 20, 1389–1396. Park, Y.-S., Chung, Y.-J., 2006. Hazard rating of pine trees from a forest insect pest using artificial neural networks. For. Ecol. Manage. 222, 222–233. Park, Y.-S., Cang, J., Sovan, L., Cao, W., Brosse, S., 2003a. Conservation strategies for endemic fish species threatened by the Three Goges Dam. Conserv. Biol. 17, 1748–1785. Park, Y.-S., Céréghino, R., Compin, A., Lek, S., 2003b. Applications of artificial neural networks for patterning and predictiong aquatic insect species richness in running waters. Ecol. Model. 160, 265–280. Park, Y.-S., Chon, T.-S., Kwak, I.-S., Lek, S.-H., 2004. Hierarchical community classification and assessment of aquatic ecosystems using artificial neural networks. Sci. Total Environ. 327, 105–122. Park, Y.-S., Chung, N.-I., Choi, K.-H., Cha, E.Y., Lee, S.-K., Chon, T.-S., 2005. Computational characterization of behavioral response of medaka (Oryzias latipes) treated with diazinon. Aquat. Toxicol. 71, 215–228. Park, Y.-S., Lek, S., Scardi, M., Verdonschot, P.F.M., Jørgensen, S.E., 2006a. Patterning exergy of benthic macroinvertebrate communities using self-organizing maps. Ecol. Model. 195, 105–113. Park, Y.-S., Tison, J., Lek, S., Giraudel, J.-L., Coste, M., Delmas, F., 2006b. Application of a self-organizing map to select representative species in multivariate analysis: a case study determining diatom distribution patterns across France. Ecol. Inform. 3, 247–257.

Park, Y.-S., Song, M.-Y., Park, Y.-C., Oh, K.-H., Cho, E., Chon, T.-S., 2007. Community patterns of benthic macroinvertebrates collected on the national scale in Korea. Ecol. Model. 203, 26–33. Pascual-Marqui, R.D., Pascual-Montano, A., Kochi, K., Carazo, J.M., 2001. Smoothly distributed fuzzy c-means: a new self-organizing map. Pattern Recognit. 34, 2395–2402. Penczak, T., Kruk, A., Park, Y.-S., Lek, S., 2005. Patterning spatial variation in fish assemblage structures and diversity in the Pilica River system using a selforganizing map (SOM). In: Park, Y.S., Lek, S. (Eds.), Modelling Community Structure in Freshwater Ecosystems. Springer-Verlag, Berlin, pp. 100–113. Penczak, T., Kruk, A., Grzybkowska, M., Dukowska, M., 2006. Patterning of impoundment impact on chironomid assemblages and their environment with use of the self-organizing map (SOM). Acta Oecol. 30, 312–321. Pisati, M., Whelan, C., Lucchini, M., Maitre, B., 2009. Mapping patterns of multiple deprivation using self-organising maps: an application to EU-SILC data for Ireland. Soc. Sci. Res. 29, 405–418. Prieto, B., Tricas, F., Merelo, J., Mora, A., Prieto, A., 2008. Visualizing the evolution of a web-based social network. J. Netw. Comput. Appl. 31, 677–698. Recknagel, F., 2006. Ecological Informatics 2nd. Scope, Techniques and Applications. Springer, New York. 496 pp. Recknagel, F., Cao, H., Kim, B., Takamura, N., Welk, A., 2006a. Unravelling and forecasting algal population dynamics in two lakes different in morphometry and eutrophication by neural and evolutionary computation. Ecol. Inform. 1, 133–151. Recknagel, F., Talib, A., Van den Molen, D., 2006b. Phytoplankton community dynamics of two adjacent Dutch lakes in response to seasons and eutrophication control unravelled by non-supervised artificial neural networks. Ecol. Inform. 1, 277–285. Ressom, H., Wang, D., Natrarajan, P., 2003. Adaptive double self-organizing maps for clustering gene expression profiles. Neural Netw. 16, 633–640. Reusch, D., 2010. Nonlinear climatology and paleoclimatology: capturing patterns of variability and change with Self-Organizing Maps. Phys. Chem. Earth 35, 329–340. Richardson, A.J., Risien, C., Shillington, 2003. Using self-organizing maps to identify patterns in satellite imagery. Prog. Oceanogr. 223–239. Rimet, F., Druart, J., Anneville, O., 2009. Exploring the dynamics of plankton diatom communities in Lake Geneva using emergent self-organizing maps (1974–2007). Ecol. Inform. 4, 99–110. Ritter, H., Schulten, K., 1986. Topology conserving mappings for learning motor tasks. In: Denker, J.S. (Ed.), Neural networks for computing, AIP Conf. Proc., 151, pp. 376–380. Ritter, H., Schulten, K., 1988. Convergency properties of Kohonen's topology conserving maps: fluctuations, stability and dimension selection. Biol. Cybern. 60, 59–71. Ritter, H., Martinetz, T., Schulten, K., 1992. Neural Computation and Self-Organizing Maps: An Introduction. Addison-Wesley, Reading, MA. 300 pp. Rousset, P., Guinot, C., Maillet, B., 2006. Understanding and reducing variability of SOM neighbourhood structure. Neural Netw. 19, 838–846. Roux, O., Gevrey, M., Arvanitakis, L., Gers, C., Bordat, D., Legal, L., 2007. ISSR-PCR: Tool for discrimination and genetic structure analysis of Plutella xylostella populations native to different geographical areas. Mol. Phylogen. Evolut. 43, 240–250. Ruggiero, A., Cereghino, R., Figuerola, J., Marty, P., Angelibert, S., 2008. Farm ponds make a contribution to the biodiversity of aquatic insects in a French agricultural landscape. CR Biol. 331, 298–308. Saltelli, A., Tarantola, S., Campolongo, F., Ratto, U., 2004. Sensitivity Analysis in Practice. Wiley, Chichester. 219 pp. Sánchez-Martos, F., Aguilera, P.A., Garrido-Frenich, A., Torres, J.A., Pulido-Bosch, A., 2002. Assessment of groundwater quality by means of self-organizing maps: application in a semiarid area. J. Environ. Manage. 30, 716–726. Scardi, M., 1996. Artificial neural networks as empirical models for estimating phytoplankton production. Mar. Ecol. Prog. Ser. 139, 289–299. Shanmuganathan, S., Sallis, P., Buckeridge, J., 2006. Self-organising map methods in integrated modeling of environmental and economic systems. Environ. Model. Softw. 21, 1247–1256. Similä, T., 2006. Self-organizing map visualizing conditional quantile functions with multidimensional covariates. Comput. Stat. Data Anal. 50, 2097–2110. Simon, G., Lee, J., Cottrell, M., Verleysen, M., 2007. Forecasting the CATS benchmark with the Double Vector Quantization method. Neurocomputing 70, 2400–2409. Son, K.-H., Ji, Y.-M., Park, Y.-M., Cui, Y., Wang, H.Z., Chon, T.-S., Cha, E.Y., 2006. Recurrent Self-Organizing Map implemented to detection of temporal line-movement patterns of Lumbriculus variegates (Oligochaeta: Lumbriculidae) in response to the treatments of heavy metal. Transaction on Biomedicine and Health Vol 10 Environemtal Toxicology. Southanpton. WIT Press, pp. 77–91. Song, M.-Y., Lee, S.-E., Park, J., Park, J., Kim, B., Koh, S., Lee, K., Park, Y.-S., Chon, T.-S., 2005. Comparative community analysis of benthic macroinvertebrates and microorganisms across different levels of organic pollution in a stream by using artificial neural networks. WSEAS Ttrans. Biol. Biomed. 3, 257–268. Song, M.-Y., Park, Y.-S., Kwak, I.-S., Woo, H., Chon, T.-S., 2006. Characterization of benthic macroinvertebrate communities in a restored stream by using selforganizing map. Ecol. Inform. 1, 295–305. Song, M.-Y., Hwang, H.-J., Kwak, I.-S., Ji, C.-W., Oh, Y.-N., Youn, B.-J., Chon, T.-S., 2007. Self-organizing mapping of benthic macroinvertebrate communities implemented to community assessment and water quality evaluation. Ecol. Model. 203, 18–25. Sørensen, P.B., Giralt, F., Rallo, R., Espinosa, G., Münier, B., Gylenkærne, S., Thomsen, M., 2010. Conscious worst case definition for risk assessment, part II A methodological case study for pesticide risk assessment. Sci. Total Environ. 208, 3860–3870. Stone, G., Barnes, J., Montgomery, C., 1995. Ecoscale: a scale for the measurement of environmentally responsible consumers. Psychol. Market. 12, 595–612. Strickert, M., Hammer, B., 2005. Merge SOM for temporal data. Neurocomputing 64, 39–71.

T.-S. Chon / Ecological Informatics 6 (2011) 50–61 Strickert, M., Hammer, B., Blohm, S., 2005. Unsupervised recursive sequences processing. Neurocomputing 63, 69–98. Suganthan, P., 2001. Pattern classification using multiple hierarchical overlapped selforganizing maps. Pattern Recognit. 34, 2173–2179. Tison, J., Park, Y.-S., Coste, M., Wasson, J.G., Ector, L., Rimet, F., Delmas, F., 2005. Typology of diatom communities and the influence of hydro-ecoregions: a study on the French hydrosystem scale. Water Res. 39, 3177–3188. Tison, J., Park, Y.-S., Coste, M., Wasson, J.G., Rimet, F., Ector, L., Delmas, F., 2007. Predicting diatom reference communities at the French hydrosystem scale: a first step towards the definition of the good ecological status. Ecol. Model. 203, 99–108. Tokunaga, K., Furukawa, T., 2009. Modular network SOM. Neural Netw. 22, 82–90. Tolat, V.V., 1990. An analysis of Kohonen's self-organizing maps using a system of energy functions. Biol. Cybern. 64, 155–164. Tran, L.T., Knight, C.G., O'Neill, R.V., Smith, E.R., O'Connell, M., 2003. Self-organizing maps for integrated environmental assessment of the Mid-Atlantic Region. Environ. Manage. 31, 822–835. Tszydel, M., Grzybkowska, M., Kruk, A., 2009. Influence of dam removal on trichopteran assemblages in the lowland Drzewiczka River, Poland. Hydrobiology 630, 75–89. Ultsch, A., 1993. Self-organizing neural networks for visualization and classification. In: Opitz, O., Lausen, B., Klar, R. (Eds.), Information and Classification. Berlin, SpringerVerlag, pp. 307–313. Vančo, P., Farkaš, I., 2010. Experimental comparison of recursive self-organizing maps for processing tree-structured data. Neurocomputing 73, 1362–1375. Vesanto, J., 1999. SOM-based data visualization methods. Intell. Data Analsis 3, 111–126. Vesanto, J., Alhoniemi, E., 2000. Clustering of the Self-Organizing Map. IEEE Trans. Neural Netw. 11, 586–600. Villmann, T., Bauer, H.U., 1998. Applications of the growing self-organizing map1. Neurocomputing 21, 91–100. Voegtlin, T., 2000. Context quantization and contextual self-organizing maps. Proc. Int. Joint Conf. on Neural Networks, 5, pp. 20–25. Voegtlin, T., Dominey, P., 2001. Learning high-degree sequences in a linear network. Proceedings of the IJCNN'2001, 1, pp. 940–944. Walley, W.J., O'Connor, M.A., 2001. Unsupervised pattern recognition for the interpretation of ecological data. Ecol. Model. 146, 219–230. Wang, K., Salhi, A., Fraga, E.S., 2004. Process design optimization using embedded hybrid visualization and data analysis technique within a genetic algorithm optimization framework. Chem. Eng. Process. 43, 663–675.

61

Ward, J.H., 1963. Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58, 236–244. Wong, J.W.H., Cartwright, H.M., 2005. Deterministic projection by growing cell structure networks for visualization of high-dimensionality datasets. J. Biomed. Inform. 38, 322–330. Wong, H.-S., Yang Sha, B.M., Ip, H.H.S., 2008. 3D head model retrieval in kernel feature space using HSOM. Pattern Recognit. 41, 468–483. Wu, X.-L., Griffin, K.B., Garcia, M.D., Michal, J.J., Xiao, Q., Wright, R.W., Jiang, Z., 2004. Census of orthologous genes and self-organizing maps of biologically relevant transcriptional patterns in chickens (Gallus gallus). Gene 340, 213–225. Wu, L.-Y., Li, Z., Wang, R.-S., Zhang, X.-S., Chen, L., 2009. Self-organizing map approaches for the haplotype assembly problem. Math. Comput. Simul 79, 3026–3037. Xia, Z., Rui, S., Bing, Z., Qingxi, T., 2008. Land cover classification of the North China Plain using MODIS EVI time series. ISPRS J. Photogramm. 63, 476–484. Xiao, X., Shao, S., Ding, Y., Huang, Z., Huang, Y., Choum, K.C., 2005. Using complexity measure factor to predict protein subcellular location. Amino Acids 28, 57–61. Xu, K., Xu, Y., Chow, T.W.S., 2010. PolSOM: a new method for multidimensional data visualization. Pattern Recognit. 43, 1668–1675. Yin, H., 2002. Data visualization and manifold mapping using the ViSOM. Neural Netw. 15, 1005–1016. Yin, H., 2008. On multidimensional scaling and the embedding of self-organizing maps. Neural Netw. 21, 160–169. Zhang, J., Dong, Y., Xi, Y., 2008. A comparison of SOFM ordination with DCA and PCA in gradient analysis of plant communities in the midst of Taihang Mountains. China. Ecol. Inform. 3, 367–374. Zhang, L., Scholz, M., Mustafa, A., Harrington, R., 2008. Assessment of the nutrient removal performance in integrated constructed wetlands with the self-organizing map. Water Res. 42, 3519–3527. Zhang, L., Scholz, M., Mustafa, A., Harrington, R., 2009. Application of the self-organizing map as a predition tool for an integrated constructed wetland agroecosystem treating agricultural runoff. Bioresour. Technol. 100, 559–565. Zurada, J.M., 1992. Introduction to Artificial Neural Systems. West Publishing Company. 683 pp.