On the statistical analysis of ocean wave directional spectra

On the statistical analysis of ocean wave directional spectra

Ocean Engineering 189 (2019) 106361 Contents lists available at ScienceDirect Ocean Engineering journal homepage: www.elsevier.com/locate/oceaneng ...

5MB Sizes 2 Downloads 63 Views

Ocean Engineering 189 (2019) 106361

Contents lists available at ScienceDirect

Ocean Engineering journal homepage: www.elsevier.com/locate/oceaneng

On the statistical analysis of ocean wave directional spectra Jesús Portilla-Yandún a, b, Francesco Barbariol c, *, Alvise Benetazzo c, Luigi Cavaleri c a

Centro de Modelizaci� on Matem� atica, Escuela Polit�ecnica Nacional, Quito, Ecuador Department of Mechanical Engineering, Escuela Polit�ecnica Nacional, Quito, Ecuador c Institute of Marine Sciences (ISMAR), Italian National Research Council (CNR), Venice, Italy b

A R T I C L E I N F O

A B S T R A C T

Keywords: Ocean waves Directional spectrum Spectral partitioning Self-organizing maps Wave statistical analysis

Given the growing availability of directional spectra of ocean waves, we explore two different statistical ap­ proaches to mine large spectra databases: Spectral Partitions Statistics (SPS) and Self-Organizing Maps (SOM). The first method is not new in the literature, while the second one is for the first time here applied to directional wave spectra. The main goal is to improve the characterization of the directional wave climate at a site, providing a more complete and consistent description than that obtained from traditional statistical methods based on integral spectral parameters (e.g., Hs, Tm, θm). Indeed, while the use of integral parameters allows a direct application of standard techniques for statistical analysis, important information related to the physics of the processes may be overlooked (e.g., the presence of multiple wave systems, for instance locally and remotely generated). The two proposed methods do not exclude integral parameters analysis, but they further allow ac­ counting for different events (e.g., with different genesis) independently. Although SPS and SOM are equally valid for both numerical model and observational data, we illustrate their potential using a 37-year long (1979–2015) model dataset of directional wave spectra at a study site in the western Mediterranean Sea. We show that standard integral parameters fail to show the complex and even multimodal conditions at this site, that are otherwise revealed by the directional spectra statistical analysis. Although the processing pathways and the resulting indicators of both SPS and SOM are substantially different, we observe that their results are mutually consistent, and provide a better insight into the physical processes at work.

1. Introduction The description of ocean waves by integral spectral parameters is the standard for most engineering and scientific applications. The main reason is historical, as waves have been described from the onset by their basic (although stochastic) parameters like significant wave height, mean period, and mean direction, which are typically analyzed using standard statistical methods. Some of these methods have been adapted from other fields of science like hydrology or meteorology: examples are time series analysis, extreme value theory and wind roses to characterize wind vectors (see, for instance, Ochi, 1978, Ochi, 2005, and Holthuijsen, 2007). However, even from a single look at the sea surface it is evident that its behavior is not one-dimensional (1D, time) but actually three-dimensional (3D, space-time). The first numerical wave prediction models were 1D, simply because the necessary tools were not yet developed. It was the concept of the wave spectrum (Pierson and Marks, 1952), particularly in the frequency-direction domain, which allowed to move towards the directional description of waves (then incorporated in

numerical prediction models) and, progressively but more slowly, the development of measuring techniques to estimate the directional spec­ trum (e.g., wave poles, three axis accelerometers, radar mapping, SAR and stereo imagery). The advantage of spectral analysis over the ones based on average integral parameters is that the wave spectrum contains a very rich amount of information as it represents the full energy dis­ tribution over the frequency-direction domain. For instance, the prin­ ciple of wave superposition inherent to the spectrum allows us to visualize different wave trains and to say something about their origin (e.g., Portilla-Yandún et al., 2015). In the last two decades, thanks to the developments in computing technology, meteorological centers, apart from their standard fore­ casting services, have started to produce long historical databases of wave spectral data (e.g., the ERA-Interim and ERA-5 datasets by the European Centre for Medium-Range Weather Forecasts; Dee et al., 2011; Copernicus Climate Change Service (C3S), 2017). In addition, measuring techniques to estimate the directional spectrum have also become more reliable and accessible and in the last decade the number

* Corresponding author. E-mail address: [email protected] (F. Barbariol). https://doi.org/10.1016/j.oceaneng.2019.106361 Received 20 November 2018; Received in revised form 22 August 2019; Accepted 23 August 2019 Available online 25 September 2019 0029-8018/© 2019 Elsevier Ltd. All rights reserved.

J. Portilla-Yandún et al.

Ocean Engineering 189 (2019) 106361

Fig. 1. (Left panel) Study site, western Mediterranean Sea (red pointer: 7.00� E, 41.00� N). (right panel) Wind rose at the study site (EI, 1979–2015); frequency of occurrence of winds (in percentage) is plotted radially. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)

of directional measuring buoys has increased significantly in number and coverage (see e.g., JCOMM, 2019). Another massive source of wave spectral data is provided by satellites, in particular the Synthetic Aper­ ture Radars on board the ERS1, ERS2, EnviSat, Sentinel, CFOSAT mis­ sions (Ardhuin et al., 2015). All these new technologies require the parallel development of new approaches and techniques for the statis­ tical description of the directional wave spectra. This is relevant for practical and scientific applications because it relates to the accuracy of estimates and projections (e.g., in engineering design, navigation, climate analysis). Integral parameters do provide a good description of wave charac­ teristics, but only under basic unimodal conditions (i.e., single peaked spectrum). However, in general the wave spectrum is made of two or more wave systems, reflecting the dynamical behavior of the atmo­ spheric forcing, appearing as wind sea and swell. In these conditions, integral parameters start to lose significantly their meaning, and whereas the significant wave height is not a very sensitive parameter, others like the mean period and mean direction can be heavily affected. The reason is that period and direction correspond to vector means related to the two coordinate variables of the directional wave spectrum, namely frequency and direction. In this paper we explore the applicability of two methods for the statistical analysis of directional wave spectra: Spectral Partitions Statis­ tics (SPS) and Self-Organizing Maps (SOM). The first one involves the physical identification of main wave features, followed by their statis­ tical characterization. This methodology has already been presented and applied in other studies (Portilla et al., 2009, Portilla-Yandún et al., 2015, 2016). The second one derives from, and it is based on, neural network techniques (Kohonen, 2001). This methodology has been pre­ viously applied in wave climate studies using spatial data based on in­ tegral parameters (Barbariol et al., 2016; Camus et al., 2011a), but it is here applied for the first time for directional wave spectra processing. In the present paper, the aim is to contribute to the statistical analysis of ocean waves based upon the frequency-direction spectrum, considering that this is the modern standard variable for statistical wave description, including numerical modeling and observations. We present the outcome of these two techniques highlighting their strengths and ad­ vantages with respect to standard types of analyses based on integral parameters. Our scope here is to illustrate the benefit of different types of spectral analysis, which may open the door to a better understanding of the wave physical processes and contribute to several marine related applications (e.g., structural design, operations planning, coastal engineering). The paper is structured as follows. In Section 2 we first introduce the working dataset, and give a brief overview of the climate conditions at the study site. We then specify the methodology starting with the typical

representation of waves according to standard statistical analysis based on integral parameters. In this section we also present the theoretical basis of both SPS and SOM. In Section 3 we present the main results from the two approaches, separately. Since the pathways and output formats of them are significantly different, in Section 4 we take advantage of some of their features that allow making a comparative analysis. We assess for instance cross-sea conditions, and the time evolution of storms from the spectral point of view. We summarize our results and conclu­ sions in Section 5. Finally, specific technical details of SOM are included in the Appendix. 2. Materials and methods 2.1. Dataset and study site To demonstrate the applicability of the proposed methodologies we use data from the ERA-Interim database (henceforth EI) from the Eu­ ropean Centre for Medium-Range Weather Forecasts (ECMWF, 2011). Model data has been favored here in order to facilitate the illustration because these spectra are homogeneous in format and continuous in time, with long enough records for a robust statistical analysis. Never­ theless, we point out that the methodologies presented here can be equally applied to observations including in-situ, or remotely sensed data. The EI reanalysis comes from fully coupled atmosphere-ocean-waveland system, covering the period from 1979 until 2019 (see Dee et al., 2011 for details). EI contains several atmospheric and oceanic param­ eters, including directional wave spectral data, which are derived from the WAM model (Komen et al., 1994). Although not specifically at spectral level, EI data have been extensively quality controlled and verified (Dee et al., 2011; Stopa and Cheung, 2014). The main limitation of EI is related to its spatial resolution, about 1� , from which not all the details of the atmospheric circulation, affected by the local orography, are expected to be captured. Nevertheless, the meso-scale phenomena driving the regional climate patterns are relatively well accounted for (Stopa and Cheung, 2014). The wave spectrum in EI is discretized in 30 frequencies ranging from 0.035 to 0.55 Hz in geometric sequence, and 24 equally spaced directions. The time resolution is 6 h, making a total of 54056 spectra per grid point, in the period considered for the study (1979-2015). As a practical example, a specific site in the western Mediterranean Sea has been chosen. This site is interesting, because the local wave conditions are significantly more complex than what is expected for an inner sea. Due to its geographical location within the mid-latitude convective cells (i.e., Ferrel and Hadley cells), the Mediterranean marks the limit between the southern warm and dry tropical conditions 2

J. Portilla-Yandún et al.

Ocean Engineering 189 (2019) 106361

Fig. 2. Statistical representation of the wave climate at the study site of Fig. 1, based on integral spectral parameters. (left panel) Wave rose plot, representing the histogram of the long-term joint occurrence of Hs and θm ; frequency of occurrence is plotted radially (in percentage). (right panel) Histogram of the long-term joint occurrence of Hs and Tm ; frequency of occurrence is plotted in color scale (in percentage); the black dashed line represents the breaking limit (steepness 2πHs =gT 2m ¼ 1/15, see e.g., Holthuijsen, 2007). (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)

in Africa and the relatively temperate and stormier zone of Europe. In addition, the Azores high-pressure center in the Atlantic strongly affects the dominant West to East atmospheric circulation, driving the regional weather at synoptic scale. The wave climate of the Mediterranean Sea has been the subject of extensive previous analyses (e.g., Ardhuin et al., 2007; Bertotti and Cavaleri, 2009; Cavaleri et al., 1991), so that a wealth of information is available for reference and comparison. In terms of waves, being an enclosed sea this area is in principle relatively simple, with waves generated locally, and excluding the possibility of remotely generated swells. In such conditions, the potential of spectral ap­ proaches can be better appreciated showing more details than the traditional approach based on integral parameters. One example is the not so evident, but possible, occurrence of multiple wave systems. The general meteorological conditions of the area are presented in Fig. 1. The left panel shows the reference area and study point. In the right panel, local wind statistics is provided in the form of a wind rose. The convention used along this document for all directional data (wind and waves) is oceanographic, i.e., showing flow direction. As it can be seen, the predominant winds are towards 120� and 150� , corresponding to the so-called Mistral wind. Since waves are a direct effect of winds, one may anticipate a similar condition for waves.

Tm . This representation is informative of the local relationship between these two parameters (e.g., the wave steepness). In this particular case, we observe that the most frequent wave conditions are concentrated in the range Tm ¼ ½3; 4� s, and Hs ¼ ½0:5; 1:0� m. However, heavy conditions with Tm ¼ 9s and Hs ¼ 7 m can also be found. For each wave period, the upper limit of the distribution in the right panel is related to the physical limit of wave growth due to breaking, determined by the wave steepness (see e.g., Holthuijsen, 2007). The representation of waves by integral parameters provides a good level of understanding about the local wave conditions, but shortcom­ ings arise in many practical applications. The most common are related to multi-modal sea states. Users are often confronted with simultaneous wave systems with different characteristic frequency and direction, and such situations are critical in harbor operations, offshore engineering, navigation, structural design, among others (e.g., Hegermiller et al., 2017; Kjeldsen, 1991). Whereas information about the presence of multi-modal states is actually present in the wave spectrum, by the computation of integral wave parameters, that information is lost and cannot be recovered, because the whole spectral distribution originally available has been encapsulated into these parameters. Therefore, any insight related to the spectral structure (e.g., multi-modality) requires returning to the original raw data. However, in practice this step is generally not trivial, because most processing techniques and systems in place are designed to provide only integral parameters. Therefore new approaches and methodologies have to be implemented in order to facilitate the exploitation of the information contained in the wave spectrum.

2.2. Standard statistical wave analysis The common description of waves is generally done using few inte­ gral parameters that account for their main properties, e.g., significant wave height, mean wave period, and mean wave direction: Hs , Tm and θm , respectively. For example Fig. 2 shows typical bi-variate distribu­ tions. The left panel shows a wave rose plot, in which the emphasis is on the directional distribution of waves, in combination with their energy (Hs , θm ), with the radial axis indicating the empirical frequency of occurrence. Consistent with the wind distribution (Fig. 1, right panel), waves in the SE quadrant dominate the climate (52% of total occur­ rence). These contain also the most severe sea states (Hs >4 m) confined within the 120–150� sector (SE), accounting for 35% of the total. The other three sectors, i.e., NE, NW and SW, account for the 19%, 17%, and 12%, respectively, there too with severe sea states (Hs >3 m), especially in the SW sector. It should be noted that the details of the less recurrent, hence apparently less relevant, waves get shadowed in this representa­ tion in favor of the more recurrent ones. As it will be shown later, one of the advantages of the statistical analysis of directional spectra is the more insightful view of the wave conditions in all the directional sectors. Since for most applications not only the magnitude and direction of waves are important, but also their period, the right panel of Fig. 2 shows another typically used bi-variate distribution, combining Hs and

2.3. Spectral Partitions Statistics (SPS) A method for deriving spectral statistics using partitioning is described in Portilla-Yandún et al., 2015. This method is based on the principle of linear superposition of different wave trains, inherent in the wave spectrum. This is possible because a spectral partition is physically understood as a self-consistent cluster of energy bins that represents an independent wave event, with particular characteristics and meteoro­ logical genesis. In general, the wave spectrum may contain one wind driven wave system and as many swell partitions as the complexity of the local conditions allow. In turn, the wind sea condition may corre­ spond to different stages of development (e.g., young or old wind seas). Integral parameters of the individual clusters are much more meaningful than those calculated for the whole spectrum, because the loss of con­ sistency of total integral parameters (e.g., Hs, Tm, θm) in multi-modal spectra arises from the fact that these correspond to vector means of the energy components. An important advantage of partitioning is that a 3

J. Portilla-Yandún et al.

Ocean Engineering 189 (2019) 106361

Fig. 3. SPS representation of the wave climate at the study site of Fig. 1. (left panel) long term occurrence distribution of spectral partitions in (f, θ). The four main wave systems are indicated with black contours. (Right panel) partial Hs distributions presented in box-whiskers plot format: red continuous line indicates the median value (percentile 50th), the blue boxes indicate the 25th and 75th percentiles, the whiskers are placed at 1.5 times the inter-quartile range on each side, and the red crosses give a general idea of possible extreme values. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)

substantial data reduction takes place while the essence of the spectral information is preserved. The purpose of partitioning is thus to identify all the possible energy clusters within the spectrum. In practice, partitions can be found making an analogy of the wave spectrum with an image, or alternatively with a 3D topographic relief. Both schemes have advantages and can also be combined. For the partitioning step as such, we rely here on the so-called mountaineer scheme (Hasselmann et al., 1996). This algorithm uses the analogy with a topographic relief in which the steepest ascent energy path from every point of the spectrum is followed and traced until a local peak is reached. The collection of all the points associated to the same local peak constitutes a partition (this scheme is described in detail in Portilla et al., 2009). In principle, the solution for a real spectrum (e.g., signed, non-planar, non zero) is unique. However, spectra can be subject to natural variability (specially observed ones) or other types of spurious features, which cannot necessarily be considered as significant wave systems. Therefore, a post-processing step is usually needed to auto­ matically assess what are the relevant features. This is achieved through image processing tools, using a filtering and thresholding sequence based on a smoothing filter (see specific details in Portilla-Yandún, 2018a,b). Once the whole spectral time-series are decomposed in partitions, these can be characterized by their specific attributes (e.g., Hs , Tm , θm ). In turn, these attributes can be used to derive related spectral statistics. The main purpose is to identify the most relevant long-term spectral wave features. Therefore, for a skillful descriptor we use the empirical distribution of partitions, obtained as the collection of the peak co­ ordinates (fp ; θp ) of each partition of the whole series. This is simply an occurrence distribution in the original bivariate (f; θ) domain (see Portilla-Yandún et al., 2015 for details, and the results of section 3.1). The long-term wave systems are defined from the clusters of this occurrence distribution of partitions. In practice, this is carried out using the same partitioning algorithm as for a single energy spectrum. In turn, the physical interpretation is that waves with similar generation con­ ditions correspond to similar spectral characteristics. An advantage of this descriptor is the capability to reveal distinctly the different wave spectral features. Recently, with the release of massive spectral data from the EI archive, a global atlas containing wave spectral statistical indicators has been produced (GLOSWAC; https://www.modemat.epn.edu.ec/nereo/; see Portilla-Yandún, 2018a for details). The SPS analysis presented here

is derived from the output of this atlas. 2.4. Self-Organizing Maps (SOM) The second approach is based on the so-called Self-Organizing Maps (SOM) technique, which is a neural network technique typically used for data mining purposes (e.g., the classification of data, vector quantiza­ tion and projection). The SOM technique (Kohonen, 2001) has been applied in different fields of science, including oceanography (Falcieri et al., 2013; Liu et al., 2006; Solidoro et al., 2007). In previous wave studies, SOM has been used to characterize ocean waves at specific lo­ cations at integral parameters level (Barbariol et al., 2016; Camus et al., 2011a; Reguero et al., 2013). Typical local features emerge from the SOM outputs, allowing portraying the wave climate, originally repre­ sented by a huge number of (Hs , Tm ; θm ) triplets. From them, SOM generates a reduced number of prototypes that represent classes of sea states with similar characteristics. In addition, the visualization capa­ bilities of SOM allow displaying multiple integral/peak parameters on a 2D lattice (Camus et al., 2011a), called map. This is a feature that en­ ables to project a multivariate dataset onto a lower-dimensional space (generally 2D), which can be seen as a proxy of the original dataset. The 2D SOM lattice, is made of several elements called units and has a much smaller size than the input dataset. A disadvantage of SOM relative to other clustering algorithms (e.g., the “Maximum Dissimilarity Algo­ rithm”) is the limitation for representing the boundaries of the original dataset, i.e., extremes, because the classification makes emphasis on the more recurrent conditions. Nevertheless other strategies can be applied to overcome this deficiency (see Barbariol et al., 2016; Camus et al., 2011b). Data compression and classification in SOM is carried out by means of a competitive–cooperative learning technique. At each iteration, the SOM units compete among themselves to be the best-matching unit (BMU), i.e., the closest to the input data according to a prescribed metric (competitive stage), and they organize themselves due to lateral inhi­ bition connections (cooperative stage). The classification metric is based on squared Euclidean distances dij between the input data Sðti ; cÞ and the SOM unit S* ðui ; cÞ, defined as: N X

Sðti ; ck Þ

dij ¼ k¼1

4

S* uj ; ck

��2

J. Portilla-Yandún et al.

Ocean Engineering 189 (2019) 106361

analysis, revealed as a good compromise between significant data compression and a fair representation of the input dataset (the data reduction factor is about 540). The SOM lattice quality is typically assessed as the goodness of the map in representing the original dataset and in preserving the topology of the input dataset. This is evaluated by means of error metrics (see Kohonen, 2001; Kohonen et al., 2009) like the mean quantization error (MQE, the average Euclidean data-to-BMU distance), and the topographic error (TE, the percentage of data that do not have first and second BMUs adjacent in the map). As there is not a consensus on the benchmark against which these error metrics have to be assessed (e.g., MQE is case-dependent, varying with the normaliza­ tion of data, the dimension of the map and other factors), here we have also used other metrics, routinely used in geophysical sciences (e.g., the 2D cross-correlation coefficient and Normalized Root Mean Square Error). Specific details about the SOM implementation and its quality assessment are included in the Appendix.

Table 1 Summary of the wave systems identified in Fig. 3, their corresponding flow di­ rections, their associated driving winds and their frequency of occurrence ac­ cording to SPS. Wave System

Flow direction

Driving Wind

Frequency of occurrence (SPS)

SPS1 SPS2 SPS3 SPS4

South-East North-West North-East South-West

Mistral Sirocco Libeccio Grecale

52% 17% 19% 12%

where ti refers to the i-th spectrum, ck to the vectorized spectral position (f,θ), and uj to the j-th map unit. In general, inputs have to be normalized to make them homogeneous, and then de-normalized to obtain the output. The number of output BMUs is chosen arbitrarily, generally as a trade-off between an accurate representation of the input dataset and a significant data reduction factor. The more the BMUs (i.e., the smaller the reduction factor), the better the representation. The lateral inhibi­ tion among the map units is based upon the map topology using a neighboring function that expresses how much a BMU relates to its neighbors at each step of the learning process. In this study the SOM technique is applied for the first time to directional wave spectra. For this we make use of the open-source SOM toolbox for MATLAB® (Vesanto et al., 2000; Vatanen et al., 2015). This toolbox encompasses several functions for pre-processing, SOM classi­ fication and post-processing. The SOM has been chosen as a 10 � 10 square lattice (100 prototypical spectra), which, after a preliminary

3. Results In this Section we present the results from the individual perspective of each of the proposed methodologies, SPS and SOM, pointing out their advantages and limitations. 3.1. Spectral Partitions Statistics (SPS) Fig. 3 shows the typical output of SPS for representing the wave

Fig. 4. SOM: 10 � 10 map representation of the wave climate at the study site of Fig. 1. The 100 prototypical 2D spectra are represented by their spectral parameters. The outer squares (left color bar) indicate the frequency of occurrence of the prototype with respect to the total (in percentage). The inner squares (right color bar) indicate their Hs magnitude. The gray arrows indicate the total mean direction (θm ) and period (Tm ) (length scale shown at the bottom-right corner). The black, blue, and green arrows indicate the peak direction and periods (θp , Tp ) of the primary, secondary, and tertiary peaks respectively. The black solid line shows the transition of mean wave direction over the four quadrants. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.) 5

J. Portilla-Yandún et al.

Ocean Engineering 189 (2019) 106361

Fig. 5. SOM-MDA representation of the wave climate at the study site of Fig. 1, through 8 prototypical 2D spectra normalized on the spectral peak. For each prototype the title shows the frequency of occurrence and the mean Hs of the represented sea states. The lower panels show the 1D projections into frequency and direction. The blue dashed lines correspond to the prototypes; black solid lines indicate the average, and the gray zone the standard deviation. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)

climate at a study site, in this case at the Mediterranean Sea location of Fig. 1. The left panel shows the occurrence frequency distribution of all partitions found in the time-series. Conveniently, this distribution is derived and presented in the same spectral domain (f; θ) as the input data. The color scale (third dimension) indicates the number of parti­ tions registered in every specific spectral bin. As it can be observed, partitions do not occur evenly over the whole spectral domain, but tend to cluster naturally in specific sectors. Moreover, these clusters are very well defined, suggesting a strong relationship with the physical condi­ tions at the reference site (e.g., meteorological, geographical). There­ fore, this distribution can be associated to the possible wave systems physically existing at the reference point. A direct visual inspection suggests the presence of four clusters, namely one in each quadrant. In practice, as mentioned before, this detection is carried out automatically by applying the partitioning algorithm to this distribution (black con­ tours in Fig. 3) as it was a single spectrum. Following the oceanographic convention for the directions, the wave system 1 (SPS1) flows to about 140� , with characteristic frequencies from 0.1 to 0.2 Hz, SPS2 flows to 330� , SPS3 flows to 40� with a secondary lobe at about 90� , SPS4 flows to 220� . Table 1 summarizes the situation including the traditional names of their corresponding driving winds. This solution is probably not very intuitive at first look, especially considering the expectations build from

integral parameters (right panel of Fig. 1 and left panel of Fig. 2), which suggest the dominance of the Mistral condition (SE). It is important to note that the identification of these systems simplifies enormously the analysis of spectral data, because it implies that at the reference location wave conditions can be classified only in one of these four groups. This principle traverses the concept of the sea state (wind sea or swell), which can be simply seen as one of the many attributes of a specific wave system. Indeed, the specification of these four long-term wave systems allows their individual characterization, using for instance their respective typical integral parameters, i.e., Hs, Tm, and θm. In fact, these last two are already available in the occurrence distribution of Fig. 3 (left panel), and any other parameter can be readily obtained by processing the partitions time series. Since the significant wave height is usually a parameter of immediate interest, for a comprehensive view of the local wave conditions, Fig. 3 (right panel) includes the individual distribu­ tions of Hs for each component and for the total spectra. For the sake of conciseness, this is given in a box-plot format. Note that the total energy E (and not Hs) corresponds to the sum of its components (i.e., Hs ¼ 4√E). Quite in agreement with the information provided by total integral parameters (Fig. 1), in Fig. 3 (right panel) we observe that the Mistral condition (SPS1) is the dominant system, both in terms of mean values and also for extremes. Nevertheless, the other wave systems 6

J. Portilla-Yandún et al.

Ocean Engineering 189 (2019) 106361

(particularly SPS3 and SPS4) can also have significant magnitudes. From the meteorological perspective, it is interesting to note that these four conditions are relatively well known, having indeed ancient associated names (SPS1: Mistral, SPS2: Sirocco, SPS3: Libeccio, and SPS4: Grecale; see Table 1). As said before, many characteristics of these wave systems can be determined from the original time series, including their seasonal and inter-annual variability. However, the scope here is to evaluate the skills of spectral statistical methods.

respectively) contain information about how the prototypes relate to the represented spectra. The blue lines correspond to the prototypes, while the black lines correspond to the mean of the represented spectra, with the gray regions showing the standard deviation. In general, there is a fair agreement, particularly in the directional distributions. On the other hand, there are significant differences between the 1D frequency spectra. This is mainly related to the inclusion of the calm conditions (Hs < 1:0 m) in the processing. It is worth noting that within the 8 prototypes we can observe 4 unimodal conditions (upper panels in Fig. 5), one for each quadrant, in agreement with the results obtained from SPS (Fig. 3). As expected, the most common and energetic condition is Mistral with 30.6% of the cases, and representative (average) Hs ¼ 2:10 m, Libeccio (NE) represents 13.9% of the cases with Hs ¼ 1:20 m. Grecale contains 4.4% frequency of the cases with Hs ¼ 1:64 m. Sirocco (NW) is relatively recurrent (11.7%), albeit with mild values (Hs ¼ 0:85 m). In addition, we detect 4 bimodal conditions (lower panels in Fig. 5), of which the combination MistralLibeccio is relatively recurrent (22.5%) albeit with relatively low mag­ nitudes (Hs ¼ 0:48m). Other combinations are less frequent with rela­ tively mild magnitudes (Hs about 1 m). Therefore, the SOM approach directly indicates the existence of waves in all quadrants of the spectral domain including bi-modal states. These results are qualitatively consistent with those obtained with SPS, both providing a better insight into the spectral structure. In any case, the spectral and standard approaches lead to consistent results. Indeed, comparing Fig. 2 with Figs. 3 and 4, it is clear that all the methods agree in some parameters. For instance, the dominant wave propagation direction is SE, with recurrence of about 52% according to STD and SPS, and 50% for SOM. For the three methods, this is also the quadrant where the most severe sea states occur. This consistency of SOM with STD and SPS is visible also in the other quadrants (NE: 16% vs. 19%; SW: 8% vs. 12%; NW: 26% vs. 17%).

3.2. Self Organizing Maps (SOM) The main result of SOM is the classification of the sea states at the study site, represented by a reduced number of the most typical condi­ tions, and presented in the form of a map. In this case, the map was defined as a 10 � 10 square grid, comprising therefore 100 typical states. However, since such a large number of spectra cannot be easily dis­ played, for an overview we have represented them in Fig. 4 through their main integral (Hs , Tm , θm ) and peak (Tp , θp ) parameters. In that Figure, the inner color of each grid element indicates the magnitude of the significant wave height, whose scale is displayed on the right. The gradual change in magnitude (color) is a direct result of the principle of minimum distance inherent to SOM: similar spectra are assigned to the same class (BMU), and similar classes are placed next to each other. The outer color shows the frequency of occurrence (scale on the left) of each prototype with respect to the total. In this representation the map con­ tains information of the empirical probability distribution of the spectral parameters. In addition, each grid element includes four arrows, each displayed with a different color (gray, black, blue, and green). Arrow direction represents respectively the mean direction, and the peak di­ rection of the primary, secondary, and tertiary wave systems. The length of the arrows is proportional to the wave period. The peaks of the pro­ totypes have been detected using the algorithm of Aguilera (2016). Two thresholds have been set to keep only the most energetic peaks, the secondary (tertiary) peak should have at least 10% the energy of the first (second), and should be clearly separated in direction (at least 30� dif­ ference). In order to compare the SOM results with the previous results (left panels of Figs. 2 and 3), the conditions corresponding to each Cartesian quadrant have been indicated in the map (black line in Fig. 4). The Mistral condition (SE) appears here also as the dominant one, taking 64 out of the 100 BMUs, and including the highest Hs values. Also consistent with the SPS results, the Grecale condition (SW) is the second most important, followed by Libeccio (NE), and then the Sirocco condi­ tion (NW) with lower magnitudes and recurrence. It is interesting to note that the map derived from 2D spectra (Fig. 4) is different from the corresponding map derived from integral parameters (e.g., Hs , Tm , θm ; see, for instance Barbariol et al., 2016, their Fig. 4). Working with 2D spectra offers also the possibility of accounting for multi-modal sea states, which are directly visible in the map as grid elements with sec­ ondary and tertiary peaks (arrows). This will be discussed in more de­ tails in the next Section. Further on, a detailed inspection of Fig. 4 suggests that some of the prototypes of the map are redundant to some extent. The Mistral con­ dition for instance, contains several entries with similar characteristics that can be eventually regrouped and represented by a smaller number of prototypes. Note that this reduction cannot be done directly by imposing a smaller grid size of the map, because this would eliminate the less recurrent but relevant conditions, in favor of the more recurrent. Therefore, this reduction has to be carried out in a post-processing step. To this end, a Maximum Dissimilarity Algorithm (MDA, Kennard and Stone, 1969) has been applied to the 100 prototypes initially found. MDA uses the same principles of SOM, based on squared Euclidean distances, to detect the most dissimilar conditions. This results in the 8 prototypes shown in Fig. 5, which is a very compact representation of the 54056 original spectra. To improve visualization, the spectra shown in Fig. 5 have been normalized with respect to their peaks. In addition, the 1D spectra shown in Fig. 5 (projected onto frequency and direction,

4. Analysis In the previous section, the main overall results of the two spectral methods have been outlined independently. In general, both methods offer a deeper understanding of the spectral wave conditions from different perspectives, showing an overall qualitative consistency. In this section we show that these methodologies can be further exploited depending on the specific application of interest. In addition, we note that due to the different steps of these methodologies and their output formats, SPS and SOM cannot be always directly compared, but when­ ever a comparison is possible their consistency is confirmed. For a comparative analysis we explore here two possible applications. The first is related to the assessment of cross-sea conditions (Section 4.1) and the second to the time evolution of complicated multi-modal stormy situations (Section 4.2). 4.1. Cross-sea conditions Although not affected by long distance incoming swells as in the oceans, still the strong and often rapid variations of the Mediterranean weather may lead to locally complicated wave spectra involving crossseas. These are notoriously dangerous to navigation with a large num­ ber of accidents reported in these conditions. Apart from the manoeu­ vring difficulties, cross-seas may be associated to a higher probability of rogue wave occurrence under certain conditions (e.g., Cavaleri et al., 2012). For planning purposes, avoiding navigation under such condi­ tions is a matter of a cost and risk balance. Using integral parameters only, this estimation becomes intricate and to some extent subjective, because some metrics need to be defined, among others, in terms of the crossing angle and the magnitude of the individual events (e.g., Cavaleri et al., 2012; Li, 2016). However, only recently wave forecasting systems started providing related information, but a long-term statistics is hardly available. In these cases, there is advantage in using spectral approaches. 7

J. Portilla-Yandún et al.

Ocean Engineering 189 (2019) 106361

PðXÞ is the marginal exceedance probability of X. Such equation can be applied to any pair of wave systems and for any Hs threshold value. For illustration, we derive here the crossing probabilities of every pair of the SPS components (SPS1, SPS2, SPS3, SPS4), based on a threshold of 1 m. The results obtained are presented in Fig. 6. In this figure, both the horizontal and vertical axes indicate the wave system number. The upper part of the grid diagonal indicates the probability in color, while its corresponding numeric value can be read in the lower part. As can be seen, the largest value (0.88%) corresponds to the pair SPS1-SPS4 (Mistral and Grecale). This accounts roughly for about 13 events per year. The combination SPS1-SPS3 (Mistral and Libeccio) is also relatively frequent (0.66%), followed by lower percentages of the other combi­ nations. This is qualitative consistent with the SOM results (Fig. 4). We quantitatively compare these results with those from SOM in Fig. 5. This comparison requires however some previous considerations. By default SOM looks for the most recurrent conditions, and those naturally include calm conditions. In turn, the joint probability PðX; YÞ derived from SPS is based on a threshold (e.g., 1 m). For instance, the combination SPS1-SPS4 (Mistral-Grecale) results in 0.88% crossing (Fig. 6). In SOM this combination is associated to the prototypes 7 and 8 (Fig. 5), with together total recurrence of 12.8%, and characteristic Hs of 0.61 and 0.83 m. A more equitable comparison is possible by postprocessing the SOM output via the MDA algorithm. For that we take only the sea states with total Hs � 1:41 m, corresponding to two com­ ponents with partial Hs ¼ 1:0 m. After this post-processing, the crossing level of Mistral-Grecale becomes 0.68%, thus very consistent with SPS. Similarly, the Mistral-Libeccio pair, produces a crossing level of 0.66% in SPS (SPS1-SPS3). This is related to prototype 6 in SOM (Fig. 5) with recurrence of 22.5% and representative Hs ¼ 0:48 m, which after post-processing becomes 0.68%. And Grecale-Sirocco (prototype 5 at 4.0%, with Hs ¼ 0.97m) becomes 0.59%, while this corresponds to 0.22% in SPS (SPS2-SPS4). Further insights in these crossing conditions can be derived. For instance from Fig. 6 we observe that the cumulative crossing probability of any pair of wave systems (at the given threshold Hs > 1:0 m) is about 2% (or 8 days per year). On the other hand, looking also at the timeseries we observe that the bimodal Mistral-Grecale condition occurs mainly during the months of December and January, accounting for about 50% of their total occurrence. This shows that this condition is mainly active during a relatively narrow winter period. Similarly from

Fig. 6. Cross-sea probabilities between pairs of components at the study site according to the SPS wave systems shown in Fig. 3, with Hs ¼ 1:0 m threshold.

As seen in the previous section, in SOM cross-sea conditions result from the classification of the spectra and are readily visible from both the map and from the prototypes after MDA. Indeed, in Fig. 4, cross-sea states can be identified by the presence of multiple arrows (related to significant first, secondary and tertiary peaks). These are generally located in the transition zones between the different quadrants. The map shows significant bi-modality between Mistral and Grecale, and between Mistral and Libeccio. From Fig. 5, the prototypical cross-sea spectra can be readily obtained. In turn, cross-sea conditions are not standard output in SPS, but granted the availability of the individual series of the spectral wave components, the crossing probability among them can be estimated from their joint occurrence probability, given for instance their indi­ vidual Hs time-series (e.g., X and Y). PðX; YÞ ¼ PðYjXÞPðXÞ where PðYjXÞ is the conditional exceedance probability of Y given X, and

Fig. 7. Storm of March 1993. Wind fields over the Mediterranean Sea showing critical moments of the time evolution of the storm. Isotachs are at 2 ms 1 interval. Arrow length is proportional to the local wind speed. The black dot points to the location whose spectra are analyzed in this paper. Data from EI dataset (1� horizontal resolution). 8

J. Portilla-Yandún et al.

Ocean Engineering 189 (2019) 106361

Fig. 8. Storm of March 1993 at the study site of Fig. 1. (left) Time evolution of the wave conditions according to integral parameters (STD, blue), SPS (green) and SOM (magenta): total and partial Hs in the upper plot; direction and period in the lower plot (for STD and SOM, gray arrows represent mean and black ones represent peak values). (right) Directional spectrum on 26th March 1993 at 12. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)

SOM we estimate that about 39% of the total spectra are bimodal at any level. This implies a high level of superposition, including of course the lower magnitudes.

of increasing complexity of the spectra. For each storm, to introduce the event, we show first the meteorological evolution and then the wave evolution according to the different statistics. However, we could have done the opposite as well using the meteo fields to verify the results because, as we will show, the temporal meteorological evolution at a location is “stored” (at least in the short-to mid-term scale) in the directional wave spectra that we analyse using SPS and SOM.

4.2. Insight into the storm evolution from wave spectra While a detailed analysis of the local meteorological conditions is out of the scope here, it is instructive to analyze some specific events. This will illustrate the possible complex conditions in the Mediterranean Sea showing that a comprehensive directional analysis is mandatory to un­ derstand the evolution of a wave field. In this section, we provide a compact description of three severe storms using the results of the standard statistics based on integral parameters and of the two proposed approaches based on directional wave spectra, which are therefore intercompared. Note that for the sequence of storms we follow the principle

4.2.1. Storm of March, 1993 A storm with Mistral and Grecale conditions (SPS1-SPS4) occurred in March 1993 over the Mediterranean Sea region. The evolution of the meteorological situation is shown in Fig. 7. A strong northerly (Mistral) energetic wind reached the Alps (Fig. 7a) on March 25th at 00 UT (henceforth we will omit “UT”, still intending UT time). The wind field shows a slight anticyclonic behavior (turning clockwise in the Alboran

Fig. 9. - Storm of January 1987. Wind fields over the Mediterranean Sea showing critical moments of the time evolution of the storm. Isotachs are at 2 ms 1 interval. Arrow length is proportional to the local wind speed. The black dot points to the location whose spectra are analyzed in this paper. Data from EI dataset (1� horizontal resolution). 9

J. Portilla-Yandún et al.

Ocean Engineering 189 (2019) 106361

south-westerly coasts of Italy. A detailed analysis is given in Cavaleri et al. (1991). On January 10th at 12 a deep low-pressure system (not shown) entered the Mediterranean Sea through the Gibraltar strait. The associated and rapidly moving front leads to a sustained Libeccio wind in most of the western Mediterranean Sea (Fig. 9a). On January 10th at 18 (panel b) the front moves East, and it is enhanced by a vigorous Mistral, which covers the whole western Mediterranean Sea on January 11th at 06 (panel c). At 18 (panel d) the Mistral is still very strong, moving progressively eastwards. In Fig. 10, STD shows that Libeccio waves get quickly substituted by Mistral waves at the peak of the storm (see spectrum in Fig. 11, right panel). In turn, SPS shows that a bimodal Mistral-Libeccio condition develops (see spectrum in Fig. 11, left panel), with Libeccio waves pre­ vailing after a few more days. The skill of SOM in this case is similar to that of integral parameter statistics, with a rather abrupt transition be­ tween the two systems. The growing stage of the storm (until January 11th at 00; original spectrum in Fig. 11, left panel) is represented by unimodal Libeccio prototypes, while the peak of the storm (original spectrum in Fig. 11, right panel) is represented by unimodal Mistral prototypes. This abrupt change is consistent with the two prototype clusters being very distant in the SOM lattice, hence very different. It is also worth noting that the peak of the storm is represented by prototype spectra significantly smaller than the actual one (Hs ¼ 5.53 m).

Fig. 10. Time evolution of the wave conditions for the storm of January 1987, according to integral parameters (STD, blue), SPS (green) and SOM (magenta): total and partial Hs in the upper plot; direction and period in the lower plot (for STD and SOM, gray arrows represent mean and black ones represent peak values). (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)

4.2.3. Storm of December, 1979 The storm of December 1979 is also well known because it gave rise to the second historical flood of Venice (Italy). On December 21st at 12 (Fig. 12a) an energetic Grecale wind is concentrated on most of the western part of the Mediterranean Sea. This was part of a more general strong northerly flow disrupted by the complicated orography of the north side of the basin. An atmospheric pressure minimum (not shown) develops over the Mediterranean Sea with a consequent Sirocco wind west of Sardinia. On December 22nd at 00 (panel b) the minimum is fully developed, leading to a very complicated situation involving cyclogen­ esis. The minimum moves progressively to North-East (December 22nd at 06, panel c), and finally at 18 (panel d) gets out of the reference area followed by a mild West to East flow. Fig. 13 (left panel) shows that the time evolution and the complex wave conditions are well represented (especially by SPS). The right panel shows the multi-modal spectrum on December 22nd at 06.

Sea). Twelve hours later (at 12, Fig. 7b) the air flow covers the Alps and the Appennines leading to a strong Grecale wind in the Ligurian Sea, north of Corsica. The storm condition is fully established on March 26th at 00 (Figs. 7c) and 12 (Fig. 7d). The wave evolution is framed in Fig. 8, summarizing the results of the three approaches: 1) standard statistics based on integral parameters (STD), 2) SPS and 3) SOM. The left panel, upper plot, displays the evolution of the total and partial Hs , and the evolution of the wave period and direction (lower plot). The right panel shows the instanta­ neous spectrum on March 26th at 12. From STD, it is only possible to infer a change in direction from Mistral to Grecale, followed by a return to the original Mistral. Contrarily, SPS and SOM give a much more detailed account. Mistral dominates the first part of the storm, with a subsequent superposition of Gracale. This bimodal condition lasts for about two days, imposing a very broad directional spectrum (right panel).

5. Summary and conclusions In this paper we have presented two approaches for the statistical analysis of long time series of directional wave spectra, namely Spectral Partitions Statistics (SPS) and Self-Organizing Maps (SOM). We have applied these methodologies to a 37-year long dataset of directional

4.2.2. Storm of January, 1987 This is one of the most severe storms that affected the Western Mediterranean Sea region, leading to extensive damages all along the

Fig. 11. Directional spectra on January 11th, 1987, at 00 (left) and 18 (right). 10

J. Portilla-Yandún et al.

Ocean Engineering 189 (2019) 106361

Fig. 12. Storm of December 1979. Wind fields over the Mediterranean Sea showing critical moments of the time evolution of the storm. Isotachs are at 2 ms 1 interval. Arrow length is proportional to the local wind speed. The black dot point indicates the reference point. Wind data correspond to EI at 1� horizon­ tal resolution.

Fig. 13. Storm of December 1979 at the study site of Fig. 1. (left) Time evolution of the wave conditions according to integral parameters (STD, blue), SPS (green) and SOM (magenta): total and partial Hs in the upper plot; direction and period in the lower plot (for STD and SOM, gray arrows represent mean and black ones represent peak values). (right) Directional spectrum on December 22nd, 1979 at 06, with Mistral, Grecale and Sirocco wave systems. Partial Hs from SPS are 1.75, 1.59, and 1.74 m, respectively. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)

wave spectra derived from the ERA-Interim reanalysis. The reference location in the Mediterranean Sea was chosen to show that even in enclosed seas, complex wave conditions can be found. Therefore spectral methods are relevant even in supposedly simple basins. Naturally, in the open ocean, granted the presence of multiple swell systems, we may say that spectral approaches are mandatory. We can itemize our results as follows:

2. The long-term statistics produced by SPS enable the identification of wave systems at a spot and their characterization, using partial in­ tegral parameters. At the study site in the Mediterranean Sea, this has revealed regimes with 4 wave systems (Mistral, Libeccio, Grecale, Si­ rocco), more complex than the one provided by the standard integral parameters, which suggests the dominance of a single Mistral wave system. Moreover, we have shown that SPS can help quantifying cross-sea conditions (for a given partial Hs threshold) and a very detailed interpretation of the storm evolution, consistent with the analysis of the meteorological situation (see for instance Fig. 13, left panel). 3. The prototypical spectra produced by SOM enable the long-term classification of the directional spectra at a spot. This has allowed identifying both unimodal and multimodal spectra at the study site, thus giving also indications about the occurrence of cross-seas. In

1. Given a dataset of directional spectra at a given location, or in a specific area, using SPS and SOM it is possible to provide related extensive and exhaustive statistics, which are consistent with the standard statistics based on integral parameters, but much more informative as they explicitly operate in the frequency-direction domain.

11

J. Portilla-Yandún et al.

Ocean Engineering 189 (2019) 106361

particular, SOM has the advantage of providing the spectral shape for the typical conditions, also useful for a more detailed interpretation of storm evolution (as well as for other applications that need directional spectrum as input, e.g., coastal engineering analysis). However, we have shown that SOM might under-represent the peaks as it is actually not designed to represent extremes. But this limita­ tion may be overcome by adopting ad-hoc strategies to adapt it for the representation of extremes (see for instance Barbariol et al., 2016). 4. Granted that SPS and SOM provide by default different outputs (i.e., formats and metrics), we have shown that, whenever their results can be directly or indirectly compared, they are fully consistent. This is even more the case, when, as possible, one is adapted to specify the same kind of results as the other. 5. While here we have demonstrated only some of the advantages and application of statistical analysis of directional spectra, both SPS and

SOM can be used to provide statistics for other, different, applica­ tions. These can concern, e.g., growing sea conditions, prolonged storms and extreme statistics. Acknowledgements J. Portilla acknowledges funding from project EPN-PIJ-1503. F. Barbariol was partially supported by the Flagship Project RITMARE (The Italian Research for the Sea-coordinated by the Italian National Research Council and funded by the Italian Ministry of Education, University and Research) and by funds from the Italian Ministry of Environment for the implementation of the Marine Strategy Framework Directive in Italian waters (Convention 2015; in actuation of the D.lgs. 190/2010). The authors gratefully acknowledge Francesco Marcello Falcieri for the fruitful discussions. Luciana Bertotti has provided the graphics of the three storms.

Appendix. Self-Organizing Maps In this Appendix we provide further details of the SOM implementation, related to data pre-processing and quality assessment. Data preparation Before applying SOM, directional spectra have to be pre-processed to comply with the requirements of the SOM toolbox for MATLAB (Vesanto et al., 2000). First, too low energy spectra are rejected (Hs < 5 cm). This is about 0.1% of the total input spectra. Then, the 30 � 24 spectral matrix is reshaped into a 720 � 1 array and all time-series are collected into a single matrix that makes the SOM input. Normalization of the input data is not needed in this case, as spectral components are homogeneous variables. SOM setup SOM has been set-up in order to project the 720-variate input dataset onto a 2D sheet map (cylindrical and toroidal options are also available) with rectangular lattice, thus allowing each map node, except those on the map boundaries, to be connected to 4 other nodes. The batch learning algorithm coupled with the linear initialization has been chosen to ensure repeatability, i.e. the repetition of learning with same dataset and parameters leads to the same map (Kohonen et al., 2009). The number of iterations has been set to 1000. SOM quality assessment Quality of the map is related to its goodness in representing the input dataset. This can be assessed by means of typical error metrics like: � the mean quantization error (MQE), given by the average squared Euclidean distance (data-to-BMU); � the topographic error (TE), computed as the percentage of data which do not have adjacent first and second BMUs in the map); P P * ðS ðf ;θ Þ SÞðS* ðfm ;θn Þ S Þ ffiffiffiffiffiffiffiP ffiffiffiffiffimffiffiffiffiffiffiffiffinffiffiffiffiiffiffiffimffiffiffiffiffinffiffiffiffiffiP ffiffiffiffiffiffiffiffiP ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi, � 2D cross-correlation coefficient CCi ¼ qffiffiP * 2 ð m ðS ðf ;θ Þ SÞ Þð m ðS* ðfm ;θn Þ S Þ2 Þ P P P P * n i m n n S ðf ;θ Þ S ðf ;θ Þ * m n m n i n n where S ¼ m MN , S ¼ m MN (m ¼ 1, …M; n ¼ 1, …,N); � root mean square difference NRMSDi ¼ variance density.

pffiP ffiffiffiffiffiffiffiP ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2ffiffi ðSi ðfm ;θn Þ S* ðfm ;θn ÞÞ m Pn P � 100 (%), expressed as a percentage (hence, normalized) of the total spectral S ðf ;θ Þ m

n i

m

n

For the map in Fig. 4, we obtained MQE ¼ 2.21 m2 s rad 1 and TE ¼ 18%. The mean values of CCi and NRMSDi are 0.79 and 0.34%, respectively, pointing out to a globally fair representation by the BMUs. Their standard deviations (0.20 and 0.61%, respectively) indicate however a variation of the SOM-original dataset agreement over the dataset. In order to assess how this agreement varies over the map, that is how well each prototype resembles the dataset spectra assigned to it, we have also computed these error metrics for each j-th BMU of the map (Fig. 4). Mean and standard deviation for each SOM unit are shown in Fig. A1. The best agreement is found for moderate-severe unimodal Mistral spectra (mean CC above 0.9; mean NRMSD between 0.1 and 0.2%; bottom part of the map in Fig. A1), which are recurrent and have well-defined and simple spectral shapes. The worst agreement is found for calm sea states (mean CC around 0.4–0.8; mean NRMSD in the 0.3–1.1% range; rows 1–2, column 3–5 in Fig. A1), which despite being very frequent states might have complex shapes. Calms also show the largest variance in CC and NRMSD. In addition, SOM tends to better represent frequent and moderate sea states with respect to extremes (see for instance Barbariol et al., 2016), therefore the representation of calms, which can be interpreted as lower-bound extremes, might be consequently affected.

12

J. Portilla-Yandún et al.

Ocean Engineering 189 (2019) 106361

Fig. A1. SOM quality assessment. CC (left panel) and NRMSD (right panel) between the prototype spectra (BMUs) and the spectra assigned to the BMUs. The color of the square represents the mean value of CC and NRMSD, respectively; the color of the inner circle represents their standard deviation.

The agreement between the map and the original dataset can also be made in terms of integral parameters, comparing the time series of the BMU integral parameters (Hs * , Tm * , θm * ) with those of the original dataset (Hs , Tm , θm ). The statistical indicators in Table A1 show that globally Hs and Tm are very precisely reproduced, with CC close to 1 (0.99 for Hs , 0.94 for Tm ), small RMSD (0.17 m for Hs , 0.4 s for Tm ) and no bias. θm is in general agreement, with directional variance of residuals (i.e., θm θm * ) of 0.2 and directional bias of 1.1� . The circular CC of 0.71 indicates that, when applied to directional spectra, SOM is less skillful in representing the evolution of directional spectral features. This is probably related to the limitation of SOM in representing extremes. For the same reason, locally the highest Hs and Tm are represented by smaller values, as demonstrated by the errors on the 99th percentiles in Table A1 : -4.6% for Hs and 0.9% for Tm . Similarly, the lowest Hs and Tm are represented by larger values, with errors of 15.8% and 2.8% on the 10th percentiles of Hs and Tm , respectively. On the contrary, errors on the 50th percentiles are significantly smaller, 0.4% (Hs ) and 0.1% (Tm ). As θ is a circular variable, directions close to 0� or 360� are interpreted as extreme values in the spectral space, hence poorly represented. Luckily, at the study site sea states propagating towards the North are also quite rare and mild. Therefore, when relevant conditions are placed in the 0� sector a rotation should be applied.

Table A1 SOM quality assessment. Statistical indicators of the agreement of map and original dataset in terms of integral parameters. Pearson’s Correlation Coefficient (CC), Root Mean Square Difference (RMSD), bias and error on p-th percentile (ΔXp ¼ ðXp X*p Þ=Xp � 100) for Hs and Tm time series. Circular Correlation Coefficient (CCC), directional variance of residuals (i.e., θm siduals) to 1. Hs Tm

θm

θm * ) and directional bias for θm time series. Directional variance of residuals ranges from 0 (meaning no variation of re­

CC

RMSD

bias

ΔX10 (%)

0.99 0.94

0.17 m 0.4 s

0.00 m 0.0 s

15.8 2.8

CCC

Directional variance of residuals

Directional bias

0.71

0.2

1.1�

ΔX50 (%) 0.4 0.1

ΔX99 (%) 4.6 0.9

As a further check, since the SOM prototypes are not necessarily spectra of the original dataset, but correspond to subset centroids, we have verified their consistency by assessing the high-frequency tail of these omni-directional centroid spectra. We found that the high-frequency tails of the SOM BMUs generally follow a f 4 decay in the equilibrium range, and f 5 at the highest frequencies, in agreement with ERA-I spectra and numerical model studies (e.g. Ardhuin et al., 2010). SOM-MDA The Maximum Dissimilarity Algorithm (MDA) has been applied to the 100 SOM BMUs to further isolate the 8 most dissimilar ones, in terms of their spectral shape and according to maximum squared Euclidean distance criterion (see Section 3.2). The result is shown in Fig. 5. The remaining 92 BMUs have been assigned to the selected 8 prototypes on the basis of minimum Euclidean distance criterion, as shown in Fig. A2. Unimodal BMUs have been mostly assigned to unimodal prototypes #1 to #4 in Fig. 5, while bimodal BMUs that are placed in between unimodal ones in Fig. 4, have been mostly assigned to bimodal prototypes #5 to #8.

13

J. Portilla-Yandún et al.

Ocean Engineering 189 (2019) 106361

Fig. A2. SOM-MDA quality assessment. Map BMUs assignment after MDA. The prototype number refers to the prototypes in Fig. 5.

References

Holthuijsen, L.H., 2007. Waves in Oceanic and Coastal Waters. Cambridge University Press. JCOMM, 2019. Data buoy collaboration panel. http://www.jcommops.org/dbcp/netwo rk/maps.html. (Accessed 27 June 2019). Kennard, R.W., Stone, L.A., 1969. Computer aided design of experiments. Technometrics 11 (1), 137–148. http://doi.org/10.1080/00401706.1969.10490666. Kjeldsen, S.P., 1991. The practical value of directional ocean wave spectra. Meas. Model. Predicting Appl. Directional Ocean Wave Spectra 381–387. Kohonen, T., 2001. Self-Organizing Maps, vol. 30. Springer Series in Information Sciences. http://doi.org/10.1007/978-3-642-56927-2. Kohonen, T., Nieminen, I.T., Timo, H., 2009. On the quantization error in SOM vs. VQ: a critical and systematic study. In: Advances in Sel-Organizing Maps. Springer, p. 374. Komen, G.J., Cavaleri, L., Donelan, M., Hasselmann, K., Hasselmann, S., Janssen, P.A.E. M., 1994. Dynamics and Modelling of Ocean Waves. Cambridge University Press. Retrieved from. https://books.google.it/books?id¼7sg_PN_PDUkC. Li, X.M., 2016. A new insight from space into swell propagation and crossing in the global oceans. Geophys. Res. Lett. 43 (10), 5202–5209. http://doi.org/10.1002/201 6GL068702. Liu, Y., Weisberg, R.H., He, R., 2006. Sea surface temperature patterns on the West Florida Shelf using growing hierarchical self-organizing maps. J. Atmos. Ocean. Technol. 23 (2), 325–338. http://doi.org/10.1175/JTECH1848.1. Ochi, M.K., 1978. On long-term statistics for ocean and coastal waves. Coast. Eng. Proc. 1 (16). Ochi, M.K., 2005. Ocean Waves: the Stochastic Approach. Cambridge University Press. Pierson, W.J., Marks, W., 1952. The power spectrum analysis of ocean-wave records. Eos. Trans. Am. Geophys. Union 33 (6), 834–844. http://doi.org/10.1029/TR033i 006p00834. Portilla-Yandún, J., 2018. The global signature of ocean wave spectra. Geophys. Res. Lett. 45 (1), 267–276. http://doi.org/10.1002/2017GL076431. Portilla-Yandun, J., 2018. Open access atlas of global spectral wave conditions based on partitioning, Proceedings of the 37th International Conference on Ocean, Offshore and Arctic Engineering. Proc. ASME 51333, 11B: V11BT12A051. https://doi.org/10. 1115/OMAE2018-77230. Portilla-Yandún, J., Cavaleri, L., Van Vledder, G.P., 2015. Wave spectra partitioning and long term statistical distribution. Ocean Model. 96, 148–160. http://doi.org/10.10 16/j.ocemod.2015.06.008. Portilla-Yandún, J., Salazar, A., Cavaleri, L., 2016. Climate patterns derived from ocean wave spectra. Geophys. Res. Lett. 43 (section 2), 1–8. http://doi.org/10.1002/201 6GL071419. Portilla, J., Ocampo-Torres, F.J., Monbaliu, J., 2009. Spectral partitioning and identification of wind sea and swell. J. Atmos. Ocean. Technol. 26 (1), 107–122. http://doi.org/10.1175/2008JTECHO609.1. Reguero, B.G., M�endez, F.J., Losada, I.J., 2013. Variability of multivariate wave climate in Latin America and the Caribbean. Glob. Planet. Chang. 100, 70–84. http://doi. org/10.1016/j.gloplacha.2012.09.005. Solidoro, C., Bandelj, V., Barbieri, P., Cossarini, G., Fonda Umani, S., 2007. Understanding dynamic of biogeochemical properties in the northern Adriatic Sea by using self-organizing maps and k-means clustering. J. Geophys. Res. 112 (C7), 1–13. http://doi.org/10.1029/2006JC003553.

Aguilera, C., 2016. Matworks file exchange (on line). last visited July 2019. https://la. mathworks.com/matlabcentral/fileexchange/12275-extrema-m-extrema2-m. Ardhuin, F., Bertotti, L., Bidlot, J.-R., Cavaleri, L., Filipetto, V., Lefevre, J.-M., Wittmann, P., 2007. Comparison of wind and wave measurements and models in the Western Mediterranean Sea. Ocean Eng. 34 (3–4), 526–541. http://doi.org/10.101 6/j.oceaneng.2006.02.008. Ardhuin, F., Rogers, E., Babanin, A.V., Filipot, J.-F., Magne, R., Roland, A., et al., 2010. Semiempirical dissipation source functions for ocean waves. Part I: definition, calibration, and validation. J. Phys. Oceanogr. 40 (9), 1917–1941. https://doi.org/ 10.1175/2010JPO4324.1. Ardhuin, F., Collard, F., Chapron, B., Girard-Ardhuin, F., Guitton, G., Mouche, A., Stopa, J.E., 2015. Estimates of ocean wave heights and attenuation in sea ice using the SAR wave mode on Sentinel-1A. Geophys. Res. Lett. 42 (7), 2317–2325. http:// doi.org/10.1002/2014GL062940. Barbariol, F., Falcieri, F.M., Scotton, C., Benetazzo, A., Carniel, S., Sclavo, M., 2016. Wave extreme characterization using self-organizing maps. Ocean Sci. 12 (2), 403–415. http://doi.org/10.5194/os-12-403-2016. Bertotti, L., Cavaleri, L., 2009. Large and small scale wave forecast in the Mediterranean Sea. Nat. Hazards Earth Syst. Sci. 9, 779–788. Camus, P., Cofi~ no, A.S., Mendez, F.J., Medina, R., 2011. Multivariate wave climate using self-organizing maps. J. Atmos. Ocean. Technol. 28 (11), 1554–1568. http://doi. org/10.1175/JTECH-D-11-00027.1. Camus, P., Mendez, F.J., Medina, R., Cofi~ no, A.S., 2011. Analysis of clustering and selection algorithms for the study of multivariate wave climate. Coast Eng. 58 (6), 453–462. http://doi.org/10.1016/j.coastaleng.2011.02.003. Cavaleri, L., Bertotti, L., Lionello, P., 1991. Wind wave cast in the Mediterranean Sea. J. Geophys. Res. 96 (C6), 10739. http://doi.org/10.1029/91JC00322. Cavaleri, L., Bertotti, L., Torrisi, L., Bitner-Gregersen, E., Serio, M., Onorato, M., 2012. Rogue waves in crossing seas: the Louis Majesty accident. J. Geophys. Res.: Oceans 117 (C11), 1978–2012. Copernicus Climate Change Service (C3S), 2017. ERA5: fifth generation of ECMWF atmospheric reanalyses of the global climate. Copernicus climate change service climate data store (CDS). https://cds.climate.copernicus.eu/cdsapp#!/home. (Accessed 12 August 2019). Dee, D.P., Uppala, S.M., Simmons, A.J., Berrisford, P., Poli, P., Kobayashi, S., et al., 2011. The ERA-Interim reanalysis : configuration and performance of the data assimilation system. Q. J. R. Meteorol. Soc. 137 (April), 553–597. http://doi.org/10.100 2/qj.828. European Centre for Medium-range Weather Forecast (ECMWF), 2011. The ERA-Interim reanalysis dataset. Copernicus Climate Change Service (C3S). Available from. https ://www.ecmwf.int/en/forecasts/datasets/archive-datasets/reanalysis-datasets/er a-interim. September 2019. Falcieri, F.M., Benetazzo, A., Sclavo, M., Russo, A., Carniel, S., 2013. Po River plume pattern variability investigated from model data. Cont. Shelf Res. 87, 84–95. http://doi.org/10.1016/j.csr.2013.11.001. Hegermiller, C.A., Rueda, A., Erikson, L.H., Barnard, P.L., Antolinez, J.A.A., Mendez, F. J., 2017. Controls of multimodal wave conditions in a complex coastal setting. Geophys. Res. Lett. 44 (24), 12315–12323. http://doi.org/10.1002/2017GL075272. Hasselmann, S., Brüning, C., Hasselmann, K., Heimbach, P., 1996. An improved algorithm for the retrieval of ocean wave spectra from synthetic aperture radar image spectra. J. Geophys. Res. 101, 16615–16629. https://doi.org/10.1029/ 96JC00798.

14

J. Portilla-Yandún et al.

Ocean Engineering 189 (2019) 106361

Stopa, J.E., Cheung, K.F., 2014. Intercomparison of wind and wave data from the ECMWF reanalysis interim and the NCEP climate forecast system reanalysis. Ocean Model. 75, 65–83. http://doi.org/10.1016/j.ocemod.2013.12.006. Vatanen, T., Osmala, M., Raiko, T., Lagus, K., Sysi-Aho, M., Ore�si�c, M., Honkela, T., L€ ahdesm€ aki, H., 2015. Self-organization and missing values in SOM and GTM. Neurocomputing 147, 60–70.

Vesanto, J., Himberg, J., Alhoniemi, E., Parhankangas, J., 2000. SOM Toolbox for Matlab 5. Technical Report A57, vol. 2, p. 59 (0). http://www.cis.hut.fi/somtoolb ox/package/papers/techrep.pdf.

15