Comparison between two GAMs in quantifying the spatial distribution of Hexagrammos otakii in Haizhou Bay, China

Comparison between two GAMs in quantifying the spatial distribution of Hexagrammos otakii in Haizhou Bay, China

Fisheries Research 218 (2019) 209–217 Contents lists available at ScienceDirect Fisheries Research journal homepage: www.elsevier.com/locate/fishres ...

1MB Sizes 0 Downloads 36 Views

Fisheries Research 218 (2019) 209–217

Contents lists available at ScienceDirect

Fisheries Research journal homepage: www.elsevier.com/locate/fishres

Comparison between two GAMs in quantifying the spatial distribution of Hexagrammos otakii in Haizhou Bay, China

T

Xiaoxiao Liua, Jing Wanga, Yunlei Zhanga, Huaming Yuc, Binduo Xua, Chongliang Zhanga, ⁎ Yiping Rena,b, Ying Xuea, a

Laboratory of Fisheries Ecosystem Monitoring and Assessment, Fisheries College, Ocean University of China, Qingdao 266003, China Laboratory for Marine Fisheries Science and Food Production Processes, Pilot National Laboratory for Marine Science and Technology (Qingdao), Qingdao 266237, China c College of Oceanic and Atmospheric Sciences, Ocean University of China, Qingdao 266100, China b

A R T I C LE I N FO

A B S T R A C T

Handled by A.E Punt

Species distribution models (SDMs) can be used to quantify the relationships between species distribution and environmental variables. The predictive skill of SDMs depends on whether appropriate explanatory variables and intrinsic processes are included in the model. In addition to abiotic environmental variables, biotic variables could also have significant impacts on the spatial distribution of marine organisms. Correlations between some explanatory variables will cause multicollinearity, which could result in overfitting of models and erroneous projections/forecasts of species distribution. Application of dimension reduction techniques such as principal component analysis (PCA) could be used to retain important information and avoid collinearity. We compared the performance of the generalized additive model (GAM) and the PCA-based GAM in predicting the spatial distribution of Hexagrammos otakii in Haizhou Bay, incorporating abiotic and biotic variables in these models. Results showed that the PCA-based GAM was able to reduce the multicollinearity introduced by explanatory variables and improve the performance of GAMs, according to a cross-validation test and predicted species distribution. Incorporating prey abundance in PCA-based GAM could improve the predictive skill of SDMs. The method proposed in this study could be extended to other marine organisms to enhance our understanding of the ecological mechanisms underlying the distribution of target species.

Keywords: Generalized additive model Species distribution model Principle component analysis Biotic variables Habitat

1. Introduction Fish spatial distributions are essential for understanding the dynamics of marine ecosystems. In recent years, the impacts of environmental-driven changes on the spatial distribution of species in the marine ecosystem have attracted extensive attention (e.g., Youcef et al., 2013; Sagarese et al., 2014; Peck et al., 2016). Understanding the linkage between species distributions and habitat s can provide necessary information for predicting the impact of climate changes and establishing spatial and/or temporal-based management strategies (Hsieh et al., 2010; Manderson et al., 2011; Arrizabalaga et al., 2015). Species distribution models (SDMs) are essential for predicting the distributions and facilitating the protection of species due to the difficulties in monitoring species distributions over large spatial areas (Liu et al., 2013). The predictive skill of SDMs depends on the accuracy and availability of input data (Dambach and Rödder, 2011). In addition to



abiotic environmental variables, prey abundance could also have significant impacts on spatial distribution of marine organisms (Bi et al., 2011; Murase et al., 2013; Arrizabalaga et al., 2015; Zerbini et al., 2016). SDMs with only abiotic variables treat environmental characteristics as proxies of prey distribution (Torres et al., 2008). Incorporation of biotic variables, rather than proxies, allows biotic interactions to be directly included in SDMs to improve the predictive skill of these models (Xue et al., 2018). Several multivariate statistical methods such as multiple linear regression, logistic regression, generalized linear models (GLM), generalized additive model (GAM) and machine learning methods such as random forest are usually used when studying the relationship between environmental variables and habitats of species (Ahmadi-Nedushan et al., 2006; Jordaan et al., 2010; Palamara et al., 2012; Jyväsjärvi et al., 2013). GAMs allow for complex non-linear relationships between response variables and predictors rather than priori fitted parametric models (Kitchens and Rooker, 2014). Thus, they may be more flexible

Corresponding author at: Room 106 Fisheries College, 5 Yushan Road, Qingdao 266003, China. E-mail address: [email protected] (Y. Xue).

https://doi.org/10.1016/j.fishres.2019.05.019 Received 29 October 2018; Received in revised form 29 May 2019; Accepted 31 May 2019 Available online 06 June 2019 0165-7836/ © 2019 Elsevier B.V. All rights reserved.

Fisheries Research 218 (2019) 209–217

X. Liu, et al.

each station. The environmental parameters such as water depth, temperature and salinity were measured using a CTD at each station, and GPS was used to record the sampling location. The abundance of H.otakii ranged from 1 to 1350 in... The length of H.otakii was 31–235 (mm), while the overall mean length was 75 ± 1.04573 (mean ± SE, mm). The sediment data for Haizhou Bay were provided by the College of Environmental Science and Engineering at the Ocean University of China.

and robust for modelling the spatial distribution of species (Hastie and Tibshirani, 1990; Leathwick et al., 2006; Ptacnik et al., 2008; Chang et al., 2010; Schmiing et al., 2013; Zerbini et al., 2016). However, some explanatory variables are likely to be strongly correlated with each other (Pérez et al., 1998; Saraceno et al., 2005; Ribeiro et al., 2012). This multicollinearity might result in overfitting and lead to considerable uncertainty in SDMs, which could result in erroneous projections/ forecasts of species distributions. Several dimension reduction techniques such as the stepwise method, removal of predictor variances, residual regression, ordinary least squares and ridge regression have been used to select appropriate explanatory variables in habitat modeling and eliminate multicollinearity between variables (Straka et al., 2012; Kroll and Song, 2013). However, these methods may lead to missing some significant explanatory variables and information. Principal component analysis (PCA) would be preferred in that case to retain important information and avoid collinearity (Buisson et al., 2008). The principal components (PCs) are used as new explanatory variables in the GAM. This PCAbased GAM can reduce the multicollinearity between explanatory variables, and the new explanatory variables derived by PCA can retain most of the information in the original data (Ahmadi-Nedushan et al., 2006). Zhao et al. (2014) compared the performance of regular GAM and PCA-based GAM when modeling the relationship between diversity indices and environmental variables. They suggested that the PCAbased GAM is a better approach with higher prediction accuracy. However, few studies have been conducted to evaluate the performance of PCA-based GAM in predicting the spatial distribution of marine species. Hexagrammos otakii is a benthic cold-water fish living in coastal waters (Sun et al., 2014). It has become one of the most economically important species in Chinese waters, and plays an important role in the ocean ecosystem (Sui et al., 2017). The relationship between the distribution of H. otakii and environmental variables tends to be complex and needs more in-depth research (Xing et al., 2015). Xu et al. (2018) found that H. otakii preys on more than 60 prey species, with shrimps being major prey groups during all seasons. H. otakii has been overfished in recent years (Sun et al., 2014). Therefore, studies on the spatial distribution of H. otakii, and its relationship with environmental factors is necessary for the scientific management of its population. The regular GAM and PCA-based GAM are used here to predict the spatial distribution of H. otakii in Haizhou Bay, China. The performance and predictive skill of the two GAMs were compared using cross-validation and a map of predicted distribution. Biological implications are discussed. The proposed method is expected to be able to improve the predictive skill of habitat models and enhance our understanding of the ecological mechanisms of species distributions.

2.2. Variables selection Several abiotic and biotic variables were considered as candidate explanatory variables in the modeling (Table 1). According to Xing et al. (2015), the abiotic variables considered for inclusion in the model were longitude, latitude, bottom temperature, bottom salinity, depth, sediment types and season. Stomach content analysis of H. otakii collected from Haizhou Bay (Xu et al., 2018) suggests that Metapenaeopsis dalei, Heptacarpus futilirostris and Latreutes planirostris are major prey species in spring, and M. dalei and Loligo chinensis were preyed upon mainly in autumn. Consequently, these were selected as biotic variables. The biomass data of prey were synchronously collected from each survey in Haizhou Bay. They were measured as catch density (g/h). Correlation analyses and variance inflation factors (VIFs) were used to select variables before modeling. Highly correlated explanatory variables were excluded from GAMs. 2.3. Generalized additive models 2.3.1. Regular GAM A two stage GAM was used to reduce the impact of zero catches (Li et al., 2015). The first stage GAM estimates the presence of H. otakii (p) using a logit link function with a binomial error model:.

GAM 1: z˜Bi (1, p), logit(p) = s (x1) + …+s (x n ) + sediment type

(1)

where s is spline smoother, and x1……xn are the explanatory variables. The second stage GAM estimates the log-transformed biomass of H. otakii (d) by using an identity link function a Gaussian error distribution (Berry and Welsh, 2002):

GAM 2: ln(x) ∼ N(ln(d), σ2), ln(d ) = s (x1) + …+s (x n ) + sediment type (2)



The overall log-transformed biomass of H. otakii ln(B) was estimated by combining the results from the first and second stages of the GAM: ln(B)= ln(p) + ln(d)

(3)

2.3.2. PCA-based GAM For the PCA-based GAM, we applied PCA to analyze the uncorrelated explanatory variables except for sediment type because the PCA is not suitable for categorical variables. The PCA-based GAM of H. otakii can be described as follows:

2. Materials and methods 2.1. Data collection Haizhou Bay is a typical open bay along the coast of the central Yellow Sea with a total area of 876.39 km². It acts as an important fishing ground for many marine species (Li et al., 2017). Samples were collected from bottom trawl surveys during spring (April–May) and autumn (September–October) in 2011 and 2013–2017 in Haizhou Bay and adjacent areas (34°25′−35°35′N, 119°25′−121°5′E). The survey stations were selected by stratified random sampling (Xu et al., 2015). The survey area was divided into five areas according to differences in water depth, latitude and other geographic factors (Fig. 1). Xu et al. (2015) developed a simulation approach to evaluate and optimize the sampling efforts for the fishery-independent survey in the study area. A total of 24 stations were randomly selected in 2011, and the sampling stations were optimized to 18 after 2013 (Xu et al., 2015). The surveys were conducted using a single trawl fishing boat with a power of 220kw. The towing time was about 1 h at the speed of 2˜3kn at

n

PCA − based GAM 1: logit(p) =

∑ s (comp i) + sediment

type

i=1

(4)

n

PCA − based GAM 2: ln(d) =

∑ s(comp i) + sediment type + ε i=1

(5)

where comp i are the PCs, and n is the number of PCs that were chosen for the PCA-based GAM. The overall log-transformed biomass of H. otakii was estimated by combining the results from the first and second stages of the PCA-based GAM: ln(B)= ln(p) + ln(d)

(6)

The PCA and GAM were conducted using the gam package in the R3.2.5 software. 210

Fisheries Research 218 (2019) 209–217

X. Liu, et al.

Fig. 1. Survey areas in Haizhou Bay and adjacent areas in 2011 and 2013–2017. Table 1 Explanatory variables considered as potential explanatory variables in the generalized additive model (GAM) for Hexagrammos otakii in Haizhou Bay. Variables

Spring

Autumn

Sediment type

coarse sand/middle coarse sand/middle fine sand/sand/silty sand/clayey sand, silt, sand-silt-clay/silty clay 119°25'12”-121°4'58” 34°25'12”-35°34'58” 6.37-36.64 m 9.16-17.95℃ 28.36-32.03 Metapenaeopsis dalei, Heptacarpus futilirostris, Latreutes planirostris

coarse sand/middle fine sand/sand/silty sand/clayey sand, silt, sand-silt-clay/silty clay 119°25'12”-121°4'58” 34°25'12”-35°34'58” 3.77-39.86 m 17.77-22.86℃ 27.75-31.94 Metapenaeopsis dalei, Loligo chinensis

Longitude Latitude Depth Sea bottom temperature (SBT) Sea bottom salinity (SBS) Prey species

distribution of H. otakii in 2017 using the regular GAM and PCA-based GAM established by data during 2011-2016. The predicted distribution of H. otakii was overlaid with the observed abundance in 2017, and the consistency between these distributions was examined. Environmental data used to predict the distribution of H. otakii in 2017 were simulated using the Finite Volume Community Ocean Model (FVCOM) (Chen et al., 2003, 2016). The biotic data used to drive models were derived using ordinary kriging.

2.4. Model validation The performances of two GAMs were evaluated using cross-validation, in which 80% of the data were randomly selected for building the model, and the remaining 20% to test the model. This subsampling process was repeated 100 times for each cross-validation. The correlation coefficient between the observed and predicted values was calculated for each run. The predicted (B) and observed values (d) can be compared by using the following linear regression model:

B = a + b (d )

3. Results

(7)

where parameter a is systematic bias of the predicted values, and b is a slope parameter. A value of b close to 1 indicates that observed and predicted values have similar spatial patterns (Chang et al., 2010).

3.1. Correlations between variables A significant correlation (|r| > 0.6) was found between depth and bottom temperature in spring (Tables 2 and 3). The VIFs for the explanatory variables ranged from 1.17 to 2.96 in spring and 1.15 to 2.56 in autumn. Depth had a higher VIF (2.96) than SBT (2.84) in spring. Thus, depth was excluded from the explanatory variables for spring.

2.5. Comparison and evaluation of model performance The main purpose of this study was to compare and evaluate the performances of regular GAM and PCA-based GAM in quantifying the spatial distribution of H. otakii in Haizhou Bay. We used the following three methods: (1) Comparison of the regression R2 and slope between the observed and predicted values from cross-validation. (2) Comparison of the proportion of the 100 simulation runs in which the same explanatory variables were identified as significant for the regular GAM or PCA-based GAM. An explanatory variable was considered significant in influencing spatial distributions of H. otakii if the proportion of the 100 runs in which it was identified as significant was higher than 0.5 (Zhao et al., 2014). (3) Prediction of the spatial

3.2. Principal component analysis The first four PCs explained 84.15% ( ± 1.72%) and 80.94% ( ± 0.92%) of the total variance on average over the 100 simulation runs for spring and autumn, respectively. In spring, longitude, SBS and latitude had the highest positive loadings, while SBT had the highest negative loading for the first PC (Fig. 2). H. futilirostris and latitude had the highest positive loadings, and L. planirostris had the highest negative loading for the second PC. Longitude and M. dalei had the highest 211

Fisheries Research 218 (2019) 209–217

X. Liu, et al.

Table 2 Correlation coefficients between candidate explanatory variables in spring. Longitude Longitude Latitude Depth SBT SBS M. dalei H. futilirostris L. planirostris

−0.08 0.49 −0.53 0.24 0.25 −0.13 −0.13

Latitude

Depth

SBT

SBS

M. dalei

H. futilirostris

L. planirostris

−0.08

0.49 0.54

−0.53 −0.49 −0.72

0.24 0.49 0.46 −0.39

0.25 0.15 0.29 −0.17 0.25

−0.13 0.05 −0.15 −0.02 0.12 −0.01

−0.13 0.03 −0.14 0.05 0.17 0.01 0.59

0.54 −0.49 0.49 0.15 0.05 0.03

−0.72 0.46 0.29 −0.15 −0.14

−0.39 −0.17 −0.02 0.05

0.25 0.12 0.17

−0.01 0.01

0.59

Note: Values in the table are correlation coefficients with bold font being significant at correlation coefficients(> 0.6).

the northern and center of the study area, which was fairly consistent with the observed abundance of H. otakii in 2017. In autumn, the regular GAM identified higher abundances mainly located in the northwestern coastal areas of Haizhou Bay, which was inconsistent with the observations. The PCA-based GAM predicted similar distributions of H. otakii to the observations, with higher abundances mainly distributed in the northeastern areas, and lower abundance tending to occur in the southern and coastal areas of Haizhou Bay (Fig. 5).

positive loadings for the third PC. The fourth PC reflected the influence of longitude and latitude. In autumn, depth and M. dalei had the highest positive loadings, while SBS had the highest negative loading in the first PC (Fig. 3). M. dalei had the highest positive loading in the second PC. Latitude and L. chinensis respectively had the highest positive and negative loadings in the third PC. The fourth PC mainly reflected the influence of L. chinensis and SBS. 3.3. Generalized additive models

4. Discussion In regular GAM, latitude, longitude, SBT, SBS, and the biomass of M. dalei, H. futilirostris and L. planirostris had significant effects on the distribution of H. otakii in spring (Table 4), while latitude, depth, SBS and the biomass of M. dalei and L. chinensis had significant effects in autumn (Table 5). In PCA-based GAM, the first two PCs were identified as significant in most simulation runs for spring, while all of the PCs were identified as significant factors for autumn, especially the first PC. However, sediment type was significant in fewer than 35 of the 100 runs.

4.1. Spatial distribution of H. otakii in Haizhou Bay Haizhou Bay is an important spawning and feeding ground for many economically-important species (Sun et al., 2014). Variation in environmental variables may have significant impacts on the spatial distribution of H. otakii. This study indicated that latitude, bottom temperature, bottom salinity, longitude and distributions of prey species including M. dalei, H. futilirostris and L. planirostris had significant impacts on the distribution of H. otakii during spring in Haizhou Bay. In autumn, bottom salinity, latitude, depth and biomass of prey species such as M. dalei and L. chinensis were identified as significant factors. The main abiotic environmental variables are similar to those identified by Xing et al. (2015), but our study found that biotic variables (such as biomass of prey species) were also important determinants of the spatial distribution of H. otakii. Water temperature is one of the most important environmental factors affecting the survival, growth and distribution of many marine organisms (Kempf et al., 2013; Youcef et al., 2013). Haizhou Bay is located on the edge of the Yellow Sea cold water mass, which takes place in the mid-spring and summer. Consequently, the bottom temperature in Haizhou Bay is relatively lower in spring, which has an impact on the distribution of H. otakii (Xing et al., 2015). Salinity is also an important environmental factor that affects the survival and growth of marine organisms. Variation in salinity will directly affect the metabolism and feeding behavior of fish (Hu et al., 2012). An influence of salinity on the distribution of H.otakii in Haizhou bay was also found in this study. H. otakii is a cold-temperate species, and it mainly distributes in the area north of 35 °N (Xing et al., 2015). Therefore, higher latitude water environment has better habitat conditions for H. otakii. The

3.4. Cross validation The average correlation coefficients between the observed and predicted abundance over the 100 simulation runs was greater than 0.5 for both the regular GAM and PCA-based GAMs (Fig. 4). The mean correlation coefficients for the PCA-based GAM was 0.594 ( ± 0.12) in spring and 0.680 ( ± 0.13) in autumn, which was significantly higher than those for the regular GAM. The PCA-based GAM tended to have smaller intercepts than the regular GAM (Table 6), and the slope coefficient was closer to 1 for the PCA-based GAM, implying that the PCA-based GAM performed better than the regular GAM in predicting the spatial distribution of H. otakii in Haizhou bay. 3.5. Predicted distributions of H. otakii Fig. 5 shows maps of predicted and observed abundance (logtransformed) of H. otakii during spring and autumn of 2017. In spring, the regular GAM predicted that higher abundances would be mainly concentrated in the northeastern area of Haizhou Bay. In contrast, the PCA-based GAM predicted that H. otakii would be mainly distributed in Table 3 Correlation coefficients between candidate explanatory variables in autumn.  

Longitude

Longitude Latitude Depth SBT SBS M. dalei L. chinensis

−0.14 0.54 0.12 0.38 0.04 −0.01

Latitude

Depth

SBT

SBS

M. dalei

L. chinensis

−0.14

0.54 0.23

0.12 0.17 −0.12

0.38 0.49 0.31 0.23

0.04 0.17 0.18 −0.16 0.02

−0.01 0.17 0.17 0.04 0.19 0.27

0.23 0.17 0.49 0.17 0.17

−0.12 0.31 0.18 0.17

0.23 −0.16 0.04

212

0.02 0.19

0.27

Fisheries Research 218 (2019) 209–217

X. Liu, et al.

Fig. 2. Box plots of loadings of environmental variables in each principal component over 100 simulation runs in spring. M. d is Metapenaeopsis dalei, H. r is Heptacarpus rectirostris and L. p is Latreutes planirostris.

Fig. 3. Box plots of loadings of environmental variables for each principal component over 100 simulation runs in autumn. M. d is Metapenaeopsis dalei and L. c is Loligo chinensis.

213

Fisheries Research 218 (2019) 209–217

X. Liu, et al.

The reliability and accuracy of SDMs depends on the availability and accuracy of input data and the selection of appropriate predictors, which is in turn related to the physiological requirements of species (Dambach and Rödder, 2011). In addition to physical environmental variables, prey availability is also one of the main drivers regulating spatial distribution of marine species. Prey species were found to have significant impacts on the spatial distributions of H. otakii. Incorporation of prey abundance may allow biotic interactions to be directly included in species distribution models to improve the predictive skill. And including prey distribution data in habitat models can also clarify the role of bottom-up processes in species distribution. Therefore, it is necessary to consider predator–prey interactions in future habitat modeling.

Table 4 Proportion of 100 simulation runs in which a factor was identified as significant factor in spring. Factors

Sediment Longitude Latitude SBT SBS M. dalei H. futilirostris L. planirostris

Regular GAM

Factors

stage 1 (%)

stage 2 (%)

32 38 72 72 36 72 54 73

23 68 98 92 92 46 0 44

PCA-based GAM stage 1 (%)

stage 2 (%)

Sediment PC1 PC2 PC3 PC4

3 86 59 21 29

24 99 84 13 42

 

 

 

Note: Proportion of the simulation runs was in bold when it is higher than 50%.

4.2. Comparison of two GAMs

Table 5 Proportion of 100 simulation runs in which a factor was identified as significant factor in autumn.

The performance of regular GAM and PCA-based GAM was evaluated. Cross-validation suggested that the PCA-based GAM had better predictive. It was also able to reduce the multicollinearity introduced by several explanatory variables in the GAM and improve the performance of species distribution models. The relationships between spatial distribution of fish distribution and environment variables tend to be nonlinear, complex, and hard to understand. In PCA-based GAM, the significant factors could be reflected in the loading contribution of PCs that are identified as significant variables. In this study, the significant factors selected by PCAbased GAM are consistent with the regular GAM. The PCA-based GAM has the same biological significance as regular GAM in addition to better statistical significance in cross-validation and prediction.

Factors

Sediment Longitude Latitude Depth SBT SBS M. dalei L. chinensis

Regular GAM

Factors

stage 1(%)

stage 2(%)

26 28 83 49 43 89 68 74

29 43 78 59 43 50 89 67

PCA-based GAM stage 1(%)

stage 2(%)

Sediment PC1 PC2 PC3 PC4

8 100 68 77 60

13 41 46 33 36

 

 

 

Note: Proportion of the simulation runs was in bold when it is higher than 50%.

4.3. Implications and future improvements

influence of depth may be related to the seasonal deep-shallow water migration of H. otakii in the coastal waters of China seas. Water temperature and salinity may shift due to climate change (Vanderwal et al., 2013), which may have significant impacts on the spatial distribution of H. otakii over time.

Ecosystem based fishery management (EBFM) has been widely considered as an essential approach for addressing the crisis of fishery management around the world. International fishery management is shifting toward EBFM to provide a comprehensive framework of sustainable fishery management (Pikitch et al., 2004; Dolan et al., 2016).

Fig. 4. Probability distribution of correlation coefficients between the observed and predicted abundance of Hexagrammos otakii from the cross validation based on two GAMs: (a) Regular GAM in spring; (b) PCA-based GAM in spring; (c) Regular GAM in autumn; (d) PCA-based GAM in autumn. 214

Fisheries Research 218 (2019) 209–217

X. Liu, et al.

Table 6 The coefficients for the regression models of observed and predicted abundance of Hexagrammos otakii in the cross-validation of two GAMs. Coefficients

Correlation coefficient Intercept (a) Slope (b)

Spring

Autumn

Regular GAM

PCA-based GAM

Regular GAM

PCA-based GAM

0.50 ± 0.14 0.74 ± 0.78 0.82 ± 0.18

0.59 ± 0.12 0.27 ± 0.95 0.99 ± 0.22

0.52 ± 0.19 0.62 ± 0.62 0.79 ± 0.19

0.68 ± 0.13 −0.01 ± 0.49 1.03 ± 0.17

Note: SDs were provided for these values.

Protected Areas (MPAs) and identification of Essential Fish Habitats (EFH), which in turn will contribute to EBFM. Understanding the spatial distribution of H. otakii and its relationship with explanatory variables can also provide a reliable scientific basis for fisheries management in China. In future, more biotic and abiotic predictors (such as predators, competitors, population memory, population size, Chlorophy II, oxygen, fishing pressure and hydrodynamic forces) should be considered in the evaluation of the model performance (Loots et al., 2010; Planque et al., 2011; Cormon et al., 2014). Previous studies suggested that machine learning methods such as random forest (RF) can avoid overfitting (Chen et al., 2013) and simple estimation of parameters (Li et al., 2015). The performance of RF and PCA-based GAM in quantifying the spatial distribution of species should be evaluated. The method outlined in this study could be applied to other marine species

The practical implementation of EBFM requires operational tools to monitor ecosystem status, to assess the ecological impacts of fishing, and to evaluate fishery-management strategies (Smith et al., 2007). However, EBFM is generally hampered by a lack of information about habitats and other ecosystem components (Pikitch et al., 2004). Understanding the spatiotemporal dynamics of species distributions and its relationships with environmental conditions are crucial for the EBFM and the evaluation of the effects of climate change. In this study, the PCA-based GAM incorporating prey abundance as explanatory variables could provide more accurate information on species distributions, which can be used to define the initial spatial allocation of functional groups at the simulation process of spatially explicit ecosystem models, such as ECOSPACE and OSMOSE (Grüss et al., 2014, 2016). The prediction of PCA-based GAM could also serve to develop spatial management strategies, such as establishment of Marine

Fig. 5. The predicted (colour contours) and observed (black circles) distribution of Hexagrammos otakii in 2017 based on two GAMs. (a) Regular GAM in spring; (b) PCA-based GAM in spring; (c) Regular GAM in autumn and (d) PCA-based GAM in autumn. 215

Fisheries Research 218 (2019) 209–217

X. Liu, et al.

to improve the predictive skill of habitat models and enhance our understanding of the ecological mechanisms of species distributions.

Mexico. Fish. Oceanogr. 23, 460–471. https://doi.org/10.1111/fog.12081. Kroll, C.N., Song, P., 2013. Impact of multicollinearity on small sample hydrologic regression models. Water Resour. Res. 49, 3756–3769. https://doi.org/10.1002/wrcr. 20315. Leathwick, J.R., Elith, J., Hastie, T., 2006. Comparative performance of generalized additive models and multivariate adaptive regression splines for statistical modelling of species distributions. Ecol. Model. 199, 188–196. https://doi.org/10.1016/j. ecolmodel.2006.05.022. Li, M., Zhang, C., Xu, B., Xue, Y., Ren, Y., 2017. Evaluating the approaches of habitat suitability modelling for whitespotted conger (Conger myriaster). Fish. Res. 195, 230–237. https://doi.org/10.1016/j.fishres.2017.07.024. Li, B., Cao, J., Chang, H., Wilson, C., Chen, Y., 2015. Evaluation of effectiveness of fixedstation sampling for monitoring American lobster settlement. North Am. J. Fish. Manag. 35 (5), 942–957. https://doi.org/10.1080/02755947.2015.1074961. Liu, C., White, M., Newell, G., Griffioen, P., 2013. Species distribution modeling for conservation planning in Victoria. Australia. Ecol. Model. 249, 68–74. https://doi. org/10.1016/j.ecolmodel.2012.07.003. Loots, C., Vaz, S., Planque, B., Koubbi, P., 2010. What controls the spatial distribution of the North Sea plaice spawning population? Confronting ecological hypotheses through a model selection framework. ICES J. Mar. Sci. 67, 244–257. https://doi.org/ 10.1093/icesjms/fsp238. Manderson, J., Palamara, L., Kohut, J., Oliver, M.J., 2011. Ocean observatory data are useful for regional habitat modeling of species with different vertical habitat preferences. Mar. Ecol-Prog. Ser. 438, 1–17. https://doi.org/10.3354/meps09308. Murase, H., Kitakado, T., Hakamada, T., Matsuoka, K., Nishiwaki, S., Naganobu, M., 2013. Spatial distribution of Antarctic minke whales (Balaenoptera bonaerensis) in relation to spatial distributions of krill in the Ross Sea, Antarctica. Fish. Oceanogr. 22, 154–173. https://doi.org/10.1111/fog.12011. Palamara, L., Manderson, J., Kohut, J., Oliver, M.J., Gray, S., Goff, J., 2012. Improving habitat models by incorporating pelagic measurements from coastal ocean observatories. Mar. Ecol. Prog. Ser. 447, 15–30. https://doi.org/10.3354/meps09496. Peck, M.A., Arvanitidis, C., Butenschön, M., Canu, D.M., Chatzinikolaou, E., Cucco, A., Domenici, P., Fernandes, J.A., Gasche, L., Huebert, K.B., et al., 2016. Projecting changes in the distribution and productivity of living marine resources: a critical review of the suite of modelling approaches used in the large European project VECTORS. Estuar., Coast. Shelf. S. https://doi.org/10.1016/j.ecss.2016.05.019. Pérez, F.F., Rı́os, A.F., Castro, C.G., Fraga, F., 1998. Mixing analysis of nutrients, oxygen and dissolved inorganic carbon in the upper and middle North Atlantic Ocean east of the Azores. J. Mar. Syst. 16, 219–233. https://doi.org/10.1016/S0924-7963(01) 00003-3. Pikitch, E.K., Santora, C., Babcock, E.A., Bakun, A., Bonfil, R., Conover, D.O., Dayton, P., Doukakis, P., Fluharty, D., Heneman, B., et al., 2004. Ecosystem-based fishery management. Science 305, 346–347. https://doi.org/10.1126/science.1098222. Planque, B., Loots, C., Petitgas, P., Lindstrøm, U., Vaz, S., 2011. Understanding what controls the spatial distribution of fish populations using a multi-model approach. Fish. Oceanogr. 20 (1), 1–17. https://doi.org/10.1111/j.1365-2419.2010.00546.x. Ptacnik, R., Lepistö, L., Willén, E., Brettum, P., Andersen, T., Rekolainen, S., Solheim, A.L., Carvalho, L., 2008. Quantitative responses of lake phytoplankton to eutrophication in Northern Europe. Aquat. Microb. Ecol. 42, 227–236. https://doi.org/ 10.1007/s10452-008-9181-z. Ribeiro, J., Carvalho, G.M., Goncalves, J.M.S., Erzini, K., 2012. Fish assemblages of shallow intertidal habitats of the Ria Formosa lagoon (South Portugal): influence of habitat and season. Mar. Ecol. Prog. Ser. 446, 259–273. https://doi.org/10.3354/ meps09455. Sagarese, S.R., Frisk, M.G., Cerrato, R.M., Sosebee, K.A., Musick, J.A., Rago, P.J., 2014. Application of generalized additive models to examine ontogenetic and seasonal distributions of spiny dogfish (Squalus acanthias) in the Northeast (US) shelf large marine ecosystem. Can. J. Fish. Aquat. Sci. 71, 847–877. https://doi.org/10.1139/ cjfas-2013-0342. Saraceno, M., Provost, C., Piola, A.R., 2005. On the relationship between satellite-retrieved surface temperature fronts and chlorophyll a in the western South Atlantic. J. Geophys. Res. 110, 1–16. https://doi.org/10.1029/2004JC002736. Schmiing, M., Afonso, P., Tempera, F., Santos, R.S., 2013. Predictive habitat modeling of reef fishes with contrasting trophic ecologies. Mar. Ecol. Prog. Ser. 474, 201–216. https://doi.org/10.3354/meps10099. Smith, A.D.M., Fulton, E.J., Hobday, A.J., Smith, D.C., Shoulder, P., 2007. Scientific tools to support the practical implementation of ecosystem-based fisheries management. Ices J. Mar. Sci. 64, 633–639. https://doi.org/10.1093/ICESJMS/FSM041. Straka, M., Syrovátka, V., Helešic, J., 2012. Temporal and spatial macroinvertebrate variance compared: crucial role of CPOM in a headwater stream. Hydrobiologia. 686, 119–134. https://doi.org/10.1007/s10750-012-1003-6. Sun, Y., Zan, X., Xu, B., Ren, Y., 2014. Growth, mortality and optimum catchable size of Hexagrammos otakii in Haizhou Bay and its adjacent waters. Periodical of Ocean University of China 44 (9), 46–52 (in Chinese). Sui, H., Xue, Y., Ren, Y., Zou, Y., Yu, L., 2017. Studied on the ecological groups of fish communities in Haizhou Bay, China. Periodical of Ocean University of China 47 (12), 59–71 (in Chinese). Torres, L.G., Read, A.J., Halpin, P., 2008. Fine-scale habitat modeling of a top marine predator: do prey data improve predictive capacity? Ecol. Appl. 18, 1702–1717. https://doi.org/10.1890/07-1455.1. Vanderwal, J., Murphy, H., Kutt, A., 2013. Focus on poleward shifts in species’ distribution underestimates the fingerprint of climate change. Nat. Clim. Change 3 (3), 239–243. https://doi.org/10.1038/nclimate1688. Xing, L., Xu, B., Zhang, C., Ren, Y., 2015. Environmental influence on the distribution of Hexagrammos otakii inhabiting Haizhou Bay and its adjacent waters. Periodical of Ocean University of China. 45 (6), 045–050 (in Chinese).

Acknowledgements We are grateful to colleagues and students in the laboratory of Fisheries Ecosystem Monitoring and Assessment for their work in field sampling and sample analyses. This study was supported by the National Key R&D Program of China (2017YFE0104400), the National Natural Science Foundation of China (31772852) and the Marine S&T Fund of Shandong Province for Pilot National Laboratory for Marine Science and Technology (Qingdao) (2018SDKJ0501-2). References Ahmadi-Nedushan, B., St-Hilaire, A., Bérubé, M., Robichaud, É., Thiémonge, N., Bobée, B., 2006. A review of statistical methods for the evaluation of aquatic habitat suitability for instream flow assessment. River Res. Appl. 22, 503–523. https://doi.org/ 10.1002/rra.918. Arrizabalaga, H., Dufour, F., Kell, L., Merino, G., Ibaibarriaga, L., Chust, G., Irigoien, X., Santiago, J., Murua, H., Fraile, I., et al., 2015. Global habitat preferences of commercially valuable tuna. Deep-sea Res. Pt. II. 113, 102–112. https://doi.org/10. 1016/j.dsr2.2014.07.001. Bi, H., Peterson, W.T., Lamb, J., Casillas, E., 2011. Copepods and salmon: characterizing the spatial distribution of juvenile salmon along the Washington and Oregon coast, USA. Fish. Oceanogr. 20, 125–138. https://doi.org/10.1111/j.1365-2419.2011. 00573.x. Buisson, L., Blanc, L., Grenouillet, G., 2008. Modelling stream fish species distribution in a river network: the relative effects of temperature versus physical factors. Ecol. Freshw. Fish 17, 244–257. https://doi.org/10.1111/j.1600-0633.2007.00276.x. Chang, J., Chen, Y., Holland, D., Grabowski, J., 2010. Estimating spatial distribution of American lobster Homarus americanus using habitat variables. Mar. Ecol. Prog. Ser. 420, 145–156. https://doi.org/10.1111/j.1600-0633.2007.00276.x. Chen, C., Gao, G., Zhang, Y., Beardsley, R.C., Lai, Z., Qi, J., Lin, H., 2016. Circulation in the Arctic Ocean: results from a high-resolution coupled ice-sea nested GlobalFVCOM and Arctic-FVCOM system. Prog. Oceanogr. 141, 60–80. https://doi.org/10. 1016/j.pocean.2015.12.002. Chen, C., Liu, H., Beardsley, R.C., 2003. An unstructured grid, finite-volume, three-dimensional, primitive equations ocean model: application to coastal ocean and estuaries. J. Atmospheric Oceanic Technol. J. Atmos. Ocean. Tech. 20, 159–186. https:// doi.org/10.1175/1520-0426(2003)020<0159:AUGFVT>2.0.CO;2. Chen, C.S., Beardsley, R.C., Luettich, R.A.Jr, Westerink, J.J., Wang, H., Perrie, W., Xu, Q., Donahue, A.S., Qi, J., Lin, H., et al., 2013. Extratropical storm inundation testbed: intermodel comparisons in Scituate, Massachusetts. J. Geophys. Res- Oceans. 118, 5054–5073. https://doi.org/10.1002/jgrc.20397. Cormon, X., Loots, C., Vaz, S., Vermard, Y., Marchal, P., 2014. Spatial interactions between saithe (Pollachius virens) and hake (Merluccius merluccius) in the North Sea. ICES J. Mar. Sci. 71, 1342–1355. https://doi.org/10.1093/icesjms/fsu120. Dambach, J., Rödder, D., 2011. Applications and future challenges in marine species distribution modeling. Aquat. Conserv. 21, 92–100. https://doi.org/10.1002/aqc. 1160. Dolan, T.E., Patrick, W.S., Link, J.S., 2016. Delineating the continuum of marine ecosystem-based management: a US fisheries reference point perspective. ICES J. Mar. Sci. 73, 1042–1050. https://doi.org/10.1093/icesjms/fsv242. Grüss, A., Babcock, E.A., Sagarese, S.R., Drexler, M., Chagaris, D.D., Ainsworth, C.H., Penta, B., deRada, S., Sutton, T.T., 2016. Improving the spatial allocation of functional group biomasses in spatially-explicit ecosystem models: insights from three Gulf of Mexico models. B. Mar. Sci. 92 (4), 473–496. Grüss, A., Drexler, M., Ainsworth, C.H., 2014. Using delta generalized additive models to produce distribution maps for spatially explicit ecosystem models. Fish. Res. 159, 11–24. https://doi.org/10.5343/bms.2016.1057. Hsieh, C.H., Yamauchi, A., Nakazawa, T., Wang, W.F., 2010. Fishing effects on age and spatial structures undermine population stability of fishes. Aquat. Sci. 72, 165–178. https://doi.org/10.1007/s00027-009-0122-2. Hu, F., Pan, L., Gao, F., Jian, Y., Zhang, S., Wang, X., Guo, W., 2012. Embryonic development of Hexagrammos otakii and its relationship with incubation temperature. Progress in fishery sciences 33 (1), 28–33. https://doi.org/10.3969/j.issn.1000-7075. 2012.01.004. (in Chinese). Jordaan, A., Chen, Y., Townsend, D.W., Sherman, S., 2010. Identification of ecological structure and species relationships along an oceanographic gradient in the Gulf of Maine using multivariate analysis with bootstrapping. Can. J. Fish. Aquat. Sci. 67, 701–719. https://doi.org/10.1139/F10-010. Jyväsjärvi, J., Boros, G., Jones, R.I., Hämäläinen, H., 2013. The importance of sedimenting organic matter, relative to oxygen and temperature, in structuring lake profundal macroinvertebrate assemblages. Hydrobiologia. 709, 55–72. https://doi. org/10.1007/s10750-012-1434-0. Kempf, A., Stelzenmüller, V., Akimova, A., Floeter, J., 2013. Spatial assessment of predator-prey relationships in the North Sea: the influence of abiotic habitat properties on the spatial overlap between 0-group cod and grey gurnard. Fish. Oceanogr. 22, 174–192. https://doi.org/10.1111/fog.12013. Kitchens, L.L., Rooker, J.R., 2014. Habitat associations of dolphinfish larvae in the Gulf of

216

Fisheries Research 218 (2019) 209–217

X. Liu, et al.

Reinhardtius hippoglossoides in relation to abundance and hypoxia in the estuary and Gulf of St. Lawrence. Fish. Oceanogr. 22, 41–60. https://doi.org/10.1111/fog.12004. Zerbini, A.N., Friday, N.A., Palacios, D.M., Waite, J.M., Ressler, P.H., Rone, B.K., Moore, S.E., Clapham, P.J., 2016. Baleen whale abundance and distribution in relation to environmental variables and prey density in the Eastern Bering Sea. Deep-sea. Res. Pt. II. 134, 312–330. https://doi.org/10.1016/j.dsr2.2015.11.002. Zhao, J., Cao, J., Tian, S., Chen, Y., Zhang, S., Wang, Z., Zhou, X., 2014. A comparison between two GAM models in quantifying relationships of environmental variables with fish richness and diversity indices. Aquat. Microb. Ecol. 48 (3), 297–312. https://doi.org/10.1007/s10452-014-9484-1.

Xu, B., Zhang, C., Xue, Y., Ren, Y., Chen, Y., 2015. Optimization of sampling effort for a fishery-independent survey with multiple goals. Environ. Monit. Assess. 187 (5), 1–16. https://doi.org/10.1007/s10661-015-4483-9. Xu, L., Xue, Y., Xu, B., Ren, Y., Dou, S., 2018. Feeding ecology of Hexagrammos otakii in Haizhou Bay. J. Fish. Sci. China 25 (3), 608–620 (in Chinese). Xue, Y., Tanaka, K., Yu, H., Chen, Y., Guan, L., Li, Z., Yu, H., Xu, B., Ren, Y., Wan, R., 2018. Using a new framework of two-phase generalized additive models to incorporate prey abundance in spatial distribution models of juvenile slender lizardfish in Haizhou Bay. China. Marine Biology Research 14 (5), 508–523. https://doi.org/ 10.1080/17451000.2018.1447673. Youcef, W.A., Lambert, Y., Audet, C., 2013. Spatial distribution of Greenland halibut

217