Accepted Manuscript Title: R language-based methods for revealing taxonomic and functional diversity and tracing process of fish fauna Authors: Bin Kang, Xiaoxia Huang, Yunzhi Yan, Yunrong Yan, Hungdu Lin PII: DOI: Reference:
S2215-0161(18)30184-5 https://doi.org/10.1016/j.mex.2018.11.003 MEX 405
To appear in: Received date: Accepted date:
22 September 2018 2 November 2018
Please cite this article as: Kang B, Huang X, Yan Y, Yan Y, Lin H, R language-based methods for revealing taxonomic and functional diversity and tracing process of fish fauna, MethodsX (2018), https://doi.org/10.1016/j.mex.2018.11.003 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
*Title: Max. 20 words. A good title should contain the fewest possible words that adequately describe the content of a paper. *Authors:
R language-based methods for revealing taxonomic and functional diversity and tracing process of fish fauna
*Affiliations:
1College
Bin Kang1, Xiaoxia Huang2, Yunzhi Yan3, Yunrong Yan4, Hungdu Lin4 of Fisheries, Ocean University of China, Qingdao 266003, China laboratory of atmospheric environment and processes in the boundary layer over the lowlatitude plateau region, School of Earth Science, Yunnan University, Kunming 650091, China 3College of Life Sciences, Anhui Normal University, Wuhu 241002, China 4College of Fisheries, Guangdong Ocean University, Zhanjiang 524088, China
[email protected] [email protected] 2Key
CC E
A
SC R
IP T
Xiaoxia Huang,
[email protected] Yunzhi Yan,
[email protected] Yunrong Yan,
[email protected] Hungdu Lin,
[email protected]
A
N
U
Species richness, Taxonomic diversity, Functional diversity, Species turnover/nestedness, Conservation
M
Agricultural and Biological Sciences
PT
*Co-authors: full names and emails. [NOTE: it is the corresponding authors responsibility to inform all co-authors if submitting as a companion paper to a research article] *Keywords: At least 3 keywords. There is no limit on the no. of keywords you can list. Please remember that effective keywords should not repeat words appearing in your title, and should be neither too general nor too narrow. *SECTION: Agricultural and Biological Sciences Biochemistry, Genetics and Molecular Biology Chemical Engineering Chemistry Computer Science Earth and Planetary Sciences Energy Engineering Environmental Science Immunology and Microbiology Materials Science Mathematics Medicine and Dentistry Neuroscience Pharmacology, Toxicology and Pharmaceutical Science Physics and Astronomy Psychology Social Sciences Veterinary Science and Veterinary Medicine
ED
*Contact email: Include institutional email address of the corresponding author
Method Article * Title: R language-based methods for revealing taxonomic and functional diversity and tracing process of fish fauna
*Authors: Bin Kang1, Xiaoxia Huang2, Yunzhi Yan3, Yunrong Yan4, Hungdu Lin4
1College
of Fisheries, Ocean University of China, Qingdao 266003, China laboratory of atmospheric environment and processes in the boundary layer over the lowlatitude plateau region, School of Earth Science, Yunnan University, Kunming 650091, China 3 College of Life Sciences, Anhui Normal University, Wuhu 241002, China 4College of Fisheries, Guangdong Ocean University, Zhanjiang 524088, China
SC R
2Key
IP T
*Affiliations:
*Contact email:
U
[email protected]
CC E
PT
ED
M
A
N
Graphical abstract
A
The figure showed the spatial pattern of an integrated index of multifaceted diversity in the Yangtze River Basin, considering the higher fish species richness and functional contribution. Headwater (characterized by abundant endemic species), the middle mainstream (necessary to fish migration), and the lakes (showing higher diverse species) were primary zones for conservation.
ABSTRACT
This paper presented the fish species richness at geographical unit of the Yangtze River. According to the fish taxonomic catalogs and biological traits, R language method was used to determine taxonomic diversity and functional diversity and the components of each unit. Regression analysis was used to test the varying tendency of taxonomic and functional diversity corresponding to the change of species richness. Functional diversity is compared against taxonomic diversity in capturing the structure of dynamic ecosystem. The β-diversity indices of taxonomy and function were calculated and decomposed to evaluate the role of species turnover and nestedness in the formation process of fish spatial pattern. An integrated diversity index, balancing α and β diversity of species richness, taxonomic and functional diversity, is used to screen the prior conservation zone.
*Keywords: Species richness, Taxonomic diversity, Functional diversity, Species turnover/nestedness, Conservation
SPECIFICATIONS TABLE • Agricultural and Biological Sciences
More specific subject area:
Biodiversity and conservation, fisheries
Method name:
GIS, R language, Regression analysis
Name and reference of original method
Baselga, A., 2012. The relationship between species replacement and dissimilarity derived from turnover and nestedness. Glob. Ecol. Biogeogr. 9, 134-143. Casanoves, F., Pla, L., Rienzo, J.A.D, Díaz, S. (2011) FDiversity: a software package for the integrated analysis of functional diversity. Methods Ecol. Evol., 2, 233-237. Clarke, K.R., Warwick, R.M. (1998) A taxonomic distinctness index and its statistical properties. J. Appl. Ecol., 35, 523-531. Fishbase: http://www.fishbase.org/search.php Shuttle Radar Topography Mission database: https://lta.cr.usgs.gov/SRTM,
IP T
Subject Area
SC R
Resource availability
*Method details
PT
ED
M
A
N
U
Geological parameters To determine the spatial pattern process of fish diversity at taxonomic and functional aspects concerning to the environmental characters, we divided the Yangtze River Basin could be into 11 sub-basins as Jinsha, Min-Tuo, Jialing, Han, Wu, the upper mainstem, the middle mainstem, the lower mainstem, Dongting, Poyang, and further into 56 geographic units, according to the natural river system and annual discharge. The longitude, latitude, altitude, and channel length data of each unit were derived through the Spatial Analyst Tool in ArcGIS 10.0 [1]. The longitude and latitude extents of a unit were represented by a rectangle envelope spanning the minimum and maximum x, y coordinates of the region, the four points and the centre of gravity of an envelope enclosing the unit were computed using the toolset. At the end, the maximum, minimum and mean values of each factor were calculated by comparing and averaging the values of each grid within the sub-basin through the Zonal Statistics tool of Spatial Analyst. The altitude data with a spatial resolution of 3 arc-seconds (~90 meters) was provided by the Shuttle Radar Topography Mission database (https://lta.cr.usgs.gov/SRTM) download from the International Scientific & Technical Data Mirror Site (Computer Network Information Centre, Chinese Academy of Sciences, http://datamirror.csdb.cn). The hydrologic analysis in Spatial Analyst Toolbox of ArcGIS was used to model the movement of water across a basin. An eight-direction (D8) flow model [2] was used to determine the direction of flow (a direction of steepest descent) from every cell in the elevation raster. After creating a hydrologically conditioned elevation model (the sinks were identified and filled) from a digital elevation model (DEM), the flow accumulation for each cell location could be calculated. Supposing when enough water flows through a cell, there should be a stream passing through it, thus defining the stream network. The length of all streams combine in a unit was summed as channel lengths.
A
CC E
Taxonomic diversity approach Taxonomic diversity (TD), incorporating the degree to which the species are taxonomically related to each other, can be used to describe biological diversity of an area more than the numbers of species present. For example, TD of an assemblage containing n species belonging to x genus is larger than that of another assemblage containing n species belonging to y genus (x>y). The Average Taxonomic Distinctness (AvTD), defined as the average taxonomic path length, was calculated based on the taxonomic distance through a classification tree at four taxonomic levels (species, genus, family, order) between every pair of species, with an advantage of not being affected by the number of species or the sampling efforts [3]. The AvTD (Δ+) is determined from data consisting only of presence or absence of taxa: Δ+ = [∑∑i
In order to estimate the volume filled in the T dimensional space by the community of interest, the quick hull algorithm was used to compute the minimum convex hull including all considered species. First, a functional distance matrix based on biological traits of each species was computed, using Gower’s distance, to compile a metric able to accommodate nominal, ordinal, continuous and missing data. Then, a Principal Coordinates Analysis (PCoA) was constructed to represent the multivariate traits space occupied by the fishes. Following a trade-off between information quality and computation time, we finally kept the species coordinates on the first three axes as the values of three synthetic functional traits describing fish functional strategies. All calculations of functional diversity indices were performed in the R statistical and programming environment using the ‘FD’ and ‘vegan’ packages [4] (Appendix B).
M
A
N
U
SC R
IP T
The β-diversity and its components The pairwise β-diversity (species richness and taxonomic diversity) of units’ fish communities was measured using the Jaccard dissimilarity index (βjac) based on the number of species, and divided into β-diversity turnover (βjtu) and β-diversity nestedness (βjne) [6]. The βjtu measures the proportion of species that would be replaced between assemblages if both assemblages had the same number of species, hence accounts for species replacement without the influence of richness differences; βjne reflects the increasing dissimilarity between nested assemblages resulting from the increasing differences in species richness. The calculation formulas were: βjac = βjtu + βjne, βjac = (b + c) / (a + b + c), βjtu = 2min (b + c) / (a + 2min (b, c), βjne = {[max (b, c) - min (b, c)]/ (a + b + c)}*{a / [a + 2min (b, c)]}, Where a is the number of species present in both sites, b is the number of species present in the first site but not in the second, and c is the number of species present in the second site but not in the first. Based on the convex hull volume, functional β-diversity = (functional space not shared) / (total functional space filled) was decomposed into functional turnover and functional nestedness components as: Functional β-diversity = [V(C1)+V(C2)-2V(C1∩C2)]/[V(C1)+V(C2)-V(C1∩C2)], Functional turnover = [2min(V(C1), V(C2)) - 2V(C1∩C2)] / [2min(V(C1), V(C2)) - V(C1∩C2)], Functional nestedness = {│V(C1) -V(C2)│/ [V(C1)+V(C2)-V(C1∩C2)]}*{V(C1∩C2) / [2min(V(C1), V(C2)) - V(C1∩C2)]}, Where V(C1) and V(C2) are the volume of the convex hulls of each of the two communities(C1 and C2), V(C1∩C2) is their intersection. According to equation of βjac, a = V(C1∩C2), b = V(C1) - V(C1∩C2), and c = V(C2) - V(C1∩C2). For each unit, we calculated the mean β-diversity by computing the average of the pairwise comparisons between each focal basin and all the remaining basins. All calculations of functional diversity indices were performed in the R statistical and programming environment using the ‘FD’, ‘betapart’, and ‘vegan’ packages [4] (Appendix C).
CC E
PT
ED
Integrated diversity index The 9 aspects of calculated diversity indices (species richness, β-species richness turnover, β-species richness nestedness, taxonomic diversity, β-taxonomic diversity turnover, β-taxonomic diversity nestedness, functional diversity, β-functional diversity turnover, β-functional diversity nestedness), each reflecting an aspect of fish fauna, were normalized to eliminate the gaps of original numerical differences: yi = [xi−min(xj)] / [max(xj)−min(xj)], (1≤i≤n, 1≤j≤n), where xi is the original data, max(xj) and min(xj) are the maximum and minimum values of the sample data, respectively, and yi is the normalized data. A trade-off integrated index was determined by summing all the normalize values, and then sorted to screen the prior zones for conservation; IDI = ∑9i=1 yi, where i means different aspect of diversity. Supplementary material and/or Additional information:
Appendix A. R codes for calculating taxonomic diversity index
A
# Set Working Directory setwd(“file directory”) # Load the package library(vegan) # Read the data file fish<-read.csv("filename ",header=T, row.names=1,sep=",") tax<-read.csv("filename ",header=T, row.names=1, sep=",") #Transform the dataframe for diversity calculation #(369 fish * 56 basin transform to 56 basin*369 fish) t_fish<-t(fish) # Taxonomic distances from a classification table with variable step lengths taxdis <- taxa2dist(tax, varstep=TRUE) # Calculate taxonomic diversity and distinctness mod <- taxondive(t_fish, taxdis) # List the result mod
Appendix B. R codes for calculating functional diversity index
Appendix C. R codes for calculating β-diversity index, and its component
SC R
IP T
# Set Working Directory setwd(“file directory”) # Load the package library(vegan) library(FD) # Read the data file fish<-read.csv("filename ",header=T, row.names=1,sep=",") tax<-read.csv("filename ",header=T, row.names=1, sep=",") trait<-read.csv("filename ",header=T, row.names=1, sep=",") #Prepare trait data for FRic calculation gower.f<-gowdis(trait) # Gower’s distance analysis pcoa.g<-pcoa(gower.f,rn=NULL ) # PCoA analysis Fun_pcoa<-pcoa.g$vectors[,1:3] # the first three axes of PCoA were kept as # the synthetic functional traits #Transform the dataframe for diversity calculation t_fish<-t(fish) #(369 fish * 56 basin transform to 56 basin*369 fish) # Calculate functional diversity indices FD.alpha<-dbFD(Fun_pcoa, t_fish) # List the result FD.alpha$FRic
A
CC E
PT
ED
M
A
N
U
# Set Working Directory setwd(“file directory”) setwd("file directory ") # Load the packages library(vegan) library(FD) library(betapart) # Read the data file fish<-read.csv("filename ",header=T, row.names=1,sep=",") tax<-read.csv("filename ",header=T, row.names=1, sep=",") trait<-read.csv("filename ",header=T, row.names=1, sep=",") #Prepare trait data for calculation gower.f<-gowdis(trait) # Gower’s distance analysis pcoa.f<-pcoa(gower.f,rn=NULL ) # PCoA analysis Fun_pcoa<-pcoa.f$vectors[,1:3] # keep the first three axes of PCoA as # the synthetic functional traits #Prepare taxonomic data for calculation gower.t<-gowdis(tax,ord = c("classic")) # Gower’s distance analysis pcoa.t<-pcoa(gower.t,rn=NULL ) # PCoA analysis tax_pcoa<-pcoa.t$vectors[,1:3] # the first three axes of PCoA were kept as # the synthetic taxonomic traits #Transform the dataframe for diversity calculation t_fish<-t(fish) #(N fish * M basin transform to M basin*N fish) #Analysis the beta indice between sub-basins SR_beta<- beta.pair(t_fish, index.family="jaccard") FD_beta<-functional.beta.pair(t_fish, Fun_pcoa, index.family="jaccard") TD_beta<-functional.beta.pair(t_fish, tax_pcoa, index.family="jaccard") #Get the results SR.jtu<-as.matrix(SR_beta$beta.jtu) SR.jne<-as.matrix(SR_beta$beta.jne) SR.jac<-as.matrix(SR_beta$beta.jac) FD.jtu<-as.matrix(FD_beta$funct.beta.jtu) FD.jne<-as.matrix(FD_beta$funct.beta.jne) FD.jac<-as.matrix(FD_beta$funct.beta.jac) TD.jtu<-as.matrix(TD_beta$funct.beta.jtu) TD.jne<-as.matrix(TD_beta$funct.beta.jne) TD.jac<-as.matrix(TD_beta$funct.beta.jac) #average beta between sub-basins, for M sub-basins, the summary should be divided by (M-1). Beta_bs<-cbind(colSums(FD.jac)/ (M-1), colSums(FD.jtu)/ (M-1), colSums(FD.jne)/ (M-1), colSums(TD.jac)/ (M-1), colSums(TD.jtu)/ (M-1), colSums(TD.jne)/ (M-1), colSums(SR.jac)/ (M-1), colSums(SR.jtu)/ (M-1), colSums(SR.jne)/ (M-1)) colnames(Beta_bs)=c('avg_FD.jac', 'avg_FD.jtu', 'avg_FD.jne', 'avg_TD.jac', 'avg_TD.jtu', 'avg_TD.jne', 'avg_SR.jac', 'avg_SR.jtu', 'avg_SR.jne') Beta_bs # Load the package library(pheatmap) # Heatmap of beta-diversity between sub-basins pheatmap(FD.jtu,main="FD.jtu")
U
SC R
IP T
pheatmap(FD.jne,main="FD.jne") pheatmap(FD.jac,main="FD.jac") pheatmap(TD.jtu,main="TD.jtu") pheatmap(TD.jne,main="TD.jne") pheatmap(TD.jac,main="TD.jac") pheatmap(SR.jtu,main="SR.jtu") pheatmap(SR.jne,main="SR.jne") pheatmap(SR.jac,main="SR.jac") #Calculate the proportion of turnover and nestedness components in the overall dissimilarity #of functional beta diversity and taxonomic beta diversity B1<- FD.jtu*FD.jac^-1 B2<- FD.jne*FD.jac^-1 B2[lower.tri(B2)]=0 B1[upper.tri(B1)]=0 FD_beta.r<- B1+B2 pheatmap(FD_beta.r, cluster_rows = F, cluster_cols = F, show_rownames = T, show_colnames = T) B1<- TD.jtu*TD.jac^-1 B2<- TD.jne*TD.jac^-1 B2[lower.tri(B2)]=0 B1[upper.tri(B1)]=0 TD_beta.r<- B1+B2 pheatmap(TD_beta.r, cluster_rows = F, cluster_cols = F, show_rownames= T, show_colnames = T) B1<- SR.jtu*SR.jac^-1 B2<- SR.jne*SR.jac^-1 B2[lower.tri(B2)]=0 B1[upper.tri(B1)]=0 SR_beta.r<- B1+B2 pheatmap(SR_beta.r, cluster_rows = F, cluster_cols = F, show_rownames= T, show_colnames = T)
N
Acknowledgement:
A
CC E
PT
ED
M
A
This work was supported by the National Natural Science Foundation of China (No. 41476149, No. 31560181). The authors give great thanks to two anonymous reviewers for their constructive suggestions.
*References:
A
CC E
PT
ED
M
A
N
U
SC R
IP T
[1] P.A. Burrough, R.A. McDonell, Principles of Geographical Information Systems, Oxford University Press, New York, 1998. [2] S.K. Jenson, J.O. Domingue, Extracting Topographic Structure from Digital Elevation Data for Geographic Information System Analysis, Photogrammetric Engineering and Remote Sensing 54 (1988) 1593-1600. [3] K.R. Clarke, R.M. Warwick, A taxonomic distinctness index and its statistical properties, J. Appl. Ecol. 35 (1998) 523-531. [4] R Development Core Team R, A language and Environment for Statistical Computing, R Foundation for Statistical Computing, 2012. [5] F. Casanoves, L. Pla, J.A.D. Rienzo, S. Díaz, FDiversity: a software package for the integrated analysis of functional diversity. Methods Ecol. Evol. 2 (2011) 233-237. [6] A. Baselga, The relationship between species replacement and dissimilarity derived from turnover and nestedness. Glob. Ecol. Biogeogr. 9 (2012) 134-143.