Inventory incompleteness and collecting priority on the plant diversity in tropical East Africa

Inventory incompleteness and collecting priority on the plant diversity in tropical East Africa

Biological Conservation xxx (xxxx) xxxx Contents lists available at ScienceDirect Biological Conservation journal homepage: www.elsevier.com/locate/...

3MB Sizes 0 Downloads 6 Views

Biological Conservation xxx (xxxx) xxxx

Contents lists available at ScienceDirect

Biological Conservation journal homepage: www.elsevier.com/locate/biocon

Policy Analysis

Inventory incompleteness and collecting priority on the plant diversity in tropical East Africa Shengwei Wanga,b, Yadong Zhoua,c,*, Paul Mutuku Musilid, Geoffrey Mwachalad, Guangwan Hua,c, Qingfeng Wanga,c a

Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, Hubei 430074, PR China University of Chinese Academy of Sciences, Beijing 100049, PR China c Sino-Africa Joint Research Center, Chinese Academy of Sciences, Wuhan, Hubei 430074, PR China d East African Herbarium, National Museums of Kenya, P. O. Box 45166 00100, Nairobi, Kenya b

A R T I C LE I N FO

A B S T R A C T

Keywords: Tropical East Africa flora Biodiversity database Inventory incompleteness Collecting bias Collecting-priority Explanatory variables

Inventory incompleteness has seriously affected the accuracy of the spatial distribution pattern of biodiversity, but the causes of incompleteness and the priority investigation with quantitative methods have received far less attention. In this study, we constructed a plant database of tropical East Africa, evaluated and explained the inventory incompleteness, and identified the priority collecting area. The results showed that the spatial distribution pattern of collection density and species richness is very uneven in tropical East Africa, with 16 % of regions having zero-collection, and more than half of the regions having inventory incompleteness. Species collection and completeness are mainly affected by species richness and road density, followed by national boundaries and insecurity in some areas. We quantitatively selected priority investigation areas in tropical East Africa to supplement biodiversity data in the area. We recommend prioritizing collections especially around western Kenya, southern Tanzania, and around the border of Tanzania and Kenya. Future work should focus on improving the digitization of specimens and the strengthening of cooperation among countries, for these are the best ways to raise awareness of the biodiversity patterns in tropical East Africa.

1. Introduction Large distribution datasets of plants are essential to understanding regional species richness and biogeographic information under global biodiversity patterns, and for predicting biodiversity loss due to the environmental changes or anthropogenic activities (Gaston, 2000; Hampton et al., 2013; Troia and McManamay, 2017; Qian et al., 2018). Nonetheless, biodiversity data are incomplete in many regions (Hortal et al., 2007; Soberón et al., 2007; Yang et al., 2013), for example, many researchers often investigate some hotspots or key habitats at high frequencies, such as forests, mountains or wetlands (Pearman et al., 2008; Troia and McManamay, 2017). This uneven distribution of data can seriously affect the study of the temporal and spatial distribution patterns of biodiversity in a given area (Ahrends et al., 2011; Yang et al., 2013; Ficetola et al., 2015; Qian et al., 2018). Therefore, estimating plant richness from available data and understanding geographical sampling bias and their influencing factors can help overcome data limitations and improve more efficient investigation and conservation research in the future (Ladle and Hortal, 2013; Yang et al.,



2013, 2014). Species distribution modeling (SDM) is a widely used method to determine species diversity and composition patterns at large spatial scales (Pearson et al., 2007; Gomes et al., 2018). SDMs can predict the potential distribution of species through presence data and environmental factors, thus, the species richness of a region can be foretold (Zhang et al., 2012, 2017). MaxEnt, which is specifically developed to model species distributions with presence-only data, has been proved to perform best when few presence records are available (Phillips et al., 2006; Wisz et al., 2008; Zhang et al., 2012), and is least affected by position error in occurrences (Graham et al., 2008; Zhang et al., 2012, 2017), was one of the most widely used SDMs methods in the available species distribution modeling algorithms Based on the species distribution simulations, estimates of richness, and the current distribution data, it is easy to identify scarcely collected areas for further collection activities. By setting different methods in MaxEnt, it is possible to predict species with low data volume (Raes and ter Steege, 2007; Zhang et al., 2012, 2017). To illustrate, Raes and ter Steege (2007) used MaxEnt to predict the plant diversity of Borneo, which is a good

Corresponding author at: Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, Hubei 430074, PR China. E-mail address: [email protected] (Y. Zhou).

https://doi.org/10.1016/j.biocon.2019.108313 Received 10 June 2019; Received in revised form 6 September 2019; Accepted 25 October 2019 0006-3207/ © 2019 Elsevier Ltd. All rights reserved.

Please cite this article as: Shengwei Wang, et al., Biological Conservation, https://doi.org/10.1016/j.biocon.2019.108313

Biological Conservation xxx (xxxx) xxxx

S. Wang, et al.

sustainable development and decision-making (Sousa-Baena et al., 2014a, 2014b). In this paper, we aim to evaluate the completeness of primary plant diversity data and the reason behind the uneven collection in tropical East Africa. Specifically, we seek to answer the following questions: (1) what is the species collection pattern and the inventory incompleteness in tropical East Africa? (2) what are the factors affecting the species collection and the inventory incompleteness? and (3) what strategies can be given for future plant collection activities?

assessment of species richness and collection bias in the area. The factors affecting the uneven collection of plants are not single, but caused by a variety of factors. It is reported that the amount collected is usually positively related to the density of human populations (Kuper et al., 2006; Botts and Alexander, 2011) and the intensity of roads or navigable rivers (Reddy and Dávalos, 2003; Botts and Alexander, 2011; Yang et al., 2014). In addition, there are reports that the collection density is also related to the environmental factors (Romo et al., 2006), the residence of botanists (Moerman and Estabrook, 2010), the location of the herbarium (Yang et al., 2014), the status of biodiversity (Parnell et al., 2010; Reddy and Dávalos, 2003), and the difference in altitude (Yang et al., 2014). Conversely, there are large differences in the causes of these impact factors in different regions. For instance, the plant collection in the Australian and South American outback is largely confined to several major roads (Nelson et al., 1990; Crisp et al., 2010), the collection of plants in Thailand is mainly concentrated in densely populated areas (Nelson et al., 1990), whereas, densely populated areas are surprisingly under-sampled in China (Yang et al., 2014). Understanding the factors of incomplete collection is instructive for determining the priority collection area. Simultaneously, determining the priority collection area is also an important method to quickly supplement the biodiversity data and accurately determine the protection area. The tropical East Africa (TEA) is known as one of the areas in the world which have the richest biodiversity, with two important biodiversity hotspots, Eastern Afromontane and Coastal Forests of Eastern Africa (Mittermeier et al., 2011), it lies in the east side of the western Rift Brunch and mainly includes five countries namely Tanzania, Kenya, Uganda, Rwanda, and Burundi (Fig. 1). TEA has a high diversity of plants, accounting for about a quarter of tropical plants in Africa (Mutke et al., 2001, 2011), mainly contributed by its vast area (1.83 million km2), complex topography, and long-standing geological history. The project of the Flora of Tropical East Africa (FTEA editors, 19482012FTEA editors, -, 2012FTEA editors, 1948-2012) is the first comprehensive survey of tropical East African plants (Zhou et al., 2017), recording their morphological characteristics in detail and listing the latitude and longitude information for each cited specimen in another monograph, Flora of Tropical East Africa - Index of collecting localities (Polhill, 1988). But, the pattern of collection and inventory completeness in tropical East Africa is not yet clear. Understanding the spatial pattern of its collection and identifying priority collection areas will complement the data on biodiversity in tropical East Africa. Moreover, these biodiversity data are critical to addressing the challenges of

2. Methods 2.1. Species distributional data We extracted the latitude and longitude information of all the corresponding cited specimens of each plant from the Flora of Tropical East Africa (FTEA editors, 1948-2012FTEA editors, -, 2012FTEA editors, 1948-2012) and Flora of Tropical East Africa - Index of collecting localities (Polhill, 1988), and downloaded some available plant herbarium records for tropical East Africa through the Global Biodiversity Information Facility data portal (GBIF, http://www.gbif.org/) and RAINBIO (Dauby et al., 2016, http://rainbio.cesab.org) as our data. To improve the data quality, we cleaned the data by standardizing the scientific names by R package "plantlist" (Zhang, 2015), which is the families and genera of plants according to ‘The Plant List’ version 1.1 (available at http://www.theplantlist.org). We found there were many repetitions (one plant had many same location records) in our data, especially RAINBIO and GBIF had significant repetition rates. Since there are no specimen collection numbers in the records in RAINBIO and GBIF, we could not ascertain whether the reason for the duplicate records is the result of the repetition in the database or the duplication of specimen collection. So, we adjusted the decimal point of the latitude and longitude to the fourth position, which could reduce the error due to the position of longitude and longitude format conversion, and then we checked the latitude and longitude of each species, deleted similar duplicates and only kept one set of each. In this way, each of the data we have has information on the collection of non-repeated locations. We then excluded any records with obvious geographic errors, such as those records occurring over sea or in the center of Lake Victoria), and finally obtained 16,143 species (containing variants and subspecies, belonging to 2306 genera and 248 families) and a total of 175,797 presence records, among them, there were 3581 plants with more than 15 records and 8574 of them with more than 5 records.

Fig. 1. Location of research area. (a) Geographical location of tropical east Africa in Africa. (b) Five countries in tropical east Africa. 2

Biological Conservation xxx (xxxx) xxxx

S. Wang, et al.

to study the relationship between explanatory variables with the collection density and inventory incompleteness. All explanatory variables were log10-transformed to obtain the best model fit and the residual of the approximate normal distribution. Since inventory incompleteness is a zero-inflated response variable (i.e., some cells without records; Zuur et al., 2010), we used log10 (n+1)-transformation to reduce extreme deviations from normality (BallesterosMejia et al., 2013). The extraction of the grid cells data and the making of the distribution maps were performed using ArcGIS 10.2, while data analysis was carried out using R version 3.5.2 (R Core Team, 2018). The calculation of the relative importance of the explanatory variables was performed by the "calc. relimp" function in R software package "relaimpo" (Groemping, 2006).

2.2. Species distribution modelling To further refine the diversity estimates in the study area, we used MaxEnt software (Version 3.4.1, available at https:// biodiversityinformatics.amnh.org/open_source/maxent/) to model species distributions. MaxEnt was set to use all the species presence records for model building, by setting the ‘random test percentage’ to zero (Raes et al., 2009) and max number of background points to 100000, linear features and quadratic features for ≥5 and < 15 records, and adding hinge features for ≥15 records (Raes et al., 2009; Zhang et al., 2012, 2017). The climate layer was derived from WorldClim (http://www.worldclim.org/) with 30 s resolution. We used the Pearson correlation analysis (Pearson r < 0.8; Appendix S1) to avoid multicollinearity. From correlated variables with Pearson r less than 0.8, only the ecologically most meaningful factors were kept. Lastly, 9 environmental variables (Bio2: Mean Diurnal Range; Bio3: Isothermality; Bio10: Mean Temperature of Warmest Quarter; Bio12: Annual Precipitation; Bio13: Precipitation of Wettest Month; Bio14: Precipitation of Driest Month; Bio15: Precipitation Seasonality; Bio18: Precipitation of Warmest Quarter; Bio19: Precipitation of Coldest Quarter) and altitude were selected as the environmental predictors. In our study, to reduce the impact of over-fitting of the model, we made a set of assumptions. For instance, when the records number was ≥15, we assumed that the predicted area with Cloglog value greater than 0.8 was a presence zone. We also assumed that the predicted area with Cloglog value greater than 0.9 was a presence area when the records number was ≥5 and < 15 records.

3. Results 3.1. Spatial distribution of inventory incompleteness The collection number in each grid cell of plants in tropical East Africa ranged from 0 to 3463. The collection density ranged from 0 to 4.4 specimens per km2 and about 16 % of the grid cells did not have any collection record (Fig. 2a). The raw data showed that the collection density of tropical East African plants was extremely uneven, and the areas with very low collection density were mainly in the northern and northeastern parts of Kenya, the south of Tanzania, and the mid-west regions (Fig. 2a). The predicted plant richness of tropical East Africa ranged from 1 to 5044 per grid, and areas with high species richness were mainly in the coastal zone, central and western Kenya, northeastern and southwestern regions in Tanzania, western of Burundi, and Rwanda (Fig. 2b). The inventory incompleteness (Fig. 2c; Appendix S2) varied between 1 and 3650, and the average value was 94.34. There were about 70 % of the grids with inventory completeness between 0 and 0.1 (Appendix S2). This indicates that the collection in most parts of tropical East Africa is incomplete. The collection density and inventory incompleteness were positively related (r2 = 0.397; Appendix S3), but the relationship was not very strong, indicating that some areas with high density of collections were not necessarily completely sampled.

2.3. Incompleteness assessment of species collection Grid cells based on longitude and latitude of different cell sizes (0.25° × 0.25°, 0.5° × 0.5°, and 1° × 1°) are often used for spatial pattern analysis of plant species (Stropp et al., 2016; Zhang et al., 2016; Ye et al., 2019). Here, we chose 0.25° × 0.25° grid cells for plant information statistics and analysis. We calculated the recorded number of species (Srec) and number of potential species (Smxt) predicted by MaxEnt. The species richness (Sr) in the grid cell consists of predicted species and recorded species, with each species being counted only once in the grid. The inventory completeness is Vc = Srec / Sr (Colwell and Coddington, 1994). Therefore, the formula for inventory incompleteness is Vinc = 1 / Vc, where the larger the value, the more incomplete the inventory.

3.2. Interpretation of collection status Through the analysis of the explanatory variables (Figs. 3 and 4; Appendix S4) by generalized least squares model (Figs. 3 and 4; Appendix S5), we found that the collection density was positively correlated with population density, road density, altitude range, and species richness (p < 0.001; Fig. 3a–d), but the nature reserves showed no significant difference (r2 = 0.0007; Fig. 3e). The total response contrast of all the explanatory variables to collection density was 0.6887. The relative importance of species richness was the highest (50.95 %). The relative importance of road density, population density, and altitude ranges were 24.64 %, 13.11 %, and 11.26 % respectively (Fig. 3f). Inventory incompleteness in tropical East Africa is significantly positively correlated with population density, and road density, significantly negatively correlated with species richness, and not correlated with elevational range and proportion nature reserves (p < 0.001; Fig. 4a–f). The total response contrast of all the explanatory variables to inventory incompleteness was 0.4714. The relative importance of species richness remained the highest (40.67 %), the relative importance of road density increased to 33.45 %, and the relative importance of population density was 17.35 % (Fig. 4f).

2.4. Priority of investigation We defined the priority of investigation as a place with a high degree of incomplete collection and high species richness. Here, we used the method of dividing the priority of protected areas (Albuquerque and Beier, 2015) for reference and then used the weighted species richness for the calculations. The formula of weighted species richness (WR) is as n follows: WR =∑1 (1/ci ), where ci is the number of sites occupied by species i, and the values are summed for the n species that occur in that site (Williams et al., 1996). The priority of investigation is P = WR × Vinc. We calculated the geographic distances (km) from all the grids in tropical East Africa to the nearest priority investigation areas to analyze their association with incomplete explanatory variables and anthropogenic factors such as habitat encroachment, armed conflict, and historical factors. 2.5. Explanatory variables of the incompleteness

3.3. Priority collection

In this investigation, we used road density (http://sedac.ciesin. columbia.edu/), population density (https://www.protectedplanet.net/ ), altitude difference (http://www.diva-gis.org/), and species richness in the grid for analysis. Ordinary least squares (OLS) model was applied

The weighted species richness was from 0 to 175 in each grid (Fig. 5a), and its spatial distribution was highly consistent with the 3

Biological Conservation xxx (xxxx) xxxx

S. Wang, et al.

Fig. 2. Geographical patterns of collection density (a), where white squares denote no collection data; (b) predicted species richness, including the data predicted by Maxent software and all recorded data (de-duplication); (c) inventory incompleteness of tropical East Africa based on grids of 0.25°×0.25°.

predicted plant richness (r2 = 0.5219; p < 0.001; Appendix S6). However, the weighted species richness was prominent in the area of plants distributed in the narrow range. The geographical distribution of the priority collection was also uneven. The more concentrated areas were mainly in the northeast of the coastal zone of Kenya, western Kenya, northern Tanzania (especially the area near the national boundaries with Kenya) and its southwest regions (Fig. 5b). There was a

significant negative correlation between collection priority and inventory completeness regions (p < 0.001), but the association was not very strong (r2 = 0.3550; Fig. 5b; Appendix S7), indicating that although many regions may be incompletely collected, they do not necessarily require priority collection.

Fig. 3. Single predictor relationships between explanatory variables and collection density of tropical East Africa (a–e). Blue lines are ordinary least squares (OLS) model fits with gray side indicating 95 % confident intervals. (f) The relative importance of each variable in explaining the collection density. Significance: *** < 0.001; ** < 0.01; * < 0.05; ns not significant. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article). 4

Biological Conservation xxx (xxxx) xxxx

S. Wang, et al.

Fig. 4. Single predictor relationships between explanatory variables and inventory incompleteness of tropical East Africa (a–e). Blue lines are ordinary least squares (OLS) model fits with gray side indicating 95 % confident intervals. (f) The relative importance of each variable in explaining the inventory completeness. Significance: *** < 0.001; ** < 0.01; * < 0.05;. < 0.1. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article).

4. Discussion

many species with low collections still require further information mining and field collection. Biodiversity investigation are the cornerstone of biodiversity data (Russo et al., 2019), but the digitization and sharing of post-investigation specimens is also important. Therefore, we strongly suggest the digitization and sharing of specimens from the local herbariums of museums, universities, botanical gardens, and some private collectors (Moerman and Estabrook, 2010), which will be very beneficial to the future research of species spatial distribution in tropical East Africa. In our study, we used the species distribution model to predict plant richness in tropical East Africa, thus filling a large number

The “data mining” and “knowledge discovery” methods can be used to reduce the gap in biodiversity data coverage (Soberón et al., 2000; Sousa-Baena et al., 2014a), thus, the data sources and methods of discovery play a key role in biodiversity research. The flora of tropical east Africa is represented by 0.096 specimens/km2, and this is far from the ideal projection of 1–3 specimens/km2 for the tropical regions (Schmid, 1989; Sousa-Baena et al., 2014a). Our data sources are still limited, the number of species involved in the SDMs are only about 50 %, and hence

Fig. 5. Geographical patterns of weighted species richness (a) and (b) collection priority; (c) Geographical distance of priority collection center (Cpri > 3000). 5

Biological Conservation xxx (xxxx) xxxx

S. Wang, et al.

areas. More explanations for inventory incompleteness can be given based on the determination of the collection priority. Firstly, difference in collection density in survey regions may be an important factor in inventory incompleteness, especially in areas with high species diversity. For example, in our study, we found that the southern side of Mount Kilimanjaro is more complete than the northern side. Secondly, national borders have contributed to the inventory incompleteness, which may be due to political or security considerations between countries, like in the case of Mount Elgon (located in Uganda, bordering Kenya), and the uneven collection of Kilimanjaro may also have this factor. Thirdly, there are many mosaic-type plaque forests in the extensive Savannah region that lack investigation. These fragmented forests provide important habitats for many endangered and endemic species (Newmark and Mcneally, 2018). The inaccessibility of these areas is mainly due to the lack of roads and the lack of inspections for safety reasons caused by animals or tribes. Fourthly, although the coastal forests of East Africa have become a global biodiversity hotspot (Mittermeier et al., 2011), they seem to have been accorded little attention. The species collection of coastal zones is still incomplete, especially in the coastal areas of southern Tanzania. We found that the two major incomplete collection areas in Tanzania are at the intersection of rivers and seas, with intricate water network, which constitutes complex terrain such as wetlands and arid deltas, which may make sampling difficult. There are still opportunities to improve the inventory completeness and coverage of flora data in tropical East Africa. Our study evaluated the current status of plant collection in tropical East Africa, analyzed possible causes of inventory incompleteness, and identified priority collection strategies. We propose the strengthening of the digitalization of specimens of various organizations or individuals, strengthening of cooperative investigations between countries, breaking through the bottleneck of administrative units in plant investigation, and establishing priority inspection routes to improve plant inspections in the region (Stephenson et al., 2017). This will be of great significance to the development of plant resources, land use, and the biodiversity conservation in East African countries.

of collection blanks and constructing the spatial distribution pattern of plant diversity. The reasons behind inventory incompleteness are often not single but interrelated. Although explanatory variables cannot directly infer causality, they can give some preliminary opinions (Ballesteros-Mejia et al., 2013). In tropical East Africa, these accessibility factors mainly include road accessibility, species richness judgment, regional security, and some local policy factors. Among them, the species richness has the most significant impact on the collection density. However, we must also acknowledge that high species richness is a result of high collection density. This is actually a cycle of reasoning, but we believe that species richness-driven aggregate density is a plausible hypothesis, because collecting specimens is a process of human subjectivity, and collectors are usually more willing to visit areas with high species richness ("diversity tracking"; Romo et al., 2006), and the convenient areas (high population and road density; Crisp et al., 2010; Ficetola et al., 2015), which is consistent with the conclusions of most researchers (Balmford et al., 2001; Kreft and Jetz, 2007; Yang et al., 2014). For these reasons, the presence records that occur for the species tend to accumulate around specific areas and are not randomly collected (Tobler et al., 2007; Loiselle et al., 2008), which causes SDMs to generally under‐predict the extent of species ranges (Feeley and Silman, 2011). Nevertheless, inventory incompleteness is still present in high-profile plant diversity hotpot areas, which may be due to the lack of a complete plant survey in the area, but only some researchers have collected plants of their own concern. Although this study showed that the collection density is not closely related to the protected area, we still believe that the protected area plays a role in plant collection because the nature reserves are more likely to attract the interest of researchers (Parnell et al., 2010; Reddy and Dávalos, 2003). Most protected areas are hotspots of biodiversity, usually characterized by low disturbance, pristine ecosystems and unique habitats, and thus make up high plant diversity and endemic richness with a higher degree of species protection (Yang et al., 2014). Subsequently, there is more security in many protected areas. For example, inspections in Kenyan reserves are usually guided and protected by Kenya Wildlife Services (KWS) or Kenya Forest Services (KFS) staff and ranger. In tropical East Africa, nature reserves account for 83 % of the total area (UNEP-WCMC, 2019), but most of the protected areas are designed to protect animals, and researchers who care about tropical East Africa also seem to prefer to study the animals here (Holmes et al., 2017; Holechek and Valdez, 2017). Nonetheless, our findings suggest that the collection of many protected areas is seriously inadequate. Therefore, we call for the strengthening of plant inspections of nature reserves in the future, including even the animal protected areas. Is it necessary to give priority to areas where the degree of incomplete inventory is high, especially in tropical East Africa where inventory is so incomplete? Sosef et al.'s (2017) assessment of African specimen collection through the RAINBIO database indicates that forest areas need to be prioritized, and that Eastern Arcs and coastal forests of Tanzania need priority collection. Our results show that most of the areas with inventory incompleteness or zero-collection in tropical East Africa are located in the Savannah region. The Savannah region of tropical East Africa has a wide area, a harsh environment, and a high degree of similarity in plant species. In the case of limited funding and limited resources, the collector has a reason to choose a safer and more accessible area for specimen collection. It seems that the Savannah area is still unnoticed by collectors and difficult to fully investigate. The contribution of few records’ species can be emphasized to some extent by the calculation of WR (Henriksson et al., 2016). These species with fewer specimens are more likely uncommon species, and many of these may be endemic or endangered species. Our research provides regional recommendations for identifying priority collection areas in the Savannah region by the analysis of WR. Therefore, it is of great significance to strengthen the priority inspections of areas with height WR for understanding the species composition and the protection of these

Declaration of Competing Interest The authors declared that no conflict of interest exits in the submission of this manuscript, and manuscript is approved by all authors for publication.

Acknowledgements We are grateful to the botanists who contributed to the compilation of the Flora of Tropical East Africa. We thank the Global Biodiversity Information Facility and RAINBIO for making their data publicly available online. Thanks to the students from Central China Normal University and Hubei University who helped us to digitize the books. We appreciate Wenjing Yang from School of Geography and Environment, Jiangxi Normal University for providing data analysis guidance. We also thank Anne C. Ochola from Wuhan Botanical Garden, Chinese Academy of Sciences to help in revising English. This study was supported by the fund of Sino-Africa Joint Research Center, CAS, China (Y323771W07 and SAJC201322) and National Natural Science Foundation of China (31800176).

Appendix A. Supplementary data Supplementary data associated with this article can be found, in the online version, at http://dx.doi.org/10.1016/j.biocon.2019.108313. 6

Biological Conservation xxx (xxxx) xxxx

S. Wang, et al.

References

Nelson, B.W., Ferreira, C.A.C., Silva, M.F.D., Kawasaki, M.L., 1990. Endemism centres, refugia and botanical collection density in Brazilian Amazonia. Nature 345, 714–716. https://doi.org/10.1038/345714a0. Newmark, W.D., Mcneally, P.B., 2018. Impact of habitat fragmentation on the spatial structure of the Eastern Arc forests in East Africa: implications for biodiversity conservation. Biodivers. Conserv. 27, 1387–1402. https://doi.org/10.1007/s10531-0181498-x. Parnell, J.A.N., Simpson, D.A., Moat, J., Kirkup, D.W., Chantaranothai, P., Boyce, P.C., Bygrave, P., Dransfield, S., Jebb, M.H.P., Macklin, J., 2010. Plant collecting spread and densities: their potential impact on biogeographical studies in Thailand. J. Biogeogr. 30, 193–209. https://doi.org/10.1046/j.1365-2699.2003.00828.x. Pearman, P.B., Guisan, A., Broennimann, O., Randin, C.F., 2008. Niche dynamics in space and time. Trends Ecol. Evol. (Amst.) 23, 149–158. https://doi.org/10.1016/j.tree. 2007.11.005. Pearson, R.G., Raxworthy, C.J., Nakamura, M., Townsend Peterson, A., 2007. Predicting species distributions from small numbers of occurrence records: a test case using cryptic geckos in Madagascar. J. Biogeogr. 34, 102–117. https://doi.org/10.1111/j. 1365-2699.2006.01594.x. Phillips, S.J., Anderson, R.P., Schapire, R.E., 2006. Maximum entropy modeling of species geographic distributions. Ecol. Modell. 190, 231–259. https://doi.org/10.1016/j. ecolmodel.2005.03.026. Polhill, D., 1988. Flora of Tropical East Africa. Index of Collecting Localities. Kew Royal Botanic Gardens, London. Qian, H., Deng, T., Beck, J., Sun, H., Xiao, C., Jin, Y., Ma, K., 2018. Incomplete species lists derived from global and regional specimen‐record databases affect macroecological analyses: a case study on the vascular plants of China. J. Biogeogr. 45, 2718–2729. https://doi.org/10.1111/jbi.13462. Raes, N., Roos, M.C., Slik, J.F., Van Loon, E.E., Steege, H.T., 2009. Botanical richness and endemicity patterns of Borneo derived from species distribution models. Ecography 32, 180–192. https://doi.org/10.1111/j.1600-0587.2009.05800.x. Raes, N., ter Steege, H., 2007. A null‐model for significance testing of presence‐only species distribution models. Ecography 30, 727–736. https://doi.org/10.1111/j. 2007.0906-7590.05041.x. R Core Team, 2018. R: a Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (Accessed 20 Dec 2018). http://www.R-project.org/. Reddy, S., Dávalos, L.M., 2003. Geographical sampling bias and its implications for conservation priorities in Africa. J. Biogeogr. 30, 1719–1727. https://doi.org/10. 1046/j.1365-2699.2003.00946.x. Romo, H., García‐Barros, E., Lobo, J.M., 2006. Identifying recorder-induced geographic bias in an Iberian butterfly database. Ecography 29, 873–885. https://doi.org/10. 1111/j.2006.0906-7590.04680.x. Russo, V., Huntley, B.J., Lages, F., Ferrand, N., 2019. Conclusions: biodiversity research and conservation opportunities. Biodiversity of Angola. Springer, New York, pp. 543–549. Schmid, R., 1989. Floristic inventory of tropical countries: the status of plant systematics, collections, and vegetation, plus recommendations for the future. Vegetatio 86, 97–98. https://doi.org/10.2307/1223097. Soberón, J., Jiménez, R., Golubov, J., Koleff, P., 2007. Assessing completeness of biodiversity databases at different spatial scales. Ecography 30, 152–160. https://doi.org/ 10.1111/j.0906-7590.2007.04627.x. Soberón, J.M., Llorente, J.B., Oñate, L., 2000. The use of specimen-label databases for conservation purposes: an example using Mexican Papilionid and Pierid butterflies. Biodivers. Conserv. 9, 1441–1466. https://doi.org/10.1023/A:1008987010383. Sosef, M.S., Dauby, G., Blach-Overgaard, A., Van Der Burgt, X., Catarino, L., Damen, T., et al., 2017. Exploring the floristic diversity of tropical Africa. BMC Biol. 15, 15. https://doi.org/10.1186/s12915-017-0356-8. Sousa-Baena, M.S., Garcia, L.C., Peterson, A.T., 2014a. Knowledge behind conservation status decisions: data basis for “Data Deficient” Brazilian plant species. Biol. Conserv. 173, 80–89. https://doi.org/10.1016/j.biocon.2013.06.034. Sousa-Baena, M.S., Garcia, L.C., Peterson, A.T., 2014b. Completeness of digital accessible knowledge of the plants of Brazil and priorities for survey and inventory. Divers. Distrib. 20, 369–381. https://doi.org/10.1111/ddi.12136. Stephenson, P., Bowles-Newark, N., Regan, E., Stanwell-Smith, D., Diagana, M., Höft, R., Abarchi, H., Abrahamse, T., Akello, C., Allison, H., 2017. Unblocking the flow of biodiversity data for decision-making in Africa. Biol. Conserv. 213, 335–340. https:// doi.org/10.1016/j.biocon.2016.09.003. Stropp, J., Ladle, R.J., Malhado, A.C.M., Hortal, J., Gaffuri, J., Temperley, W.H., Olav Skøien, J., Mayaux, P., 2016. Mapping ignorance: 300 years of collecting flowering plants in Africa. Glob. Ecol. Biogeogr. 25, 1085–1096. https://doi.org/10.1111/geb. 12468. Tobler, M., Honorio, E., Janovec, J., Reynel, C., 2007. Implications of collection patterns of botanical specimens on their usefulness for conservation planning: an example of two neotropical plant families (moraceae and myristicaceae) in Peru. Biodivers. Conserv. 16, 659–677. https://doi.org/10.1007/s10531-005-3373-9. Troia, M.J., McManamay, R.A., 2017. Completeness and coverage of open‐access freshwater fish distribution data in the United States. Divers. Distrib. 23, 1482–1498. https://doi.org/10.1111/ddi.12637. UNEP-WCMC, 2019. Protected Area Profile for United Republic of Tanzania, Kenya, Uganda, Rwanda, Burundi From the World Database of Protected Areas. (Accessed 01 April 2019). www.protectedplanet.net. Williams, P., Gibbons, D., Margules, C., Rebelo, A., Humphries, C., Pressey, R., 1996. A comparison of richness hotspots, rarity hotspots, and complementary areas for conserving diversity of British birds. Conserv. Biol. 10, 155–174. https://doi.org/10. 1046/j.1523-1739.1996.10010155.x. Wisz, M.S., Hijmans, R.J., Li, J., Peterson, A.T., Graham, C.H., Guisan, A., NCEAS

Ahrends, A., Rahbek, C., Bulling, M.T., Burgess, N.D., Platts, P.J., Lovett, J.C., Kindemba, V.W., Owen, N., Sallu, A.N., Marshall, A.R., 2011. Conservation and the botanist effect. Biol. Conserv. 144, 131–140. https://doi.org/10.1016/j.biocon.2010.08.008. Albuquerque, F., Beier, P., 2015. Rarity-weighted richness: a simple and reliable alternative to integer programming and heuristic algorithms for minimum set and maximum coverage problems in conservation planning. PLoS One 10, e0119905. https:// doi.org/10.1371/journal.pone.0119905. Ballesteros-Mejia, L., Kitching, I.J., Jetz, W., Nagel, P., Beck, J., 2013. Mapping the biodiversity of tropical insects: species richness and inventory completeness of African sphingid moths. Glob. Ecol. Biogeogr. 22, 586–595. https://doi.org/10.1111/ geb.12039. Balmford, A., Moore, J.L., Brooks, T., Burgess, N., Hansen, L.A., Williams, P., Rahbek, C., 2001. Conservation conflicts across Africa. Science 291, 2616–2619. https://doi.org/ 10.1126/science.291.5513.2616. Botts, E.A., Alexander, G.J., 2011. Geographic sampling bias in the South African Frog Atlas Project: implications for conservation planning. Biodivers. Conserv. 20, 119–139. https://doi.org/10.1007/s10531-010-9950-6. Colwell, R.K., Coddington, J.A., 1994. Estimating terrestrial biodiversity through extrapolation. Philos. Trans. R. Soc. London 345, 101–118. https://doi.org/10.1098/rstb. 1994.0091. Crisp, M.D., Laffan, S., Linder, H.P., Monro, A., 2010. Endemism in the Australian flora. J. Biogeogr. 28, 183–198. https://doi.org/10.1046/j.1365-2699.2001.00524.x. Dauby, G., Zaiss, R., Blachovergaard, A., Catarino, L., Damen, T., Deblauwe, V., Dessein, S., Dransfield, J., Droissart, V., Duarte, M.C., 2016. RAINBIO: a mega-database of tropical African vascular plants distributions. Phytokeys 74, 1–18. https://doi.org/ 10.3897/phytokeys.74.9723. 4. Feeley, K.J., Silman, M.R., 2011. Keep collecting: accurate species distribution modelling requires more collections than previously thought. Divers. Distrib. 17, 1132–1140. https://doi.org/10.1111/j.1472-4642.2011.00813.x. Ficetola, G.F., Cagnetta, M., Padoa‐Schioppa, E., Quas, A., Razzetti, E., Sindaco, R., Bonardi, A., 2015. Sampling bias inverts ecogeographical relationships in island reptiles. Glob. Ecol. Biogeogr. 23, 1303–1313. https://doi.org/10.1111/geb.12201. FTEA editors, 2012. Flora of Tropical East Africa. Royal Botanic Gardens, Kew, London. Gaston, K.J., 2000. Global patterns in biodiversity. Nature 405, 220–227. https://doi.org/ 10.1038/35012228. Gomes, V.H., IJff, S.D., Raes, N., Amaral, I.L., Salomão, R.P., de Souza Coelho, L., et al., 2018. Species Distribution Modelling: contrasting presence-only models with plot abundance data. Sci. Rep. 8, 1003. https://doi.org/10.1038/s41598-017-18927-1. Graham, C.H., Elith, J., Hijmans, R.J., Guisan, A., Townsend Peterson, A., Loiselle, B.A., NCEAS Predicting Species Distributions Working Group, 2008. The influence of spatial errors in species occurrence data used in distribution models. J. Appl. Ecol. 45, 239–247. https://doi.org/10.1111/j.1365-2664.2007.01408.x. Groemping, U., 2006. Relative importance for linear regression in r: the package relaimpo. J. Stat. Softw. 17, 1–27. http://hdl.handle.net/10.18637/jss.v017.i01. Hampton, S.E., Strasser, C., Tewksbury, J.J., Gram, W., Budden, A.E., Batcheller, A.L., Duke, C.S., Porter, J.H., 2013. Big data and the future of ecology. Front. Ecol. Environ. 11, 156–162. https://doi.org/10.1890/120103. Henriksson, A., Yu, J., Wardle, D.A., Trygg, J., Englund, G., 2016. Weighted species richness outperforms species richness as predictor of biotic resistance. Ecology 97, 262–271. https://doi.org/10.1890/15-0463.1. Holechek, J.L., Valdez, R., 2017. Wildlife conservation on the Rangelands of Eastern and Southern Africa: past, present, and future. Rangel. Ecol. Manag. 71, 245–258. https://doi.org/10.1016/j.rama.2017.10.005. Holmes, G., Smith, T.A., Ward, C., 2017. Fantastic beasts and why to conserve them: animals, magic and biodiversity conservation. Oryx 52, 1–9. https://doi.org/10. 1017/S003060531700059X. Hortal, J., Lobo, J.M., Jimenezvalverde, A., 2007. Limitations of biodiversity databases: case study on seed‐plant diversity in Tenerife, Canary Islands. Conserv. Biol. 21, 853–863. https://doi.org/10.1111/j.1523-1739.2007.00686.x. Kreft, H., Jetz, W., 2007. Global patterns and determinants of vascular plant diversity. Proc. Natl. Acad. Sci. U. S. A. 104, 5925–5930. https://doi.org/10.1073/pnas. 0608361104. Kuper, W., Sommer, J.H., Lovett, J.C., Barthlott, W., 2006. Deficiency in African plant distribution data - missing pieces of the puzzle. Bot. J. Linn. Soc. 150, 355–368. https://doi.org/10.1111/j.1095-8339.2006.00494.x. Ladle, R.J., Hortal, J., 2013. Mapping species distributions: living with uncertainty. Front. Biogeogr. 5. https://doi.org/10.21425/F5FBG12942. Loiselle, B.A., Jorgensen, P.M., Consiglio, T., Jimenez, I., Blake, J.G., Lohmann, L.G., Montiel, O.M., 2008. Predicting species distributions from herbarium collections: does climate bias in collection sampling influence model outcomes? J. Biogeogr. 35, 105–116. https://doi.org/10.1111/j.1365-2699.2007.01779.x. Mittermeier, R.A., Turner, W.R., Larsen, F.W., Brooks, T.M., Gascon, C., 2011. Global Biodiversity Conservation: the Critical Role of Hotspots. In Biodiversity Hotspots. Springer, Berlin, Heidelberg, pp. 3–22. Moerman, D., Estabrook, G., 2010. The botanist effect: counties with maximal species richness tend to be home to universities and botanists. J. Biogeogr. 33, 1969–1974. https://doi.org/10.1111/j.1365-2699.2006.01549.x. Mutke, J., Kier, G., Braun, G., Schultz, Chr., Barthlott, W., 2001. Patterns of African vascular plant diversity: a GIS based analysis. Syst. Geogr. Plants 1125–1136. https:// doi.org/10.2307/3668744. Mutke, J., Sommer, J.H., Kreft, H., Kier, G., Barthlott, W., 2011. Vascular plant diversity in a changing world: global centres and biome-specific patterns biodiversity hotspots. Biodiversity Hotspots 83–96. https://doi.org/10.1007/978-3-642-20992-5_5.

7

Biological Conservation xxx (xxxx) xxxx

S. Wang, et al.

Zhang, J., 2015. Plantlist: Looking up Families of Higher Plants. (accessed 20 Mar 2015). http://R-Forge.R-project.org/projects/plantlist/. Zhang, M.G., Zhou, Z.K., Chen, W.Y., Slik, J.W., Cannon, C.H., Raes, N., 2012. Using species distribution modeling to improve conservation and land use planning of Yunnan, China. Biol. Conserv. 153, 257–264. https://doi.org/10.1016/j.biocon. 2012.04.023. Zhang, M.G., Slik, J.F., Ma, K.P., 2017. Priority areas for the conservation of perennial plants in China. Biol. Conserv. 210, 56–63. https://doi.org/10.1016/j.biocon.2016. 06.007. Zhou, Y., Liu, B., Mbuni, Y., Yan, X., Mwachala, G., Hu, G., Wang, Q., 2017. Vascular flora of Kenya, based on the Flora of Tropical East Africa. PhytoKeys 90, 113–126. https:// doi.org/10.3897/phytokeys.90.20531. Zuur, A.F., Ieno, E.N., Elphick, C.S., 2010. A protocol for data exploration to avoid common statistical problems. Methods Ecol. Evol. 1, 3–14. https://doi.org/10.1111/ j.2041-210X.2009.00001.x.

Predicting Species Distributions Working Group, 2008. Effects of sample size on the performance of species distribution models. Divers. Distrib. 14, 763–773. https://doi. org/10.1111/j.1472-4642.2008.00482.x. Yang, W., Ma, K., Kreft, H., 2013. Geographical sampling bias in a large distributional database and its effects on species richness–environment models. J. Biogeogr. 40, 1415–1426. https://doi.org/10.1111/jbi.12108. Yang, W., Ma, K., Kreft, H., 2014. Environmental and socio‐economic factors shaping the geography of floristic collections in C hina. Glob. Ecol. Biogeogr. 23, 1284–1292. https://doi.org/10.1111/geb.12225. Ye, J., Lu, L., Liu, B., Yang, T., Zhang, J., Hu, H., Li, R., Lu, A., Liu, H., Mao, L., 2019. Phylogenetic delineation of regional biota: a case study of the Chinese flora. Mol. Phylogenet. Evol. https://doi.org/10.1016/j.ympev.2019.03.011. Zhang, D.C., Ye, J.X., Sun, H., 2016. Quantitative approaches to identify floristic units and centres of species endemism in the Qinghai‐Tibetan Plateau, south‐western China. J. Biogeogr. 43, 2465–2476. https://doi.org/10.1111/jbi.12819.

8