e c o l o g i c a l m o d e l l i n g 2 1 0 ( 2 0 0 8 ) 414–430
available at www.sciencedirect.com
journal homepage: www.elsevier.com/locate/ecolmodel
Nest selection by snow petrels Pagodroma nivea in East Antarctica Validating predictive habitat selection models at the continental scale Fr´ed´erique Olivier a,∗ , Simon J. Wotherspoon b,1 a b
Institute of Antarctic and Southern Ocean Studies, University of Tasmania, Private Bag 77, Hobart 7001, Tasmania, Australia School of Mathematics and Physics, University of Tasmania, Private Bag 37, Hobart 7001, Tasmania, Australia
a r t i c l e
i n f o
a b s t r a c t
Article history:
Little is known on the factors controlling distribution and abundance of snow petrels in
Received 4 January 2006
Antarctica. Studying habitat selection through modeling may provide useful information on
Received in revised form
the relationships between this species and its environment, especially relevant in a climate
29 July 2007
change context, where habitat availability may change. Validating the predictive capability of
Accepted 7 August 2007
habitat selection models with independent data is a vital step in assessing the performance
Published on line 1 October 2007
of such models and their potential for predicting species’ distribution in poorly documented areas.
Keywords:
From the results of ground surveys conducted in the Casey region (2002–2003, Wilkes Land,
Habitat selection
East Antarctica), habitat selection models based on a dataset of 4000 nests were created to
Nest distribution
predict the nesting distribution of snow petrels as a function of topography and substrate. In
Model validation
this study, the Casey models were tested at Mawson, 3800 km away from Casey. The location
Generalized linear model (GLM)
and characteristics of approximately 7700 snow petrel nests were collected during ground
Classification tree (CT)
surveys (Summer 2004–2005). Using GIS, predictive maps of nest distribution were produced for the Mawson region with the models derived from the Casey datasets and predictions were compared to the observed data. Models performance was assessed using classification matrixes and Receiver operating characteristic (ROC) curves. Overall correct classification rates for the Casey models varied from 57% to 90%. However, two geomorphologically different sub-regions (coastal islands and inland mountains) were clearly distinguished in terms of habitat selection by Casey model predictions but also by the specific variations in coefficients of terms in new models, derived from the Mawson data sets. Observed variations in the snow petrel aggregations were found to be related to local habitat availability. We discuss the applicability of various types of models (GLM, CT) and investigate the effect of scale on the prediction of snow petrel habitats. While the Casey models created with data collected at the nest scale did not perform well at Mawson due to regional variations in nest micro-characteristics, the predictive performance of models created with data compiled at a coarser scale (habitat units) was satisfactory. Substrate type was the most robust predictor
∗
Corresponding author. Tel.: +61 3 6226 7482; fax: +61 3 6226 2973. E-mail address:
[email protected] (F. Olivier). 1 Tel.: +61 3 6226 2729; fax: +61 3 6226 2410. 0304-3800/$ – see front matter. Crown Copyright © 2007 Published by Elsevier B.V. All rights reserved. doi:10.1016/j.ecolmodel.2007.08.006
e c o l o g i c a l m o d e l l i n g 2 1 0 ( 2 0 0 8 ) 414–430
415
of nest presence between Casey and Mawson. This study demonstrate that it is possible to predict at the large scale the presence of snow petrel nests based on simple predictors such as topography and substrate, which can be obtained from aerial photography. Such methodologies have valuable applications in the management and conservation of this top predator and associated resources and may be applied to other Antarctic, Sub-Antarctic and lower latitudes species and in a variety of habitats. Crown Copyright © 2007 Published by Elsevier B.V. All rights reserved.
1.
Introduction
The habitat-association approach to ecology has been used for a variety of purposes, including conservation and ecological management (Fielding and Bell, 1997; Johnson et al., 2004) and habitat selection models are more and more widely used to understand and manage wildlife. The idea that species have predictable habitat requirements that are part of a species’ identity, often referred to as a species’ niche dominates the habitat selection theory (Jones, 2001) and is also the base of the use of statistical models which are commonly used to assess multivariate species-habitat relationships (Fielding and Haworth, 1995; Karl et al., 2000; Guisan and Thuiller, 2005). These often spatially explicit habitat models can be used to later develop species distribution maps. The approach has been used to develop predictive models for estimating population sizes and geographical ranges in poorly documented areas and for identifying the potential impact of habitat modifications (Stillman and Brown, 1994; Suarez-Seoane et al., 2002). The majority of habitat ecological modeling is based on generalized linear modeling (GLM) approaches, with logistic regression most commonly used for species distribution modeling because models consider the presence-absence of the target (Rusthon et al., 2004; Guisan and Thuiller, 2005). However, the relationships generalized from regression modeling are purely inferential. With no pre-existing knowledge on nest selection by snow petrel in Antarctica, statistical models established at the FitSite (Casey) first helped identifying the possible processes determining nest distribution to produce a conceptual, process-based model (Olivier and Wotherspoon, 2005). The statistical validation of the models at the TestSite was necessary to confirm and nuance the understanding of such processes. To strengthen interpretations of statistical inference, it is important to assess the performance and validity of habitat selection models (Fielding, 2002; Pearce and Ferrier, 2000). Because models have their greatest utility when they can be used predictively (Rushton et al., 2004) rather than just in an exploratory manner, model assessment is a vital step in species distribution modeling. Model validation consists in the evaluation of two components (Pearce and Ferrier, 2000): reliability or calibration (the agreement between predicted probabilities of occurrence and observed proportions of sites occupied), and discrimination capacity (the ability of a model to correctly distinguish between occupied and unoccupied sites). Validation methods for habitat selection models have been widely discussed (Boyce et al., 2002; Fielding and Bell, 1997;
Manel et al., 2001), highlighting that a very fair and meaningful way of assessing a model is to compare model prediction ´ et al., 2005). Any approach to ecowith observed data (Araujo logical modeling has little merit if the predictions cannot be, or are not assessed for their accuracy using independent data (Verbyla and Litaitis, 1989). However, given the costs and logistical constraints associated with data collection, it is not uncommon that methods to verify the validity of habitat selection models are limited to crossvalidation based on the dataset used to build the model or on independent datasets collected in the close vicinity of the model-building data. More infrequently used is the validation of the models concerned with a widely distributed species with completely new data (Mladenoff et al., 1999; Lindenmayer et al., 1994) spatially distant from the training dataset, which becomes problematic for species distributed at the continental scale. There is a growing interest in building predictive models of species distributions over large geographic areas resulting, for example, from the need for assessing the potential effect of climate-related habitat changes on populations (Peterson et al., 2002; Thuiller, 2003). In the Antarctic, such effects of climate change may be detected with the study of the ecological requirements of bio-indicators species at the continental scale. With limited access to the Antarctic continent, it is difficult to either locate or estimate local breeding populations of top predators, which remain relatively unstudied, such as the snow petrel Pagodroma nivea, Forster. The large-scale compilation work of Croxall et al. (1995) determined that the snow petrel is ubiquitous, present at least 258 locations around the Antarctic and breeds at 195 of these. It only briefly mentions ecological factors influencing snow petrel continental and regional distributions. Processes of nest site formation and selection were suggested in the early work of Brown (1966). Other studies also have proposed topography as the major determining factors explaining the distribution of snow petrel colonies (Ryan et al., 1989). More recently, detailed habitat selection models were established for the snow petrel at Casey, in East Antarctica (Olivier and Wotherspoon, 2005). If proved robust, habitat selection models predicting the presence of nest sites as a function of environmental factors may help distribution and abundance studies, and contribute to a better understanding and management of this top predator. The objectives of this paper are: (i) To briefly provide an up to date summary of baseline information on the distribution and abundance of snow petrels in the Mawson region. (ii) To test the habitat selection models established at Casey, that is, evaluate the predictive ability of the models when transposed
416
e c o l o g i c a l m o d e l l i n g 2 1 0 ( 2 0 0 8 ) 414–430
to different areas (potentially Antarctica wide). (iii) To create habitat selection models based on the local Mawson data and systematically compare them in order to better understand ecological requirements of the species, locally and potentially at the large scale.
2.
Methods
2.1.
Study area
Although variations exist due to the differing geomorphological histories of various Antarctic ice-free regions, a certain homogeneity remains in the landscape throughout East Antarctica. For this reason, the validation of habitat selection models originally created at Casey (referred to further as FitSite) was purposedly conducted as far from Casey as logistics permitted, in a regional area of comparable size. Mawson (67◦ 36 S, 62◦ 52 E), 3800 km West of Casey (Fig. 1), was selected as the TestSite. The total surface ice-free areas representing available habitat was 3153 ha at Casey and 3973.5 ha at Mawson (total including Rookery Islands and Robinson groups and all Framnes ranges). Despite strong similarities in the geomorphology of coastal areas in East Antarctica, local differences exist, and are mostly related to the local climate and geological/glaciological history
of the area. At the TestSite, two separate types of landscapes were distinguished (Fig. 1).
• TestIslands (TestIs): of maritime influence in summer, Holme Bay is generally reaches subzero temperatures and is frequently subject to medium strength katabatic winds. Low level coastal islands (maximum elevation 151 m) and a small number of nunataks (small rocky outcrops emerging from the ice cap) surround the station. Except for a few larger islands (Welch Island, 125 ha, Table 1), the ice-free area of individual islands is generally limited to a few ha. • TestMountains (TestMtns): on the other hand, the Framnes Mountains emerge at higher elevation within a 70 km2 area mostly covered by blue ice flowing to the coast. At this elevation, temperatures may be 8–10 degrees lower than on the coast, although local areas exposed to sun radiation can warm up considerably in the absence of wind. Composed of several ranges mostly oriented in a S(W)—N(E) direction, TestMtns displays a more compact landscape: ice-free areas characterized by steep slopes are condensed in ranges of maximum 9 km in length and 100 m to 2 km width. Due to logistic limitations, only five of these ranges were thoroughly surveyed (Mt Henderson, Northern, Central and Southern Massom ranges, David and Mt Horden ranges, Fig. 1).
Fig. 1 – (a) Location of Mawson (TestSite) and the two main sub-regions studied: Holme Bay Islands (TestIs) and the Framnes Mountains (TestMtns). Dotted circles are detailed in b and c. (b) 3D View of coastal habitats looking East from Mawson station. (c) 3D view of a section of habitat in the Framnes Mountains (Northern Massom range). Black dots are nests locations, grey squares (various shades of grey) are the delimitation of 200 × 200 m sites. Elevation of significant features is provided in m. The grey arrows illustrate the direction of prevailing winds.
Table 1 – Locations surveyed and number of nests recorded per location and proportion of nests correctly predicted with the FitSite model including topography and substrate (TopoSubst) Sub-region
Location surveyed
Ice-free area
Area surveyed (ha)
% of ice-free areas searched (ha)
Number of nests observed Total
Non-active
Density averaged for area (Nest/ha)
% of nests predicted by the Casey model
8.6 2.8 0.00 1.8 11.9 1.3 4.4
87.5 91.1 – 87.5 87.8 86.4 87.9
Framnes Mountains (TestMtns)
TOTAL/Average Holme Bay Islands (TestIs) Arrow Is Bechervaise Is Canopus Is Departure rocks Dyer Is East Budd Is Evans Is Flat Is Jocelyn Is group (incl. Is to the south) Kitney Is Klung Is Mawson (incl. Entrance, Hump Is) Nost Is Ring rocks Peak jones rocks Rouse Is Smith rocks Stinear Is (incl. islands to the north) Teyssier Is Trevillian Is Welch Is Total/Average
545.1 521.2 116.4 766.2 442.3 324.0 2715.2
316.2 206.1 44.4 259.7 228.7 108.3 1163.4
58.0 39.5 38.2 33.9 51.7 33.4 42.5
2705 575 0 455 2740 140 6615
306 94 0 95 574 35 1104
20.7 67.9 30.2
15.1 47.1 22.5
73.3 69.3 74.5
18.8 18.4 58.1 25.3 46.0
17.8 18.4 52.8 25.1 41.5
94.9 100.0 90.9 99.1 90.3
43 113 6 0 27 40 78 16 174
3 24 0 0 0 4 6 0 10
2.8 2.4 0.3 0.0 1.5 2.2 1.5 0.6 4.2
65.1 66.4 100.0 – 77.8 45.0 60.3 56.3 75.3
2.3 66.3 35.9
2.3 35.1 33.7
100.0 53.0 93.8
8 35 36
0 1 14
3.4 1.0 1.1
0.0 88.6 47.2
27.9 45.2
25.1 34.2
89.9 75.8
15.4 13.0 40.0
91.4 96.5 96.6
0 2 0 1 0 3
0.0 1.9
16.8 13.4 41.4
0 64 0 113 41 67
7.3 3.2 1.7
60.9 – 77.0 75.6 92.5
15.6 9.3 124.5 684.1
15.6 9.3 90.6 554.7
100.0 100.0 72.8 87.5
75 8 162 1106
2 0 4 74
4.8 0.9 1.8 2.2
90.7 0.0 98.8 75.0
e c o l o g i c a l m o d e l l i n g 2 1 0 ( 2 0 0 8 ) 414–430
Northern Massom Central Massom Southern Massom David range Mt Henderson range Mt Horden range
417
418
2.2.
e c o l o g i c a l m o d e l l i n g 2 1 0 ( 2 0 0 8 ) 414–430
Environmental descriptors
Model validation required the survey methods and modeling protocols applied at the TestSite to be consistent with that of the FitSite. Thoroughly described previously (Olivier et al., 2003; Olivier and Wotherspoon, 2005), these are only briefly described in this study, with an emphasis on the small adaptations made for TestSite (Mawson). A 2 m resolution elevation surface (MEANDEM) was generated in Arcinfo from the original Mawson spot heights (AAD 2002). Several topographic variables were derived with the spatial analyst extension of ArcGIS (ESRI, 2002; see Table 2): slope, aspect, aspect to the prevailing winds (WASPECT) with average prevailing winds direction estimated at 125◦ for TestSite (Bureau of Meteorology, 2005) and curvature (MEANCURV). Although seemingly redundant to other topographic parameters, the use of curvature potentially enabled to distinguish significant geomorphological features at the FitSite, such as prominent areas (convex ridges, subject to wind erosion) and accumulation areas (concave, where boulders and scree potentially accumulate). Geomorphological variables describing the percentage (%) cover of each type of substrate in habitat units mapped in the field were also created according to the methodology previously used at the FitSite. Permafrost scree (SNOWSCREE) was added to the list of substrate variables (Table 2). Previously unrecorded at the FitSite, this substrate results from the breakdown of mother rock due to the cold and was found only in TestMtns. As moraine sediment (MORSED), it was highly unfavourable to the presence of nests. To keep model calculation similar to that of the FitSite, SNOWSCREE and MORSED % cover were combined as one input variable. Topographic and substrate values extracted from the layers described above were then attributed to individual nests.
2.3.
Snow petrel nests
Surveys of snow petrel nests were conducted in TestIs during December 2004, and in TestMtns in January to mid-February 2005. This short, 2-month period enabled adequate consistency of the data by ensuring nest occupation was confirmed easily during the egg incubation or chick rearing periods. Snow petrels are commonly thought to be colonial, nesting in pre-existing holes and rock crevices. All nests, active or abandoned, were recorded and their level of activity specified, based on the definition of an active nest established by van Franeker et al. (1990, “apparently occupied nest site). Each nest was thoroughly described with a set of descriptors designed to characterize micro-environmental conditions at the nest (Table 2). The same observer described the nests at both Fitsite and TestSite, ensuring consistency in the classification of habitats. A non-exhaustive comparative analysis of selected nest micro-characteristics at both FitSite and TestSite is presented here to explain the differences observed in the models. Analyzing similarities and differences in the density distribution of nest properties proved useful as a pre-modeling exploratory phase but also as a post-modeling explanatory phase. GPS locations of individual nests were recorded with Trimble Geoexplorer handheld GPS. Differential correction of the signal enabled the nest location to be recorded at a resolution higher than that of the topographic surfaces, therefore
minimizing one source of model error with spatial data. The Windows CE interface, which supported the GIS based-data collection software (Terrasync), was useful for keeping accurate record of the areas searched while in the field (delimited a priori on a moving map background). Exhaustive searches for snow petrel nests were conducted over randomly selected predefined areas, generally 200 × 200 m grid sites (Fig. 1b and c), except for small islands where ice-free areas was considerably smaller than 4 ha (in which case the entire island was surveyed). Variables describing nest density were attributed to each nest. The standard ratio “Number of nest/Area” is commonly subject to variations in the boundaries attributed to the area over which density is calculated. Therefore, we preferred a nest-centered approach to the calculation of densities (Olivier and Wotherspoon, 2005) and attributed to each nest two values describing the clustering of the spatial distribution of nests for modeling purposes: the number of neighbours within 30 m (NN30COUNT) and the average distance to neighbours within 30 m (NESTDIST, Table 2). These values express part of the spatial dependence [or autocorrelation: the location of one nest is not independent from that of others (Doligez et al., 2003)] related to conspecific attraction of this semi-colonial species (Olivier and Wotherspoon, 2005; Keitt et al., 2002).
2.4. Evaluating predictive performance of the models fitted at Casey in the test region Resource selection functions (RSF, Manly et al., 2002; Boyce et al., 2002) were the main tool used to model nesting habitat selection by snow petrels at the FitSite (Olivier and Wotherspoon, 2005). Based on logistic regression and implemented with GLMs (Generalized Linear Models), the RSF-based models returned the probability of presence of a nest given a certain combination of environmental predictors (substrate and topography). A probability threshold above which a nest was considered present was selected to generate predictive nest distribution maps (which maximize the percentage of presences correctly predicted). Such functions were also extended further to the prediction of abundance over defined areas (Boyce and McDonald, 1999). GLMs were preferred to GAMs (Generalized Additive Models) for their predictive robustness (Frescino et al., 2001). In our case, models implemented as GAMs tended to overfit this type of data, and became too region specific. In contrast, GLMs are more robust, generalize better to new regions, and are more easily implemented in a standard GIS interface. Significant linear, quadratic and interactive terms of GLMs were selected based on model deviance as indicator of model goodness of fit, and the Akaike’s Information Criteria (AIC; Akaike, 1973). As an alternative, Classification Trees (CT) models (Breiman et al., 1984; De’Ath and Fabricius, 2000) were also used to predict nest selection at Casey. CT models estimate the probability of class membership based on the proportion of observations of each class (presence or absence in this study) at any terminal node of the tree. These proportions can in turn be used to produce predictive maps representing the geographic distribution of probabilities of occurrence. The ability of these models to generalize to new regions was also tested. Detailed comments on the performance of all types of modeling approaches
419
e c o l o g i c a l m o d e l l i n g 2 1 0 ( 2 0 0 8 ) 414–430
Table 2 – Names and description of the variables used in the models and their range of variation per sub-region (TestIslands and TestMtns) Variable name
Description
Range of values TestIs
MEANDEM WASPECT
MEANSLOPE Observed nest aspect MEANCURV
BOULDERBIG BOULDSMALL BARESUBST MORSED
SCREE SNOW Nest type NESTDIST NESTDISTCODE
NESTNB Nest cavity size Nest cavity depth Nest snow blockage
Elevation averaged for an area of 2 m surrounding each nest location (pixel resolution of the original DEM file) Aspect to the prevailing winds, 2 m resolution Angles to the prevailing winds were calculated as A = Abs(MEANASPECT of the site considered—125) or 360—Abs(MEANASPECT of the site considered—125) if A > 180. (Abs: “Absolute Value of”). 125 was the average direction of most prevailing winds (BOM, 2005). The resulting WASPECT variable is such that areas facing into the winds had a low angle, and areas away from the wind had a high angle Slope, 10 m resolution Orientation of the main nest entrance (Categorical N, NE, E, SE, S, SW, W, NW) Convexity expressed as curvature calculated from the 10 m resolution DEM file (arbitrary values, a negative sign expresses concavity, a positive sign convexity, derived with ArcView extension, Behren, 2000) % cover of large boulders (more than 1.2 m in diameter) % cover of small boulders (diameter between 0.4 and 1.2 m) % cover of bare substrate (bare geological substrate, often with cracks resulting from erosion) % cover of moraine sediment (various sedimentary deposits of sand, small rocks and boulders less than 0.4 m in diameter). In the Framnes, permafrost SNOWSCREE was assimilated under this category and combined with morsed as unfavourable habitat % cover of scree (broken rocks of various sizes with sharp angles) % cover of snow (interstitial snow) 4 categories: under one boulder, under 2 boulders, under flat rock, in a crack of bare substrate Average distance to neighbours located within 30 m of each nest NESTDIST coded from 1 to 4. (1: nest with no neighbour within 30 m; 2: 20 m < NESTDIST < 30 m; 3: 10 m < NESTDIST < 20 m; 4:0 m < NESTDIST < 10 m) Number of neighbours within 30 m of a given nest Relative rating conducted by one observer for nest cavity volume In cm, 3 categories Accessible (A), Accessible with snow, or ice (AS, AI), blocked with snow, or ice (BS, BI)
for our specific purpose were previously reported (Olivier and Wotherspoon, 2005, 2006). The relative performance of these modeling techniques in various situations was also extensively reviewed (Guisan and Zimmermann, 2000). Models were fitted with data summarized over a selection of scales, from 200 × 200 m grid site to the nest level (Olivier and Wotherspoon, 2005). Because the performance of a habitat selection model should be related to the scale of analysis (Karl et al., 2000), we tested models generated at the habitat unit and at the nest scale. Four of the FitSite models were applied to the TestSite: TOPO, TOPOSubst, NestTOPOSubst, NestCT (Table 3). Predictive maps returning the probability of snow petrel nest presence were constructed by applying the coefficients of the FitSite models to the environmental
0–124 m
TestMtns 354–1229 m
0–180◦
0–42◦
0–61◦
−5.59 to +4.3
−6 to +17
0–80% 0–45%
0–45% 0–40%
0–100%
0–80%
0–100%
0–50%
0
0–50%
0–80%
0–40%
0–29.9 m
0–29.8 m
1, 2, 3, 4
0–15 S, M, L, XL, XXL
0–87
0–30 cm, 30–60 cm, >60 cm A, AS, AI, BS, BI
surfaces created for the TestSite using ArcGIS. The probabilities of presence returned by the model were compared with the location of nests determined during field surveys at the TestSite. Measures of predictive performance based on confusion matrixes (2 × 2 classification tables showing the number of correct and incorrect predictions made by the models against observations) were used as the main tools to check the predictive ability of the models (Fielding and Bell, 1997). The proportion of accurately predicted nest presences (sensitivity) and accurately predicted absences (specificity) were estimated for a range of probability thresholds. The objective probability cut point determined by the minimum total misclassification of both presences and absences was used to compare models (Fielding and Bell, 1997). Therefore models
420
e c o l o g i c a l m o d e l l i n g 2 1 0 ( 2 0 0 8 ) 414–430
Table 3 – Description of GLM models and their coefficients with logistic regression equations used to calculate a value of Logit(P) and subsequently nest probability of occurrence Variable Constant Meandem Meanslope Meancurv Waspect Boulderbig Morsed Baresubst Scree [Baresubst]∧2 [Meancurv]∧2 [Boulderbig]∧2 [Waspect]*[meanslope]*[scree] [Slope]*[waspect]*[boulderbig]
TOPO
TOPOSubst
−1.1615407
−1.721
0.0525286 0.0020434 −0.0037908
0.02713 0.004283 −0.004017 0.1163 0.00674
NestTOPOSubst
TestSiteNestTOPOSubst
−1.729 0.01206 0.0441 0.0004901 −0.007403 0.1226 −0.02191 0.03262
−1.5182689 0.0302475 0.1230183 0.1432326 0.0024996 0.4940184 −0.0817645 0.0225338
−0.0003037 0.000001719 −0.0008735
−0.0003985
NestCTa >82.8 >79.25 >7.5/>12.5 <27.5 >67.5
−0.00004069 −0.001438
−0.0060232 0.00001153
0.00001656
The logistic regression equation used to calculate Logit (P) = intercept + coefficient * variable. . . The inclusion of “Nest” in model name expresses the compilation of model input variables at the nest scale rather than at the habitat unit scale. “∧2”, squared; “>”, more than; “<”, less than. a
For the classification tree model (CT) the thresholds are given individually in the table but they are combined as follows in the model: Con([boulderbig] > 12.5, Con([morsed] < 27.5, 1, 0), Con([curvature] > = 79.25, Con([boulderbig] > = 7.5, 1, Con([meandem] > = 82.82, 1, 0)), Con([scree] > = 67.5, 1, 0))) with “Con(. . .,. . .,. . .)”: IF-THEN-ELSE formula defining conditions of occurrence of nests with threshold (ArcGIS).
could be compared when their respective performances were at their best. The effect of the choice of a cut point was also examined (the local TestSite cut points were identified and compared to predictive results obtained with the FitSite cut point). To complement these results, receiver operating characteristic (ROC) curves were generated for each model. ROC curves plot the proportion of correct presence predictions (sensitivity) against the proportion of false positive (1—specificity) for a range of probability cut-off values (0–1). They therefore provide a summary measure of model performance (Pearce and Ferrier, 2000) and a threshold independent method of comparing models of different nature (Brotons et al., 2004) using the area under curve (AUC, Mason and Graham, 2002). Because simple predictive classification tables cannot be built unless multiple abundance classes are created, the performance of the abundance model was assessed by comparing the abundance predicted by the model in a 30 m radius around each nest to the effective measure of density calculated for each nest (NESTDIST). The distribution of the difference, which represented model errors was analyzed to assess model performance.
2.5. Modeling of nest selection with the TestSite data and systematic comparison with FitSite models In light of the results of the application of FitSite model to the TestSite, we created original habitat selection models for the TestSite at the nest scale (Model called TestSiteNestTOPOSubst). Nest-scale based selection models related the presence-absence of nests to environmental predictors calculated for a small area around each nest and/or nest micro-characteristic (Olivier and Wotherspoon, 2005). At this scale, habitat selection was evaluated by comparing used versus available habitats (Manly et al., 1993) and such binomial models also required non-nest (absence points) to be
artificially generated (Zaniewski et al., 2002; Gross et al., 2002). In this study, by conducting exhaustive searches in pre-defined areas (minus observer error), it was possible to generate “reliable” absence data with random points (Olivier and Wotherspoon, 2006). Two sets of random points (TestIs, 1162 random points and TestMtns, 6981 random points) were generated following the method previously selected as the most appropriate for this species at the FitSite (Olivier and Wotherspoon, 2005). Random points were generated in the sites searched, at least 30 m away from nests (buffer to avoid placing random points in high density habitats); their number slightly exceeded the number of nests found in the area, to compensate for the potential effect of prevalence (Manel et al., 2001). The proportion of the deviance explained by each variable for the TestSite overall was compared to that of the FitSite. Given the inaccuracies found in the FitSite model predictions, we modeled habitat selection for the two subregions of the TestSite in order to identify the coefficients (and variables) that differed between regions: FitSite, TestIs, TestMtns. In light of the obvious differences between coastal and mountain habitats, and similarly to a study conducted by Osborne and Suarez-Seoane (2002), we chose to keep the TestIs and TestMtns separate in further analyses. First, we constructed models using the same variables as the ones selected in final models for the FitSite but recalculated and compared one by one the coefficients and the proportion of deviance explained by each variable. Predictive performance of these models applied to TestSite data was evaluated as described above (ROC curves). Second, models were selected for the TestSite with a manual stepwise selection method based on deviance and reduction in AIC (Akaike Information Criterion, Burnham and Anderson, 2002). The effect of the inclusion of spatial dependence terms (NESTDIST) was investigated with the model that included all main effects.
421
e c o l o g i c a l m o d e l l i n g 2 1 0 ( 2 0 0 8 ) 414–430
3.
Results
3.1. Distribution and abundance of snow petrels in the Mawson area A total of 1107 snow petrel nests were located in 19 islands of TestIs and 6605 nests were located in TestMtns (Table 1). Of these nests, the proportion of apparently occupied nests (van Franeker et al., 1990) varied from 94.0% in TestIs to 81.5% in TestMtns, indicating a lesser pressure for available habitat in the latter (large proportion of previously occupied nest sites being abandoned, sometimes highly degraded (representing 3.6% of the nests in TestMtns, none in TestIs, previously unobserved at the FitSite)). In TestIs, nest distribution was relatively homogenous at the large-scale with very few sites found completely empty of snow petrels (two), and generally low nest densities (average: 2.2 nests/ha, Table 3). At the search site scale (200 m), nests were found in small aggregations/small pockets of suitable habitat (especially groups of boulders or cracks). In TestMtns, densities per search site were comparable to that of the FitSite (average 4.4 nests/ha, Olivier et al., 2003) although very high-density aggregations were less common (see further analysis of nest centered densities). Snow petrel distribution was a lot more discontinuous with large patches or entire ranges recorded with no nest. Despite seemingly harsher living conditions, TestMns support 80% of the snow petrel population recorded.
3.2. area
Application of the Casey model to the Mawson
The application of the model TOPO was satisfactory in TestIs with sensitivity and specificity around 70%, but the performance of this model was limited in TestMtns despite 72.9% of correct absence predictions (Table 4). If used with the FitSite cut point (0.43), the same model over predicted nest presence in TestMtns, bringing the model overall prediction
rate to a falsely satisfactory result (76.7% on average, against 59.7% when used for the entire TestSite area at a cut point of 0.5). This underlines the importance of the choice of cut point values when estimating prediction performance as it dramatically alters the calculation of predictive performance for each model when applied to a new area. Cut points were generally lower for TestIs than at the FitSite and higher in TestMtns. For models generated locally such variation did not occur between sub-regions (constant cut point of 0.7 for the TestSite model overall). Extreme values of the cut point were also better avoided for a more realistic estimation of performance: the model NestTOPOSubst is an example where overall performance of the model was satisfactory only with an unrealistically high/low cut point. Detailed examination of the density distributions for predicted probability values (Fig. 2) underlined the lack of discrimination of this model, especially for TestMtns (all areas associated with a high probability of nest occurrence). For TestIs, absences were correctly discriminated by the default density distribution of model probabilities whereas probability of presence homogeneously ranged from 0 to 1. ROC curves (Fig. 3) confirmed the low performance of this model by showing simultaneous proportions of false positive and true positive being no higher than those expected by chance (AUC < 0.5), especially for TestMtns and relatively unsatisfactory in TestIs. Despite low sensitivity values, NestCT returned satisfactory results at the TestSite, with prediction rates similar to that of the FitSite in TestMtns (Table 4, Fig. 4). In general, the performance of models created with data compiled at the nest scale was lower than that of models created at the habitat scale, making the latter more transferable. Indeed, prediction rates obtained with TOPOSubst were higher for the TestSite than with the FitSite training data, with an overall efficiency of 80.9%. This showed TOPOSubst as a potentially robust model for applications at the large scale. Overall, the performance of the models in TestIs was similar to that obtained at the FitSite, especially on the basis of topographic parameters (TOPO performed reasonably well, but only in coastal areas). In TestMtns, the ameliorated perfor-
Table 4 – Model performance evaluation for several models tested in TestIs and TestMtns separately and the entire TestSite with results previously obtained at the FitSite for comparison Model Scale location TestIs
TestMns
TestSite
FitSite
–, no data.
Performance measure Cut point Sensitivity (%) Specificity (%) Cut point Sensitivity (%) Specificity (%) Cut point Sensitivity (%) Specificity (%) Efficiency (%) Cut point Sensitivity (%) Specificity (%) Efficiency (%)
TOPO Habitat 0.28 69.3 72.8 0.58 38.9 72.9 0.5 60.3 59.1 59.7 0.43 62.1 66 64
TOPOSubst Habitat 0.38 75.0 80.3 0.63 96.2 80.1 0.55 78.7 82.3 80.9 0.43 68.2 77.6 73.5
Nest TOPOSubst Nest 0.17 70 70 0.95 40 40 0.93 60.2 62.1 61.0 0.5 73.8 77.2 76.0
NestCT model Nest 0.5 52.7 92.3 0.5 72.9 89.3 0.5 67.9 90.8 90.2 0.5 78.5 82.8 81.1
Mawson nest model Nest 0.7 77.5 78.6 0.7 82.4 81.9 0.7 81.7 80.4 – – – –
422
e c o l o g i c a l m o d e l l i n g 2 1 0 ( 2 0 0 8 ) 414–430
Fig. 2 – Density distribution of probabilities attributed by the models to observed nest presence and absence for 3 models established at the FitSite applied at TestIs and TestMtns separately (a–c) and the GLM model created with the entire TestSite data set (d). X axes are probability values attributed by the model (0–1) to either pixels that contained a nest (presences) or pixels that did not contain a nest (absences). Vertical axis is the number of nests, which were attributed each probability value (from 0 to 1, horizontal axis).
mance of TOPOSubst was due to the addition of the substrate terms in this model, suggesting an added importance of the substrate in the habitat selection process for this sub-region. When tested over the entire TestSite, such differences were masked by an overall medium performance of the model.
A nest abundance model established at the habitat scale was also tested. In TestIs, errors were normally distributed with the abundance model returning on average 1.6 nests more than calculated with NNCOUNT (min: −9, 1st quartile: −1, median: 2, 3rd quartile: 4, max: 30). In TestMtns,
Fig. 3 – ROC curves obtained with the data from the two types of nest habitats, TestIs and TestMtns, for 3 of the FitSite nest selection models and for the model fitted with the test data (MawsonNestTopoSubst). Sensitivity is the proportion of nests correctly predicted by the model. Specificity is the proportion of absences correctly predicted by the model, therefore 1 – specificity is the proportion of false presences returned by the model. The 45 degrees line represents the true positive and false positive values expected by chance; the larger the area between this line and model ROC curve (when above it), the better the model.
e c o l o g i c a l m o d e l l i n g 2 1 0 ( 2 0 0 8 ) 414–430
423
Fig. 4 – Visualization of model outputs for 3 Casey (FitSite) nest selection models applied at Mawson (TestSite) and the model created with TestSite data for a portion of the Northern Massom range (within TestMtns). Same 3D perspective as Fig. 1c. Locations of nests are dots (black or grey). The increasing probability of presence is represented in grayscale (from 0 in black to 1 in white).
abundance models did not perform as well and predicted abundances were generally underestimated by a factor of 2 or more (min: −71, 1st quartile −16, mean: −6, median: −4, 3rd quartile: 5, max: 40).
3.3. Comparison with models built with the TestSite nest data To highlight the environmental predictors, which induced significant differences in model performance between regions, variable coefficients were compared systematically between the models created at the FitSite and the models built with the same variables using the datasets collected at the TestSite. First, we considered the models, which were selected at the FitSite for having the highest explanatory power (Model 1 and 2, Fig. 5a and see Olivier and Wotherspoon, 2006). While MEANSLOPE was a major explanatory factor of nest distribution in the generally flat coastal areas, the presence of nests was mostly correlated with elevation (MEANDEM) in the TestMtns, (about 20% of the deviance explained across models). The sign of coefficients was also a potential indication of varying selection processes: while nest probability of presence increased with elevation at the FitSite, it decreased at both TestIs and TestMtns (negative coefficients). This trend was confirmed by the nest density distributions (Fig. 6) where 90% of nests were located at lower elevations (350–850 m over
a 350–1250 m range for TestMtns and 0–25 m over a 0 to 125 m range for TestIs). At the FitSite, neither slope nor elevation strongly correlated with nest presence although curvature (MEANCURV) did. The positive contribution of BOULDERBIG to nest presence was found similar between TestIs and TestMtns (explaining 64–70% of the deviance) and was 30% higher than for the selected FitSite models. The negative correlation of the binomial response with MORSED explained a large part of the deviance in the FitSite models (11–37%) but not at the TestSite where this substrate was generally not mutually exclusive of BOULDERBIG. The effect of SCREE was significant but low in the TestSite models. This substrate was not encountered in TestIs and present in far lower proportions in TestMtns than at the FitSite. Relative contributions of other terms varied but remained low. In a second stage, we compared the coefficients of models comprising all variables as main effects only (Fig. 5b). This simple model (Main effects) explained 68.4% of the deviance for TestMtns and 63.6% of the deviance for TestIs and therefore had a higher predictive power than at the FitSite (43.3%). The FitSite model was enhanced by interactive terms (those selected in models 1 and 2: deviance explained by SLOPE*ASPECT*SCREE or BOULDERBIG*2 whose deviance was minimal for the TestSite). No other major interactive term appeared in the stepwise selection process at the TestSite. Nest presence was negatively correlated to BARESUBST at both
424
e c o l o g i c a l m o d e l l i n g 2 1 0 ( 2 0 0 8 ) 414–430
Fig. 5 – Comparison of model coefficients and their proportion of deviance between 2 regions at Mawson (TestIs, in grey, TestMtns, in white) and FitSite (in black). (a) With the models selected at the FitSite for their predictive performance. (b) With simple models built with all main effects considered. Coefficient values for each area are reported on the Y-axes.
e c o l o g i c a l m o d e l l i n g 2 1 0 ( 2 0 0 8 ) 414–430
425
Fig. 6 – Density distribution of topographic parameters for the nests surveyed in TestIs, in the TestMtns and at the FitSite for comparison. Vertical axis: number of nests, horizontal axis: respective values of variables (units defined in Table 2).
TestIs and TestMtns, and to snow and moraine sediment at TestIs only. The comparison of the ROC curves (Fig. 3) indicated that the performance of the model locally created with the TestSite data compiled at the nest scale was higher than that of the habitat scale model transferred from the FitSite. Similar to the FitSite model in its prediction of absences (high
specificity values), the TestSite nest model had a higher sensitivity at medium and low specificity values, therefore having a higher capacity in predicting presences. The inclusion of a spatial dependence term (NESTDISTCODE) increased the proportion of deviance explained by the models (by about 3.5–69.9% for TestIs and by 12–81.5% for
426
e c o l o g i c a l m o d e l l i n g 2 1 0 ( 2 0 0 8 ) 414–430
TestMtns) but its contribution was lower than at the FitSite where it accounted for 25% of the total deviance with 58.4% of the deviance explained (Fig. 5b).
3.4.
same period) was negligible at TestSite. However, the occurrence of ice in nests of TestSite was more common (all aspect confounded, Fig. 7 shows aspect and nest conditions). Overall, snow blockage and aspect were unrelated at TestSite.
Nest characteristics compared between locations
Because the FitSite model did not return satisfactory predictive performance at the nest scale, nest selection may differ between FitSite and TestSite due to nest micro-characteristics. Therefore, we compared the distribution of nest characteristics between FitSite, TestIs, TestMtns in order to investigate diverging model results. Based on the calculation of the average nearest neighbour distance and count, the nest-centered 30 m radius densities were comparable between TestIs and TestMtns and between TestSite and FitSite with minor differences (Fig. 6). At the FitSite and at TestIs, the average distance to neighbours was about 15 m, but slightly higher for TestMtns, indicating a sparser distribution of nests, despite maximum densities reaching 80 nests (vs. 150 at the FitSite). The largest proportion of nests with no or few neighbours within 30 m was found in TestIs and maximum densities did no exceed 15 nests (respectively 13%, 1.6% and 4.4% of nests had no neighbours at TestIs, TestMtns and FitSite). While the FitSite was partitioned between low-density and high-density habitats, the distribution of densities was more homogeneous at the TestSite, potentially explaining the lesser effect of clustering (NESTDISTCODE) in the models. A higher diversity was present in the types of nest aggregation observed at the TestSite and indicated a lesser constraint of the substrate on nest dispersion (Olivier and Wotherspoon, 2006). Indeed, no strong relationships between clustering and habitat type (% cover in substrate) were identified at the TestSite. Scree was non-existent in TestIs, where landscape was dominated by bare substrate interrupted with sparse, small pockets of boulders where a few nests were “forced” close together (explaining a large proportion of nests 14.5% being unusually close to neighbours (Fig. 6, NESTDIST)). Cracks in the bare substrate also accounted for a large number of nests in TestIs (26.3%, vs. 12.6% in TestMtns and 18.7% at the FitSite and Fig. 7). Field observations confirmed that bare substrate was generally offering less potential cavities at the TestSite than at the FitSite probably due to stronger wind erosion, but cavities were used proportionally more in TestIs. However, nest size did not appear to be related to substrate type for the TestSite. Despite a variety of nest types, a strong uniformity was present in the level of concealement of nests. The distribution of cavity size and cavity depth was very similar between locations (Fig. 7). The optimal (or most frequently selected) slope at the nest scale was similar between FitSite and TestIs, but significantly steeper in TestMtns (Fig. 6). On the other hand, the nest aspects most frequently recorded at FitSite largely differed from those at TestSite and between TestIs and TestMtns (Fig. 7). In TestIs, nests SE to SW aspect were 46.4% of the nests despite their facing the exposed slopes (Southeasterly winds). In TestMtns, most common nest aspects were W to N aspects (37.1%). While a large number of nests were affected by snow blockage at FitSite (20%), the proportion of nests blocked or partially obstructed with snow (for data collected within the
4.
Discussion
4.1. Model predictions highlight differential habitat selection related to landscape differences Models established at the FitSite performed differently in the two TestSite habitats (TestIs and TestMtns). Creating separate models for TestIs and TestMtns helped characterizing the differences between these habitats and compare them to the FitSite. The varying relationship between nest selection and topography (aspect and elevation) expressed by model coefficients and their respective proportion of total model deviance highlighted small ecological differences between TestSite and FitSite, and also between TestIs and TestMtns. Despite its moderate contribution to model deviance, WASPECT was identified as an important variable in the comparison. While nest snow blockage at the FitSite was specifically avoided by nesting in areas exposed to the prevailing winds (Olivier and Wotherspoon, 2005), a large proportion of nests in TestMtns were located on the edge of the snow accumulation areas (commonly called “blizz-tails”) and facing North (away from the prevailing winds). Despite the presence of favorable substrates, few nests were found on slopes facing the prevailing winds (East facing slopes), except in protected gullies. Overall, there may be a trade-off between the chances of having a nest blocked by snow early in the summer and seeking better shelter from winds, which are generally stronger at the TestSite than at FitSite (average over the last 10 years: 14.5 knots for FitSite vs. 24 knots for TestSite, Bureau of Meteorology, 2005). Independence between nest snow blockage and aspect (Fig. 7) confirmed that the lower snow precipitations and stronger winds encountered at the TestSite limit snow deposition in general and in nests facing away from the wind in particular. In TestIs, the second largest proportion of nests was also recorded on North facing slopes. The majority of nests were located on the southern slopes of islands, which face the prevailing winds, therefore indicating a selection process different from that observed in TestMtns (confirmed by opposite signs of the coefficient attributed to WASPECT). Selection here may be explained by a second factor addressed at the FitSite (Olivier and Wotherspoon, 2005): the presence of nest cavities can also be correlated to WASPECT because of the erosion process, which predominantly generates cracks or boulders into the wind. Nest availability then determines nest selection despite a higher level of wind exposure.
4.2.
Evaluation and explanation of model robustness
While the model TOPOSubst predictions based on substrate variables performed best in TestMtns, its good predictive performance in TestIs relied mostly on topographic factors, due to the observed differences in substrate covers. Despite these variations, overall model predictions remained satisfactory for TestSite overall and for TestIs and TestMtns, separately.
e c o l o g i c a l m o d e l l i n g 2 1 0 ( 2 0 0 8 ) 414–430
427
Fig. 7 – Distribution of nest micro-characteristics for TestIs, TestMtns and FitSite. The nest aspect was recorded by the observer in the field. Proportions of various levels of snow blockage were plotted for each aspect, (codes for the three variables in Table 2: Accessible (A), Accessible with snow or ice, AS, AI; blocked with snow or ice, BS, BI). Vertical axis: number of nests, horizontal axis: respective category of variables (as defined in Table 2: 1B: nest under one boulder, 2B: nest under two boulders, Crack: nest in a crack of mother rock, Flat rock: nest under flat rock).
The models establishing selection at the habitat scale (e.g. with variables averaged over habitat units) for the FitSite performed better at the TestSite than in their training area. The robustness of the FitSite model was due to the fact that the multiple logistic regression coefficients expressed selection trends intermediate between those observed in TestIs and TestMtns. This explains its overall good performance. The choice of probability thresholds and, more generally, the choice of several alternative methods to validate the models may also help achieving a better evaluation of overall model performance. Conducting the validation phase with new data was most valuable to detect such differences in the variety of models applied to interpret and predict habitat selection (Verbyla and Litaitis, 1989). The GLM approach adopted as our main modeling strategy, appeared appropriate for extrapolation at the large scale. GLM’s robustness is based on regression coefficients that multiply the values of each term in the model, thus allowing potentially large variations of environmental factors to be incorporated and their effect on habitat selection detected (large differences in the proportions of certain substrates, for example). On the other
hand, the use of classification tree (CT) models with data compiled at the nest scale returned satisfactory results. However, the use of this threshold-based modeling procedure has to be taken with caution and may not be transferable in general because local thresholds determining habitat selection may vary between regions (here, elevation range varies between FitSite, TestIs and TestSite).
4.3.
Model scale versus species range
As shown by Karl et al. (2000), the scale at which the data were compiled to generate the original habitat selection models affected the transposability of the FitSite models to the TestSite. The effect of scale on the predictive performance of habitat selection models in new areas was detected and emphasized in prior studies (Luck, 2002a; Thuiller et al., 2004) and varies depending on the species considered. Either nest site quality mostly defines the niche of the species (Hooge et al., 1999) and is found to be homogeneous between separate regions, or the habitat selection process may operate based on the characteristics of wider areas (ex: mountain range,
428
e c o l o g i c a l m o d e l l i n g 2 1 0 ( 2 0 0 8 ) 414–430
Storch and Frynta, 1999). With the latter, models established at coarser scale return better results if local variability is too high. In our case, the selection process operating at the nest level seemed to involve environmental factors exhibiting a higher variability between locations around the Antarctic. Despite a relative homogeneity in the average nest size between locations (which may be a trait of the snow petrel’s niche), the comparison of the distribution of nest micro-characteristics between FitSite, TestIs and TestMtns confirmed detectable differences, in nest entrance aspect (see above) and especially in the type of nests used. A larger than average proportion of cracks was used as nests in TestIs, making the distribution of nest types atypical in this area. The habitat unit scale was therefore considered more appropriate to extrapolate snow petrel habitat selection in East Antarctica. In the TOPOSubst model, a certain homogeneity was conferred to the habitat selection process at the large scale because at all three locations. Selected habitats were positively correlated to the same substrates type (BOULDERBIG and SCREE). However, at TestIs, SCREE and BOULDERBIG were in short supply and replaced by a high proportion of bare substrate on the islands (SCREE was in fact inexistent at TestIs). Instead, BARESUBST was positively related to nest presence because it also provided available nest cavities on South facing slopes. This is an example where alternative substrates can also provide suitable habitat, potentially as a result of adaptation. This change in selection associated with availability is referred to as functional response (Mysterud and Ims, 1998). The recent occupation of such nests in TestIs supported the hypothesis that snow petrels adapted to occupy cracks because they are the dominant type of suitable nest cavities available there. In other Antarctic areas (A. Peninsula), snow petrels may even be found nesting on open ledges (like Cape Petrels Daption capense) when it is the only available habitat (Monteath, 1996). Furthermore, avian species can exhibit a considerable range-wide variation not only in habitat selection patterns but also between individuals who can vary their perceptions of habitat quality (Jones, 2001). GLM effectively model the ecological (realized) niche rather than the fundamental niche due to their intrinsic empirical nature (Guisan and Zimmermann, 2000), potentially capturing locally many factors other than the species’ range of tolerance to environmental variables (such as competition: Leathwick and Austin, 2001). The use of new habitats such as nest cracks (some snow petrels are nesting in cracks very similar in size and space to those used by Wilson’s storm petrels nearby) may be the result of an adaptation to local conditions, which extends the species’ range. It is important to establish models in a situation that is less specific/most representative of the species potential range to later maximize presence predictions in new areas. The predictive performance of habitat selection models may depend on the difference between the absolute/overall range of tolerance of the species for the set of environmental factors attributed to nest selection and the species’ range observed in the specific area where the original selection model was implemented. At the FitSite (Casey), snow petrel niche was wide enough to encompass a variety of situations, which enabled accurate prediction of presence in two different environments, where the species observed range differed. A model built with data
collected in TestIs only, for example, may be too “localized” (Boyce et al., 2002), preventing the correct capture of the entire environmental range of the species (Thuiller et al., 2004). As a result, it would have a lesser predictive value when applied back to FitSite area. This illustrates the danger of over-fitting the models to region-specific species-habitat relationships (Luck, 2002b). On the other hand, by maximizing species range in the data used for model building, we run the risk of creating models with a low discriminative power. Falsely satisfactory results may be obtained with models that over predict areas of potential nest presence but do not discriminate absences (such as NestTOPOSubst). Spatially partitioning data prior to analysis was in fact suggested in order to improve the fit of the models predicting species presence over large geographical areas (Osborne and Suarez-Seoane, 2002). The separation of Holme Bay and Framnes Mountains habitats was an application of this type of partitioning as it was possible to estimate coefficients for separate localities (Boyce et al., 2002).
4.4. Conclusion: predictive performance and model applications Finally, when estimating predictive performance, the intended use of the models also has to be considered. One potential use of habitat selection models for snow petrels is to obtain accurate population estimates for monitoring and management (Olivier and Wotherspoon, 2005). To do so, we aimed to reliably discriminate the habitats where snow petrel nests were absent in order to focus population surveys and abundance estimations on the areas where snow petrel nests may be present. Therefore, we may favor models, which return the best absence predictions (higher specificity). Validation confirmed that abundance models were of limited applicability by themselves and would be best used in conjunction with the binomial presence-absence models, to guide field surveys as an indication of higher density habitats. Despite their possible lack of accuracy, the large-scale application of habitat selection models may be also be useful to guide the preparation of population surveys in new areas by helping the interpretation of topographic and substrate information (if available from aerial photography), for the realization of stratified sampling designs for example. On the other hand, the snow petrel is widely distributed around Antarctica and probably a useful species to monitor some effects of climate change on snow and ice cover. In this case, a presence-oriented model may be more appropriate to study and predict effects of climate change on habitats, nest selection and subsequently on snow petrel populations. Despite the diversity of habitats in which snow petrels are encountered, this study showed that it was possible to predict nest presence satisfactorily, based on topography and substrate. Model testing confirmed the ubiquitous nature of the predictors included in the models, suggesting that the selection mechanisms highlighted are probably consistent around the Antarctic. It is therefore adequate to relate habitat selection to a set of environmental factors, which, when subjected to changes of global nature (warming), may directly require the adaptation of the species.
e c o l o g i c a l m o d e l l i n g 2 1 0 ( 2 0 0 8 ) 414–430
The application of habitat selection models is of course not limited to the snow petrel. Such models offer potential to understand the ecology and predict the distribution of other poorly known (Sub-)Antarctic species such as the Wilson’s storm petrel (Oceanites oceanicus). It can also be applied to other types of Antarctic habitats important to seabirds, such as sea-ice and oceanic foraging grounds. In every instance, the approach to estimate the performance of predictive spatial distribution models will depend in their purpose.
Acknowledgements Many thanks to Wade Fairley, who participated in long and hard hours of data collection at Mawson for 2 months during the summer 2004–2005 and to Dr George Jackson for his review of the manuscript. We thank all Mawson expeditioners from 2004/2005 summer for their assistance and support. The Australian Antarctic Division provided logistical support for the survey (Project ASAC 2704). GPS equipment was kindly supplied by the Australian Antarctic Data Center, along with the most knowledgeable help of David Smith and Roger Handsworth. Rupert Summerson produced the digital elevation models. Part of this study was supported by a grant from the Australian Geographic Society.
references
Akaike, H., 1973. Information theory as an extension of the maximum likelihood principle. In: Petrov, N.B., Csaki, F. (Eds.), Second International Symposium on Information Theory. Akademiai Kiado, Budapest. ´ Araujo, M.B., Pearson, R.G., Thuiller, W., Erhard, M., 2005. Validation of species-climate impact models under climate change. Global Change Biol. 11, 1504–1513. Boyce, M.S., McDonald, L.L., 1999. Relating populations to habitats using resource selection functions. Trends Ecol. Evol. 14, 268–272. Boyce, M.S., Venier, P.R., Nielsen, S.E., Schmiegelow, F.K.A., 2002. Evaluating resource selection functions. Ecol. Model. 157, 281–300. Breiman, L., Freedman, J., Olshen, R., Stone, C., 1984. Classification and Regression Trees. Wadsworth. Brotons, L., Thuiller, W., Araujo, M.B., Hirzel, A., 2004. Presence-absence versus presence-only modelling methods for predicting bird habitat suitability. Ecography 27, 437–448. Brown, D.A., 1966. Breeding biology of the Snow Petrel Pagodroma nivea (Forster). ANARE Sci. Rep. Serie B. Zool 1, 1–63. Bureau of Meteorology, 2005. Climate averages for Antarctic sites - Mawson. http://www.bom.gov.au/climate/averages/tables/ cw 300001.shtml. Burnham, K.P., Anderson, D.R., 2002. Model Selection and Multimodel Inference. A Practical Information-Theoretic Approach. Springer. Croxall, J.P., Steele, W.K., McInnes, S.J., Prince, P., 1995. Breeding distribution of the snow petrel Pagodroma nivea. Mar. Ornith. 23, 69–99. De’Ath, G., Fabricius, K.E., 2000. Classification ands regression trees: a powerful yet simple technique for ecological data analysis. Ecology 81, 3178–3192. Doligez, B., Cadet, C., Danchin, E., Boulinier, T., 2003. When to use public information for breeding habitat selection? The role of
429
environmental predictability and density dependence. Anim. Behav. 66, 973–988. ESRI, (2002) Arcmap 8.3 Software. Copyright 1999-2002 All right reserved. http://www.esri.com. Fielding, A.H., 2002. What are the appropriate characteristics of an accuracy measure. In: Scott, J.M., Heglund, P.J., Morrison, M.L., Haufler, J.B. (Eds.), Predicting Species Occurrence—Issues of Accuracy and Scale. Island Press, Washington. Fielding, A.H., Bell, J.F., 1997. A review of methods for the assessment of prediction errors in conservation presence/absence models. Environ. Conserv. 24, 38–49. Fielding, A.H., Haworth, P.F., 1995. Testing the generality of bird-habitat models. Conserv. Biol. 9, 1466–1481. Frescino, T., Edwards, T.C., Moisen, G.G., 2001. Modelling spatially explicit forest attributes using generalized additive models. J. Veg. Sci. 12, 15–26. Gross, J.E., Kneeland, M.C., Reed, D.E., Reich, R.M., 2002. GIS-Based habitat models for mountain goats. J. Mammal. 83, 218–228. Guisan, A., Thuiller, W., 2005. Predicting species distribution: offering more than simple habitat models. Ecol. Lett. 8, 993–1009. Guisan, A., Zimmermann, N.E., 2000. Predictive habitat distribution models in ecology. Ecol. Model. 135, 147–186. Hooge, P.N., Stanback, M.T., Koenig, W.D., 1999. Nest site selection in the Ackorn Woodpecker. Auk 116, 45–54. Johnson, C.J., Seip, D.R., Boyce, M.S., 2004. A quantitative approach to conservation planning: using resource selection functions to map the distribution of mountain caribou at multiple spatial scales. J. Appl Ecol. 41, 238–251. Jones, J., 2001. Habitat selection studies in avian ecology: a critical review. Auk 118, 557–562. Karl, J.W., Heglund, P.J., Garton, E.O., Scott, J.M., Wright, N.M., Lutto, R.L., 2000. Sensitivity of species-habitat relationship model performance to factors of scale. Ecol. Appl. 10, 1690–1705. Keitt, T.H., Bjornstad, O.N., Dixon, P.M., Citron-Pousty, S., 2002. Accounting for spatial pattern when modelling organism-environment interactions. Ecography 25, 616–625. Leathwick, J.R., Austin, M.P., 2001. Competitive interactions between tres species in New Zealands’s old-growth indigenous forests. Ecology 82, 2560–2573. Lindenmayer, D.B., Cunningham, R.B., Donnely, C.F., 1994. The conservation of arboreal marsupials in the montane ash forests of the central highlands of Victoria, south east Australia: 6. The performance of statistical models of the nest tree and habitat requirements of arboreal marsupials applied to new survey data. Biol. Conserv. 70, 143–147. Luck, G.W., 2002a. The habitat requirements of the rufus treecreeper (Climacteris rufa). 1. Preferential habitat use demonstrated at multiple spatial scales. Biol. Conserv. 105, 383–394. Luck, G.W., 2002b. The habitat requirements of the rufus treecreeper (Climacteris rufa). 2. Validating predictive habitat models. Biol. Conserv. 105, 395–403. Manel, S., Ceri Williams, H., Ormerod, S.J., 2001. Evaluating presence-absence models in ecology: the need to account for prevalence. J. Appl. Ecol. 38, 921–931. Manly, B.F.J., McDonald, L.L., Thomas, D.L., 1993. Resource Selection by Animals: Statistical Design and Analysis for Field Studies. Chapmann & Hall, London. Manly, B.F.J., McDonald, L.L., Thomas, D.L., McDonald, T.L., Erickson, W.P., 2002. Resource Selection by Animals: Statistical Design and Analysis for Field Studies, 2nd Ed. Chapmann & Hall, London. Mason, S.J., Graham, N.E., 2002. Areas beneath the relative operating characteristics (ROC) and levels (ROL) curves: statistical significance and interpretation. Q. J. R. Meteorol. Soc. 128, 2145–2166.
430
e c o l o g i c a l m o d e l l i n g 2 1 0 ( 2 0 0 8 ) 414–430
Mladenoff, D.J., Sickley, T.A., Wydeven, A.P., 1999. Predicting gray wolf landscape recolonization: logistic regression models vs new field data. Ecol. Appl. 9, 37–44. Monteath, C., 1996. Antarctica Beyond the Southern Ocean. Harper Collins, Auckland. Mysterud, A., Ims, R.A., 1998. Functional responses in habitat use: availability influences relative use in trade-off situations. Ecology 79, 1435. Olivier, F., Wotherspoon, S.J., 2005. GIS based applications of resource selection functions to the prediction of snow petrel distribution and abundance in East Antarctica: comparing models at multiple scales. Ecol. Model. 189, 105–129. Olivier, F., Wotherspoon, S.J., 2006. Modelling habitat selection using presence-only data: case study of a colonial hollow nesting bird, the snow petrel. Ecol. Model. 195, 187–204. Olivier, F., Lee, A., Woehler, E.J., 2003. Distribution and abundance of snow petrels Pagodroma nivea in the Windmill Islands, East Antarctica. Polar Biol. 27, 257–265. Osborne, P.E., Suarez-Seoane, S., 2002. Should data be partitioned spatially before building large-scale distribution models? Ecol. Model. 157, 249–259. Pearce, J., Ferrier, S., 2000. Evaluating the predictive performance of habitat models developed using logistic regression. Ecol Model. 133, 225–245. Peterson, A.T., Ortega-Huerta, M.A., Bartley, J., Sanchez-Cordero, V., Soberon, J., Buddenmeier, R.H., Stockwell, D.R.B., 2002. Future projections for mexican faunas under global climate change scenarios. Nature 416, 626–629. Rushton, S.P., Ormerod, S.J., Kerby, G., 2004. New paradigms for modelling species distributions? J. Appl. Ecol. 41, 193–200.
Ryan, P.G., Watkins, B.P., Lewis Smith, R.I., Dastych, H., Eicker, A., Foissner, W., Heatwole, H., Miller, W.R., Thompson, G., 1989. Biological survey of Robertskollen, western Dronning Maud Land: area description and preliminary species lists. S. Afr. J. Ant. Res. 19, 10–20. Stillman, R.A., Brown, A.F., 1994. Population sizes and habitat sizes of upland breeding birds in the South Pennines, England. Biol. Conserv. 69, 307–314. Storch, D., Frynta, D., 1999. Evolution of habitat selection: stochastic acquisition of cognitive clues? Evol. Ecol. 13, 591–600. Suarez-Seoane, S., Osborne, P.E., Alonso, J.C., 2002. Large-scale habitat selection by agricultural steppe birds in Spain: identifying species-habitat responses using generalized additive models. J. Appl. Ecol. 39, 755–771. Thuiller, W., 2003. BIOMOD: optimising predictions of species distribution and projecting potential future shifts under global change. Global Change Biol. 9, 1353–1362. Thuiller, W., Brotons, L., Araujo, M.B., Lavorel, S., 2004. Effects of restricting environmental range of data to project current and future species distribution. Ecography 27, 165–172. van Franeker, J.A., Bell, P.J., Montague, T.L., 1990. Birds of Ardery and Odbert islands, Windmill Islands, Antarctica. EMU 90, 74–80. Verbyla, D.L., Litaitis, J.A., 1989. Resampling methods for evaluating classificationa accuracy of wildlife habitats models. Environ. Manage. 13, 783–787. Zaniewski, A.E., Lehmann, A., Overton, J.M., 2002. Predicting species spatial distributions using presence-only data: a case study of native New Zealand ferns. Ecol. Model. 157, 261–280.