J. Great Lakes Res. 29 (Supplement 1):183–189 Internat. Assoc. Great Lakes Res., 2003
Agreement Among Observers Classifying Larval Sea Lamprey (Petromyzon marinus) Habitat Katherine M. Mullett1,* and Roger A. Bergstedt2 1U.S.
Fish and Wildlife Service Marquette Biological Station 1924 Industrial Parkway Marquette, Michigan 49855-1699 2U.S.
Geological Survey, Great Lakes Science Center Hammond Bay Biological Station, 11188 Ray Road Millersburg, Michigan 49759
ABSTRACT. Estimates of larval sea lamprey (Petromyzon marinus) abundance are used to rank Great Lakes tributaries for lampricide treatment. Observers subjectively stratify habitat into three categories: type I = preferred, type II = acceptable, type III = unacceptable. Agreement was evaluated among eight observers classifying habitats in small discrete plots in two Lake Superior tributaries, the Rock and Chocolay rivers, and among four observers classifying and measuring the amount of each habitat type along random transects in the Rock River. Agreement among the eight observers classifying habitat plots was high (Chocolay, κ = 0.742 and Rock, κ = 0.785). The amounts of types I, II, and III habitat estimated were statistically different among observers. However, the amount of variability found in the classification and measurement of habitat by observers had little effect on the ranking of 51 streams considered for lampricide treatment. INDEX WORDS:
Observer agreement, sea lamprey, larval habitat classification.
INTRODUCTION Sea lampreys (Petromyzon marinus) are one of the most destructive of the exotic fish species that invaded the Great Lakes (Smith 1968). Although several alternatives for controlling sea lampreys have been developed over the course of the management program (Applegate et al. 1961, Hanson and Manion 1980, Hunn and Youngs 1980, Lavis et al. 2003, Twohey et al. 2003), treating their larval stage in streams with the lampricide 3-trifluoromethyl-4-nitrophenol (TFM) remains the primary control tool (Brege et al. 2003). Because application of TFM targets the larval stage, assessment of larval sea lamprey populations is essential to sea lamprey management. During 1953 to 1988, most larval assessments produced qualitative indices that monitored when
*Corresponding
populations were re-established following TFM treatments, along with their relative abundance, growth rates, and spatial distributions. The selection of a stream for lampricide treatment was based on insights about its production of larvae and past treatment history, coupled with indications of the relative abundance of large larvae relative to other streams (Koonce et al. 1993, Slade et al. 2003). During 1982, the Great Lakes Fishery Commission formally adopted a policy of integrated management of sea lampreys, changing the target level of control from maximum suppression to management at the economic injury level (Koonce et al. 1993). The policy recognized that analytical models were needed to set control goals and allocate resources among streams across the entire Great Lakes basin. The use of models required a transition from qualitative to quantitative larval assessments to produce the estimates of larval sea lamprey abundance used to rank streams. Because larval sea lampreys are not randomly
author. E-mail:
[email protected]
183
184
Mullett and Bergstedt
distributed within a stream, stratified sampling showed promise for increasing the precision of population estimates (Elliott 1993). During 1988–1995, a quantitative assessment sampling (QAS) protocol was developed that stratified the stream area for sampling based on qualitative descriptions of larval sea lamprey habitat (Slade et al. 2003). The QAS protocol provided a template for stratification that has two components. The first component is the classification and measurement of habitat. The second component is the electrofishing sample tied to each habitat type that measures larval density. Under the QAS protocol, population estimates are calculated from the estimated stream area in each of three habitat types and the estimated density of larvae from electrofishing samples taken in each habitat type (Slade et al. 2003). To divide the stream area into sampling strata, field personnel subjectively divide the stream bottom along randomly placed transects into three habitat types based on lamprey preference (I = preferred, II = acceptable, III = unacceptable, Dustin et al. 1989 ). Although the habitat types were defined in only descriptive terms, data collected over several years supported the utility of the definitions in terms of relative larval abundance among types (Slade et al. 2003). However, assigning habitat type is subjective, and it is important to ensure consistency among observers. During 1996, a study was conducted to examine the level of agreement among field personnel who subjectively assigned habitat type. The objectives of this study were to measure agreement among observers in the classification of a fixed set of marked habitat plots and in the classification and measurement of habitat along a set of randomly chosen transects. METHODS This study was designed to address the classification of habitat component in two ways. The first, and of fundamental importance, was to measure the agreement among observers classifying habitat in small discrete plots (habitat classification). The second, and of equal practical importance, was to measure agreement among observers who classified segments of habitat along transects. To do this, observers identified transition points between different habitat types and measured the amounts of each type (habitat measurement).
Habitat Classification The kappa statistic has been used to measure the degree of agreement between raters making subjective decisions and takes into consideration the extent of agreement due to chance alone. It was initially used in cases where two raters each evaluated the same subject (Cohen 1960). Fleiss (1981) developed a generalized kappa for cases where there were multiple raters, the raters of one subject were not necessarily the same as those of another subject, and the number of categories to which subjects were assigned was greater than two. Habitat was qualitatively divided into three categories (types I, II, or III) based on substrate components and whether it is loosely compacted enough to permit burrowing of larvae. Type I habitat (preferred by larvae) is loosely compacted to permit burrowing. It consists of sand, fine organic matter, and detritus or aquatic vegetation and is usually found in depositional areas of the stream. Type II habitat is acceptable for burrowing but is not preferred by larvae if type I is available. It consists of sand and may contain some gravel and rubble but has little fine organic matter. Type III habitat is unacceptable for burrowing, consisting of bedrock, hardpan clay, rubble, or coarse gravel (Slade et al. 2003). Agreement was evaluated among individual observers assigning habitat type to discrete plots of stream bottom in two Lake Superior tributaries, the Chocolay (Marquette County, MI) and Rock (Alger County, MI) rivers. The Chocolay River study area was a 740 m section of stream and the Rock River study area was a 1,194 m section of stream. The two streams were different in character. The Rock River was sinuous with a mix of habitat types. Water depth varied from a few cm to about 1.20 m, and woody debris was frequently found throughout the stream. The study area in the Chocolay River was a relatively straight stretch of stream that consisted mostly of rubble, boulder, and gravel. Water depth was less than 0.60 m deep and woody debris was sparse. For each study area, a group of eight observers was selected at random from a pool of 20 field personnel and two sets of random numbers were generated to identify 90 sample locations. The first set of random numbers (decimals constrained between 0 and 1) was multiplied by the length of the study area to provide the distance of each plot from the upstream limit. The second set of random numbers (also decimals constrained between 0 and 1) was multiplied by stream width at each site to determine the distance from the bank. The day before observa-
Classification of Larval Sea Lamprey Habitat tion, locations were marked with a numbered brick. At the time of inspection, a 30 × 50 cm frame was placed around the numbered brick to define the plot of substrate to be classified. Observers walked the study area (as a group) and secretly recorded their individual observations of habitat type for each plot. The observers had to visit the sample sites as a group because they often needed to feel the substrate with their hands to assign a classification. If each observer visited the site independently, the manual inspection would cause a change in habitat so that a subsequent observer might not view an identical sample. Each observer visually inspected the plot, probed it with a metal stake, and sometimes felt the substrate with his or her hands. When any observer desired to feel the substrate, a sample was retrieved and all observers had the opportunity to view and touch the retrieved sample. Agreement was assessed with the kappa statistic (Fleiss 1981). The measure of agreement within an individual category was n
κˆ j =
xij (m − xij ) ∑ i =1
(1)
nm(m − 1)p j q j
the composite measure of agreement across all categories was n
κˆ j = 1 −
k
nm 2 − ∑ ∑ xij2 i =1 j =1 k
(2)
nm(m − 1)∑ p j q j j =1
the standard error of κ∆j was s.e.0 (κˆ j ) =
2 nm(m − 1)
(3)
and the standard error of κµ was
s.e.0 (κˆ ) =
2 k k 2 ∑ p j q j − ∑ p j q j (q j − p j ) j =1 j =1 k
pjqj ∑ j =1
nm(m − 1)
where: κ∆j= value of kappa for category j xij = number of ratings on subject i into category j m = number of ratings (observers) n = number of subjects (plots)
(4)
185
k = number of categories pµj = overall proportion of ratings in category j, and qµj = 1- pµj Kappa equaled 1 if there was complete agreement; was between 0 and 1 if agreement was greater than or equal to chance agreement (p < 0.01); and was less than 0 if agreement was less than chance agreement. Landis and Koch (1977) and Fleiss (1981) provided arbitrary benchmarks for interpreting both kappa statistics: < 0.40 (poor), 0.41–0.75 (fair to good), > 0.75 (excellent). Habitat Measurement Agreement in habitat measurement was assessed by having four observers classify segments of habitat along transects, identify the transitions from one habitat type to another, and measure the length of each habitat segment to estimate the total amount of types I, II, and III in a stream. The study area was a 9,721 m section in the Rock River. Eighty random numbers (decimals constrained between 0 and 1) were multiplied by the length of the study area to provide the distance from the upstream limit where transects were located. Transects were oriented perpendicular to the shoreline and were marked at least 1 day in advance with stakes on each shore. Observers worked individually and completed the 80 transects in six days. All observers assessed the same transects on the same day, and the order of observers was varied daily. As an observer reached a transect, each attached a tape measure to the stake on the left bank (facing upstream) and stretched it across the stream. The stretched tape was the transect. The observer then walked along the transect and visually observed the substrate lying directly beneath the transect line. The observer probed it with a metal stake or felt it with their hands or feet, and separated the transect into linear segments of types I, II, or III habitat. A change in habitat greater than 0.1 m in length along the transect began a new segment. The length of each segment was measured to the nearest 0.1 m. Segments with the same habitat type were summed to calculate the total length of types I, II, and III habitat along each transect. The sum of the segment lengths along each transect was also calculated to provide stream width. Although lengths were measured along transects, the measures
186
Mullett and Bergstedt
related to subdividing stream width and are hereafter referred to as habitat width. For each observer, the mean widths of each habitat type were calculated from all transects and a randomized block analysis of variance (Sokal and Rohlf 1981) was used to test for differences among the mean widths of each habitat type measured by each observer yij = µ + βi + τj + eij
where: yij µ βi τj eij
= = = = =
(5)
observed value (habitat width) mean response block effect (transect) treatment effect (observer), and residual effect
Stream Treatment Ranking All streams surveyed in a given year are ranked for treatment based on the cost per metamorphosing sea lamprey killed. In the stream treatment selection model, the total larval abundance for each stream is estimated by multiplying the stream’s larval density (larvae/m 2 ) by the square meters of an adjusted value for type I habitat (in the stream selection model during 1997, this was defined as the sum of type I habitat and 10% of the type II habitat; Schleen et al. 1996, Slade et al. 2003). To estimate the number of larvae that actually metamorphosed and left the stream, the total larval abundance was multiplied by the proportion of sea lamprey larvae predicted to metamorphose (Slade et al. 2003). That result was further divided by the cost to treat the stream to calculate the cost per metamorphosing sea lamprey removed. A list of the streams in ascending order of cost per metamorphosing sea lamprey removed served as the ranked list of streams scheduled for treatment the following year. Larvae in the stream that are not metamorphosing are not considered in the stream ranking process. The amount of time and dollars available dictated the number of streams that would be treated. To assess the effect of the variability in habitat measurement demonstrated by the four observers, a sensitivity analysis was used that changed the habitat estimates in the stream treatment selection model based on observed variability. Monte Carlo simulations replaced the original adjusted habitat values of individual streams with random values from a normal distribution that reflected the variability demonstrated by the four observers. Abundance of all larvae, abundance of metamorphosing
larvae, and costs per metamorphosing larvae removed were re-calculated based on this variability. The new costs per metamorphosing larvae removed were used to re-rank the streams from the original list and the top 29 were scheduled for treatment (during 1997, the amount of time and dollars available dictated that 29 streams would be treated). RESULTS Habitat Classification Based on the benchmarks given by Landis and Koch (1977), agreement calculated across all habitat types was good (κ = 0.742) in the Chocolay River and excellent (κ = 0.785) in the Rock River (Fig. 1). Of the 90 plots in the Chocolay River, 56 had complete agreement among the eight observers, 15 had one observer disagreeing, and 19 had greater disagreement. Of the 90 plots in the Rock River, 57 had complete agreement, 18 had one observer disagreeing, and 15 had greater disagreement. Certain combinations of sediments seemed to be present when more than one observer disagreed. The first combination consisted of very fine sand and silt, but contained no detritus or vegetation. This occurred in six plots in the Rock River and nine plots in the Chocolay River and more were classified as type I than type II (70 vs. 50). The second
FIG. 1. Kappa statistic (± 2 standard errors) from habitat classifications by eight observers examining 90 sample plots in the Rock and Chocolay rivers. The horizontal line at κ = 0.75 denotes the boundary between excellent and fair to good agreement.
Classification of Larval Sea Lamprey Habitat combination consisted of very coarse sand, but had a substantial amount of detritus, vegetation, or overlying fine organic matter. This occurred in seven plots in the Rock River with almost equal numbers of plots classified as type I or type II (27 vs. 29). The third combination consisted of very coarse, shallow (5 to 8 cm deep) sand amid rubble and/or boulders. This occurred in 11 plots in the Chocolay River. In 10 of those plots, observers were almost evenly divided between types II and III (38 for type II vs. 42 for type III). In the eleventh plot, nearly equal numbers of observers classified it as each of the three habitat types (three, two, and three observers classifying it as type I, II, or III, respectively). The substrate in this plot largely fit the definition of type I habitat. However, the substrate surface had a compacted peat-like covering that would have been difficult for a larval sea lamprey to penetrate. The substrate in one plot in the Chocolay River consisted of bedrock covered by 4 cm of fine organic material. Of the eight observers, seven classified it type III and one classified it type I. Although training can reduce disagreement among observers, these types of situations will continue to introduce some error into our stratification. Habitat Measurement Mean width of the stream and the mean width for each habitat type were calculated from the transect data collected by each observer. Although there was evidence of non-normality in the distribution of residuals, violation of this assumption has little effect on the significance level of the F-test (Miller 1986) Classification of the amount of each type of larval habitat differed significantly (Table 1 and Fig. 2) among observers. The greatest difference in measurements was 1.7 m in type II habitat. The maximum difference in the type I measured was 1.3 and type III was 0.5. Stream width was also significantly different among the observers even though the greatest difference was only 0.1 m. Stream Treatment Ranking The practical significance of the variability in habitat classification was evaluated by the (potential) change in the ranking of streams for lampricide treatment, based on a change in the amount of adjusted type I habitat, and consequently, a change in the abundance estimate. The streams in the treatment selection model were ranked based on available habitat, estimated density of larval sea lam-
187
TABLE 1. Results of four analyses of variance relating habitat width (m) to observer with transect as a blocking variable. One analysis was conducted for each habitat type in the Rock River: I, II, III, and the adjusted type I habitat (sum of type I and 10% of the type II). The coefficient of determination (R2), mean square error (MSE), and the F-ratio and p value for the effect of observer are reported for each analysis. Habitat Type I Type II Type III Adjusted type I
R2 0.81 0.80 0.96
MSE 29.12 41.81 6.65
F-ratio 21.95 28.09 15.76
P < 0.01 < 0.01 < 0.01
0.83
22.86
20.65
< 0.01
preys, number of sea lampreys predicted to metamorphose, and the cost to remove metamorphosing sea lampreys (Christie et al. 2003, Slade et al. 2003). Because the amount of habitat in the model is combined with the larval sea lamprey density to provide total larval abundance, changes in the amount of habitat affect the larval estimates and likewise the estimated number of sea lampreys estimated to metamorphose. A mean adjusted type I habitat width (sum of type I habitat and 10% of the type II habitat) was determined for each observer and converted to a proportion. The mean proportion of adjusted type I habitat for the four observers was 0.40 with a standard deviation of 0.06 (Table 2). Monte Carlo simulations were run with the adjusted type I habitat values for 51 individual streams randomly drawn from a normal distribution based on the variability demonstrated by the four observers (Table 2). In all 1,000 simulations, the first 24 streams from the original ranking retained their treatment status (remained above the 29 stream cut-off line). The few streams at the end of the list that were affected during the simulations would still have had > 50% probability of remaining on the treatment list. Any streams that were originally below the cut-off line had < 40% probability of moving above the cut-off line on the treatment list. Streams that moved in or out of the block of streams scheduled for treatment were originally ranked in positions 25 through 35. DISCUSSION Like Roper and Scarnecchia (1995) and Hannaford et al. (1997), training was an important part
188
Mullett and Bergstedt
FIG. 2. Mean widths (± S.D.) for habitat types IIII and the mean adjusted type I habitat width calculated from measurements by four observers examining 80 transects in the Rock River.
TABLE 2. Proportion of type I, type II, and adjusted type I (sum of type I and 10% of the type II) habitat for each observer, and the mean, standard deviation, and CV for four observers who classified and measured habitat at 80 transects in the Rock River.
Observer 1 Observer 2 Observer 3 Observer 4 Mean Standard Deviation CV
Proportion type I 0.26 0.41 0.32 0.40 0.35
Proportion type II 0.61 0.40 0.49 0.47 0.49
Proportion adjusted type I habitat 0.32 0.45 0.37 0.45 0.40
0.07 20%
0.09 18%
0.06 15%
of observer agreement. Hannaford et al. (1997) showed that a group of observers who conducted stream habitat assessments who received hands-on training demonstrated less variability than a group of observers who were given only a form with definitions of technical terms. To make the stratification process used to estimate larval sea lamprey abundance useful, the sea lamprey management program relies on consistent habitat classification. Sea lamprey management field personnel attend annual training sessions that consist of stream visits by groups of about ten field personnel who individually assign classification to plots similar to the methods described in this paper (habitat classification) and discuss observations until agreement is reached. Roper and Scarnecchia (1996) studied the ability of trained observers to classify habitat into pools, riffles, and glides, and then further classify pools and riffles into one of five subgroups. They found that certain physical characteristics of streams could influence the consistency with which habitat types are classified. In this study, the two rivers were different in character. In the Rock River, type I habitat was distributed throughout the width of the stream and most of the plots were classified as either type I or type II habitat. Type III habitat observed was usually bedrock, boulder, or tightly packed rubble and gravel. Unlike the Rock River, the Chocolay River plots were mostly type III habitat. Any type I habitat encountered was along the margin of the stream, usually within 0.3 to 0.6 m of the bank. Type III habitat was usually boulders or boulders and large rubble with shallow pockets of coarse sand and gravel intermixed. Although we only have the two streams to compare results, we did not find that observers agreed more in one stream over the other. After 1997, a change was made to how much type II habitat contributed to larval sea lamprey abundance. While initial data supported including 10% of type II habitat in the adjusted type 1 habitat, analysis conducted with additional years data indicated that 27.5% of the type II habitat should be used (Slade et al. 2003). The 10% adjusted type I habitat values in the stream selection process were replaced with 27.5% adjusted type1 values and the streams were re-ranked. Streams moved from one to four positions from their rank in the original list, but most of the streams that were originally ranked for treatment retained their status. The only exceptions were the two streams originally ranked as 28 and 30. These two streams traded positions, with one moving below and the other above the cut-off line.
Classification of Larval Sea Lamprey Habitat Analysis of how the variability among observers affected the stream treatment ranking was year specific (1997); each year a new set of streams and their associated larval estimates and costs would need to be considered. The analysis suggested that the variability in agreement among observers, while statistically significant, did not seriously affect the ranking of streams for treatment. Streams at the top of the treatment list generally have the largest production of metamorphosing sea lampreys. During 1997, 95% of the metamorphosing sea lampreys targeted for treatment lived in the first 15 streams on the list. The treatment status of these streams was not affected by the variability demonstrated by the observers. The streams that were affected were at the end of the list where fewer transformers were predicted. Any movement of streams across the cutoff line cumulatively represented less than 1.0% of the transformers targeted for treatment during 1997. ACKNOWLEDGMENTS We thank the Great Lakes Fishery Commission for supporting this endeavor and John W. Heinrich, Gerald T. Klar, and the staff of the Marquette Biological Station for their participation. We appreciate the assistance with statistical analysis provided by Jean V. Adams. We especially thank the late Dr. Philip A. Doepke, Drs. William L. Robinson and Frank A. Verley from Northern Michigan University, and Robert J. Young from the Canadian Department of Fisheries and Oceans for their guidance and comments. This article is Contribution 1196 of the U.S. Geological Survey Great Lakes Science Center. REFERENCES Applegate, V.C., Howell, J.H., Moffett, J.W., Johnson, B.G.H., and Smith, M.A. 1961. Use of 3-trifluoromethyl-4-nitrophenol as a selective sea lamprey larvicide. Great Lakes Fish. Comm. Tech. Rep. No. 1. Brege, D.C., Davis, D.M., Genovese, J.H., McAuley, T.C., Stephens, B.E., and Westman, W.R. 2003. Factors responsible for the reduction in quantity of the lampricide, TFM, applied annually in streams tributary to the Great Lakes from 1979 to 1999. J. Great Lakes Res. 29 (Suppl. 1):500–509. Cohen, J. 1960. A coefficient of agreement for nominal scales. Educ. Pyschol. Meas. 20:37-46. Dustin, S.M., Schleen, L.P., Popowski, J., and Klar ,G.T. 1989. Sea lamprey management in the Great Lakes 1989. Great Lakes Fish. Comm. Ann. Rep. 1989 Elliott, J.M. 1993. Some methods for the statistical
189
analysis of samples of benthic invertebrates. Freshwater Biol. Assoc. Sci. Publ. No. 25. Fleiss, J.L. 1981. Statistical methods for rates and proportions, second edition. New York: John Wiley and Sons. Hannaford, M.J., Barbour, M.T., and Resh, V.H. 1997. Training reduces observer variability in visual-based assessments of stream habitat. J. N. Am. Benthol. Soc. 16(4):853-860. Hanson, L.H., and Manion, P.J. 1980. Sterility method of pest control and its potential role in an integrated sea lamprey (Petromyzon marinus) control program. Can. J. Fish. Aquat. Sci. 37:2108–2117. Hunn, J.B., and Youngs, W.D. 1980. Role of physical barriers in the control of sea lamprey (Petromyzon marinus). Can. J. Fish. Aquat. Sci. 37:2118-2122. Koonce, J.F., Eshenroder, R.L., and Christie, G.C. 1993. An economic injury level approach to establishing the intensity of sea lamprey control in the Great Lakes. N. Am. J. Fish. Manage. 13:1–14. Landis, J.R., and Koch, G.G. 1977. The measurement of observer agreement for categorical data. Biometrics 33:159–174. Lavis, D.S., Hallett, A., Koon, E.M., and McAuley, T. C. 2003. History of and advances in barriers as an alternative method to suppress sea lampreys in the Great Lakes. J. Great Lakes Res. 29 (Suppl. 1):362–372. Miller, R.G., Jr. 1986. Beyond ANOVA, Basics of Applied Statistics. New York: John Wiley & Sons. Roper, B.B., and Scarnecchia, D. 1995. Observer variability in classifying habitat types in stream surveys. N. Am. J. Fish. Manage. 15:49–53. Schleen, L. P., Young, R. J., and Klar, G. T. 1996. Integrated management of sea lampreys in the Great Lakes 1995. Great Lakes Fish. Comm. Ann. Rep. 1995. Slade, J.W., Adams, J.V., Christie, G.C., Cuddy, D.W., Fodale, M.F., Heinrich, J.W., Quinlan, H.R., Weise, J.G., Weisser, J.W., and Young, R.J. 2003. Techniques and methods for estimating abundance of larval and metamorphosed sea lampreys in Great Lakes tributaries, 1995 to 2002. J. Great Lakes Res. 29 (Suppl. 1):137–151. Smith, S.H. 1968. Species succession and fishery exploitation in the Great Lakes. J. Fish. Res. Board Can. 25:667–693. Sokal, R.E., and Rohlf, F.J. 1981. Biometry, second edition. New York: W.H. Freeman and Company. Twohey, M.B., Heinrich, J.W., Seelye, J.G., Fredricks, K.T., Bergstedt, R.A., Kaye, C.A., Scholfield, R.J., McDonald, R.B., and Christie, G.C. 2003. The sterile-male-release technique in Great Lakes sea lamprey management. J. Great Lakes Res. 29 (Suppl. 1): 410–423. Submitted: 21 December 2000 Accepted: 24 June 2002 Editorial handling: Robert J. Young