Smooth age length keys: Observations and implications for data collection on North Sea haddock

Smooth age length keys: Observations and implications for data collection on North Sea haddock

Fisheries Research 105 (2010) 2–12 Contents lists available at ScienceDirect Fisheries Research journal homepage: www.elsevier.com/locate/fishres S...

2MB Sizes 0 Downloads 12 Views

Fisheries Research 105 (2010) 2–12

Contents lists available at ScienceDirect

Fisheries Research journal homepage: www.elsevier.com/locate/fishres

Smooth age length keys: Observations and implications for data collection on North Sea haddock Traiani Stari a , Katharine F. Preedy a,∗ , Eddie McKenzie a , William S.C. Gurney a , Michael R. Heath a , Philip A. Kunzlik b , Douglas C. Speirs a a b

Department of Mathematics and Statistics, University of Strathclyde, Richmond St, Glasgow G1 1XH, United Kingdom Marine Scotland, Marine Laboratory, Aberdeen, United Kingdom

a r t i c l e

i n f o

Article history: Received 27 March 2009 Received in revised form 17 February 2010 Accepted 22 February 2010 Keywords: Age length key comparison Haddock Sampling protocols

a b s t r a c t Age at length keys (ALKs), which give the probability of age given length, are a fundamental component of many age-based fish stock assessment methods. Usually, ALKs are compiled from readings of otoliths or scales taken from length-stratified sub-samples of fishery landings or research vessel trawls catches. The assessment process is data intensive when there are numerous fleets assessment sub-regions to be sampled over an annual cycle, making the collection and analysis of material costly and time consuming. Hence, the data are almost always sparse, often with voids that require in-filling before use in assessments. Though robust statistical procedures for automatically in-filling data voids have been developed, they have not been widely adopted, and procedures often remain manual. Here we use Generalized Linear Models to derive the probability of age given length from sparse data gathered during the International Bottom Trawl Survey and Scottish commercial sampling program. The keys are used to test statistically for differences between ALKs from different sampling regions, differing gear geometries, different sampling programs and fish at different life stages. The results of the comparisons suggest that ALKs from the commercial sampling program are not, in general, comparable to those generated by the IBTS program. We also found that ALKs from Nephrops trawls are significantly different from those generated by other trawls over the same sampling region. The tests also suggest that age at length distributions differ not only between but also within IBTS roundfish areas, in part due to differences between ALKs for mature and immature fish. These differences are an important factor when considering a reduction in the resolution of sampling areas and when combining data from countries with differing fleet compositions. They also raise important questions about present protocol for collection of otolith data from IBTS survey trawls. © 2010 Elsevier B.V. All rights reserved.

1. Introduction Much of the analysis of fish stocks considers age structured populations (Hayes, 1993; Morton, 2008). In particular the stock assessments used to provide management advice from which Total Allowable Catches may subsequently be derived commonly rely on age structured sequential population analysis (SPA). However, it is expensive and time consuming to age fish so the raw measurements from both and commercial catch samples are structured by length. A method for transforming length structured data to age structured data is, therefore, an important part of the toolbox in stock management, and the most common tool is an age at length key (ALK). In addition, any estimate of the growth rate of fish requires a knowledge of the mean length at age of the species in question and, in many cases, a vital first step in the calcula-

∗ Corresponding author. Tel.: +44 0141 5483599; fax: +44 0141 5222079. E-mail address: [email protected] (K.F. Preedy). 0165-7836/$ – see front matter © 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.fishres.2010.02.004

tion is the estimation of distributions of age at length. In addition to their utility in data transformation, comparison between ALKs can give insight into spatial and temporal changes in stock structure. The cost of collecting age data on fish provides a strong incentive to minimise the level of collection whilst retaining sufficient spread of effort to obtain a representative sample of the population. This has specific resonance currently in EU waters of the northeast Atlantic because the current implementation of the revised EU data collection regulation (Anon., 2008a) encourages the adoption of internationally integrated ‘population’ ALKs (Anon., 2008b) without any significant reference to spatial or fishing gear-based factors that may result in heterogeneous keys. Most commonly ALKs are calculated through a two-stage sampling procedure whereby a sample of fish that is used to obtain a length-frequency distribution is then sub-sampled for age to derive a distribution of ages at each length (Pope, 1988). The number of fish caught at that length can then be divided between the appropriate ages according to the distribution calculated. By summing

T. Stari et al. / Fisheries Research 105 (2010) 2–12

across all lengths the number of fish at each age is obtained. However, because aging is not done in situ, the size of the age sample may be reduced. For instance, in the case of the International Council for the Exploration of the Sea (ICES) International Bottom Trawl Survey (IBTS), age data are collected by interpreting the annual ring structure of otoliths and if an otolith is unreadable there may be no opportunity to sample from another fish. As a result, particularly at large sizes where few fish are caught, there may be insufficient data to calculate a distribution, or no age data for a given length, even where fish of that length have been caught. Without some degree of smoothing, this means that the transformation from length to age structure can remove fish from the population. This problem can be exacerbated by input errors (Daan, 2001) where records have to be removed as part of a quality control process. As a result ALKs are necessarily noisy and full of gaps. This has been addressed by Martin and Cook (1990) using a maximum likelihood estimate (MLE) and later by Morton (2008) combining information on mean length at age and seasonality. However, the former method makes a priori assumptions about the distribution of length at age which may vary between years and the latter requires a knowledge of the length distribution which is often inferred from the length distribution in caught samples using a comparison to VPA estimates which are age structured. Kvist et al. (2000) used continuation ratio logits (CRLs) followed by General Linear Models (GLMs) to investigate what might be the major sources of variation in the age distribution of sand eel samples. Rindorf and Lewy (2001) extended this work by using the fact that the mean length of fish increases with age to model both the CRLs and GLMs simultaneously and Ibaibarriaga et al. (2007) used multinomial models to investigate the development of stage- rather than age-classified processes in anchovy. The models investigated by Rindorf and Lewy (2001) were relatively complex, requiring the definition of a polynomial function in order that every possible distribution of length at age is covered. The assumption that length at age is normally distributed greatly simplifies the model and Gerritsen et al. (2006) used this to consider differences in the age at length of haddock in the Irish sea. This approach allows the estimation of age distributions for missing lengths with little leverage from outlier lengths, together with easy comparison between ALKs and an intuitive visualisation of the stock structure. However, despite the simplicity of implementation, this modelling approach has seen little take up in the context of ALKs. Therefore, we describe, explicitly, a method (similar to that of Kvist et al. (2000) and Gerritsen et al. (2006) of calculating ALKs which uses Generalized Linear Modelling to estimate the distribution of age at a given length as a relatively smooth function of length, so that the model is smoothed with respect to both age and length. The fitting is done using Maximum Likelihood Estimation, so the process gives an estimate of how robust the key is and easily enables formal testing of the equality of ALKs using Likelihood Ratio Test. This approach is then used to consider ALKs for haddock (Melanogrammus aeglefinus) in the North Sea, comparing those generated from different sampling areas within the ICES International Bottom Trawl Survey (IBTS) program and the Scottish commercial sampling program. We then utilise the fact that this process of modelling ALKs allows easy comparison of keys to investigate in more detail some of the differences observed.

2. Methods For convenience, we will refer to fish in age classes a = 1, 2, . . . , A and in length classes (in cm) k = 1, 2, . . . , K. The sequences of values of age and length classes need not start at one, nor do they need to be equally wide, but they must be consecutive. We model age as a function of length by using the within the likelihood function as follows. Suppose in a chosen area for a given

3

length k, Nk fish have been aged and there are nak fish in age class a where a = 1, 2, . . . , A. Suppose the distribution of age at length k is given by P(k) = {p1 (k), p2 (k), . . . , pA (k)}, where pa (k) is the probability that a fish in length class k is in age class a, then, up to a multiplicative constant, the likelihood, Lk , of observing {n1k , n2k , . . . , nAk } fish is given by Lk = p1 (k)

n1k

p2 (k)

n2k

. . . pA (k)

nAk

= Aa=1 pa (k)

nak

(1)

and over the entire data set (incorporating all lengths) the likelihood, L, is L = Kk=1 Lk .

(2)

A complete ALK is given by P(k)—a probability distribution of age as a function of length. Estimation of P(k) is achieved using ordinal regression which supports a variety of forms depending on how the probability distributions {pa (k)} are parametrized as functions of length k. We use the continuation ratio logit model (Dobson, 2002) where the likelihood function, L is reparametrized such that it is the product of likelihoods La which can be estimated separately for each value of a = 1 . . . A. The probability distributions pa (k) can then be derived from them for each a using Generalized Linear Modelling methodology. We use glm and predict in (R project, 2009). Let a (k) =

pa (k)



, a = 1, 2, . . . , A − 1,

A

(3)

pi (k)

i=a

A (k) = 1 −

A−1 

a (k)

(4)

a=1

A

n for a = 1, . . . , A − 1; k = 1, . . . , K. and let Nak = i=a+1 ik Then Eq. (1) can be rewritten as nak

 (k) Lk = A−1 a=1 a

(1 − a (k))

Nak

(5)

and Eq. (2) becomes



Kk=1 a (k) L = A−1 a=1 L=

na k

(1 − a (k))

Nak



A−1 L a=1 a

(6) (7)

where La are distinct Binomial Likelihoods. Therefore we can estimate the probability distribution {a (k)} as a function of k for each value of a separately by using logistic regression to estimate (˛a , ˇa ) in the equation a (k) =

1 . 1 + e˛a +ˇa k

(8)

Eqs. (3) and (4) allow us to obtain



pa (k) =

1 (k) a (k)a−1 (1 − i (k)) i=1 A−1 (1 − i (k)) i=1

a=1 a = 2, 3, . . . ,A − 1, a=A

(9)

This modelling approach has many advantages; the functional form has been chosen to allow wide variation in the estimated probability distributions of age at length and it allows us smoothly to interpolate “missing information” where no or few fish have been caught at a given length. In addition to this, because the method uses Maximum Likelihood Estimation, it is possible to test formally whether two keys, estimated from different data sets, can be distinguished using a Likelihood Ratio Test. The procedure assumes that exactly the same age classes, numbering M say, are used for each data set. Suppose that the estimation of the two ALKs results in likelihoods L1 and L2 which we transform to log likelihoods, l1 and l2 (i.e. l = Ln L). Now the two data sets are combined and a single ALK is estimated, using the same M age classes and resulting in

4

T. Stari et al. / Fisheries Research 105 (2010) 2–12

Fig. 1. The left hand map shows ICES roundfish areas (RFAs) and the right hand map shows Scottish demersal sampling areas (DSAs). Both maps are taken from the Marine Scotland, Scotia Sea Going Manual.

a log likelihood lc . The appropriate Likelihood Ratio Test here uses the test statistic  = 2(l1 + l2 − lc ). Under the null hypothesis that the two underlying ALKs are actually identical,  is known to have a 2 distribution with 2(M − 1) degrees of freedom and so the test is easily performed. Appendix A contains a recipe for generating and comparing these keys in R (R project, 2009). (See Appendix A for the code implementing both the modelling and the generation of the likelihood.)

3. Case studies 3.1. Case 1: calculation of age at length keys from IBTS North Sea Quarter 1 surveys We consider data collected in the Quarter 2 IBTS survey (from January to March) in 1991 for roundfish areas (RFAs) 1 and 2 as defined by the ICES IBTS sampling protocol (Anon., 2004) (see

Fig. 2. The top charts show the raw data from RFA 1 (left) and RFA 2 (right) in 1991. The middle graphs show the raw ALKs and the bottom charts shows the ALKs generated by the modelling process. The scale gets lighter through the age classes 1 . . . 6 and 7+. In the original key for RFA 2 there are some lengths for which no fish have been aged. In addition there lengths for which no fish of a given age has been found despite it begin seen in the surrounding length classes. For instance 6-year olds in length class 54 cm. The modelling process smooths over length and age to produce better distributions, particularly for lengths with few data points or none.

T. Stari et al. / Fisheries Research 105 (2010) 2–12

5

Fig. 3. The top chart shows the ALK generated by mature fish and the bottom chart that generated by immature fish from RFA 1 Quarter 1 2005 for age classes ≤ 2, 3, 4, 5 and ≥ 6 getting lighter with age. The two keys are not comparable.

Fig. 1). Since the haddock spawning season runs from approximately February to May and juvenile haddock settle in the year they are spawned, all fish sampled in the bottom trawl survey must be at the very youngest age 1. Thus, we consider fish in year classes a = 1, 2, . . . , 6 and, ≥ 7 of lengths k = 1, 2, . . . , 80 cm. The middle graphs in Fig. 2 show the raw ALKs. These can be very rough estimates as they depend crucially on the number of fish of each length in the sample which are aged. There are gaps where no fish have been aged (e.g. 56, 60 and 61 cm in RFA 1) and some bars where only one fish has been caught so all fish at that length appear to be a single age even though there are younger fish in the length class above or older fish in the length class below (e.g. all fish of 66 cm are age 7 and all fish of 67 cm are age 6 in RFA 1). There

are also instances where there are no fish of a given age in some length classes but they do exist in the surrounding length classes (e.g. 42 cm for age 6 or 42–45 cm and 47 cm for age 7 in RFA 1). The bottom graphs show the ALK generated by the modelling process described above. It eliminates such problems to produce a more robust ALK. We then compare the two to the ALKs—that generated by the data from Quarter 1, 1991 in RFA 1 (the left hand column of Fig. 2) with that generated by the data from RFA 2 (the right hand column of Fig. 2). It is clear from both the raw and modelled ALKs that the age at length of fish aged 3 and above in RFA 2 tends to be greater than that of those in RFA 1 and this can be seen formally by performing the Likelihood Ratio Test which gives a p-value of 5.78 × 10−7 that the keys are comparable. Thus,

Fig. 4. Numbers of fish aged by length and age in RFA 3 Quarter 3 2005. The top left graph shows the raw commercial data which is much better for large fish. The top right graph shows the raw IBTS data which are much better for small fish. The darkest bars represent fish of age classes 1–2, getting lighter through age classes 3 . . . 6 and 7+. In the bottom graphs cyan represent fish age 4+. The data sets produce comparable ALKs (bottom left and right show modelled ALKs for commercial and IBTS data, respectively).

6

T. Stari et al. / Fisheries Research 105 (2010) 2–12

Fig. 5. Numbers of fish aged by length and age in RFA 1 Quarter 1 (January–March) 2005. The top graphs show raw data and the bottom graphs the associated modelled ALKs. The left column is from commercial sampling and the right from IBTS sampling. The darkest bars represent fish of age classes 1–2, getting lighter through age classes 3 . . . 6 and 7+. The data sets produce comparable ALKs for lengths ≥ 36 cm but not for all lengths.

we can distinguish between the two ALKs at the 99% confidence level. 3.2. Case 2: comparison of mature and immature fish Haddock mature at age 2–3 and RFA 1 has a large number of spawning haddock. We therefore consider whether there may be a difference between the ALKs generated by mature and immature fish. We use IBTS data from RFA 1 Quarter 1 2005 because the data set from this period contains a good distribution of both mature and immature fish and because at this time the protocol for assigning the maturity stage of fish is well established. Comparing fish of ages 2, . . . , 6 we reject the hypothesis that ALKs mature and immature fish are comparable at the 99% confidence level (p = 5.09 × 10−3 ). Fig. 3 allows us to see that mature fish are older at length than immature fish. This is consistent with the Heath et al. (2003) who found an that age, hepatosomatic condition and length were all correlated with maturity. 3.3. Case 3: comparison of commercial landings and discard data In the following examples we compare the IBTS data with commercial catch data collected by Marine Scotland in compliance with EU catch reporting requirements. Commercial haddock extraction is sampled within Scottish demersal sampling areas which are smaller than the ICES RFAs but nest within them so, for instance demersal sampling areas (DSAs) 1, 2 and 5 combine to make up RFA 1 (see Fig. 1). The data sets have simply been combined to make up keys which correspond to the same areas. Commercial trawls also use different gear and target areas where larger fish congregate. As a result data from small fish are scarce in the commercial data whilst data from large fish are scarce in the IBTS data (see Figs. 4–6). If the ALKs are comparable over lengths for which there are good data then it is reasonable to suppose that the fish come from the same population and hence the data sets could be combined to pro-

duce more robust ALKs. Data on landings are collected by sampling the catch of selected vessels. Commercial landings are commonly sorted by the crew, species by species into size categories, for example, small, medium and large. Each category is then sampled for length and age according to the two-stage sampling process. The numbers at length are calculated by multiplying up the weight sampled to the weight of fish in each of the categories declared for the vessel landings. Data from different vessels within the same sample strata are combined using the vessels’ landed weights as a weighting factor, and then raised to ‘fleet’ level by the ratio of fleet landed weights to the sum of the sampled vessels’ landed weights. Age data are collected in the same manner as for the IBTS data. There is a minimum length at which it is legal to land haddock—30 cm. Fish below this length are discarded and discard quantities are estimated by observers on volunteer ships logging the proportion of the catch weight discarded. Discards are also sampled for length and age and raised to the numbers discarded at length and age using simple ratio estimates. We consider first RFA 3 in Quarter 3 (July–September) of 2005. Fig. 4 shows the data sets from the commercial survey (left) and IBTS (right). The data set here is small and the bias towards small fish in the IBTS survey and larger fish in the commercial survey is particularly apparent. There are no fish aged 1 in the commercial data and very few greater than age 4 in the IBTS data so the keys are produced for ages ≤ 2, 3,and ≥ 4. The log likelihood ratio test gives a p-value of 0.841 suggesting that it is not possible to distinguish between the two keys despite the tiny number of larger fish in the IBTS data. In this case the data sets can be combined to give a more robust estimate of age at length distributions—very useful given the dearth of data for larger fish. However, it should be noted that the IBTS data set is extremely small so the key would have to be very different in order to reject the null hypothesis. RFA 1 Quarter 1 in 2005 has more data for both surveys though there is still very little data for smaller fish in the commercial data set (Fig. 5). For smaller fish, there are many fish aged 1 and 2 in

T. Stari et al. / Fisheries Research 105 (2010) 2–12

7

Fig. 6. Numbers of fish aged by length and age in RFA 3 Quarter 1 2005. The top graphs show raw data and the bottom graphs the associated modelled ALKs. The left column is from commercial sampling and the right from IBTS sampling. In the top graphs the darkest bars represent fish of age classes 1–2, getting lighter through age classes 3 . . . 6 and 7+. There are few fish under age 4 in the commercial data so darkest bars represent fish of age ≤ 4.

the IBTS data and very few in the commercial data. As a result, the keys are not comparable when all lengths are considered (p = 9.575 × 10−7 ). In contrast, in RFA 3 in Quarter 1 of 2005 (Fig. 6) there are many more fish aged 1 and 2 at small lengths and at intermediate lengths a larger proportion of the fish are younger in the IBTS data. Indeed, there are very few fish aged 3 or under in the commercial data set and not enough fish of age 7 and over in either data set so the ALKs represent fish ≤ 4, 5 and 6+ only. The age keys are not comparable (p = 0.003) even when data are restricted to mid-length fish where data are good for both IBTS and commercial data (≥ 36 and ≤ 42 cm and age classes ≤ 5, ≥ 6). There is a variety of explanations which might account for the difference between the keys for the commercial data and the survey data. The two main differences between commercial and IBTS fishing patterns are gear and spatial distribution. Commercial trawlers use a variety of net geometries, particularly, as in the case of Nephrops trawls, when they are targeting a specific species. They also target areas where they are most likely to find big fish. The first scenario might give rise to different ALKs if that behavioural changes through the haddock life cycle make them more vulnerable to different gears at different ages. (For instance older fish might spend more time higher up the water column.) The second scenario might lead to different ALKs if there was aggregation of fish by age. This might be caused by spatially clustered settlement patterns or, by spawning aggregation if, as in Section 3.2 mature and immature fish generate different ALKs. In Sections 3.4 and 3.5 we consider whether different gears and location might affect the ALKs produced. 3.4. Case 4: comparison of gear configurations The composition of fishing fleets varies from country to country. For instance, the Scottish fleet has a large Nephrops fishery

and no use of beam trawls which forms a large proportion of the Dutch fleet. If age data from the various countries are to be combined then it is necessary to understand any difference in the age at length of fish caught by different gear configurations in order that fleet composition can adequately be accounted for in the estimation of keys from the combined data set. In the light of the results from Section 3.3 which show spatial differences in ALKs we restrict our comparison to data gathered within single demersal sampling areas. In many cases, there are insufficient data for robust comparison but we discuss two contrasting examples which show the importance of careful consideration of gear geometry when combining data sets. Fig. 7 shows a comparison of data from the Seine net vs light trawl in demersal sampling area 1 (Shetland) for Quarter 1 of 2005. There was little data for ages 1–3 so the classes considered were ≤ 4, 5, 6 and ≥ 7. The ALKs were found to be comparable with a p-value of 0.307. In contrast, the Nephrops trawl was compared to all other trawls combined in demersal sampling area 5 (Forties) for Quarter 3 of 2006 (see Fig. 8. The ALKs were found not to be comparable using age classes ≤ 2,3,4,5,6 and ≥ 7. However, there were only 7 fish of ages 4 and 5 (this would correspond to age classes of 2001 and 2002—weak recruitment years after the massive 1999 recruitment year). Therefore, the analysis was repeated using age classes ≤ 2, 3–4, 5–6 and ≥ 7. The hypothesis of comparable keys was, again rejected having a p-value of 9.765 × 10−12 . The possibility that the lack of large fish reported from the Nephrops trawl might affect results was excluded by restricting the analysis to fish of length ≤ 47 cm. The hypothesis of comparable keys was still rejecting having a p-value of 2.244 × 10−10 despite the reduced size of the data set. It is clear therefore that the ALKs generated by f fish caught by boats fishing for Nephrops is very different to that of those targeting other species.

8

T. Stari et al. / Fisheries Research 105 (2010) 2–12

Fig. 7. Numbers of fish aged by length and age in demersal sampling area 1 (Shetland) Quarter 1 2005. The top graphs show raw data and the bottom graphs the associated modelled ALKs. The left column is from sampling of commercial Seine trawls and the right from sampling of commercial light trawls. The darkest bars represent fish in age classes ≤4, getting lighter through age classes 3 . . . 6 and 7+. These keys are found to be comparable with a p-value of 0.307.

3.5. Case 5: comparison of spatial areas Unfortunately, much of the age data reported in the IBTS survey is reported by RFA and it is not, therefore, possible to subdivide

IBTS data within each RFA. However, Scottish demersal sampling areas 1 (Shetland), 2 (Viking) and 5 (Forties) nest exactly to form RFA 1. We compare fish in age classes ≤ 3, 4, 5, 6, 7, 8, 9 and ≥ 10 for DSAs 1 vs 2 but, due to the distribution of ages in DSA 5 we

Fig. 8. Numbers of fish aged by length and age in RFA 3 Quarter 1 2005. The top graphs show raw data and the bottom graphs the associated modelled ALKs. The left column is from commercial sampling and the right from IBTS sampling. The darkest bars represent fish in age class ≤ 2 getting lighter through age classes 3 . . . 6 and 7+. The ALKs generated by this data are not comparable even when analysis is restricted to fish of length ≤ 47 cm.

T. Stari et al. / Fisheries Research 105 (2010) 2–12

9

Fig. 9. Commercial data for RFA 1 decomposed to DSAs 1, 2 and 5 on the first second and third rows, respectively. The left hand column shows raw data and the right hand column show the ALKs generated. The darkest bars represent fish in age class ≤ 2 getting lighter through age classes 3 . . . 8 and 9+. DSA’s 1 and 2 are not comparable but DSA’s 2 and 5 are and DSA’s 1 and 5 are comparable at the 95% confidence level but not the 90% confidence level.

consider fish in age classes ≤ 4, 5, 6, 7, 8 and ≥ 9 when comparing DSA 1 vs DSA 5 and DSA 2 vs DSA 5. DSAs 1 and 2 are not comparable (p = 5.610 × 10−5 ) and this remains true comparing fish in age classes ≤ 4, 5,6,7,8 and ≥ 9. In contrast, DSAs 2 and 5 are comparable (p = 0.528) and so are DSAs 1 and 5 at the 95% confidence level but not at the 90% confidence level (p = 0.067). Fig. 9 shows the raw commercial data set and the ALKs it generates and it is very apparent that large fish in Viking tend to be younger at length than those in Shetland but that at smaller lengths, the fish are younger at length in Shetland than Viking. Forties bears more similarity to Viking than Shetland but exhibits the same tendency for larger fish to be older as is seen in Shetland. The ALKs in this figure also highlight a limitation of this method. Although it is useful for in-filling data for missing lengths, extreme care should be taken when considering extrapolation beyond the data set—it is clearly highly unlikely that fish of less than 20 cm will be in age class 6. The differences between these ALKs suggests that either growth rates vary on a much smaller scale than previously considered, or there is indeed a degree of spatial aggregation by age. 4. Assumptions and limitations The structure of the likelihood function defined by Eqs. (1)–(7) is perfectly general and the main modelling assumption lies in how we represent a (k) as an explicit function of length k. We do this here using logistic regression as in Eq. (8), but other functional forms are possible. We note that a (k) is the probability of the event E: that a sampled fish of length k is found to be in age class a conditional on the fact that it is no younger than a. In detail, we model the log-odds in favour of E as a linear function of k, often found to be robust in applications. In practical terms, we are assuming that the log-odds of E vary monotonically with k, which is not unreasonable since we would expect them to tend to decrease with k, as older fish will tend to be larger. The limitations of the approach are mainly related to the limitations of the quality and availability of the data. Obviously, we require a reasonable amount of data over a good range of lengths and corresponding ages for each of those lengths. In particular, we

require consecutive age classes, and sometimes data are so scarce that we need to collapse age classes to ensure enough data are available for meaningful estimation. Thus, in the example shown in Fig. 4, because of lack of fish in certain age classes, we reduce age to three classes (2 years and younger, 3 years, 4 years and older). This allows us to estimate an ALK which we can interpret clearly, and which applies well to the data we have. On the other hand, this collapsing of age classes may lessen the interpretive usefulness of the model. Problems of data scarcity are very common in this area but will affect any method of estimation of an ALK. An advantage of our method is that we can still estimate a meaningful and useful ALK and any limitations in its interpretation are clear. Finally, we note a cautionary point about data quality and the interpretation of tests. In our examples we test whether ALKs based on two different data sets are the same. We do this here with a view to combining the data sets (and so getting a richer and better informed ALK) if they are found to be the same. In the example shown in Fig. 4, the formal test indicates that the two ALKs are the same, but the obvious scarcity of IBTS data makes it clear that the sensible decision here is the more conservative one that there is insufficient evidence to detect any difference between them. 5. Discussion The continuation ratio logit model applied to distributions of age at length derived using GLM methodology allows the production of smooth and more robust ALKs. The use of Maximum Likelihood Estimation avoids some of the infelicities apparent in the present method constructing ALKs from IBTS and commercial sampling data. Given the scarcity of data and the expense of collection, it is desirable to amalgamate data sets wherever possible without biasing results as this is will reduce the impact of any recording or reading errors and generate more robust ALKs. Hayes (1993) used Fisher’s exact test (Conover, 1999) to compare ALKs on Georges Bank. This allowed comparison of keys with few data points but only allowed each length class to be compared separately rather than the keys in their entirety. The methods developed, initially, by Kvist et al. (2000) and Rindorf and Lewy (2001) and for which

10

T. Stari et al. / Fisheries Research 105 (2010) 2–12

a straightforward to implement presentation is formulated in this paper allow for easy comparison of keys in a single test. Thus, noisy data sets can be compared after collation with a view to combining them into something more robust. In addition, easy comparison of ALKs is invaluable for analysis of spatial and temporal differences in population structure and could also help in investigations of whether there is any variation in age selectivity by different gears. There are limitations to the detail that can be achieved, but the method allows biologically realistic ALKs to be generated from what is necessarily often scarce data. The comparison of keys generated over the same areas by the IBTS survey and by samples from commercial trawls suggests that they are not sampling the same population of fish (either because of location or gear selectivity). It is clear that different gears can have an effect (as in the case of Nephrops trawls). However, the clear difference between ALKs generated from commercial data in DSAs 1 and 2 suggests that the distribution of age at length is not uniform at the level

of the roundfish area and the difference in the ALKs generated by mature and immature fish in RFA 1 suggests that spawning aggregation may affect the spatial variability of age at length. This has important implications for the otolith sampling protocol within the IBTS survey which sets a quota of 10 otoliths per length for each roundfish sampling area but does not put conditions on how those otoliths distributed over the trawls within each area. Acknowledgements The authors would like to thank Helen Fraser for drawing the problem underlying this paper to their attention and for much useful discussion on the subject to the referees for helpful comments and for drawing our attention to previous work on the problem and to Marine Scotland for supporting the research through ROAME MF0761. Appendix A. R code for generating and comparing ALKs

T. Stari et al. / Fisheries Research 105 (2010) 2–12

11

12

T. Stari et al. / Fisheries Research 105 (2010) 2–12

References Anon., 2004. Manual for the International Bottom Trawl Surveys Revision VII. http://datras.ices.dk/Documents/Manuals/ManualVII.doc. Anon., 2008a. COUNCIL REGULATION (EC) No. 199/2009 of 25 February 2009 Concerning the Establishment of a Community Framework for the Collection, Management and Use of Data in the Fisheries Sector and Support for Scientific Advice Regarding the Common Fisheries Policy. Anon., 2008b. COMMISSION DECISION of 6 November 2008 Adopting a Multiannual Community Programme Pursuant to Council Regulation (EC) No. 199/2008 Establishing a Community Framework for the Collection, Management and Use of Data in the Fisheries Sector and Support for Scientific Advice Regarding the Common Fisheries Policy (2009/949/EC). Conover, W.J., 1999. Practical Nonparametric Statistics, 3rd edn. John Wiley & Sons, New York. Daan, N., 2001. The IBTS database: a plea for quality control. CM 2001/T:03. International Council for the Exploration of the Sea. Dobson, A.J., 2002. An Introduction to Generalized Linear Models, 2nd edn. Chapman & Hall, CRC, London. Gerritsen, H.D., McGrath, D., Lordon, C., 2006. A simple method for comparing agelength keys reveals significant regional differences within a single stock of haddock, Melanogrammus aeglefinus. ICES J. Mar. Sci. 63, 1096–1100. Hayes, D.B., 1993. A statistical method for evaluating differences between age-length keys with application to Georges Bank haddock, Melanogrammus aeglefinus. Fish. Bull. 91, 550–557.

Heath, M.R., MacKenzie, B.R., Ådlandsvik, B., Backhaus, J.O., Begg, G.A., Drysdale, A., Gallego, A., Gibb, F., Gibb, I., Harms, I.H., Hedger, R., Kjesbu, O.S., Logemann, K., Marteinsdottir, G., McKenzie, E., Michalsen, K., Nielsen, E., Scott, B.E., Strugnell, G., Thorsen, A., Visser, A., Wehde, H., Wright, P.J., 2003. An Operational Model of the Effect of Stock Structure and Spatio-temporal Factors on Recruitment—Final Report of the EU-STEREO Project FAIR-CT98–4122. Contract Report 10/03, Fisheries Research Service. Ibaibarriaga, L., Bernal, M., Motos, L., Uriarte, A., Borchers, D.L., Lonergan, M.E., Wood, S.N., 2007. Characterization of stage-classified biological processes using multinomial models: a case study of anchovy (Engraulis encrasicolus) eggs in the Bay of Biscay. Can. J. Fish. Aquat. Sci. 64, 539–553. Kvist, T., Gislason, H., Thyregod, P., 2000. Using Continuation-Ratio Logits to analyze the variation of the age composition of fish catches. J. Appl. Stat. 27 (3), 303–319. Martin, I., Cook, R.M., 1990. Combined analysis of length and age-at-length data. Journal du Conseil 46, 187–199. Morton, R., 2008. Comparison of methods for estimating age composition with application to southern bluefin tuna (Thunnus maccoyii). Fish. Res. 93, 22–28. Pope, J.G., 1988. Collecting fisheries assessment data. In: Gulland, J.A. (Ed.), Fish Population Dynamics: The Implications for Management, 2nd edn. John Wiley & sons. R Development Core Team, 2009. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria ISBN 3-900051-07-0 http://www.R-project.org. Rindorf, A., Lewy, P., 2001. Analyses of length and age distributions using Continuation-Ratio Logits. Can. J. Fish. Aquat. Sci. 58, 1141–1152.