Fisheries Research, 9 (1990) 143-163
143
Elsevier Science Publishers B.V., Amsterdam
The statistical design of comparative fishing experiments M.O. Bergh I, E.K. Pikitch t, J.R. Skalski 2 a n d J.R. W a l l a c e l ~Fisheries Research Institute, WH- I O, University of Washington, Seattle, WA 98195 (U.S.A.) 2Center for Quantitative Science, HR-20, University of Washington, Seattle, WA 98195 (U.S.A.) (Accepted for publication 31 January 1990 )
ABSTRACT Bergh, M.O., Pikitch, E.K., Skalski, J.R. and Wallace, J.R., 1990. The statistical design of comparative fishing experiments. Fish. Res., 9:143-163. Experimental designs for alternate tow gear experiments are described and evaluated, considering situations with and without randomized complete blocks. Variance components relevant to the performance of these designs are identified, and a method for estimating them is presented and applied using data for the Pacific multispecies groundfishery. The aim of the Pacific experiment was to estimate the magnitude of changes in fish length for various species and total tow catch value due to changes in cod end mesh size. Alternative experimental designs are compared with regard to the sample sizes needed to reject the null hypothesis that there is no treatment effect. For the specific application examined, there is a 4-I 0-fold reduction in sample size when randomized complete blocks arc used within a vessel trip, compared with an unblocked experimental design using only one treatment type per vessel trip.
INTRODUCTION
The importance of proper design for fishing gear research is emphasized in the papers by Pope (1963) and Pope et al. (1975 ). As pointed out by these authors, careful planning in gear experimentation ensures that conclusions from the study are objective, and therefore statistically and legally defensible, and minimizes the effort required to reach these ends. This paper considers in more detail the experimental design for one of the common field techniques used in gear research, namely the method of alternate tows. Work using the alternate tow method has been reported by Jensen and Hennemuth ( 1966 ) and Smolowitz ( 1983 ), for ICNAF stocks. With this technique, different fishing nets are compared (with respect to response variables such as fish length and catch per unit effort) by towing them in alternating order. In contrast to the covered cod end technique (Margetts, 1956, 1959; Otterlind, 1959; Hodder and May, 1964; Robertson, 1983; Robertson et al., 1986), and the parallel trawl technique, the method of alternate trawls 0165-7836/90/$03.50
© 1 9 9 0 - - Elsevier Science Publishers B.V.
144
M.O. BERGH ET AL.
can be performed under close to normal commercial fishing conditions. There are several advantages of performing gear research under commercial operating conditions. In the first instance, the results will be more readily accepted by fishers and managers. There is thus a better chance that the results of experimentation will have an impact on management. In addition, the costs of field research are greatly reduced, since the use of costly research charters is avoided. The trouser trawl technique has been used in Europe for experimentation under commercial conditions, but because of logistical constraints it was not considered here. Nevertheless, the principles of design outlined here are applicable to experiments conducted with trouser trawls. In designing a gear research experiment, two procedures invariably recomm e n d e d by statisticians are: ( 1 ) the randomization of treatment types and (2) the use of complete statistical blocks. Randomization is needed to eliminate systematic effects, including possible subjective inputs by the scientific researchers (or skipper and crew). The subdivision of the experiment into randomized blocks containing balanced r a n d o m applications of all treatment types can be shown to reduce the sample sizes needed to obtain statistically defensible results. However, the frequent changes of nets or cod ends necessitated by a randomized block design increases the workload and inconvenience for the research team and for the vessel crew, and reduces available fishing time, and hence trip revenues. Thus, the overall advantages of blocking need to be demonstrated in simple quantitative terms, particularly when research is to be performed under commercial operating conditions using donated vessel time. This means that the sample sizes required for some comm o n statistical test must be calculated for each kind of design. In this paper, a m e t h o d for estimating the sample sizes needed to reject an experimental hypothesis with a given design at a pre-set level of statistical significance and power is presented for an alternate tow gear experiment. The methods are applied to the design of an experimental gear study for the s u m m e r of 1988 for the Pacific groundfishery. Here this study is referred to as the proposed study. We focus on the relative advantage of blocking, where the most natural definition of a block is a vessel trip at sea. However, blocks which are smaller than complete trips are also considered. The key elements in the calculation procedure are the identification of important response variables, the choice of appropriate mathematical transformations, estimation of variance components, consideration of alternative hypotheses and the calculation of sample sizes for different experimental designs. We also include a discussion of the importance of obtaining data sets without missing data. To avoid the occurrence of missing points in the data set, procedures at sea must be carefully specified. Scientifically valueless results need to be rejected and the treatment application must be repeated in a statistically acceptable manner. Also, when a vessel can terminate a trip early,
STATISTICAL DESIGN OF COMPARATIVE FISHING EXPERIMENTS
145
due to factors outside scientific control, the blocking procedure must be modified appropriately and a m e t h o d for doing this is suggested. The methods presented here are applied to an analysis of alternative experimental designs for a gear study of the American west coast groundfishery. The treatment types considered are trawl net cod ends with different net mesh sizes or types (i.e. d i a m o n d mesh or knotless square mesh). The general approach and design considerations, however, have application to a much broader class o f gear studies and treatment types.
Aims of gear research Figure 1 shows a hypothetical yield trajectory in a fishery which, although initially at equilibrium, is subjected suddenly to a change in the gear type. In this example, the gear change envisaged is an increase in the cod end mesh size. Legislated changes such as this are often m a d e if it is suspected that a larger sustainable yield might occur, or if the stock can be better conserved according to some criterion o f a safe m i n i m u m stock size. The frequent aim in gear research is to predict future yields following a change in gear regulations. A traditional approach to this kind of research has developed over the years, particularly for research into the effect of mesh size. This is to estimate the net selectivity factors which give the proportion of fish of different sizes caught by the proposed net type. By converting these estimates to values of fishing mortality as a function o f age, one can then use models of population dynamics to predict future yields in the fishery (Beverton and Holt, 1957:
.
.
.
.
.
.
.
.
.
.
Y e o rs
Fig. 1. A hypothetical yield curve for an age-structured bottomfish population which is initially at equilibrium, and is subjected to a sudden increase in the cod end mesh size in use by the fishing fleet. Effort is assumed to be constant over the entire time period. Initially the yield drops sharply, but the new equilibrium yield could be larger than before (the new sustainable yield could also be smaller).
146
M.O. BERGH ET AL.
Gulland, 1963 ). With this approach, the emphasis is on measuring the lengthspecific selectivity properties of the net and then doing yield-per-recruit calculations to obtain the shape of the future yield curve in Fig. 1. Emphasis is generally placed on estimating the change in sustainable yield, zly" in Fig. 1, rather than the immediate consequences of changing selectivities, shown as ,dy,. Even though the rationale for pursuing selectivity studies is that Ay" may be positive, immediate effects are undoubtedly important and deserve more attention. For example, if short-term losses are very severe, the fishing industry could face financial disaster and not survive to experience the increased sustainable yields forecast. For the fishery with which this work was concerned, earlier studies (Pikitch, 1987; Vaga and Pikitch, 1988 ), indicated that sustainable yields would increase with increasing mesh sizes. However, concern about the short-term consequences of mesh size changes was expressed and a decision was reached to focus the gear study on the immediate consequences of a mesh size change, such as changes in the average cash value of a tow, i.e. an attempt would be made to estimate the quantity 3y'.
Randomization and blocking An important requirement for a defensible experimental design is that the order in which gear treatment types are applied should be random (see Pope, 1963; Pope et al., 1975). The random application of treatment types can be achieved using tables of random numbers. The intent is to eliminate systematic effects which might bias the results for a particular treatment type. However, even if treatment types are randomly applied, human intervention can jeopardize the objectivity of the experiment. Additional measures may therefore be necessary to preserve objectivity, such as preventing the vessel skipper from finding out which cod end type is in use. As this is practically impossible, compromises can be worked out which limit the skipper's ability to respond to different cod end types, such as nominating cod end type from the random table after the skipper has decided on the tow location and target species. Blocking refers to different aspects of gear study field work. Under normal conditions, vessels set to sea for a number of days on a trip, during which several distinct tows are made. An important question that arises in gear studies is whether to carry all the gear or treatment types on the same fishing trip, or whether only one treatment type need be carried per vessel. We consider two alternative designs. In Design A, the experimental unit is a trip and each fishing vessel carries a single gear type. Replication is achieved in Design A by multiple trips with the same type of gear. As an alternative, during each trip the vessel could carry the full complement of gear types. In
STATISTICALDESIGN OF COMPARATIVEFISHING EXPERIMENTS
[ 47
this design, denoted as Design B, gear types would be tested using a randomized block design (RBD). Within each block, all gear types are used and the order of use is randomized independently for each block. We restrict our analysis of the relative merits of different experimental designs to an evaluation of Design A versus Design B. METHODS
Design response variables For the gear study of the Pacific groundfish fishery, which is the example used here, the quantity identified as being crucial was the cash value of a tow per unit tow time, C. This response was of great interest to the fishermen, whose donated vessel time was critical to the project. Furthermore, complex and variable pricing structures mitigated against the use of catch weight as a critical response variable with post hoc conversion to revenues. We define the variable cl,q(m ) as the catch from a single tow using net m, for species q and length class l. Also let z be the tow duration in hours. When there are Q marketable species in a fishery, the cash value of a tow per unit tow duration using gear type m, Cm, is L(q) pLqWl,qCl,q(m) ZL' ZI='
c,._
(1)
where Pl.q and Wl,q are, respectively, price per unit weight in dollars, and body weight for species q( q = 1 .... ,Q ) and length class l ( l= 1 .... ,L ( q ) ). Changes in Cm as a result of gear types with different mesh sizes should be accompanied by changes in the mean length of fish caught. Logically, catches should decrease as mesh sizes increase. Mean length estimates thus provide an additional check on the tow cash value results and are important response variables. Mean fish lengths by species and sex for a single tow are given by B~ where n~ llq = ( ~ lSi,q)/nSq (2) i=1
where n~ is the total n u m b e r of fish from species q, with sex s in the catch of an unspecified tow, and liS,q is the length of the ith fish caught.
A N O V A models corresponding to experimental designs The null hypothesis for the comparative field study is "There is no treatment effect" on the mean cash value of the tow or the mean length of fish caught for a given species.
M.O.BERGHETAL.
148
Design A For Design A, the test of gear effect is performed using a one-way analysis of variance with subsampling. The different gear types constitute the treatments, the trips constitute the replicates and the tows are the subsamples. A total of T trips are completed. However, since there are r treatment types, only t trips are completed with each treatment type, so that T = rt. The model is (3)
Ym,i,j = l l + OLm -{- ~i "~-(-m,i,j
where Ym,id= the observed value for the jth tow ( j = 1, 2 ..... k) of the ith trip ( i = 1, 2, ...,t) using the mth gear type ( m = 1, 2, ...,r); # = t h e overall mean response; a m = t h e effect of the mth gear type; fl~=the effect of the ith set of r trips; em,ij=error terms associated with the variation in the jth tow ( j = 1, 2 .... k) of the ith trip ( i = 1, 2 ..... t) with the mth gear type m ( m = 1, 2, ...,r). We postulate that in ANOVA Model ( 3 ), the effects of the gear, trip and tow are random effects each contributing to the overall variance in Ym,id. The trip component of variance (a 2 ) is the variance in response from trip to trip. A second component of variation is the variance in response from tow to tow within a trip (tr2H). The variance between tows under Model (3) is estimated by (Johnson and Leone, 1977 ) Lr=l
~-'~=1Lk=, (Ym,i,j--Ym,i)-
6~-
2
(4)
r t ( k - 1)
where 37m,~ is the mean value across tows for each mesh m and trip i combination ~k= 1 Ym,i,j
Y'"-
k
The variance in response from trip to trip has both the variation between tows and trips, and consequently the trip component of variance is estimated by (Johnson and Leone, 1977) [~.2 Z~n=l ~-~-~=1 (Ym,i--Ym) 2 ~ 2
r(t-1)
k
where Ym is the mean of 3~m.;across all trips with the same mesh type 1 Ym,i L•=
t
(5)
STATISTICAL DESIGN OF COMPARATIVE FISHING EXPERIMENTS
149
Design B The ANOVA model for Design B can be written as Ym,j = fl dr"Ogrn .Of_flj d[- Ym,j "ac(-m,j
(6)
where ymj=the observed value for the mth gear type ( m = 1, 2 ..... r), on the jth trip ( j = 1, 2 .... ,T); # = t h e overall mean response; otto=the effect of the mth gear type; flj=the effect of the jth trip; ym,j=the interaction factor between the mth gear type and jth trip; em,j=error terms associated with the variation in the jth trip ( j = 1, 2, ...,T) and the ruth gear type m (m = 1, 2,
...,r). Null hypothesis For Designs A and B, the null hypothesis is no:a
I =Og 2 ....
=Og r
(7)
against the alternative hypothesis Ha :not all Otm are equal, m = l, 2, ...,r
(8)
The advantage of Design B is that treatment types are compared on the same trip. This eliminates the trip component of variance from the mean square error which appears in the denominator of the F = statistic used to test the null hypothesis. The result is a reduction in the sample size needed to achieve the same significance level and power.
Degrees offreedom For Design A, the unit of observation is the trip observation. There are exactly T such observations, i.e. T/r per treatment type, so that for Design A, the denominator degrees of freedom for the F-test of the null hypothesis (7) is
v2=T- ( r - 1)-1 = T - r
(9)
In Design B, there are Trk tow-specific observations. There are T - 1 degrees of freedom associated with trip factor, ( r - 1 ) mesh factor degrees of freedom, ( T - 1 ) ( r - 1 ) trip-mesh interaction factor degrees of freedom, and one degree of freedom associated with the grand mean. Therefore, for Design B
v2 =Trk- ( T - 1 ) - ( r - 1 ) - ( T - 1) ( r - 1 ) - l = r T ( k - 1)
(10)
For the F-test of the null hypothesis (7), the numerator degrees of freedom, v~, will be r - 1 for both designs.
150
M.O. BERGH ET AL.
Background to sample size calculations The aim in optimizing the design of an experiment is to determine the design which leads to the rejection of the null hypothesis with the smallest sampiing effort or cost. For commercial vessels, the number of tows per trip is dictated by conditions which are outside scientific control. The costs of a fishing trip are far greater than the costs associated with a tow or the effort required to perform an extra tow on an existing trip. Therefore, the relative costs of Designs A and B will be compared on the basis of the total number of trips, T, required to reject the null hypothesis at a given level of significance and power, keeping the number of tows per trip equal for both designs. The value of T required to reject the null hypothesis can be calculated by using the non-central F parameter, c~ (see Cochran and Cox, 1957; Scheffr, 1959; Peng, 1967 ) in conjunction with appropriate statistical tables. The noncentral F parameter is related to the parameter 0 used in Pearson and Hartley's ( 1962 ) tables by the formula 0=
v/"
v~+l
(11)
Myers (1972) defines 02 as the variance between population means (population m e a n s = t h e mean response value obtained with a particular treatment type) under the alternative hypothesis, multiplied by the number of replicates made with each treatment type and divided by the population error variance (population error variance = the mean square error for the ANOVA under the alternative hypothesis). Peng's formulae for 0 are consistent with those of Myers. Using Myers ( 1972 )
RE;=I 0 2-
+ 1) 0"2
(12)
where R is the number of replicates with each treatment type in the experiment,/,j is the true population mean for the jth treatment type under the alternative hypothesis (note that under the null hypothesis #j= # for all j = 1, 2, ...,r),/~ is the grand mean and a 2 is the population error variance, for which the mean square error from the ANOVA is an estimate. In the following description, we derive expressions for 0 for Design A and B in terms of anticipated differences in response to gear type (as defined below), the number of trips, T, required to reject the null hypothesis at a given statistical power and significance level, the variance components, a 2 and a 2, and the total number of treatment replicates per trip k. Under the alternative hypothesis, we assume that
STATISTICAL DESIGN OF COMPARATIVE FISHING EXPERIMENTS
I5 I
(this assumption can easily be dropped; see Scheff6, 1959; Peng, 1967 ). Also, let I # - # l I + 1#-#21 + . . . l # - - # r I = d and
( # - # l ) + ( # - # 2 ) +... ( # - #r) = 0 so that 3 I/6-#l = r
j = l , 2 ..... r
(13)
Therefore, for r = 2 #-#1
=-
(#-#2)
Substituting 2 for r, A/2 for ( & - # ) , and T/2 for R in eqn. (12) (in Design A there will be T/2 replicates of each treatment type) gives the following expression for 0 2 for Design A A2T 0~,--8a2A
(14)
For Design B, R = Tk and, therefore, substituting in eqn. ( 12 ) where necessar~ 2 d2Tk OB = -4~
(15)
For the two designs considered here
&
rr2 =or2 + T
(16)
and 2 2 O'B = O"H
(17)
i.e. cr2 has contributions from both the trip and tow components of variance, but the trip variance component is absent from a 2. From eqns. ( 14 ) - ( 17 ), the formulae for 0 are 0A --
(18) 2
¢2(
2
O'H 2
av + - k - )
and 0u - A ~ / ~ 2rr H
( 19 )
152
M.O. BERGHET AL.
F r o m eqns. ( 18 ) and ( 19 ), crucial information for evaluating different designs are the estimates o f a 2 and a 2. In the following section, a m e t h o d for estimating a~ and a 2 is described. The application of this m e t h o d to data from the west coast groundfishery is presented for the dollars per tow hour response variable. The resultant variance components are then used in a subsequent section to evaluate the performance o f Designs A and B with respect to sample size.
Estimating the variance components In this section, a m e t h o d is described for estimating a 2 and a 2 using prior records of response observations for tows conducted in an uncontrolled unbalanced m a n n e r which are nevertheless grouped by fishing trip. Using data such as this, a 2 and a 2 can be estimated using the ANOVA described in eqn. (3) with r = 1, i.e. assuming that there is only one treatment type. Note that when r = 1, T = t. For the completely balanced case, variance c o m p o n e n t estimates can therefore be obtained from eqns. (4) and (5) by setting r = 1 and m = 1. The tow c o m p o n e n t of variance is therefore 82 _
ET=I E)=I
(Yl,i,j--Yl,i)
2
T(k-1)
(20)
where Yl,i is now the m e a n value across tows for each trip i (lumping all gear types into one category, m = 1 )
Yl,i =
Ek=l Yl,i.j k
As before, the variance in response from trip to trip has both the variation between tows and trips, and so the corrected trip-to-trip c o m p o n e n t o f variance is given by
(y,,,_y,)2 82 (t-l)
k
(21)
where 371 is the m e a n o f 3~,,~across all trips
t Equation (21 ) is derived from an equation o f the form MStrip -- 82 a~k ~'
(22)
STATISTICAL DESIGN OF COMPARATIVE FISHING EXPERIMENTS
153
where mStrip is the mean square for the trip effect. For an unbalanced data set, k in eqn. (22) is replaced by 0 (Johnson and Leone, 1977 )
where n~ is the number of tows in the ith trip, and N is the total number of tows in the data set, i.e. N = ~i__ ~n,.
Application to the Pacific groundfishery For the example to be used here, the multispecies Pacific groundfishery, a database from an earlier study (Pikitch, 1986), referred to here as the Oregon-Washington database, contained measurements at the level of tow for an unbalanced and uncontrolled set of gear types. Specifically, the data were obtained by observers stationed aboard commercial groundfish trawl vessels operating out of the major fishing ports of the Oregon and southern Washington coast. During each trip, the skipper of the vessel controlled the location and gear type used for each tow, and the number of tows performed. Observers sampled the catch and estimated the total landed weight of each species obtained in each tow, and recorded length frequency information for selected species. Records of prices paid per unit weight and species were obtained from fish processing plants in order to estimate the cash value per unit tow duration. Prior to estimating trip and tow variance components of the cash value response variable, the data were split into two categories corresponding to two major fishing strategies employed by the fleet. These strategies are denoted as follows. ( 1 ) Rockfish: tows directed at a mixture of rockfish (Sebastes spp. ) using roller gear. (2) Flatfish: tows directed at a deepwater ( > 100 fathoms) assemblage consisting primarily of Dover sole (Microstomus pacificus), sablefish (A noplopoma fimbria) and thornyheads (Sebastolobus spp. ) using m u d or combination mud-roller gear. The variance c o m p o n e n t estimates that are obtained from this data set still contain variance associated with gear type which could not be removed due to the uncontrolled appearance of different gear types in the data set, and should therefore be viewed as larger than one would expect to find in the data set obtained from the controlled gear selectivity study.
Sample size calculations The value of 0 can be found from tables provided that one knows the significance level o~, the power ( l - f l ) , and the numerator (v~) and denomina-
154
M.O. BERGH, ETAL.
tor (v2) degrees of freedom for the F-test. In all our analyses, the following fixed values were specified: o~= 0.20 (two-tailed) 1
and vl =1
(23)
It is assumed that the direction of the change in response due to gear type can always be predicted if the different gear types are different cod end mesh sizes, i.e. larger mesh sizes should catch a larger mean length of fish and should produce a smaller catch per unit effort than other smaller mesh sizes. Therefore, when there are only two treatment types, the F-test at a = 0 . 2 0 (twotailed ) is equivalent to a one tailed t-test at ot = 0.10. Specification of the number of sampling trips, T, and the number of replicates per trip, k, fixes v2 via eqns. (9) and (10). The value of ~ needed to obtain the significance and power specified in (23) for given v2, denoted ~', is given in statistical tables (Pearson and Hartley, 1962 ). These numbers decrease with increasing values of v2 and T. However, the predicted observed value of 0 under the alternative hypothesis, denoted as 0", which is determined by eqns. (18 ) and (19) by the values of T, a 2, a 2, k and A, is an increasing function of v2 and T. If all information, except T, is specified, then the plots of O' and 0* versus Twill either intersect, or make a closest approach at some value of T. The smallest value of ~* permissible for the required power must be equal to or larger than the value of 0' at the approximate (approximate because the x-axis, T, takes integer values only) intersection point. This procedure is summarized by the following algorithm. 1. Input the values of a 2, a~, k, a, 1 - fl, and A. 2. Provide an initial value for v2. 3. Model A: T = v2 + 2; Model B: T = t h e integer closest to (v2/(2 ( k - 1 ) ) ). 4. The value of~* (as defined above) for each of the models, for a given value of the denominator degrees of freedom, v2, is
Model A: ~*--
Ax/~ 2
Model B: ~* _ d v./~Tk 2all 5. If ~* (V2) < ~' ( V2) and ~* (v2 + 1 ) > ¢' (v2 + 1 ) proceed to Step 8.
STATISTICAL DESIGN OF COMPARATIVE
FISHING EXPERIMENTS
15 5
6. If 0" (v2) > ~' (v2), decrease v2 by one, and go back to Step 3. 7. If 0" (v2) < 0' (v2), increase v2 by one, and go back to Step 3. 8. The calculation is complete. The required number of trips is given by the value of T corresponding to v2 + 1. RESULTS
Figure 2 shows a plot of CV (coefficient of variation) versus mean cash value per trawl hour for both flatfish and rockfish fishing, for the different gear types represented in the Oregon-Washington study. The CVs were roughly constant across various values for mean cash value per trawl hour. The best known distributions for positive random variables for which the CVs are constant (i.e. independent of the means) are the log-normal and the gamma distributions. Use of the log-normal distribution is consistent with a statistical model in which catchability fluctuations are much larger than the sampling error in determining the cash value of tows (see Butterworth and Andrew, 1987 ) which is plausible for this data set. This choice is also consistent with the choice made by de la Mare ( 1986 ) following a comprehensive analysis of catch per unit effort data. We therefore followed de la Mare's prescription by applying a logarithmic transformation to the tow cash value response variable. No transformation was used for the mean length random variable. Table 1 shows estimates of the variance components (/z, tr~ and Cr2H) of the natural logarithm of tow cash value log I Cm I, obtained from the study of the Oregon-Washington groundfishery. The data have been split into two
25O
200
CV
(94
15o ®4.5
®3
lOO (9 5
4.5 A ®3.5 A5
&6
50 /,,3 o
o
I 100
I 150
I 200
L 250
MEAN
Fig. 2. Data from a discard study of the Pacific groundfishery (Pikitch, 1986, 1987 ). The x-axis shows the mean value of a tow in the rockfish portion of the fishery ( ® ) in dollars, plotted against the coefficient of variation (CV, dimensionless) as a percentage for different cod end mesh sizes. The superscript numbers in the figure refer to the lower value for ranges of mesh sizes at 0.5-inch intervals. Data for the flatfish portion of the fishery are represented by A.
156
M.O. BERGH ETAL.
TABLE1 Variance component estimates for the logarithm of tow cash value (in dollars per hour of tow time) in the flatfish and rockfish portions of the Pacific groundfish fishery. These estimates are based on data (Pikitch, 1986) from 139 fishing trips, with a total of 376 rockfish tows and 502 flatfish tows Variance
Rockfish
Flatfish
a~ a~
0.391 1.368
0.170 0.454
TABLE2 Variance component estimates for the mean length per tow for eight species of importance in the Pacific groundfishery - Arrowtooth Flounder A theresthes stomias (Arrowtooth), Petrale Sole Eopsettajordani (Petrale), English Sole Parophrys vetulus (English), Dover Sole Microstomus pacificus (Dover), Sablefish or Black Cod Anoplopomafimbria (Sablefish), Pacific Ocean Perch Sebastes alutus (P.O.P.), Yellowtail Rockfish Sebastesflavidus (Yellowtail) and Widow Rockfish Sebastes entomelas (Widow). M. = males; F. = females. These estimates are based on data (Pikitch, 1986) from 139 fishing trips, with a total of 376 rockfish tows and 502 flatfish tows Sex/species
Mean length
a2T
a2
M. Arrowtooth F. Arrowtooth M. Petrale F. Petrale M. English F. English M. Dover F. Dover M. Sablefish F. Sablefish M.P.O.P. F.P.O.P. M. Widow F. Widow M. Yellowtail F. Yellowtail
37.05 39.62 32.58 38.90 26.18 31.18 34.68 39.09 50.45 52.71 37.32 39.16 39.01 40.71 42.78 45.06
39.87 52.18 9.48 10.13 10.38 4.28 8.79 15.50 15.19 3.65 3.25 0.82 4.28 2.44 6.39
0.67 1.06 3.84 10.33 3.02 8.66 2.89 7.33 10.59 28.52 0.82 1.96 7.54 7.59 3.72 2.01
c a t e g o r i e s o f fishing strategy: fishing t a r g e t t i n g e i t h e r rockfish, o r flatfish, at the edge o f t h e c o n t i n e n t a l shelf. T a b l e 2 gives e s t i m a t e s o f the m e a n a n d the v a r i a n c e c o m p o n e n t s o f t h e u n t r a n s f o r m e d values o f m e a n lengths p e r tow; /~, a t a n d a 2 , for e a c h s e x - s p e c i e s c a t e g o r y ( n o d i s t i n c t i o n b e t w e e n flatfisha n d r o c k f i s h - d i r e c t e d c a t c h e s here ). T h e s a m p l e sizes ( t r i p s ) n e e d e d for d e t e c t i n g c h a n g e s in the l o g a r i t h m o f the t o w cash v a l u e are g i v e n in T a b l e 3, for fishing t a r g e t t e d at e i t h e r flatfish
STATISTICAL DESIGN OF COMPARATIVE FISHING EXPERIMENTS
157
TABLE 3 Number of vessel trips required to reject the null hypothesis, "There is no mesh size effect", at a = 0.10 (one-tailed) with a power of ( 1 - fl) = 0.80 for the logarithm of tow cash value, in the flatfish and rockfish portions of the Pacific groundfishery for Designs A and B. It was assumed that there are 8 tows per trip for flatfish trips and 16 tows per trip for rockfish trips. The column headed alternative shows the response for one net relative to the other net. Although either an increase or a decrease is considered for the alternative hypothesis, we consider that the direction of change is predictable, given the difference in the treatment levels, therefore the one-tailed test is used Alternative
Design A
Design B
Fishing strategy: flatfish L-~-l.05 3444 L-~-l.10 902 1-~3-1.15 418 L-~-1.20 246 L~6-1.50 52 ½-2 18
436 116 52 32 7 3
Fishing strategy: rockfish ~-~-1.05 7240 L-~-l.10 1896 ~.A~-l.15 882 lay6-1.20 518 L-~6-1.50 104 -2 38
649 168 78 47 10 4
or rockfish. The range of possible changes under the alternative hypothesis (between a control and test net) considered is: A = _+0.049, _+0.095, _+0.140, _+0.182, _+0.405 and _+0.693, which are equivalent to changes in tow cash value o f L-t0~or 1.05, EAr6or 1 • 10, ~ i or 1.15, ~.Ar6or 1.20, ~ or 1.50 a n d ½ or 2. These results were calculated using 16 tows per rockfish trip and 8 tows per flatfish trip, as was indicated from the Oregon-Washington study data (Pikitch, 1986 ). Tables 4 and 5 show the sample sizes needed for detecting mean length changes for percentage changes under the alternative hypothesis of A = 1%, 2% .... , 1 0 % . A comparison of Table 3 with Tables 4 and 5 shows that the tow cash value response variable is by far the more demanding in terms of sampling effort when one compares tow cash value changes of 20% with length changes of --, 3%. This was expected, given the greater variability of catch per unit effort data compared with fish length data. The other important result is the substantial reduction in sample size which results from using a randomized block design, comparing Design B with De-
158
M.O. BERGHETAL.
TABLE 4 Number of trips required to reject the null hypothesis of no treatment effect at a = 0.10 (onetailed) with a power of ( 1 - f l ) =0.80 for the mean length per tow of the species and sexes (M. = male; F. = female) indicated using Design A with eight tows per trip. P is the percentage change between two nets with different mesh sizes under the alternative hypothesis Sex/species
1%
M. Arrowtooth F. Arrowtooth M. Petrale F. Petrale M. English F. English M. Dover F. Dover M. Sablefish F. Sablefish M.P.O.P. F.P.O.P. M. Widow F. Widow M. Yellowtail F. Yellowtail
10528 2632 12054 3012 3394 848 2730 682 5676 1418 . . . 1394 348 2296 574 2390 596 2440 610 974 242 824 206 418 104 1140 284 574 142 1182 294
2%
3%
4%
5%
6%
7%
8%
9%
10%
1168 1338 376 302 630 . 154 254 264 270 108 92 48 126 66 130
658 752 212 170 354 . 86 142 148 152 62 54 28 72 38 76
420 482 136 108 226 . 58 92 96 98 40 34 18 48 24 50
292 334 94 78 156
214 246 70 58 116
164 188 54 44 88
130 148 44 36 72
104 120 36 30 58
20 30 32 32 14 12 8 16 10 16
16 24 26 26 12 10 6 14 8 14
.
. 40 66 68 70 28 24 14 34 18 34
. 30 48 50 52 22 18 10 26 14 26
. 24 38 40 40 18 14 8 20 10 20
sign A. Four- to five-fold reductions are common in Table 3, but order of magnitude reductions occur in Tables 4 and 5. It was estimated that for the proposed gear study in the Pacific groundfishery (see Introduction), a total of 40 rockfish and 40 flatfish trips could be conducted. For the flatfish tow cash value, Design A requires about a 50% change to reject the null hypothesis, but for rockfish a change of between 50 and 100% is needed to reject the null hypothesis with 40 trips. The required sample sizes with Design B are much smaller, and changes in tow cash value of 20 and 50% for flatfish and rockfish, respectively, would be detectable given the expected sample size of 40 trips. The required mean length changes for Design A are between ~ 3 and 10%, but decrease to only 1 or 2% with Design B. Arguments in favor of Design B were presented in preliminary meetings with fishermen and their representatives. They expressed concern that the time spent changing cod ends would severely hamper fishing operations, and it was suggested that the research group consider the feasibility of reducing the number of random cod end changes per trip. The additional number of trips required for a design in which cod ends are randomly changed every second tow instead of every tow, denoted Design B', were calculated. To determine the new variance components of response variables, tow-by-
STATISTICAL DESIGN OF COMPARATIVE FISHING EXPERIMENTS
159
TABLE 5
Number of trips required to reject the null hypothesis "There is no change in mean l e n g t h " at a = 0.10 ( o n e - t a i l e d ) with a power o f ( 1 - r ) = 0.80 for the mean length per tow of the species and sexes ( M . = male; F. = f e m a l e ) indicated using Design B with eight tows per trip ( f o u r per treatment t y p e ) Sex/species
1%
2%
3%
4%
5%
6%
7%
8%
9%
10%
M. A r r o w t o o t h F. A r r o w t o o t h M. Petrale F. Petrale M. English F. English M. D o v e r F. D o v e r M. Sablefish F. Sablefish M.P.O.P. F.P.O.P. M. W i d o w F. W i d o w M. Yellowtail F. Yellowtail
11 15 81 154 99 . 54 108 94 232 13 28 112 103 45 22
3 4 20 38 24
2 2 9 17 11
1 2 4 6 4 . 3 4 4 9 1 2 4 4 2 2
1 1 3 4 3
1 1 2 4 2 . 2 3 3 4 1 1 3 3 2 1
1 1 2 3 2
1 1 2 3 2
1 1 1 2 2
1 1 1 2 2
1 2 2 3 1 1 2 2 1 1
1 2 2 3 I 1 2 2 1 1
t 2 2 3 1 1 2 2 1 1
.
. 13 27 23 58 4 7 28 25 11 5
. 6 12 10 25 2 4 12 11 5 3
.
. 2 4 3 6 1 1 4 4 2 1
.
. 1 2 2 4 1 1 2 2 1 1
TABLE 6 V a r i a n c e component estimates for the logarithm of the mean cash value of consecutive pairs of tows in the flatfish and rockfish portions of the Pacific groundfish fishery Variance 9
a~ ah
Rockfish (pairs)
Flatfish ( p a i r s )
0.332 0.726
0.133 0.245
tow data points within either the flatfish or the rockfish category were arranged in pairs o f consecutive tows. Data for tows without unique consecutive counterparts were left out o f the calculation. A new database was created using the mean tow value for each pair as the fundamental observation. The variance components of the logged pair-mean tow cash values are shown in Table 6. The tow component is about one-half o f the tow variance component for the unpaired data, suggesting that successive tow responses are roughly independent (the fact that the tow component o f variance for the paired data set is slightly higher than one-half o f the unpaired data set is evidence for a small amount o f autocorrelation in the tow-to-tow data, which might be due to skill at maintaining high catch rates at certain times ).
160
M.O. BERGHETAL.
TABLE7 Number of vessel trips required to reject the null hypothesis "There is no mesh size effect", at a=0.10 (one-tailed) with a power of (1-fl) =0.80 for the logarithm of tow cash value, for flatfish and rockfish fishing strategies in the Pacific groundfishery for Design B'. A value of 8 tows per trip has been used for the flatfish trips and 16 tows per trip for rockfish trips Alternative
Design B'
Fishing strategy: flatfish
~.-~-1.05 t.-~-l.10 1.~-1.15 ~-~-1.20 -1.50
475 134 57 33 7
½-2 Fishing strategy: rockfish
t.-~-1.05 L~-l.10 1.~3-1.15 L-L2~-l.20 i-~-1.50 ½-2
672 182 88 51 10 4
Sample sizes for Design B' are shown in Table 7. The extra number of trips required to achieve the same power as for Design B is small, so that a decrease in the number of random cod end changes made per trip from one every tow, to one every second tow, is acceptable. DISCUSSION
Trips are subject to unpredictable events which often lead to trip termination. Therefore one might often not be able to complete a predefined randomized block at sea. To minimize the loss of useable data, smaller blocks within trips can be devised. The smallest possible block contains one replicate of each treatment type. Such blocks have the dual advantage of minimizing the loss of complete blocks due to trip termination and eliminating the additional interblock variance which would still be present in RCBs at the level of trip. This is illustrated in Fig. 3, which compares the performance of two blocking procedures given an early trip termination. The procedure for completing single replicate blocks requires that samplers be furnished with a pre-defined strategy for dealing with events that arise at sea. Tows without scientific value (e.g. due to hang-ups, damaged nets, etc. ) are deemed to be aborted tows which must not be included in the block sequence. In the event of an aborted tow, the sequence of treatment types re-
STATISTICAL DESIGN OF COMPARATIVE FISHING EXPERIMENTS
Advantages
Design B:
of
Single
T_rip I
Trip 2
Block
Block
112~3 ~
,
~
~
Replicate .....
16 I
Blocks Trip n Block
..... ~
J
Unforeseen Trip Termination
Design C: 13142]231~.14132I, Block Block Block
1412313214112431 .....
~
Block Block Block
Block Block Block
Fig. 3. Schematic comparison of two randomized block designs for testing four treatment types (denoted 1-4). For Design B, blocks are defined at the level of trip, whereas for Design C, each consecutive set of four tows constitutes a block. Both designs result in three complete blocks per trip when all 12 tows are completed. However, in cases where a trip ends early (e.g. after four tows, as illustrated in the figure), Design C will result in a greater number of complete blocks than Design B. quired to complete the block must be re-randomized so that the skipper does not impart a subjective modification to the strategy for the next tow which depends on the mesh size used in the aborted tow. In the example of the Pacific groundfish study cited here, we used the response variable, cash value per tow hour, for calculating sample sizes in different designs. Data on the statistical properties o f the tow cash value and fish length response variables were available from a previous study in the O r e g o n Washington region of the Pacific groundfishery (Pikitch, 1986 ), in which gear types were represented in an uncontrolled unbalanced manner. The main point which is m a d e is that a major reduction in the required sample size is achieved when tows within a vessel trip are treated as a randomized block, compared with when this is not done. The use o f smaller blocks within trips minimizes missing values due to trip terminations, and probably leads to additional reductions in variance and sample sizes required to reject the null hypothesis, although we did not explicitly show this. The accuracy of the estimates of required sample sizes depends on the accuracy o f the estimates of the trip and tow components of variance. Because the ANOVA which was used to estimate these components of variance could not adequately remove the variance due to various gear types in use in the fishery, these variance estimates should be regarded as larger than the actual variance components. This has most likely resulted in slightly larger sample sizes than would be required in practice. In planning any long-term research project, one has to consider that the experimental design process will be iterative, with the design o f each phase updated on the basis o f variance estimates from the previous phase. The first study in such a series is the most risky since little might be known about the magnitude of relevant variance components.
162
M.O. BERGH ET AL.
ACKNOWLEDGMENTS We t h a n k D o n G u n d e r s o n a n d t w o a n o n y m o u s reviewers for c o m m e n t s o n an earlier draft. This w o r k was s u p p o r t e d b y S a l s t o n s t a l l - K e n n e d y G r a n t N A 8 8 - A B H - 0 0 0 1 7 a n d W a s h i n g t o n Sea G r a n t P r o g r a m G r a n t N A 8 6 A A - D S G 0 4 4 to E.K. Pikitch, a n d b y N M F S N o r t h w e s t a n d A l a s k a Fisheries Center, the West C o a s t Fisheries D e v e l o p m e n t F o u n d a t i o n a n d N M F S S o u t h west R e g i o n .
REFERENCES Beverton, R.J.H. and Holt, S.J., 1957. On the dynamics of exploited fish populations. Fish. Invest. London, (2) ( 19 ): 533 pp. Butterworth, D.S. and Andrew, P.A., !987. On the appropriateness of approaches used at ICSEAF to obtain TAC estimates from catch-effort data for hake. Collect. Sci. Pap. Int. Comm. SE Atl. Fish., 14: 161-192. Cochran, W.G. and Cox, G.M., 1957. Experimental Designs. 2nd Edn. Wiley, New York, 611 PP. De la Mare, W.K., 1986. Further consideration of the statistical properties of catch and effort data, with particular reference to fitting population models to indices of relative abundance. Rep. Int. Whal. Comm., 36: 419-423. Gulland, J.A., 1963. Approximations to the selection ogive, and their effect on the predicted yield. In: The Selectivity of Fishing Gear, ICNAF Spec. Publ. 5, pp. 102-105. Hodder, V.M. and May, A.W., 1964. The effect of catch size on the selectivity of otter trawls. ICNAF Res. Bull., 1: 29-35. Jensen, A.C. and Hennemuth, R.C., 1966. Size selection and retainment of silver and red hake in nylon codends of trawl nets. ICNAF Res. Bull., 3: 86-101. Johnson, N.L. and Leone, F.C., 1977. Statistics and Experimental Design in Engineering and the Physical Sciences. Vols. I and II. Wiley, New York, 1082 pp. Margetts, A.R., 1956. A mesh experiment with sisal, cotton and nylon codends. Int. Counc. Expl. Sea. C.M. 1956, Comp. Fish. Comm., No. 73. Margetts, A.R., 1959. A preliminary report on the International Arctic Mesh Experiment. Int. Counc. Expl. Sea. C.M. 1959, Comp. Fish. Comm., No. 81. Myers, J.L., 1972. Fundamentals of Experimental Design. 2nd Edn. Allyn and Bacon, Boston, 465 pp. Otterlind, G., 1959. Report on Swedish trawl experiments in the southern Baltic. Int. Counc. Expl. Sea. C.M. 1959, Comp. Fish. Comm., No. 120. Pearson, E.S. and Hartley, H.O., 1962. Biometrika Tables for Statisticians. Vol. 1. 2nd Edn. Cambridge. Peng, K.C., 1967. The Design and Analysis of Scientific Experiments. Addison-Wesley, London. Pikitch, E.K., 1986. Impacts of management regulations on the catch and utilization of rockfish in Oregon. Proceedings of the International Rockfish Symposium, 20-22 October, at Anchorage Alaska, Alaska Sea Grant Rep. No. 87-2. Pikitch, E.K., 1987. Use of a mixed-species yield-per-recruit model to explore the consequences of various management policies for the Oregon flatfish fishery. Can. J. Fish. Aquat. Sci., 44(Suppl. I1): 349-359. Pope, J.A., 1963. A note on experimental design. In: The Selectivity of Fishing Gear. ICNAF Spec. Publ. 5, pp. 175-184.
STATISTICAL DESIGN OF COMPARATIVE FISHING EXPERIMENTS
163
Pope, J.A., Margetts, A.R., Hamley, J.M. and Akyuz, E.F., 1975. Manual of methods for fish stock assessment. FAO Fish. Tech. Pap. No. 41. Robertson, J.H.B., 1983. Square mesh cod-end selectivity experiments on whiting (,llerla~gil~ merlangus (L)) and haddock (Melanogrammus aelgefim4s (L)). Int. Counc. Expl. Sea. C.M. 1984/B:30, Fish Capture Comm. Robertson, J.H.B., Emslie, D.C., Ballantyne, K.A. and Chapman, C.J., 1986. Square and diamond mesh trawl codend selection trials on Nephrops norvegicus ( L ). Int. Counc. Expl. Sea. C.M. 1986/B:12, Fish Capture Comm. Scheff6, H., 1959. The Analysis of Variance. Wiley, New York, 477 pp. Smolowitz, R.J., 1983. Mesh size and the New England groundfishe~ - applications and implications. NOAA Tech. Rep. NMFS SSRF-771. Vaga, R.M. and Pikitch, E.K., 1988. West Coast Groundfish Mesh Size Stud x. Phase l Final Report. Submitted to the program officer of Saltonstall-Kennedy Grant No. NA-86-~BH00035, 68 pp.