Chapter | fourteen
The Nuclear Genome: Neutral and Adaptive Markers in Fisheries Science Stefano Mariani,1 Dorte Bekkevold2 1
School of Environment and Life Sciences, University of Salford, Manchester, United Kingdom 2 National Institute of Aquatic Resources, Technical University of Denmark, Silkeborg, Denmark
CHAPTER OUTLINE Abbreviations ...........................................................................................................................298 14.1 Introduction ...................................................................................................................298 14.1.1 Structure, Variability, and Size of the Nuclear Genome............................... 298 14.1.2 Genetic Patterns and Processes and the Relevance to Fisheries ...........300 14.1.3 Neutral versus Adaptive Variation ........................................................................ 302
14.2 MethodologydThe Nuclear “Tool Kit” for Stock Identification................304 14.2.1 Microsatellites................................................................................................................304 14.2.2 Restriction-Assisted Methods: From AFLPs to Reduced Genomic Representation .............................................................................................................. 307 14.2.3 Single Nucleotide Polymorphisms ......................................................................... 309 14.2.4 Candidate Gene Approach .......................................................................................... 311 14.2.5 Transcriptomics and Proteomics ............................................................................. 312
14.3 Matching Each Question with the Right Tool .................................................. 315 14.3.1 14.3.2 14.3.3 14.3.4
Stock Structure ............................................................................................................. 315 Mixed Stock Analysis and Individual Assignment ............................................ 317 Seascape Genetics ....................................................................................................... 318 Effective Population Size .......................................................................................... 318
14.4 Conclusions..................................................................................................................... 319 Acknowledgments ..................................................................................................................320 References ................................................................................................................................320 297 Stock Identification Methods. http://dx.doi.org/10.1016/B978-0-12-397003-9.00014-X Copyright Ó 2014 Elsevier Inc. All rights reserved.
298 Nuclear Genomic Markers
ABBREVIATIONS AB Ascertainment bias AFLP Amplified fragment length polymorphism bp Base pairs cDNA Complementary DNA CU Conservation unit DNA Deoxyribonucleic acid EST Expressed sequence tag ESU Evolutionarily significant unit FST Fixation index expressing the variance in allele frequencies among populations GBS Genotyping by sequencing GIS Geographic information system GSI Genetic stock identification HSP Heat-shock protein IA Individual assignment IbD Isolation by distance IUU Illegal, unreported, unregulated (is used preceding the term “fisheries”) m Migration rate MHC Major histocompatibility complex mRNA Messenger RNA MSA Mixed stock analysis mtDNA Mitochondrial DNA MU Management unit Nb Effective number of breeders Nc Census population size Ne Effective population size NGS Next-generation sequencing PCR Polymerase chain reaction RE Restriction enzyme RGR Reduced genomic representation RNA Ribonucleic acid SNP Single nucleotide polymorphism SSR Simple sequence repeat STR Short tandem repeat
14.1 INTRODUCTION 14.1.1 Structure, Variability, and Size of the Nuclear Genome The genome of a teleost fish generally contains over one billion nucleotides, with elasmobranchs exhibiting up to four times that size (Gregory et al., 2007). The book you are reading contains over one million letters, so you would need 1000 books of comparable size to put together a string of information that resembles the amount of potential genetic polymorphisms in one single animal! If we then consider that one shoal of Atlantic herring (Clupea harengus) may naturally be formed by millions of individuals and that the world’s major fisheries are sustained by over 800 different species (Anderson, 2003) with varying distributions, sizes, and life histories, it becomes apparent that the wealth of genetic information available for scrutiny is beyond staggering. Typically, the enormous information load in the nuclear genome varies greatly in its nature and significance. Recent findings of the Encyclopedia of
Introduction
299
DNA Elements (ENCODE) project have shown that 80% of the human genome serves some biochemical purpose (Pennisi, 2012), highlighting the heavy involvement of untranslated RNAs and epigenetic mechanisms in the regulation of gene expression. However, while the population-level consequences of these new discoveries remain to be assessed, the vast majority of nuclear DNA variants are still at present assumed to be of negligible functional significance to the organism, in a population genetic context. Nuclear DNA genes that encode for proteins are generally substructured into two sets of regions: exons and introns, the former containing the information for the production of RNAs and proteins, the latter mainly being involved to varying degrees in the regulation of transcription from DNA to RNA and the splicing of the final RNA transcripts (Figure 14.1). A notable consequence of such diversity of sequence roles is the fact that different genomic regions will exhibit different evolutionary constraints and will evolve at remarkably different rates (Zhang and Hewitt, 2003): the exons of the most functionally important genes show typically low substitution rates, in the region of 108e109 mutations per base per generation (Roach et al., 2010), while the less constrained repetitive noncoding regions exhibit mutation rates of 103e104 per site per generation (Estoup and Angers, 1998). As a result, every genome will have “hot spots” and “cold spots” of variability, accrued at different speeds over the evolutionary history of a species. For the purpose of stock identificationdand in any analogous applications that hinge on population genetics theorydthere are three main factors that must be considered for the choice and use of molecular markers. First, the type of marker must be chosen so as to provide a measure of the biological processes Chromosome of 1.5 × 10 8 nucleotide pairs, containing about 3000 genes
0.5% of chromosome, containing 15 genes Gene 2
Intergenic region
Intergenic region
Gene 13
One gene of 10 5 nucleotide pairs
5′ regulatory region Exon 1
Exon 11 3′ untranslated region
Intron 7
DNA TRANSCRIPTION 5′
3′
Primary RNA transcript
RNA SPLICING 3′
5′ mRNA
FIGURE 14.1 Schematic representation of the gene structure along a eukaryotic chromosome: the singled-out gene is composed of 12 exons, 11 introns, a 50 regulatory region, and 30 untranslated region. The final (mature) spliced mRNA only contains exon information. From Wirgin and Waldman (2005). (For color version of this figure, the reader is referred to the online version of this book.)
300 Nuclear Genomic Markers that are thought to primarily determine the pattern of interest: nonfunctional genomic regions are employed to investigate genetic drift and gene flow (“neutral markers”), while functional regions can be used to measure the influence of natural selection (“adaptive markers”). Second, the variability of the chosen marker must be such to provide sufficient statistical power for the detection of the effect of interest. Third, the number of markers used must guarantee a realistic and unbiased view of the process investigated. Over the past decade, significant strides have been made that allow scientists to be in an optimal position to adequately choose the type, the variability, and the number of genetic markers for stock identification tasks. The ultimate goal of this chapter is to offer a clear view of the advantages and limitations of nuclear markers and to provide a robust framework to harness the former and meet the challenges presented by the latter (Table 14.1).
14.1.2 Genetic Patterns and Processes and the Relevance to Fisheries When genetic markers were first introduced in the realm of fisheries science, they rapidly gained considerable popularity, as they seemed to guarantee a new, reliable way to infer relatedness among individuals, populations, and species, which was firmly rooted in the established biological processes of inheritance (Utter et al., 1974). Population genetics had been a fertile ground of investigation since the early 1900s, hence it provided a solid theoretical framework against which to contrast and interpret empirical observations. Pioneering approaches based on phenotype had to contend with greater uncertainties, owing to the multiple, interacting forces underlying phenotypic variation. Genetics, on the other hand, appealed for the same fundamental reasons it still appeals today: its signals, estimates, and indices all ultimately depend on the transmission of alleles across generations, something that is universally true for every living organism (Avise, 2004). Unfortunately, it became apparent that genetics was not the silver bullet that many had hoped it to be, and this fact still represents an awkward obstacle for the implementation of useful genetic information into management strategies (discussed in Waples et al., 2008; Sagarin et al., 2009). In general terms, with the exception of very few straightforward questionsdsuch as the identification of species using DNA “barcodes” (Ward et al., 2005; Antoniou and Magoulas, 2013)dgenetic methods cannot always provide a single clear-cut unambiguous answer, something like a magical threshold number, in response to questions on stock boundaries; this is a fact that is often frowned upon by fisheries managers. Marine fish populations, especially those of high commercial importance, for which biological evidence is most keenly sought, are typically very large, mobile, and distributed over vast stretches of ocean. Thus even the best possible sample collection is inherently less accurate demographically than those normally achievable with terrestrial and freshwater species. Furthermore, rapid advances in biotechnologies and bioinformatics mean that researchers have at their disposal an ever increasing bulk of genomic information and
Introduction
301
more sophisticated computational methods to analyze it. This biotechnological “arms race” is partly driven by the need for more in-depth understanding of biological processes and partly to develop methods that can satisfy the pressing needs of fisheries management. As a result, periodical changes in perspectives and approaches are inevitable, but they make it difficult for managers and policy makers to keep up to date with the latest developments. The issue of “large population size” is fundamental to understand the limitations of genetic approaches in stock identification. In population genetics, a key parameter is the effective population size, Ne, which represents the size of an idealized population that exhibits the same rate of random genetic drift and inbreeding as the natural population under consideration (Wright, 1931), and can approximately be seen as the number of successful breeders in a population across generations. Ne interacts with the rate of migration, m, to determine the degree of differentiation between populations, traditionally estimated by the FST index, which is an expression of the amount of allelic variance between populations (Wright, 1965). In absence of migration and mutation, FST approximately measures the effect of random genetic drift between two populations, which is inversely proportional to Ne and increases with time following the formula: FST ¼ 1 (1 1/2Ne)t, where t represents the number of generations. Although the ratio between Ne and Nc, the census size of the population, can be rather low in marine fish (Hauser and Carvalho, 2008), Ne still tends to be large enough (>1000) for genetic drift to result weak. Figure 14.2(a) illustrates how even very different migration rates (between 0.2 and 0.05) do not cause big
(a)
(b) 0.1
Genetic drift ( FST)
Genetic drift ( FST)
0.05 0.04 0.03 0.02
0.05
0.01 0.01 100
500
Ne
1000
10
100
200
Ne
FIGURE 14.2 Relationship between effective population size (Ne) and random genetic drift (as measured by FST), under three different rates of migration, m (panel a), and with no migration but after two different time intervals (panel b). In panel a, the solid line is m ¼ 0.2, the dashed line is m ¼ 0.1, and the dotted line is m ¼ 0.05: the horizontal bar at FST ¼ 0.01 shows that the value that can be detected between populations with Ne z 100 and connected by m ¼ 0.2 is comparable to that estimated when Ne z 500 and m ¼ 0.05. In panel b, the vertical bar highlights that in small populations (i.e., Ne z 50), drift alone can produce FST values above 0.02 after just three generations (solid line) and nearly 0.05 after 10 generations (dashed line). (For color version of this figure, the reader is referred to the online version of this book.)
302 Nuclear Genomic Markers changes in FST values when Ne is large. Figure 14.2(b) shows instead that, over short timescales, notable differences in FST can be detected only in the absence of migration and at particularly low Ne levelsda situation that is probably rare, and restricted to cases of recent colonization and “founder effect” phenomena. The main implication here is that a signal of genetic distinctiveness due to random drift can only be detected between populations that have been largely independent demographically for a considerable number of generations. The above issues are applicable to molecular markers that are subjected exclusively to “neutral” evolutionary forces, such as gene flow and random genetic drift, and this has long been the traditional assumption for most applications in fish population biology (Ihssen et al., 1981). However, as highlighted in the first section, the assumption of neutrality is not realistic for some regions of the genome. This was one of the reasons for the progressive demise of allozyme markers and the advent of mitochondrial DNA (Karl and Avise, 1992) and nuclear microsatellite markers (Wright and Bentzen, 1994). Genetic variability at functional enzyme loci was often shown to obfuscate patterns of genetic structuring due to balancing selection, and in marine species, where levels of genetic differentiation are already comparatively low, this was seen as a huge limitation. Mitochondrial DNA (mtDNA) and microsatellites therefore became the mainstay of fisheries genetics for the past two decades, playing a crucial role in unveiling important stock delineations across the world’s oceans (Hauser and Carvalho, 2008) and generally satisfying the classical population genetics assumptions for neutral markers (Waples, 1987; McKusker and Bentzen, 2010), albeit with notable exceptions (Nielsen et al., 2006; Coscia et al., 2012; and see Section 14.2.1). Yet, in many species, populations and stocks for which life history information seems to suggest stock subdivision (e.g., Atlantic herring), the neutral approach appears to have hit a wall, faced with the inevitable time lag existing between the short-term demographic dynamics relevant to fisheries management (ecological paradigm) and the longterm evolutionary dynamics (evolutionary paradigm) driving the fate of a species or metapopulation (Waples and Gaggiotti, 2006). In such instances, neutral genetic markers are likely to prove ineffective in resolving stock structure, hence a new trend has recently emerged, which emphasizes the identification of “adaptive” markers that are under diversifying selection and may reflect distinctive features of local stocks.
14.1.3 Neutral versus Adaptive Variation Neutral DNA sequence variation is solely subjected to stochastic processes, such as mutation and genetic drift. In this case, whether an individual possesses one variant (or allele) or another will have no effect on its survival or the numbers of its offspring (i.e., fitness). By definition, natural selection thus has no effect on neutral genetic markers. The term adaptive variation refers to variants of a gene of adaptive value, that is those influencing how the
MethodologydThe Nuclear “Tool Kit” for Stock Identification
303
fitness of individuals is affected by natural selection. Adaptive variation therefore underlies the evolutionary processes that determine the phenotypic traits enabling individuals to cope with and adapt to local environments. Following the heyday of analysis of neutral variation in the 1990s and 2000s, the field of population genetics is undergoing a shift toward the analysis of sequence variation of functional, adaptive significance. There has thus been an increasing interest in studying the genetics of adaptation for fundamental evolutionary studies (see McKay and Latta, 2002; Luikart et al., 2003; Vasem€agi and Primmer, 2005), and also with the aim to identify genetic markers under diversifying selection (Beaumont, 2005; Schl€otterer and Dieringer, 2005; Storz, 2005; Joost et al., 2007). One of the reasons for the interest in the study of adaptive variation directly is that neutral genetic variation is often poorly correlated with variation at ecologically relevant traits (Meril€a and Crnokrak, 2001; McKay and Latta, 2002) and hence may fail to provide insights into some parameters of importance for defining conservation and management units. This is for instance expected to be particularly prominent in many marine organisms characterized by demographically “open” populations of large size (large Ne) where there is the potential for natural selection to determine adaptive divergence even in the face of high gene flow (Nielsen et al., 2009a). Accordingly, since the strength of directional selection on phenotypic traits can be locally intense (Heino, 2013), the identification of adaptive polymorphisms associated with specific areas and stocks may indirectly help identify population units at temporal scales relevant to fisheries management, which might go undetected by neutral markers (Waples et al., 2008). Population genomics can be broadly defined as the study of microevolutionary phenomena at the population level through the use of large numbers of molecular markers (generally between hundreds and hundreds of thousands) spanning either the entire genome or at least significant portions of the genome of an organism. Recent advances in ultra high-throughput DNA sequencing now allow the rapid generation of large amounts of genomic data in nonmodel species (Davey et al., 2011), hence offering researchers nearly unlimited opportunities to choose markers in desired genomic regions in potentially any species of interest. So far, most population genomics studies on fishes have examined correlations between putatively adaptive genomic markers and specific environmental factors. However, examples already exist of studies that were able to invoke causal links between specific genomic regions and important fitnessrelated traits (Hohenlohe et al., 2010; Miller et al., 2011; Chaoui et al., 2012)dalbeit so far only in species for which preexisting genomic information is substantial. Neutral and adaptive changes are both integral components of the microevolutionary process; thus, neutral and adaptive markers are both useful to understand the different mechanisms that underlie population divergence. Although their roles in stock identification have often been perceived as mutually exclusive (Utter and Seeb, 2010), here we attempt to show that greater benefits can be achieved by looking at population structure from both standpoints.
304 Nuclear Genomic Markers
14.2 METHODOLOGYdTHE NUCLEAR “TOOL KIT” FOR STOCK IDENTIFICATION 14.2.1 Microsatellites Short tandem repeats (STR), or simple sequence repeats (SSR), or, more simply, “microsatellites” are small (tens to hundreds of base pairs long) segments of repetitive, noncoding DNA found in every eukaryote genome. They usually contain repeat motifs constituted by two (e.g., ... [AT]n ...) to six (e.g., ... [ATTCTG]n ...) base pairs (bp), where n is the number of times a tandem repeat is found in a given allele; in most cases, di- (2 bp), tri- (3 bp), or tetranucleotidic (4 bp) loci are most commonly employed. Depending on genome size and relative base content (AT-rich genomes seem to have more tandem repeats), there can be between thousands and hundreds of thousands of microsatellites in one genome, which generally leaves ample opportunities for choice. Loci with complex sequence repeats (e.g., ... [AT]n[CATG][CAT]n0 ...) tend to be avoided owing to ambiguity in interpreting observed size variations, as do fragments that are longer than 350e400 bp, owing to inefficient amplification and/or scoring. Methods for identifying microsatellite sequences in the genome used to be labor intensive and time consuming (Zane et al., 2002), but next-generation sequencing (NGS, see next sections) techniques can rapidly generate vast amounts of high-quality sequence data, and thousands of candidate loci can be isolated in a few days/weeks thanks to powerful bioinformatic tools (Gardner et al., 2011). In studies that are severely limited by funds available, it is possible to use marker loci previously characterized for different species (generally of the same genus but often from any member of the family), which will “cross amplify” in the target organism. For instance, many genetic studies on Atlantic herring have successfully employed microsatellite loci developed for Pacific herring (Clupea pallasii), and studies on several sea breams (family Sparidae) have benefitted from many marker loci isolated in a range of species from different genera. Microsatellite loci are analyzed by means of PCR amplification, primed by species-specific oligonucleotide sequences (15e25 bp) annealed to the nonrepetitive flanking regions of the locus. Subsequently, the fluorescently labeled fragments amplified are subjected to capillary electrophoresis and their size accurately estimated through laser detection in an automated sequencer. Alleles with a smaller number of tandem repeats will be shorter, migrate faster, and be detected by the laser before the fragments with greater numbers of repeats (Figure 14.3). Both parental alleles at each locus are detectable (codominance): if the individual is homozygous, only one peak will be visible. This apparent simplicity can be affected by a number of disturbances that the researcher must be familiar with. Allele stuttering is the production of multiple peaks for the same allele, caused by strand slippage during DNA synthesis in the PCR; owing to polymerase bias, the last of the peaks generally corresponds to the true allele, but heterozygotes with similar-sized alleles can results difficult to score. Allele dropout is the underamplification of one
MethodologydThe Nuclear “Tool Kit” for Stock Identification
305
FIGURE 14.3 A chromatogram obtained through a capillary-based automated fragment analyzer, which shows the screening of two individuals (1 and 2) at five microsatellite loci (A, B, C, D, E). Values on the y-axis express the amount of PCR product obtained for a given allele. Progressive numbers along the x-axis refer to the length of the fragments expressed in base pairs (bp), with the shorter fragments on the left and the longest to the right. Locus A is homozygote for the same allele in both individuals, loci B and E are both heterozygous, but for different alleles; loci C and D are heterozygous in individual 1 and homozygous in individual 2. The use of different fluorescent dyes allows pooling fragments of overlapping sizes (like C and D) in the same reaction. Image courtesy of Debbi Pedreschi, University College Dublin. (For color version of this figure, the reader is referred to the online version of this book.)
of the two alleles (often the larger of the two), due to low concentration, poor quality of the template, or simply PCR bias, when one allele is considerably larger than the other. “Null alleles” are caused by the failed amplification of an allele due to a mutation in the primer region, which prevents the primer to bind the site. The likely presence of null alleles can be inferred statistically (van Oosterhout et al., 2004) and then verified, if necessary, by sequencing the full fragment of the suspected sample. These phenomena do not generally pose serious problems to population structure inference (though in temporal studies, the DNA from old samples is more degraded than that from recent ones, and this may be a source of bias). A rather more insidious issue is that of homoplasy, which occurs when two alleles are identical not as a
306 Nuclear Genomic Markers result of common descent but due to random mutation. Homoplasy can obscure the signal of population differentiation over longer timescales, in large populations, and at loci with particularly high mutation rates (see Estoup et al., 2002 for an extensive discussion). Microsatellites can be found very close to coding regions, and even embedded in intronsda fact that may make these loci “tied” to the selective forces acting on the nearby genes through a process known as “hitchhiking” selection (Nielsen et al., 2006)dthe vast majority of microsatellites evolve neutrally and accumulate mutations at a higher rate than any other genomic regions. This fact, coupled with their biparental inheritance and codominance, quickly made them the markers of choice to investigate gene flow and genetic structuring over shallow timescales. Given the above, it seems perhaps ironic that, while less than 20 years ago Wright and Bentzen (1994) entitled their landmark paper “Microsatellites: Genetic Markers for the Future,” we are now at a stage where many believe that their time is already up. Seeb et al. (2011b) recently predicted that by 2020 microsatellites will probably only contribute to 10% of the published studies in population genetics of nonmodel organisms. However, when examining the literature relevant to fisheries over the last 15 years (Figure 14.4), it appears that studies based on microsatellites continue to increase steadily every year; though the increasing impact of SNPs, starting from the mid-2000s, is also evident. This persistence in the relative importance of microsatellites may be partly due to the fact that exhaustive stock structure studies generally require several hundred to thousands of individuals, often
80 Microsatellites
70
AFLPs
Number of published articles
SNPs
60 50 40 30 20 10 0 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
FIGURE 14.4 Scientific publishing trend since 1994, comparing outputs of studies employing the following three classes of molecular markers: microsatellites, AFLPs, and SNPs, as listed in the ISI Thompson-Reuters Web-of-Science. The search criteria were: “fish* AND gene* AND (population OR stock) AND molecular marker*,” where molecular marker means “Microsatellite*”/“AFLP*”/“SNP*” in three separate analogous searches.
MethodologydThe Nuclear “Tool Kit” for Stock Identification
307
sampled over subsequent years; therefore a decisive switch to SNPs and more novel techniques will only take place when they will effectively outcompete the established methods in terms of costs. Irrespective of the “when” and “how fast” things will change, it is expected that microsatellites will continue to play a significant role in several applications over the next decade.
14.2.2 Restriction-Assisted Methods: From AFLPs to Reduced Genomic Representation Restriction enzymes (RE) are endonucleases that recognize specific DNA sequences between four and eight bp long and typically cleave the strands at a specific and constant position within or before the recognition site. REs are naturally present in bacteria and are believed to have evolved as a defense mechanism against viral infections. Hundreds of these enzymes have become important tools in DNA technology, specifically for the effective reduction of large portions of DNA, including full genomes, into smaller units that are amenable for genetic variability analysis and biological inference. Assuming no frequency bias among nucleotides along a stretch of DNA, REs that recognize four bp sites are likely to cut the DNA once every 256 bp (1/44), REs with six bp recognition sequences will cut approximately every 4096 bp (1/46), and so on. Thus, subjecting a one billion bp fish genome to restriction by a six bp cutter will result in approximately 250,000 fragments, while an eight bp cutter will produce only about 15,000 segments. Intuitively, more fragments, of a variety of sizes, can be produced by using several RE combinations. This “fragmentation” process is the basis of the Amplified Fragment Length Polymorphism (AFLP) method (Vos et al., 1995), a population genetic technique that allows a relatively broad coverage of a genome without any prior knowledge on the DNA variability of the species considered. The technique has been described exhaustively by Liu (2005) but can be summarized briefly as follows: First, total genomic DNA is digested with REs; a popular combination is the six bp-cutter EcoRI with the four bp-cutter MseI, which, for a one billion bp genome, leads to approximately four million fragments with an MseI cut at both ends, half a million fragments with MseIeEcoRI ends and a significantly smaller amount of EcoRIeEcoRI fragments (this happens because the long fragments generated by EcoRI are further cleaved by MseI). The second step is the ligation of “adaptors” to the “sticky” ends of the cut sites. For instance, EcoRI cuts the —GAATTC— site between the G and the first A, which results in a 50 AATT “overhang” in both strands: short stretches of known sequence (the adaptors) are ligated to this end and can be later recognized by PCR primers. The subsequent steps are generally two rounds of a PCR amplification (a “preselective” PCR and a “selective” PCR), which are aimed at reducing the huge number of fragments (e.g., the half a million MseIeEcoRI fragments) to the few hundred that can be resolved by a gel or capillary system. These
308 Nuclear Genomic Markers PCRs use primers that recognize a given adaptor but contain one, two, or three extra bases at the 30 end, so that only a subset of the original fragments will be amplifiable. The fragments generated are then sized by means of electrophoresis. The AFLP approach has two great advantages and two pitfalls compared to microsatellite analysis. First, an AFLP scan can be started from scratch on any species required, as the same enzymes and the same adaptor/primer combinations can in principle be applied to any organism. Perhaps more importantly, AFLPs provide a broad, randomized picture of genetic variation across the full genome, which can be probed using hundreds of commercially available REs and in different combinations, so as to indirectly assess a broad range of sequence stretches. Unfortunately, no information is available on any of the polymorphisms identified, so it is important to at least test for repeatability of the fragments detected (peaks) and discard all the nonrepeatable ones. Moreover, being that AFLPs results are scored as “presence/absence” of anonymous peaks of different sizes, it is impossible (without making bold assumptions) to detect heterozygotes (marker dominance). The prospects for the wide application of AFLP technology were thwarted not only by the difficulty of achieving result reproducibility between laboratories but also by the recent development of novel ultra highethroughput sequencing methods (Schuster, 2008). These approaches essentially entail the array of several hundred thousand short sequencing templates on a solid surface, so that these can be analyzed in parallel. The result is that an NGS platform, operated by a single person, can generate at least half a billion bp worth of Ô sequences in one day (Roche 454 GS-FLX System), and other platforms (i.e., Ô Ô Illumina HiSeq , ABI SOLiD ) can produce up to 10 times this amount (albeit arrayed in shorter fragments). This means that the full genome of a fish can now be sequenced about three times in one week. This is still far from being sufficient to assemble a new reference genome for a species, but it shows that what was seen up to the mid-2000s as a multiple-year endeavor can now be done in weeks, and at a small fraction of the cost. Whole-genome analyses are still well beyond the needs of fisheries management; rather, the new sequencing methods can be effectively assisted by restriction enzymes, which are employed to digest and “shear” the genomic DNA into fragments of manageable size, whose sticky end overhangs are ligated with adaptors that allow selective amplification and massively parallel NGS. This process is at the basis of “reduced genomic representation” (RGR) approaches (see Davey et al., 2011 for a discussion), which offers very costeffective ways to scan genomic variation between individuals and populations, using thousands to hundreds of thousands of polymorphisms. Contrary to AFLPs, RGR approaches allow detection and analysis of codominant polymorphisms within known sequences and through a streamlined, repeatable procedure whose costs continue to decrease rapidly (between the drafting and the publication time of this essay, NGS costs will have probably halved). In the foreseeable future, fully sequenced and annotated genomes will exist for
MethodologydThe Nuclear “Tool Kit” for Stock Identification
309
most commercial species, hence every polymorphism used in stock identification studiesdbe it neutral or adaptivedwill be directly mapped onto a specific genomic region. Given the above, despite the pioneering role of AFLPs in population genetics and some encouraging applications (Garoia et al., 2007), it is difficult to foresee a substantial role for them in fisheries genetics in the years to come.
14.2.3 Single Nucleotide Polymorphisms RGR methods and other NGS-based approaches provide geneticists with millions of genomic fragments of known sequence, which can be assembled into longer contiguous units (contigs) using bioinformatic tools, and from which sites that are variable among individuals and populations can be “cherrypicked” to become markers in stock identification. The most commonly used variable sites are constituted by single base pair substitutions, known as single nucleotide polymorphisms (SNPs). SNP markers that exhibit sufficiently high polymorphism can be used, usually in conjunction with many other SNPs, to quantify genetic variation among individuals. SNPs are generally biallelic, and markers are typically filtered to include those with a frequency of the rarer variant greater than 0.05 (Figure 14.5). SNPs can be located in either coding or noncoding regions of the genome, and their location in the genome may be either known (e.g., from mapping and annotation study) or unknown (“anonymous” SNPs). SNPs are attractive markers for determining stock structure and identifying the genetic origin of individuals (see examples of
G
A
T
C
G
T
T
G
SNP1
SNP2
SNP3
SNP4
G
T
A
C
G
T
A
G
Ind. A
Ind. B FIGURE 14.5 Schematic representation of a chromosomal stretch where 4 SNPs have been identified in two diploid individuals (ind. A and ind. B). The bars represent the two parental copies for each individual. SNP1 is monomorphic in the two individuals shown, although it could show variation if more individuals were screened, while SNP4 is heterozygous (C/G) in both. SNP2 is heterozygote (A/T) in ind. A and homozygote (T/T) in ind. B. SNP3 is homozygote in both individuals and fixed for alternative alleles; the highlighted area represents a functional gene region, thus this SNP3 is a candidate for screening adaptive variation. (For color version of this figure, the reader is referred to the online version of this book.)
310 Nuclear Genomic Markers application in Seeb et al., 2011a,b) for several reasons. These include the potential for rapid genotyping of tens of collections with sample sizes above hundreds of hundreds to hundreds of thousands of markers in a single assay and with low scoring error rates. Thus far, SNP screening has relied upon rather expensive chip/array-based platforms; however, new developments in genotyping-by-sequencing techniques (GBS) and associated bioinformatics tools (Peterson et al., 2012; Wang et al., 2012) might soon allow SNP typing at running costs competitive with microsatellite analysis. Since the statistical power to detect population structure and estimate connectivity among units is related to the total number of examined alleles, it is expected that assays of SNP panels of 200e400 neutral markers (which can be routinely run on medium-scale platforms) can yield stock information content exceeding that obtained using about 20e30 microsatellites. SNPs are easily calibrated among laboratories, which allows for combining spatial datasets, and are also relatively robust to degraded, low-copy DNA samples (Morin and McCarthy, 2007; Smith et al., 2011). The latter advantage allows for comparisons among temporal samples, such as historical scale, otolith, and bone collections (Nielsen and Hansen, 2008). A major advantage of SNPs is the ability to examine both neutral variation as well as regions under diversifying selection. Empirical studies in marine fishes demonstrate that application of gene-associated SNP markers may yield much more detailed information about population subdivision than that attainable with neutral markers (e.g., Nielsen et al., 2009b; Bradbury et al., 2010; Poulsen et al., 2011; Limborg et al., 2012) and that gene-associated SNPs can be exploited to greatly increase statistical power for genetic stock identification (GSI) (see Ackerman et al., 2011; Nielsen et al., 2012). Putatively adaptive SNPs could even be applied to study ephemeral genetic differences caused by differential selection in cohorts or larval groups within an ostensibly panmictic population (e.g., Gagnaire et al., 2012) providing cohort/stock tags in feeding and nursery areas. Recent studies demonstrate that combining information from multiple SNPs into haplotypes can further increase resolution in population structure and GSI analyses (e.g., Gattepaille and Jakobsson, 2012). A potential problem may arise when attempting to estimate genetic variation for populations outside the geographical range for which the markers were developed. This is referred to as ascertainment bias (AB) and is a generally acknowledged problem for genetic markers, including SNPs (Helyar et al., 2011). However, it appears that AB tends to decrease the information content of specific SNPs in newly screened populations compared to the populations for which the markers were developed (e.g., Bradbury et al., 2011), rather than posing a general threat to GSI resolution, at least when fairly large numbers of SNPs are applied at medium-to-large spatial scales (see also Seeb et al., 2011b). Another challenge lies in the identification of the most appropriate set of SNP markers for empirical estimates of population structure and GSI. Considering temporal changes, such as varying selection pressures and the
MethodologydThe Nuclear “Tool Kit” for Stock Identification
311
associated potential changes in population distributions, is not trivial and requires careful consideration (Galindo et al., 2010). However, if SNP panels are carefully designed on a case-by-case basis to target specific analysis scenarios (e.g., using the approach in Nielsen et al., 2012), their application is likely to yield unprecedented levels of stock discrimination in years to come.
14.2.4 Candidate Gene Approach A “candidate” gene is a functional locus presumed to have involvement in the expression of a phenotypic trait. Candidate gene analysis has been applied for studying population structure and adaptive evolution since the infancy of population genetic marker analysis (see e.g., Sick, 1961, 1965). Candidate gene variation may either represent structural genes or genes involved in physiological processes, where the assumption is that different variants alone, or most commonly in combination with other genes, directly determine phenotypes under natural selection or are hitchhiked with genes that do (Guinand et al., 2004). Identification of candidate genes has normally followed a top-down process (Dalziel et al., 2009), where genes of potential interest to specific environmental scenarios are sequenced to explore patterns of polymorphism. Once identified, population-specific information can be obtained either by directly sequencing or by typing markers (e.g., microsatellites, SNPs, restrictionassisted methods) found to be associated or directly linked with gene variants. Classical examples of candidate genes in fishes include hemoglobins, which are oxygen-carrying blood proteins (Andersen et al., 2009; Borza et al., 2009). Pioneering studies in the 1960s revealed clear genetic differentiation among populations inhabiting environmentally divergent areas (Sick, 1961, 1965). Subsequent studies (e.g., Brix et al., 1998; Petersen and Steffensen, 2003) indicated that variants affected physiological performance under different temperature and oxygen regimes, and hence suggested hemoglobin to be under adaptive evolution in Atlantic cod (Gadus morhua). Nonetheless, it was not until recently that Andersen et al. (2009) characterized the differences in oxygen affinities between hemoglobin functional variants, thus providing a direct link between genotype and phenotype. The gene for the integral membrane pantophysin, PanI (Pogson, 2001), is another much studied candidate gene in Atlantic cod. PanI exhibits variation suggestive of adaptive evolution associated with a suite of traits and environmental parameters, including growth, behavior, depth, salinity, and temperature. The physiological function of PanI is still largely unknown, and it also remains to be established whether selection acts on the gene itself or on one or more linked genes. Nevertheless PanI exhibits strong divergence among local populations and has been used as a marker for population structure and to resolve stock identity in a suite of Atlantic cod studies, including Pampoulie et al. (2006, 2012), Wennevik et al. (2008), and Glover et al. (2011). A family of well-studied candidate genes in fishes are “heat-shock” genes. These genes code for heat-shock proteins (HSP), which play an essential role in cellular stress response by facilitating interactions among other
312 Nuclear Genomic Markers proteins (Schlesinger, 1990). In fishes, HSPs have been shown to be activated in response to changes in salinity, temperature, and pollution (Basu et al., 2002). Heat-shock genes are indicated to be under divergent selection in several fishes, commonly exhibiting more than an order of magnitude larger differentiation among populations compared to neutral markers (e.g., European flounder, Platichthys flesus, Hemmer-Hansen et al., 2007; Atlantic cod, Nielsen et al., 2009b). Apart from illustrating mechanisms of local adaptation to different marine habitats (e.g., high versus low salinity), these genes thus also present a highly valuable tool for genetic stock identification. Other candidate gene families include the major histocompatibility complex (MHC) genes. This gene complex exhibits extreme polymorphism, probably maintained by pathogen-mediated selection (Eizaguirre et al., 2012), and has been proposed as a potential driver of population divergence. Different gene variants may determine, or be linked to, different levels of parasite and disease resistance (Spurgin and Richardson, 2010). The processes of coadaptation between the host (the fish) and its pathogens therefore leave a signature of spatial differentiation, which can be detected and quantified through the analysis of MHC gene variation (see e.g., Beacham et al., 2001; Cohen, 2002). The approach of simultaneously comparing several neutral and candidate gene markers has been highly successful in a range of fish species (Coscia et al., 2012; Hemmer-Hansen et al., 2007; Pampoulie et al., 2012), illustrating the importance of combining inference for different marker types (Vasem€agi and Primmer, 2005). However, information for one or a few candidate genes alone, or contrasted with neutral marker data, may not directly increase resolution in the analysis of population structure per se (Boutet et al., 2008; Larmuseau et al., 2009). A main challenge for candidate gene approaches aimed at resolving population structure and local adaptation is thus to identify which genes are most likely to exhibit signals of population divergence, depending on the environmental gradients existing in the system under consideration. Various classes of genes that are expected to be candidates for adaptive variation have been listed (Ford, 2002) to direct the search. However, in years to come, the classical candidate gene approach that entails time-consuming sequencing to locate polymorphisms in and around focal genes is likely to be circumvented by genome-wide sequencing applications, followed by identification of single nucleotide polymorphisms in novel candidate genes (e.g., Renaut et al., 2010; Hemmer-Hansen et al., 2011). This will not only increase the number of genetic polymorphisms for the purpose of stock discrimination but will also offer the opportunity to explore what regions of the genome are primarily involved in determining phenotypic variation in heterogeneous environments.
14.2.5 Transcriptomics and Proteomics As discussed before, genomic resources nowadays offer a vast wealth of tools to address virtually every issue in stock identification, and as we head toward a
MethodologydThe Nuclear “Tool Kit” for Stock Identification
313
not-so-distant future whereby a reference genome will be available for every species under study, the tool kit for stock identification projects promises to become ever more powerful, robust, and informative. Nevertheless, even high-density SNP coverage of the variability of DNA sequences contained in the genome cannot fully describe the fundamental functions that allow organisms to adapt to their environment. This can only be obtained by looking into how genomic information is expressed through the analysis of RNA (Goetz and McKenzie, 2008) and proteins (Karr, 2008): the “transcriptome” and the “proteome,” respectively, one and two steps downstream from the original repository of information contained in the genome. Although RNA can be extracted from cells in a similar fashion as DNA, it is a highly unstable molecule and its synthesis (and hence its concentration in the cell) constantly changes according to the functional state of the living tissue. For this reason, tissues for RNA analysis must be sampled far more cautiously and swiftly than what is normally practiced for DNA studies and must be preserved in special buffers. Messenger RNA (mRNA), in its final form (Figure 14.1), is characterized by a string of adenosine-phosphates (“poli-A”) at the 30 end of the chain, which can be easily recognized by oligonucleotide primers, to kick-start the process of “reverse transcription,” aided by an enzyme (reverse transcriptase) that converts the mRNA sequence in a strain of “complementary DNA” (cDNA). The second strand will then be synthesized by a DNA polymerase, and all the fragments of cDNA produced will represent the “mirror image” of mRNA segments in a physically stable molecular form. The production of cDNA from the mRNA in any given tissue effectively distills the information content of a cell’s DNA to a reduced set of sequences that underlie physiological and ecological functions. This creates the opportunity to probe more closely the process of adaptation, by allowing detection of differences in gene expression, in selected organs and tissues, among individuals and populations inhabiting different environments. These differences can essentially be examined in three main ways: first, by screening a number of “expressed sequenced tags” (ESTs), which are single-read sequences obtained by partial sequencing of cDNA clones previously obtained from a bulk of mRNA from a certain tissue (Bouck and Vision, 2007). Millions of ESTs are available for many organisms in public databases, though the majority of them are not validated on a reference genome. Second, “microarrays,” or “DNA chips,” can be employed. These are small slides, made of glass or silicon, onto which known sequences of many thousand genesdor selected oligonucleotides of particular functional significancedare spotted. By hybridizing fluorescently labeled cDNA obtained from transcripts of the studied tissue and populations onto the microarray slide, it is possible to reveal which genes are expressed in the samples under consideration and how intensely. Obviously this approach is limited by the amount of tissues and species for which the arrays are available, as developing new, reliable microarrays for a given tissue in a species without a
314 Nuclear Genomic Markers reference genome or EST libraries is a very hefty and time-consuming project in itself. Fortunately, to address this problem, too, NGS technologies have made life easier for transcriptome studies: all the cDNA from a tissue can now be sequenced in full (also referred to as “whole-transcriptome sequencing,” or “RNA-seq”) with greater accuracy and far greater speed and effectiveness than what ESTs/microarrays can possibly achieve, including the ability to uncover a considerable amount of transcribed product from genes that were not annotated in the reference genome (Sultan et al., 2008). Of course, in the absence of a reference genome, the assembly of a full transcriptome and the identification of genotypes will remain somewhat problematic (Davey et al., 2011), but as said before, whole genome reference databases will soon become commonplace for the majority of species with ecological and economic importance. Beyond the analysis of nucleic acids lies the “proteomic approach,” which aims at gauging a complete understanding of variation in gene expression (Greenbaum et al., 2003) through the quantitative analysis of protein production within and among natural populations. The procedure entails the extraction of both cytosolic and membranous/organellar fractions of proteins from the targeted tissue, followed by a two-dimensional electrophoretic gel that separates fluorescently labeled proteins according to both electric charge and molecular weight. Using specific image analysis software, the protein “spots” exhibiting significant differences between samples can be detected; these can be excised from the gel and identified using a mass spectrometer, which can measure mass/charge protein ratios with astounding accuracy, allowing assignment of individual amino acid composition to the peptide of interest, thus making possible an assignment to a unique gene that codes for this peptide (Karr, 2008). Much like external phenotypic features and life history traits, protein expression is typically “plastic,” and in fact, protein expression can vary hugely, in response to environmental and physiological factors, over much shorter timescales. Thus, rigorous proteomic (and transcriptomic) studies entail a “common garden” component, where individuals from different areas, habitats, or putative stocks are reared under the same conditions (Rees et al., 2011; Papakostas et al., 2012), so as to disentangle the variance component caused by fixed, adaptive genomic differences from the component that results from short-term phenotypic plasticity. These “post-genomic” approaches are highly informative with respect to the deep understanding of the biology of a species and contribute to provide a more complete picture of the interaction between genomes and the environment; however, they are unlikely to be the primary methods in stock identification, due to the high costs associated with screening a large number of samples and the impracticability of common garden experiments for many species. Nevertheless, transcriptomics and proteomics can help, in conjunction with genomic information, to guide the process of pinpointing, refining, and optimizing a limited number of markers of great discriminatory power, associated with known functions, and suitable for large-scale applications.
Matching Each Question with the Right Tool
315
Table 14.1 Task/Tool Matrix Summarizing the Applicability of the Main Nuclear Markers to Address Issues Relevant to Fisheries. The Last Row Contains Comments to Highlight the Main Limitation of Each Method
AFLPs
Microsatellites Candidate Genomic Transcriptomics Gene SNPs & Proteomics
Identification of stock boundaries (MU)
Yes
Yes
Yes
Yes
No
Individual assignment and mixed stock analysis
Yes
Yes
Yes
Yes
No
Seascape genetics
Yes
Yes
Yes
Yes
Yes
Effective population size estimation
No
Yes
No
Yes
No
Local adaptation No and conservation units
No
Yes
Yes
Yes
Main caveat
Poor Low genome reproducibility coverage
Low genome High costs coverage
Strongly dependent on environmental variation
14.3 MATCHING EACH QUESTION WITH THE RIGHT TOOL Several authors have previously discussed the importance of attaining methodological rigor in order to make solid inference on stock structure, and numerous publications exist that provide important advice as to how to minimize potential sources of bias, from sampling (Waples, 1998) to marker choice (Anderson, 2010; Bradbury et al., 2011) and data analysis (Kalinowski, 2002; Waples and Gaggiotti, 2006; Meirmans and Hedrick, 2011). This body of literature remains the benchmark for anyone needing to apply population genetics approaches to stock identification and conservation biology. Here we briefly list the main tasks in fisheries genetics and outline the basic norms that should be considered when embarking on such projects, also referring to relevant recent literature.
14.3.1 Stock Structure In his landmark paper, John Waldman (1999) (p. 242) simply put it that “to discriminate stocks of fishes, the signal from among-stock variation must exceed the noise of within-stock variation, and the more so, the better.” Although this statement does not specify what type of variation is being
316 Nuclear Genomic Markers partitioned, and, perhaps unfairly, dismisses aspects of within-stock individual variation as “noise,” it does capture the essence of the stock identification process: to detect a statistically and biologically significant variance in chosen descriptors between putatively (to some degree) independent demographic units. The general null hypothesis in this case is that of “panmixia,” that is, the condition of a freely random mating single population, exhibiting no substructure across the area of study. The rejection of this hypothesis should indicate the existence of one or more independent stocks. With neutral genetic markers, as illustrated in the first section of this essay, even very small effect sizes (e.g., FST estimates) can signify demographic separation of high relevance for fisheries management (Figure 14.2). Bentzen (1998) argued that Ne and m are typically so high in marine species that any detectable signal of allelic differentiation must underlie important biological discontinuities (but see Secor, 2013; Kritzer & Liu, 2013); however, random sampling and short-term repeatability (over at least two subsequent years) are necessary conditions in order to confidently uphold the findings (Waples, 1998; Waples and Gaggiotti, 2006). As discussed previously, SNPs and microsatellites are the two classes of markers that will dominate the field of stock identification over the coming decade, with SNPs likely to overtake microsatellites within 5e10 years. To date, the discriminatory power of w10 highly polymorphic microsatellites still outperforms that of panels of >100 neutral anonymous SNPs (Beacham et al., 2010; Hess et al., 2011); however, for a few species of vast commercial importancedwhich have been the focus of intense genetic research efforts for decades (e.g., Atlantic cod, Atlantic herring)dthousands of SNPs have been characterized (Bradbury et al., 2010; Hubert et al., 2010) and several stocks have been investigated using both microsatellites and SNPs (e.g., Limborg et al., 2012; Ruzzante et al., 2006, for Atlantic herring, and Nielsen et al., 2009b; Poulsen et al., 2011, for Atlantic cod). From these, a reduced panel of SNPs can be chosen, some with a likely adaptive value, with maximum power for stock discrimination and affordable costs. For such few species, SNPs are already superseding microsatellites, but the same cannot be assumed for all the several hundred species (thousands if we include invertebrate taxa) for which stock structure information is required to inform management. In several cases, even suites of <10 microsatellites can reveal patterns of significant differentiation and considerable effect sizes, especially in less mobile, coastal species (e.g., Sala-Bozano et al., 2009; Mariani et al., 2012), which still delivers fisheries management institutions with the opportunity to obtain rapid and reliable stock identity assessment for a small fraction of the costs associated with conducting an RGR scan or developing and screening thousands of SNPs in the same number of samples. Nevertheless, it is reasonable to imagine that the cohorts of today’s students who will become involved in GSI for fisheries management in their careers will become familiar with GBS and/or will routinely use “next generation screening panels” of markers, constituted by a relatively small number of genetic
Matching Each Question with the Right Tool
317
polymorphisms designed to detect differences between all pairs of known stocks from any major commercial fish. A key requirement for this approach will be to verify that the markers employed in these panels (most of which will be susceptible to rapid changes of selective pressures) provide a temporally stable picture of stock divergence (Nielsen et al., 2012) over timescales germane to management.
14.3.2 Mixed Stock Analysis and Individual Assignment The performance of GSI methods based on either the assignment of individuals to previously characterized stocks or the estimates of proportional population contributions to mixed stocks (e.g., Pella and Masuda, 2001) is highly dependent on both the actual level of genetic differentiation among populations and the number of genetic markers used (Manel et al., 2005). The inclusion of markers under strong diversifying selection (and even nongenetic markers, in a multidisciplinary discriminant analysis; Cadrin et al., 2013) will significantly improve the success of assignment tests and will resolve many of the problems associated with “high-grading bias” (Anderson, 2010), which occurs when markers are not chosen using rigorous cross-validation of the discriminant algorithm with a data set independent of the one used to develop the algorithm in the first place (Waples, 2010). Irrespective of the number and types of markers used in mixed stock analysis (MSA) and individual assignment (IA), an important step will be the characterization of all the potential “baseline” stocks that can contribute to mixed collections, and from which unknown specimens may originate. It is hoped that the necessity to (1) monitor changes in composition of mixed catches and (2) trace marketed products to combat illegal, unreported, and unregulated (IUU) fisheries will promote coordinated international efforts toward the sharing, reanalysis, and standardization of procedures among research groups, as well as the establishment of “tissue banks” and communal databases. Such efforts may also go some way to facilitate the standardization of novel tools for GSI in forensics; while validation and international standardization issues still make microsatellites the tool of choice in human court cases (Butler et al., 2007), fisheries and wildlife research may be in a better position to pioneer and harness the potentially more powerful options offered by SNPs (Hauser et al., 2011; Ogden, 2011). The envisaged heavy use of confirmed and/or putative adaptive markers will render the interpretation of population structure and mixed stock compositions rather more unhinged from classical hypotheses of genetic drift and gene flow: the selected polymorphisms will reflect a mixed array of different, interacting biological processes, which will also vary across species. The signal from many of these markers may have a very transient value over evolutionary timescales and over the lifetime of a large marine metapopulation (Secor, 2013), but they will be able to cater more clearly and effectively for the needs of fisheries management, while also allowing monitoring of short-term evolutionary responses to fishery exploitation (Jakobsd ottir et al., 2011).
318 Nuclear Genomic Markers
14.3.3 Seascape Genetics The term seascape genetics only first appeared in the literature in 2006 (Galindo et al., 2006)dborrowed from the previously introduced landscape genetics (Manel et al., 2003)dto indicate an approach that employs the physical, chemical, and biological features of the habitat to explain the observed patterns of spatial genetic structure. In the vast majority of cases, the habitat variables have so far involved some form of oceanic circulation modeling (Galindo et al., 2010; White et al., 2010) in order to provide more realistic links between space and pelagic larval dispersal than what is traditionally implied by the “Isolation by Distance” (IbD) expectations (Wright, 1943), especially at finer spatial scales. Seascape genetics is expected to become a very fertile ground of research, which can bridge the gap between ecology and population genetics. Selkoe et al. (2010) have already shown that more inclusive habitat description is possible by taking into account additional variables, such as substrate coverage, which can help explain successful settlement and levels of suitability for different species. Much like in terrestrial studies, it is conceivable to integrate increasingly larger numbers of habitat variables using a GIS approach (Spear et al., 2010) and generate spatially explicit models of habitat suitability across the seascape. Although seascape genetics approaches have so far emphasized the relationship between the physical environment and dispersal, we foresee that with more elaborate suitability maps and a broader array of predictor variablesdwhich will likely also take into account known aspects of the life history of the target speciesdit will be possible to include adaptive responses among the effects investigated. In this context, it will be important to separate neutral from adaptive markers in order to properly understand the links between specific environmental factors and biological processes. This multimarker, multilayer, and, ideally, multispecies approach will likely help identify areas of particular ecological significance and hence play a potentially pivotal role in marine spatial planning (e.g., the design of marine protected areas).
14.3.4 Effective Population Size In the first section, we emphasized the importance of the effective population size, Ne, in influencing estimates of population differentiation (i.e., FST). However, over the past decade, Ne itself has become the focus of intense research efforts, possibly spurred by several reports that, despite their naturally huge census population sizes (Nc), marine fishes may exhibit very low Ne/Nc ratios, between 103 and 105 (Hauser and Carvalho, 2008), which means that in a population of millions of individuals, there may be as little as a few hundred effective breeders. Thus, even large commercial stocks may potentially be rather susceptible to the negative consequences of genetic drift and inbreeding (Hare et al., 2011). Several methods exist to estimate contemporary Ne from genotypic data, which rely on fundamental population genetic processes, such as genetic
Conclusions
319
drift, individual relatedness, and linkage disequilibrium (Jorde and Ryman, 2007; Tallmon et al., 2008; Wang, 2009; Waples, 1989; Waples and Do, 2008) and all require the use of neutral markers. Empirical and modeling studies to date have shown that sample sizes between w50 and w250 can providedusing commonly available suites of microsatellite loci (10e20)d relatively accurate estimates of Ne if the true value ranges between the few hundreds and the few thousands, respectively (Waples and Do, 2010; Cuveliers et al., 2011). Palstra and Ruzzante (2008) provide a general guideline that the sample used should be at least 10% of the studied population’s effective size, although precision and accuracy will exponentially decay as Ne increases. Ne is a parameter of fundamental importance in conservation management, and more research in this area will have a positive impact in many directions. Ne estimates can be used to monitor temporal demographic responses to exploitation in terms of both the number of individuals sustaining a population and the relative changes in Ne/Nc ratios (Cuveliers et al., 2011; Hauser et al., 2002; Hutchinson et al., 2003; Portnoy et al., 2009; Therkildsen et al., 2010). Significant differences in Ne among stocks can pinpoint weaker components in a metapopulation system, and same-cohort estimates can inform on the fluctuations of the effective number of breeders producing offspring over subsequent years (Nb z (Ne/generation time); Hare et al., 2011). Moreover, knowledge of natural Ne levels can be instrumental to artificial restocking programs and can also be included in seascape genetics approaches as partial predictors of the component of differentiation accounted for by random drift alone. A naturally forthcoming step will be the performance assessment of panels of thousands of SNPs selected from across the genome in producing Ne estimates, especially when linkage disequilibrium coefficients can be adjusted for the chromosomal location-dependent rate of recombination (Tenesa et al., 2007).
14.4 CONCLUSIONS The vast amount of information contained in the nuclear genome is becoming progressively more available owing to recent developments in DNA sequencing techniques. However, the wealth of information that can be “read” across genomes may have rather different meanings. The most substantial difference is between neutral and adaptive genetic variation: the former is primarily shaped by the interaction between random genetic drift and gene flow, whereas the latter is constrained by the action of natural selection. This fundamental issue is often “lost in translation” when geneticists communicate their findings to fisheries managers, much like the way nuclear and mitochondrial DNA evidence is sometimes lumped into the general term genetics. There is still no general consensus or golden rule as to when it is more appropriate to use adaptive or neutral markers. Initially, an emphasis on migrant exchange had put neutral markers at the forefront of stock identification applications, with the view that markers under selection would essentially
320 Nuclear Genomic Markers “obscure” migration/drift dynamics. Recently, with the aim to resolve structure in cases where neutral tools failed to reject panmixia, adaptive markers have become an attractive alternative. Since population divergence over evolutionary timescales is the result of processes that affect the whole organism and its genome collectively, Funk et al. (2012) have recently proposed that for the identification of evolutionarily significant units (ESUs), all sets of markers should be employed together to assess divergence. On the other hand, for the identification of management units (MUs), which emphasize demographic independence, they suggest that neutral markers would be best suited, while markers under selection should be applied to detect local adaptation in population components requiring special conservation (CUs). Here we argue that, in marine populations, neutral markers may be at times ineffective at describing demographic boundaries between stock units, in which case the employment of selected adaptive markers will be necessary to address management issuesd while obviously ensuring that the patterns observed are temporally stable over the timescales of interest. This approach will be strengthened by the rapidly increasing knowledge of genomes, which will allow understanding of the functional relevance of the chosen markers on a case-by-case basis. Every branch of conservation biology operates at the delicate interface between applied and fundamental research, and in order to deliver effective management advice, it is often unnecessary to explain the subtleties of the biological processes underlying the patterns observed. In fact, while ecological and evolutionary research is eventually bound to investigate biological processes, stock identification is primarily interested in detecting patterns that are informative for the purpose of management. Advances in the field of stock identification and many other applications are attained through the inherently deep-delving and self-correcting process of scientific inquiry, which seeks to resolve the ultimate causes for the patterns observed in nature. However, in order to effectively translate basic knowledge into management, it is invariably necessary to simplify and streamline the information. In this sense, it is important to sustain continued efforts in fundamental research, because it is upon the solid platform of good scientific knowledge that improved methods can be designed, tested, and implemented.
ACKNOWLEDGMENTS We are indebted to the members of the ICES SIMWG, and WGAGFM expert groups for the numerous and stimulating discussions over the years, which are at the core of many of the issues raised in this chapter. We also thank Bill Hutchinson and Dave Weetman for the insightful and constructive comments provided on the manuscript draft.
REFERENCES Ackerman, M.W., Habicht, C., Seeb, L.W., 2011. Single-nucleotide polymorphisms (SNPS) under diversifying selection provide increased accuracy and precision in mixed-stock
References
321
analyses of sockeye salmon from the Copper River, Alaska. Trans. Am. Fish. Soc. 140, 865e881. Andersen, O., Wetten, O.F., De Rosa, M.C., Andre, C., Alinovi, C.C., Colafranceschi, M., Brix, O., Colosimo, A., 2009. Haemoglobin polymorphisms affect the oxygen-binding properties in Atlantic cod populations. Proc. R. Soc. B Biol. Sci. 276, 833e841. Anderson, E.C., 2010. Assessing the power of informative subsets of loci for population assignment: standard methods are upwardly biased. Mol. Ecol. Resour. 10, 701e710. Anderson, J., 2003. The International Seafood Trade. Woodhead Publishing Ltd., p. 240. Antoniou, A., Magoulas, A., 2013. Mitochondrial DNA methods in fisheries research. In: Cadrin, S.X., Kerr, L.A., Mariani, S. (Eds.), Stock Identification Methods. Elsevier Inc., San Diego. Avise, J.C., 2004. Molecular Markers, Natural History and Evolution, seconnd ed. Sinauer Associates, Sunderland, Massachusetts. Basu, N., Todgham, A.E., Ackerman, P.A., Bibeau, M.R., Nakano, K., Schulte, P.M., et al., 2002. Heat shock protein genes and their functional significance in fish. Gene 295, 173e183. Beacham, T.D., Candy, J.R., Supernault, K.J., Ming, T., Deagle, B., Schulze, A., Tuck, D., Kaukinen, K.H., Irvine, J.R., Miller, K.M., Withler, R.E., 2001. Evaluation and application of microsatellite and major histocompatibility complex variation for stock identification of coho salmon in British Columbia. Trans. Am. Fish. Soc. 130, 1116e1149. Beacham, T.D., McIntosh, B., Wallace, C., 2010. A comparison of stock and individual identification for sockeye salmon (Oncorhynchus nerka) in British Columbia provided by microsatellites and single nucleotide polymorphisms. Can. J. Fish. Aquat. Sci. 67, 1274e1290. Beaumont, M.A., 2005. Adaptation and speciation: what can FST tell us? Trends Ecol. Evol. 20, 435e440. Bentzen, P., 1998. Seeking evidence of local stock structure using molecular genetic methods. In: von Herbing, H., Kornfield, I.,I., Tupper, M., Wilson, J. (Eds.), The Implications of Localized Fisheries Stocks. Regional Agricultural Engineering Service, New York, pp. 20e30. Borza, T., Stone, C., Gamperl, A.K., et al., 2009. Atlantic cod (Gadus morhua) hemoglobin genes: multiplicity and polymorphism. BMC Genet. 10, 51. Bouck, A., Vision, T., 2007. The molecular ecologist’s guide to expressed sequence tags. Mol. Ecol. 16, 907e924. Boutet, I., Quere, N., Lecomte, F., Agnese, J.F., Guinand, B., 2008. Putative transcription factor binding sites and polymorphisms in the proximal promoter of the PRL-A gene in percomorphs and European sea bass (Dicentrarchus labrax). Mar. Ecol. Evolv. Persp. 29, 354e364. Bradbury, I.R., Hubert, S., Higgins, B., et al., 2010. Parallel adaptive evolution of Atlantic cod on both sides of the Atlantic Ocean in response to temperature. Proc. R. Soc. B Biol. Sci. 277, 3725e3734. Bradbury, I.R., Hubert, S., Higgins, B., et al., 2011. Evaluating SNP ascertainment bias and its impact on population assignment in Atlantic cod, Gadus morhua. Mol. Ecol. Resour. 11, 218e225. Brix, O., For as, E., Strand, I., 1998. Genetic variation and functional properties of Atlantic cod hemoglobins: introducing a modified tonometric method for studying fragile hemoglobins. Comp. Biochem. Physiol. A 119, 575e583. Butler, J.M., Coble, M.D., Vallone, P.M., 2007. STRs vs. SNPs: thoughts on the future of forensic DNA testing. Forensic Sci. Med. Pathol. 3 (3), 200e205. http: //dx.doi.org/10.1007/s12024-007-0018-1. Cadrin, S.X., Kerr, L., Mariani, S., 2013. Interdisciplinary stock identification for fishery management and conservation biology. In: Cadrin, S.X., Kerr, L.A., Mariani, S. (Eds.), Stock Identification Methods. Elsevier Inc., San Diego. Chaoui, L., Gagnaire, P.A., Guinand, B., Quignard, J.P., Tsigenopoulos, C., Kara, M.H., Bonhomme, F., 2012. Microsatellite length variation in candidate genes correlates
322 Nuclear Genomic Markers with habitat in the gilthead sea bream Sparus aurata. Mol. Ecol. http: //dx.doi.org/10.1111/mec.12062. Cohen, S., 2002. Strong positive selection and habitat-specific amino acid substitution patterns in Mhc from an estuarine fish under intense pollution stress. Mol. Biol. Evol. 19, 1870e1880. Coscia, I., Vogiatzi, E., Kotoulas, G., Tsigenopoulos, C., Mariani, S., 2012. Exploring neutral and adaptive genetic variation in expanding populations of gilthead sea bream, Sparus aurata, in the North East Atlantic. Heredity 108, 537e546. Cuveliers, E.L., Volckaert, F.A.M., Rijnsdorp, A.D., Larmuseau, M.H.D., Maes, G.E., 2011. Temporal genetic stability and high effective population size despite fisheries-induced life-history trait evolution in the North Sea sole. Mol. Ecol. 20, 3555e3568. Dalziel, A.C., Rogers, S.M., Schulte, P.M., 2009. Linking genotypes to phenotypes and fitness: how mechanistic biology can inform molecular ecology. Mol. Ecol. 18, 4997e5017. Davey, J.W., Hohenlohe, P.A., Etter, P.D., Boone, J.Q., Catchen, J.M., Blaxter, M.L., 2011. Genome-wide genetic marker discovery and genotyping using next-generation sequencing. Nat. Rev. Genet. 2, 499e510. Eizaguirre, C., Lenz, T., Kalbe, M., Milinski, M., 2012. Rapid and adaptive evolution of MHC genes under parasite selection in experimental vertebrate populations. Nat. Commun. 3, 621. Estoup, A., Angers, B., 1998. Microsatellites and minisatellites for molecular ecology: theoretical and empirical considerations. In: Carvlho, G.R. (Ed.), Advances in Molecular Ecology, NATO Science Series. IOS Press, Amsterdam, pp. 55e86. Estoup, A., Jarne, P., Cornuet, J.-M., 2002. Homoplasy and mutation model at microsatellite loci and their consequences for population genetics analysis. Mol. Ecol. 11, 1591e1604. Ford, M.J., 2002. Applications of selective neutrality tests to molecular ecology. Mol. Ecol. 11, 1245e1262. Funk, C.W., McKay, J.K., Hohenlohe, P.A., Allendorf, F.W., 2012. Harnessing genomics for delineating conservation units. Trends Ecol. Evol. 27, 489e496. Gagnaire, P.A., Normandeau, E., Cote, C., Hansen, M.M., Bernatchez, L., 2012. The genetic consequences of spatially varying selection in the panmictic American eel (Anguilla rostrata). Genetics 190, 725. Galindo, H., Olson, D., Palumbi, S., 2006. Seascape genetics: a coupled oceanographic-genetic model predicts population structure of Caribbean corals. Curr. Biol. 16, 1622e1626. Galindo, H.M., Pfeiffer-Herbert, A.S., McManus, M.A., Chao, Y., Chai, F., Palumbi, S.R., 2010. Seascape genetics along a steep cline: using genetic patterns to test predictions of marine larval dispersal. Mol. Ecol. 19, 3692e3707. Gardner, M.J., Fitch, A., Bertozzi, T., Lowe, A.J., 2011. Rise of the machinesd recommendations for ecologists when using next generation sequencing for microsatellite development. Mol. Ecol. 11, 1093e1101. Garoia, F., Guarnieri, I., Grifoni, D., Marzola, S., Tinti, F., 2007. Comparative analysis of AFLPs and SSRs efficiency in resolving population genetic structure of Mediterranean Solea vulgaris. Mol. Ecol. 16, 1377e1387. Gattepaille, L.M., Jakobsson, M., 2012. Combining markers into haplotypes can improve population structure inference. Genetics 190, 159e174. Glover, K.A., Dahle, G., Jorstad, K.E., 2011. Genetic identification of farmed and wild Atlantic cod, Gadus morhua, in coastal Norway. ICES J. Mar. Sci. 68, 901e910. Goetz, F.W., McKenzie, S., 2008. Functional genomics with microarrays in fish biology and fisheries. Fish Fish. 9, 378e395. Greenbaum, D., Colangelo, C., Williams, K., Gerstein, M., 2003. Comparing protein abundance and mRNA expression levels on a genomic scale. Genome Biol. 4, 117. Gregory, T.R., Nicol, J.A., Tamm, H., Kullman, B., Kullman, K., Leitch, I.J., Murray, B.G., Kapraun, D.F., Greilhuber, J., Bennett, M.D., 2007. Eukaryotic genome size databases. Nucleic Acids Res. 35, D332eD338.
References
323
Guinand, B., Lemaire, C., Bonhomme, F., 2004. How to detect polymorphisms undergoing selection in marine fishes? A review of methods and case studies, including flatfishes. J. Sea Res. 51, 167e182. Hare, M.P., Nunney, L., Schwartz, M.K., et al., 2011. Understanding and estimating effective population size for practical application in marine species management. Conserv. Biol. 25, 438e449. Hauser, L., Adcock, G.J., Smith, P.J., Ramirez, J.H.B., Carvalho, G.R., 2002. Loss of microsatellite diversity and low effective population size in an overexploited population of New Zealand snapper (Pagrus auratus). Proc. Natl. Acad. Sci. USA 99, 11742e11747. Hauser, L., Baird, M., Hilborn, R., Seeb, L.W., Seeb, J.E., 2011. An empirical comparison of SNPs and microsatellites for parentage and kinship assignment in a wild sockeye salmon (Oncorhynchus nerka) population. Mol. Ecol. Resour. 11, 150e161. Hauser, L., Carvalho, G.R., 2008. Paradigm shifts in marine fisheries genetics: ugly hypotheses slain by beautiful facts. Fish Fish. 9, 333e362. Heino, M., 2013. Quantitative traits. In: Cadrin, S.X., Kerr, L.A., Mariani, S. (Eds.), Stock Identification Methods. Elsevier Inc., San Diego. Helyar, S.J., Hemmer-Hansen, J., Bekkevold, D., Taylor, M.I., Ogden, R., Limborg, M.T., Cariani, A., Maes, G.E., Diopere, E., Carvalho, G.R., Nielsen, E.E., 2011. Application of SNPS for population genetics of nonmodel organisms: new opportunities and challenges. Mol. Ecol. Resour. 11 (Suppl. 1), 123e136. Hemmer-Hansen, J., Nielsen, E., Meldrup, D., Mittelholzer, C., 2011. Identification of single nucleotide polymorphisms in candidate genes for growth and reproduction in a nonmodel organism; the Atlantic cod, Gadus morhua. Mol. Ecol. Resour. 11 (Suppl. 1), 71e80. Hemmer-Hansen, J., Nielsen, E.E., Frydenberg, J., Loeschcke, V., 2007. Adaptive divergence in a high gene flow environment: Hsc70 variation in the European flounder (Platicthys flesus L.). Heredity 99, 592e600. Hess, J., Matala, A.P., Narum, S.R., 2011. Comparison of SNPs and microsatellites for fine-scale application of genetic stock identification of Chinook salmon in the Columbia River Basin. Mol. Ecol. Resour 11, 137e149. Hohenlohe, P.A., Bassham, S., Etter, P.D., Stiffler, N., Johnson, E.A., Cresko, W.A., 2010. Population genomics of parallel adaptation in threespine stickleback using sequenced RAD tags. PLOS Genet. 6, e10000862. Hubert, S., Higgins, B., Borza, T., Bowman, S., 2010. Development of a SNP resource and a genetic linkage map for Atlantic cod (Gadus morhua). BMC Genomics 11, 191. Hutchinson, W.F., van Oosterhout, C., Rogers, S.I., Carvalho, G.R., 2003. Temporal analysis of archived samples indicates marked genetic changes in declining North Sea cod (Gadus morhua). Proc. R. Soc. B 270, 2125e2132. Ihssen, P., Booke, H., Casselman, J., cglade, J., Payne, N., Utter, F., 1981. Stock identification: materials and methods. Can. J. Fish. Aquat. Sci. 38, 1838e1855. Jakobsdottir, K.B., Pardoe, H., Magn usson, A., et al., 2011. Historical changes in genotypic frequencies at the Pantophysin locus in Atlantic cod (Gadus morhua) in Icelandic waters: evidence of fisheries-induced selection? Evol. Appl. 4, 562e573. Joost, S., Bonin, A., Bruford, W., et al., 2007. A spatial analysis method (SAM) to detect candidate loci for selection: towards a landscape genomics approach to adaptation. Mol. Ecol. 16, 3955e3969. Jorde, P.E., Ryman, N., 2007. Unbiased estimator for genetic drift and effective population size. Genetics 177, 927e935. Kalinowski, S.T., 2002. Evolutionary and statistical properties of genetic distances. Mol. Ecol. 11, 1263e1273. Karl, S.A., Avise, J.C., 1992. Balancing selection at allozyme loci in oysters: implications from nuclear RFLPs. Science 256, 100e102. Karr, T.L., 2008. Application of proteomics to ecology and population biology. Heredity 100, 200e206. Kritzer, J.P., Liu, O., 2013. Metapopulation ecology and stock identification. In: Cadrin, S.X., Kerr, L.A., Mariani, S. (Eds.), Stock Identification Methods. Elsevier Inc., San Diego.
324 Nuclear Genomic Markers Larmuseau, M.H.D., Raeymaekers, J.A.M., Ruddick, K.G., Van Houdt, J.K.J., Volckaert, F.A.M., 2009. To see in different seas: spatial variation in the rhodopsin gene of the sand goby (Pomatoschistus minutus). Mol. Ecol. 18, 4227e4239. Limborg, M.T., Helyar, S.J., de Bruyn, M., Taylor, M.I., Nielsen, E.E., Ogden, R., Carvalho, G.R., Consortium, F.P.T., Bekkevold, D., 2012. Environmental selection on transcriptome-derived SNPS in a high gene flow marine fish, the Atlantic herring (Clupea harengus). Mol. Ecol. 21, 3686e3703. Liu, Z.J., 2005. Amplified fragment length polymorphism (AFLP). In: Cadrin, S.X., Friedland, K.D., Waldman, J.R. (Eds.), Stock Identification Methods. Elsevier Inc., San Diego, pp. 389e411. Luikart, G., England, P.R., Tallmon, D., Jordan, S., Taberlet, P., 2003. The power and promise of population genomics: from genotyping to genome typing. Nat. Rev. Genet. 4, 981e994. Manel, S., Schwartz, M.K., Luikart, G., Taberlet, P., 2003. Landscape genetics: combining landscape ecology and population genetics. Trends Ecol. Evol. 18, 189e197. Manel, S., Gaggiotti, O., Waples, R., 2005. Assignment methods: which approaches best address which biological questions? Trends Ecol. Evol. 20, 136e142. Mariani, S., Peijnenburg, K.T.C.A., Weetman, D., 2012. Independence of neutral and adaptive divergence in a low dispersal marine mollusc. Mar. Ecol. Prog. Ser. 446, 173e187. McCusker, M.R., Bentzen, P., 2010. Positive relationships between genetic diversity and abundance in fishes. Mol. Ecol. 19, 4852e4862. McKay, J.K., Latta, R.G., 2002. Adaptive population divergence: markers, QTL and traits. Trends Ecol. Evol. 17, 285e291. Meirmans, P.G., Hedrick, P.W., 2011. Assessing population structure: FST and related measures. Mol. Ecol. Resour. 11, 5e18. Meril€a, J., Crnokrak, P., 2001. Comparison of genetic differentiation at marker loci and quantitative traits. J. Evol. Biol. 14, 892e903. Miller, K.M., Li, S., Kaukinen, K.H., Ginther, N., Hammill, E., Curtis, J.M.R., Patterson, D.A., Sierocinski, T., Donnison, L., Pavlidis, P., Hinch, S.G., Hruska, K.A., Cooke, S.J., English, K.K., Farrell, A.P., 2011. Genomic signatures predict migration and spawning failure in wild Canadian salmon. Science 331, 214e217. Morin, P.A., McCarthy, M., 2007. Highly accurate SNP genotyping from historical and low-quality samples. Mol. Ecol. Notes 7, 937e946. Nielsen, E.E., Cariani, A., Mac Aoidh, E., Maes, G.E., Milano, I., Ogden, R., et al., 2012. Gene-associated markers provide tools for tackling IUU fishing and false eco-certification. Nat. Commun. 3, 851. Nielsen, E.E., Hansen, M.M., 2008. Waking the dead: the value of population genetic analyses of historical samples. Fish Fish. 9, 450e461. Nielsen, E.E., Hansen, M.M., Meldrup, D., 2006. Evidence of microsatellite hitch-hiking selection in Atlantic cod (Gadus morhua L.): implications for inferring population structure in nonmodel organisms. Mol. Ecol. 15, 3219e3229. Nielsen, E.E., Hemmer-Hansen, J., Larsen, P.F., Bekkevold, D., 2009a. Population genomics of marine fishes: identifying adaptive variation in space and time. Mol. Ecol. 18, 3128e3150. Nielsen, E.E., Hemmer-Hansen, J., Poulsen, N.A., et al., 2009b. Genomic signatures of local directional selection in a high gene flow marine organism, the Atlantic cod (Gadus morhua). BMC Evol. Biol. 9, 276. Ogden, R., 2011. Unlocking the potential of genomic technologies for wildlife forensics. Mol. Ecol. Resour. 11, 109e116. Palstra, F.P., Ruzzante, D.E., 2008. Genetic estimates of contemporary effective population size: what can they tell us about the importance of genetic stochasticity for wild population persistence? Mol. Ecol. 17, 3428e3447. Pampoulie, C., Ruzzante, D.E., Chosson, V., et al., 2006. The genetic structure of Atlantic cod (Gadus morhua) around Iceland: insight from microsatellites, the Pan I locus, and tagging experiments. Can. J. Fish. Aquat. Sci. 63, 2660e2674. Pampoulie, C., Danielsdottir, A.K., Thorsteinsson, V., Hjorleifsson, E., Marteinsdottir, G., Ruzzante, D.E., 2012. The composition of adult overwintering and juvenile aggregations
References
325
of Atlantic cod (Gadus morhua) around Iceland using neutral and functional markers: a statistical challenge. Can. J. Fish. Aquat. Sci. 69, 307e320. Papakostas, S., Vasem€agi, A., V€ah€a, J.P., Himberg, M., 2012. A proteomics approach reveals divergent molecular responses to salinity in populations of European whitefish (Coregonus lavaretus). Mol. Ecol. http://dx.doi.org/10.1111/j.1365e294X.2012.05553.x. Pella, J., Masuda, M., 2001. Bayesian methods for analysis of stock mixtures from genetic characters. Fish. Bull. 99, 151e167. Pennisi, E., 2012. ENCODE project writes eulogy for junk DNA. Science 337, 1159e1161. Petersen, M.F., Steffensen, J.F., 2003. Preferred temperature of juvenile Atlantic cod Gadus morhua with different haemoglobin genotypes at normoxia and moderate hypoxia. J. Exp. Biol. 206, 359e364. Peterson, B.K., Weber, J.N., Kay, E.H., Fisher, H.S., Hoekstra, H.E., 2012. Double digest RADseq: an inexpensive method for de novo SNP discovery and genotyping in model and non-model species. PLoS One 7 (5), e37135. Pogson, G.H., 2001. Nucleotide polymorphism and natural selection at the pantophysin (Pan I) locus in the Atlantic cod, Gadus morhua (L.). Genetics 157, 317e330. Portnoy, D.S., McDowell, J.R., McCandless, C.T., Musick, J.A., Graves, J.E., 2009. Effective size closely approximates the census size in the heavily exploited western Atlantic population of sandbar shark, Carcharhinus plumbeus. Conserv. Genet. 10, 1697e1705. Poulsen, N.A., Hemmer-Hansen, J., Loeschcke, V., Carvalho, G.R., Nielsen, E.E., 2011. Microgeographical population structure and adaptation in Atlantic cod Gadus morhua: spatio-temporal insights from gene-associated DNA markers. Mar. Ecol. Prog. Ser. 436, 231e243. Rees, B.B., Andacht, T., Skripnikova, E., Crawford, D.L., 2011. Population proteomics: quantitative variation within and among populations in cardiac protein expression. Mol Biol Evol. 28, 1271e1279. Renaut, S., Nolte, A.W., Bernatchez, L., 2010. Mining transcriptome sequences towards identifying adaptive single nucleotide polymorphisms in lake whitefish species pairs (Coregonus spp. Salmonidae). Mol. Ecol. 19 (Suppl. 1), 115e131. Roach, J.C., Glusman, G., Smit, A.F.A., Huff, C.D., Hubley, R., Shannon, P.T., Rowen, L., Pant, K.P., Goodman, N., Bamshad, M., Shendure, J., Drmanac, R., Jorde, L.B., Hood, L., Galas, D.J., 2010. Analysis of genetic inheritance in a family quartet by whole-genome sequencing. Science 328, 636e639. Ruzzante, D.E., Mariani, S., Bekkevold, D., Andre, C., et al., 2006. Biocomplexity in a highly migratory pelagic marine fish, Atlantic herring. Proc. R. Soc. Lond. B Biol. Sci. 273, 1459e1464. Sagarin, R., Carlsson, J., Duval, M., Freshwater, W., Godfrey, M.H., et al., 2009. Bringing molecular tools into environmental resource management: untangling the molecules to policy pathway. PLoS Biol. 7, 426e430. Sala-Bozano, M., Ketmaier, V., Mariani, S., 2009. Contrasting signals for multiple markers illuminate population connectivity in a marine fish. Mol. Ecol. 18, 4811e4826. Schlesinger, M.J., 1990. Heat-shock proteins. J. Biol. Chem. 265, 12111e12114. Schl€otterer, C., Dieringer, D., 2005. A novel test statistics for the identification of local selective sweeps based on microsatellite gene diversity. In: Nurminski, D. (Ed.), Selective Sweep. Eurekah.com and Kl€ uwer Academic/Plenum Publishers, Georgetown, Texas, pp. 55e64. Schuster, S.C., 2008. Next-generation sequencing transforms today’s biology. Nat. Methods 5, 16e18. Secor, D.H., 2013. The stock concept. In: Cadrin, S.X., Kerr, L.A., Mariani, S. (Eds.), Stock Identification Methods. Elsevier Inc., San Diego. Seeb, L.W., Seeb, J.E., Habicht, C., Farley, E.V., Utter, F.M., 2011a. Single-nucleotide polymorphic genotypes reveal patterns of early juvenile migration of sockeye salmon in the Eastern Bering Sea. Trans. Am. Fish. Soc. 140, 734e748. Seeb, L.W., Templin, W.D., Sato, S., Abe, S., Warheit, K., Park, J.Y., Seeb, J.E., 2011b. Single nucleotide polymorphisms across a species’ range: implications for conservation studies of Pacific salmon. Mol. Ecol. Resour. 11, 195e217.
326 Nuclear Genomic Markers Selkoe, K.A., Watson, J.R., White, C., Ben Horin, T., 2010. Taking the chaos out of genetic patchiness: seascape genetics reveals ecological and oceanographic drivers of genetic patterns in three temperate reef species. Mol. Ecol. 19, 3708e3726. Sick, K., 1961. Haemoglobin polymorphism in fishes. Nature 192, 894e896. Sick, K., 1965. Haemoglobin polymorphism of cod in Baltic and Danish Belt Sea. Hereditas 54, 19e20. Smith, M.J., Pascal, C.E., Grauvogel, Z., Habicht, C., Seeb, J.E., Seeb, L.W., 2011. Multiplex preamplification PCR and microsatellite validation enables accurate single nucleotide polymorphism genotyping of historical fish scales. Mol. Ecol. Resour. 11, 268e277. Spear, S., Balkenhol, N., Fortin, M., McRae, B., Scribner, K., 2010. Use of resistance surfaces for landscape genetic studies: consideration for parameterization and analysis. Mol. Ecol. 19, 3576e3591. Spurgin, L.G., Richardson, D.S., 2010. How pathogens drive genetic diversity: MHC, mechanisms and misunderstandings. Proc. R. Soc. B Biol. Sci. 277, 979e988. Storz, J.F., 2005. Using genome scans of DNA polymorphism to infer adaptive population divergence. Mol. Ecol. 14, 671e688. Sultan, M., Schulz, M.H., Richard, H., Magen, A., Klingenhoff, A., Scherf, M., Seifert, M., Borodina, T., Soldatov, A., Parkhomchuk, D., Schmidt, D., O’Keeffe, S., Haas, S., Vingron, M., Lehrach, H., Yaspo, M.L., 2008. A global view of gene activity and alternative splicing by deep sequencing of the human transcriptome. Science 321, 956e960. Tallmon, D.A., Koyuk, A., Luikart, G., Beaumont, M.A., 2008. ONeSAMP: a program to estimate effective population size using approximate Bayesian computation. Mol. Ecol. Resour. 8, 299e301. Tenesa, A., Navarro, P., Hayes, B.J., Duffy, D.L., Clarke, G.M., Goddard, M.E., Visscher, P.M., 2007. Recent human effective population size estimated from linkage disequilibrium. Genome Res. 17, 520e526. Therkildsen, N.O., Nielsen, E.E., Swain, D.P., Pedersen, J.S., 2010. Large effective population size and temporal genetic stability in Atlantic cod (Gadus morhua) in the southern Gulf of St. Lawrence. Can. J. Fish. Aquat. Sci. 67, 1585e1595. Utter, F., Hodgins, H., Allendorf, F., 1974. Biochemical genetic studies of fishes: potentialities and limitations. In: Malins, D.C., Sargent, J.R. (Eds.), Biochemical and Biophysical Perspectives in Marine Biology, vol. 1. Academic Press, New York, pp. 213e238. Utter, F., Seeb, J., 2010. A perspective on positive relationships between genetic diversity and abundance in fishes. Mol. Ecol. 19, 4831e4833. van Oosterhout, C., Hutchinson, W.F., Wills, D.P.M., Shipley, P., 2004. MICRO-CHECKER: software for identifying and correcting genotyping errors in microsatellite data. Mol. Ecol. Notes 4, 535e538. Vasem€agi, A., Primmer, C.R., 2005. Challenges for identifying functionally important genetic variation: the promise of combining complementary research strategies. Mol. Ecol. 14, 3623e3642. Vos, P., Hogers, R., Bleeker, M., Reijans, M., van de Lee, T., Hornes, M., Frijters, A., Pot, J., Peleman, J., Kuiper, M., 1995. AFLP: a new technique for DNA fingerprinting. Nucleic Acids Res. 23, 4407e4414. Waldman, J.R., 1999. The importance of comparative studies in stock analysis. Fish. Res. 43, 237e246. Wang, J., 2009. A new method for estimating effective population sizes from a single sample of multilocus genotypes. Mol. Ecol. 18, 2148e2164. Wang, S., Meyer, E., McKay, J.K., Matz, M.V., 2012. 2b-RAD: a simple and flexible method for genome-wide genotyping. Nat. Methods 9 (8), 808e812. Waples, R.S., 1987. A multispecies approach to the analysis of gene flow in marine shore fishes. Evolution 41, 385e400. Waples, R.S., 1989. A generalized approach for estimating effective population size from temporal changes in allele frequency. Genetics 121, 379e391. Waples, R.S., 1998. Separating the wheat from the chaff: patterns of genetic differentiation in high gene flow species. J. Hered. 89, 438e450.
References
327
Waples, R.S., 2010. High-grading bias: subtle problems with assessing power of selected subsets of loci for population assignment. Mol. Ecol. 19, 2599e2601. Waples, R.S., Do, C., 2008. LDNE: a program for estimating effective population size from data on linkage disequilibrium. Mol. Ecol. Resour. 8, 753e756. Waples, R.S., Do, C., 2010. Linkage disequilibrium estimates of contemporary Ne using highly variable genetic markers: a largely untapped resource for applied conservation and evolution. Evol. Appl. 3, 244e262. Waples, R.S., Gaggiotti, O., 2006. What is a population? An empirical evaluation of some genetic methods for identifying the number of gene pools and their degree of connectivity. Mol. Ecol. 15, 1419e1439. Waples, R.S., Punt, A.E., Cope, J.M., 2008. Integrating genetic data into management of marine resources: how can we do it better? Fish Fish. 9, 423e449. Ward, R.D., Zemlak, T.S., Innes, B.H., et al., 2005. DNA barcoding Australia’s fish species. Philos. Trans. R. Soc. B 360, 1847e1857. Wennevik, V., Jørstad, K.E., Dahle, G., Fevolden S-, E., 2008. Mixed stock analysis and the power of different classes of molecular markers in discriminating coastal and oceanic Atlantic cod (Gadus morhua L.) on the Lofoten spawning grounds, Northern Norway. Hydrobiologia 606, 7e25. White, C., Watson, J., Siegel, D.A., Selkoe, K.A., Zacherl, D.C., Toonen, R.J., 2010. Ocean currents help explain population genetic structure. Proc. R. Soc. B 277, 1685e1694. Wirgin, I., Waldman, J.R., 2005. Use of nuclear DNA in stock identification: single-copy and repetitive sequence markers. In: Cadrin, S.X., Friedland, K.D., Waldman, J.R. (Eds.), Stock Identification Methods. Elsevier Inc., San Diego, pp. 331e370. Wright, J.M., Bentzen, P., 1994. Microsatellites: genetic markers for the future. Rev. Fish Biol. Fish. 4, 384e388. Wright, S., 1931. Evolution in Mendelian populations. Genetics 16, 97e159. Wright, S., 1943. Isolation by distance. Genetics 28, 114e138. Wright, S., 1965. The interpretation of population structure by F-statistics with special regard to systems of mating. Evolution 19, 395e420. Zane, L., Bargelloni, L., Patarnello, T., 2002. Strategies for microsatellite isolation: a review. Mol. Ecol. 11, 1e16. Zhang, D.X., Hewitt, G.M., 2003. Nuclear DNA analyses in genetic studies of populations: practice, problems and prospects. Mol. Ecol. 12, 563e584.