A DNA resequencing array for genes involved in Parkinson’s disease

A DNA resequencing array for genes involved in Parkinson’s disease

Parkinsonism and Related Disorders 18 (2012) 386e390 Contents lists available at SciVerse ScienceDirect Parkinsonism and Related Disorders journal h...

144KB Sizes 0 Downloads 37 Views

Parkinsonism and Related Disorders 18 (2012) 386e390

Contents lists available at SciVerse ScienceDirect

Parkinsonism and Related Disorders journal homepage: www.elsevier.com/locate/parkreldis

A DNA resequencing array for genes involved in Parkinson’s disease E.J. Wilkins a, b,1, J.P. Rubio a,1, K.E. Kotschet a, g, T.F. Cowie d, W.C. Boon a, b, M. O’Hely c, R. Burfoot a, W. Wang f, C.M. Sue h, i, T.P. Speed c, J. Stankovitch e, M.K. Horne a, b, g, * a

Florey Neuroscience Institutes, Melbourne, Australia Centre for Neuroscience, The University of Melbourne, Australia c Walter and Eliza Hall Institute, Melbourne, Australia d Department of Pathology, The University of Melbourne, Australia e Menzies Research Institute, University of Tasmania, Hobart, Australia f Stanford University, CA, USA g Department of Neurology, St Vincent’s Hospital, Fitzroy, Australia h Department of Neurogenetics, The University of Sydney, Australia i Department of Neurogenetics, Kolling Institute of Medical Research, Royal North Shore Hospital, Sydney, Australia b

a r t i c l e i n f o

a b s t r a c t

Article history: Received 3 August 2011 Received in revised form 25 November 2011 Accepted 20 December 2011

Parkinson’s disease (PD) is aetiologically complex with both familial and sporadic forms. Familial PD results from rare, highly penetrant pathogenic mutations whereas multiple variants of low penetrance may contribute to the risk of sporadic PD. Common variants implicated in PD risk appear to explain only a minor proportion of the familial clustering observed in sporadic PD. It is therefore plausible that combinations of rare and/or common variants in genes already implicated in disease pathogenesis may help to explain the genetic basis of PD. We have developed a CustomSeq Affymetrix resequencing array to enable high-throughput sequencing of 13 genes (44 kb) implicated in the pathogenesis of PD. Using the array we sequenced 269 individuals, including 186 PD patients and 75 controls, achieving an overall call rate of 96.5% and 93.6%, for two respective versions of the array, and >99.9% accuracy for five samples sequenced by capillary sequencing in parallel. We identified modest associations with common variants in SNCA and LRRK2 and a trend suggestive of an overrepresentation of rare variants in cases compared to controls for several genes. We propose that this technology offers a robust and cost-effective alternative to targeted sequencing using traditional sequencing methods, and here we demonstrate the potential of this approach for either routine clinical investigation or for research studies aimed at understanding the genetic aetiology of PD. Crown Copyright Ó 2011 Published by Elsevier Ltd. All rights reserved.

Keywords: Parkinsons disease Affymetrix resequencing array DNA resequencing

1. Introduction Although most cases of Parkinson’s disease (PD) present as sporadic disease, between 5 and 10% of these cases have first- or second-degree relatives with PD, suggesting a Mendelian inheritance pattern and a broader role for genetics in the heritability of PD [1]. Pathological sequence variants have been identified in genes underlying familial PD [2e7]. Candidate gene association studies and genome-wide association studies demonstrate that risk of PD [7e17] is conferred by common variants in genes implicated in

* Corresponding author. Florey Neurosciences Institute, 2nd Floor, Alan Gilbert Building, 161 Barry St, Carlton South, 3053 Victoria, Australia. Tel.: þ613 8344 1800. E-mail address: malcolm.horne@florey.edu.au (M.K. Horne). 1 Wilkins and Rubio are equal first authors.

familial disease as well as other genes. However, the small effect sizes observed in these association studies fail to explain much of the familial clustering of PD, leading to the hypothesis that rare variants in known and unknown genes may account for the missing genetic component. Public-access genomic databases of human sequence variants are heavily biased towards common variants, so deep and comprehensive resequencing of genes already implicated in PD is likely to provide further insights into the genetic basis of PD. The high cost and low-throughput of conventional sequencing methods has inhibited large-scale targeted sequencing of genes implicated in PD and often limiting studies to a small number of genes or genotyping of known variants. Therefore, a higher throughput, less expensive alternative to conventional sequencing would be invaluable for both researcher and clinician. To this end, we developed the PD GeneChipÒ, a custom designed resequencing array that produces 44 kilobase (kb) of accurate and reproducible

1353-8020/$ e see front matter Crown Copyright Ó 2011 Published by Elsevier Ltd. All rights reserved. doi:10.1016/j.parkreldis.2011.12.012

E.J. Wilkins et al. / Parkinsonism and Related Disorders 18 (2012) 386e390

sequence data from genes implicated in PD, with significantly less labour and reagent costs than conventional techniques [18]. We have screened 269 samples (186 PD cases, 75 controls, one asymptomatic sibling of a PD case, and 7 individuals with other neurological diseases) using the PD GeneChipÒ to evaluate the accuracy and reproducibility of the array and to provide insights into the genetic basis of PD. We demonstrate that even with a relatively small sample, it is possible to detect pathogenic alterations and common sequence variants associated with disease.

387

2.4. Quantitation, pooling, fragmentation, and labelling of PCR products For short-range PCR reactions, amplicon concentrations were measured using BioAnalyzer DNA 7500 LabChips (Agilent, Santa Clara, CA). Long-range PCR reactions were quantitated by eye using a 1% agarose gel. Equimolar amounts of each PCR product were pooled, purified (QIAquick PCR Purification Kit; QIAGEN), and processed (GeneChipÒ Resequencing Assay Kit; Affymetrix). Briefly, pooled DNA was fragmented with 2.5 U/ml Fragmentation Reagent (0.2 U/mg DNA) at 37  C for 35 min and inactivated at 95  C for 15 min. DNA was biotinylated using terminal deoxynucleotidyl transferase. 2.5. Hybridisation and analysis

2. Materials and methods The study was approved by ethics committees of the Royal Melbourne, St Vincent’s, Austin and Royal North Shore Hospitals and adhered to the National Health and Medical Research Council code of practice. All subjects provided written informed consent.

2.1. Samples Samples came from 186 unrelated PD cases, seven individuals with multiple system atrophy (3), dementia with Lewy bodies, progressive supranuclear palsy, normal pressure hydrocephalus and PD, or essential tremor, one asymptomatic sibling of a PD case, and 75 (predominantly spousal) unaffected controls. Most individuals self-reported northern European ancestry although 40 cases and 10 controls reported mixed but predominantly European ancestry. In addition, three control subjects self-reported an affected relative. All cases attended specialist movement disorder clinics and met the United Kingdom Parkinson’s Disease Society Brain Bank Clinical Diagnostic Criteria [19]. The average age of onset of PD in the cohort was 55  11 (mean  standard deviations) years. Forty eight of the cohort had at least one first degree relative with PD and 58 had at least 1 s-degree relative with PD. Age of onset was known in 27 case with first degree relatives with PD; and in 22 with secondary relatives with PD: the average age of PD onset was 54  8 and 52  12 respectively. Ten of the 186 samples carried pathogenic sequence alterations previously characterised by capillary sequencing or multiplex ligation-dependent probe amplification. Six samples carried single nucleotide substitutions: three carried the LRRK2 c.6055G>A (p.Gly2019Ser) variant in the heterozygous state, two carried the LRRK2 c.4324G>C (p.Ala1442Pro) variant in the heterozygous state [20] and one carried a homozygous point mutation in the PINK1 gene (C. Sue unpublished data). Four samples carried heterozygous duplication or deletion variants in the PARK2 gene. Samples from these individuals were included as positive controls.

2.2. Resequencing array design Two 50 kb versions of the PD GeneChipÒ were designed (Version 1 and 2) using GeneChipÒ CustomSeqÒ Custom Resequencing Array technology (Affymetrix, Santa Clara, CA [18]). The initial design covered all exons, flanking intronic sequence and 50 and 30 sequence for the genes ATP13A2 (OMIM *610513), PARK7, (OMIM *602533) GCH1 (OMIM *600225), LRRK2 (OMIM *609007), NR4A2 (OMIM *601828), PARK2 (OMIM *600116), PINK1 (OMIM *608309), SNCA (OMIM *163890), SNCAIP (OMIM *603779), TH (OMIM *191290) and UCHL1 (OMIM þ191342), as well as selected sequence from APP (OMIM þ104760), MAPT (OMIM þ157140), PSEN1 (OMIM þ104311) and PSEN2 (OMIM þ600759). Six variants from the APOE (OMIM þ107741; rs429358, rs7412) and COMT (OMIM þ116790; rs4633, rs4680, rs165599, rs737865) genes were also included on the array. The second design also included the GBA (OMIM *606463) gene. Gene sequences were obtained from the UCSC Genome Browser March 2006 assembly [21] (http://genome.ucsc.edu/) and were checked with RepeatMasker [22] (http://www.repeatmasker.org) for repetitive elements and internal duplications before final array design.

2.3. DNA extraction and polymerase chain reaction (PCR) Genomic DNA was extracted from saliva (102 samples; DNA Genotek, Kanata, Ontario), whole blood (160 samples; QIAGEN, Valencia, CA or as described in [23]) or primary cell cultures (7 samples; Promega, Madison, WI). For Version 1, specific regions of the genome were PCR amplified using GoTaq Flexi DNA Polymerase (Promega, Madison, WI) and 130 short-range reactions. For Version 2, specific regions of the genome in each sample (w4 mg genomic DNA) were PCR amplified using a combination of 24 long-range (TaKaRa LA Taq; Takara Bio Inc., Japan) and 48 short-range reactions (GoTaq Flexi DNA Polymerase; Promega). Primer sequences and reaction conditions are available upon request. Primers were designed using the Primer3 program [24] (http://fokker.wi.mit.edu/ primer3/input.htm) except the primers for the GBA gene which have been previously described [25].

Resequencing array and analysis were performed at the Australian Genome Research Facility (Melbourne, Australia) according to the manufacturer’s protocol. Briefly, after arrays were pre-hybridised for 15 min, denatured biotinylated-DNA samples were applied to the array for hybridisation, 16 h at 49  C with 60 rotations per minute. This was followed by washing and staining on automated fluidic stations, and scanning on a GeneChipÒ Scanner 3000 (Affymetrix). The intensity data (.CEL files), generated from raw image files (.DAT), were analysed by the GeneChipÒ Sequence Analysis Software, Version 4.0 (GSEQ v4.0, Affymetrix). Analysis values were set to default for a diploid genome, except the no signal threshold was set to 1 and the quality score threshold was set to 3. 2.6. Capillary sequencing DNA amplicons were sequenced (BigDye v3.1 Terminator DNA Sequencing Kit ;Applied Biosystems, Foster City, CA) at Applied Genetic Diagnostics (University of Melbourne, Australia). Sequencing products were purified with DyeEx 2.0 Spin Kits (QIAGEN), separated by capillary electrophoresis on an ABI3130xl Genetic Analyzer (Applied Biosystems) and sequence data were analysed by the SeqMan program (DNASTAR, Madison, WI). 2.7. Accuracy and reproducibility analysis The sensitivity of the array technology was assessed using the six positive control samples carrying PD-associated single nucleotide substitutions. The accuracy of the array relative to capillary sequencing was defined in terms of false positive and false negative rates for five additional samples. Additionally, the reproducibility was defined as the proportion of identical calls made across 11 samples processed in duplicate. 2.8. Genotypeephenotype analysis Data cleaning (PLINK Version 1.07 [26,27]) was performed prior to genotypephenotype analysis. For the 11 samples processed in duplicate, discordant calls within a pair were excluded and then one dataset from each pair was removed entirely. Tri- and tetra-allelic variants, those with genotyping rate less than 85% (after accounting for PCR failure), and those identified as false positives in the accuracy analysis were all excluded. Additionally, variants within fragments with a reference call rate less than 99% (i.e. more than one in 100 nucleotides called nonreference), and variant calls that lay within nine nucleotides of a known insertion or deletion variant were removed. The asymptomatic sibling of a PD case was excluded from association tests. Tests for HardyeWeinberg equilibrium were performed for each variant in controls. A p-value < 0.001 led to exclusion from further analysis. Variants with minor allele frequency (MAF) > 0.02 were tested for allelic association using the Cochran-Mantel-Haenszel test, stratifying by array version. The BreslowDay test was used to test for heterogeneity between array versions. Where a sample had more than two rare variants (MAF < 0.02) within a fragment, those variant calls were set to no-call. Rare variants were then pooled by gene and tested for allelic association using Fisher’s Exact Test. One-sided tests were used for each gene to test the hypothesis that rare variants occur more frequently in cases than controls.

3. Results Two versions of the PD GeneChipÒ were designed and specific sequences such as repetitive elements and internal duplications were excluded where necessary. The two designs resequenced 44,020 base pair (bp) and 44,471 bp of human genomic sequence, respectively. The resequencing data for 269 individuals, or 280 arrays (11 samples processed in duplicate) were analysed using GSEQ and data from the two versions of the array were analysed separately (Table 1). After accounting for PCR failures (no visible product using the Bioanalyzer) overall GSEQ call rates were 96.3% (Version 1) and 93.6% (Version 2).

388

E.J. Wilkins et al. / Parkinsonism and Related Disorders 18 (2012) 386e390

3.2. Genotype-phenotype analyses

Table 1 Summary results

Total possible number of nucleotides analysed per array Number of arrays analysed Total possible number of nucleotides analysed Total number of nucleotides analysed (accounting for PCR failure) Number of nucleotides called by GSEQ (accounting for PCR failure) Overall call rate (relative to number of nucleotides analysed) Number of fragments with call rate less than 85% (short-range/long-range PCR) Number of nucleotides called by GSEQ (accounting for PCR failure and fragments with call rate less than 85%) Overall call rate (relative to number of nucleotides analysed) Minimum call rate per array Maximum call rate per array Median call rate per array a

Version 1

Version 2

44,020

44,471

148a 6,514,960

132 5,870,172

6,308,557

5,870,172

6,076,000

5,495,395

96.3%

93.6%

372 (372/-) 6,055,050

1577 (387/1190) 5,240,240

96.0%

89.3%

86.2% 97.5% 96.5%

32.8% 96.5% 93.1%

Eleven samples processed in duplicate, 137 unique samples processed.

As previously suggested [28], individual fragments with a call rate below 85% were set entirely to no-call. This affected 372 and 1577 fragments respectively from the Version 1 and Version 2 datasets. Following this quality control step, the overall GSEQ call rates were 96.0% (Version 1) and 89.3% (Version 2). The overall call rate for Version 2 was affected by 15 arrays with particularly low call rates (<80%), illustrated by the relatively high median call rate per array (93.1%). 3.1. Validation of the PD GeneChipÒ Ten samples carrying pathogenic sequence variants previously characterised by capillary sequencing or multiplex ligationdependent probe amplification (MLPA) were processed (Version 1) as a test of the array’s sensitivity with respect to known pathogenic variants. Six of the variants were single nucleotide substitutions and four were large duplication or deletion variants. Three (LRRK2; G2019S, A1442P and PINK1; W437G) of the six single nucleotide substitution variants were correctly identified by GSEQ with the remaining three (LRRK2; 2x G2019S. A1442P) assigned no-calls (Table S1). The actual genotypes were confirmed by visualising the raw data files. None of the duplication or deletion variants were correctly identified by GSEQ, and none showed evidence of hybridisation disruption. Note however, that the breakpoints of the three whole-exon duplication/deletion variants have not been determined and were likely not sequenced by the GeneChipÒ, and all four variants were present in the heterozygous state making it unlikely that GSEQ could detect disrupted hybridisation. Prior to fragmentation and labelling of PCR products, 11 samples were halved and each sub-sample was hybridised to its own array (Version 1). The reproducibility of the data was examined by calculating how often the pairs of arrays made identical calls. Identical calls were made at 96.4e98.9% of nucleotides sequenced on the 11 pairs of arrays (Table S2). Most non-identical calls resulted from one sample of the pair being a no-call, with no more than 25 calls (0.06%) being truly non-identical within a pair. Accuracy was assessed by comparing nucleotide calls obtained from five samples (Version 1) using GSEQ with those obtained by capillary sequencing. The overall accuracy of the array ranged from 99.93% to 99.95% (Table S3). The maximum number of false positive calls for a given sample was 18 (0.04%) and the maximum number of false negative calls for a given sample was 12 (0.03%).

Following data cleaning, 746 single nucleotide variants (SNVs) were analysed using PLINK. Sixteen variants were excluded because HardyeWeinberg equilibrium p-values were less than 0.001. The remaining 730 variants were tested for allelic association using the Cochran-Mantel-Haenszel test, were tested for heterogeneity between array versions using the Breslow-Day test and then separated based on the MAF. Sixty-three variants were classed as common (MAF > 0.02). Of these, the difference in call rate between Version 1 and Version 2 was significant for two variants (Fisher’s Exact Test), and five variants were only polymorphic in one version of the array. Of the remaining 56 variants, weak allelic association with PD risk was demonstrated for seven, with uncorrected p-values less than 0.1. Two of the variants showing weak association (rs2619361 (p ¼ 0.01) and rs1372519 (p ¼ 0.06)) are located upstream of the SNCA translation start site and two others (rs11556273 (p ¼ 0.09) and rs13129604 (p ¼ 0.10)) are located upstream of the UCHL1 translation start site and within intron two of the UCHL1 gene, respectively. The remaining three modestly associated variants (rs10878405 (p ¼ 0.04, p.E2108E), rs33995883 (p ¼ 0.07, pN2081D) and rs33962975 (p ¼ 0.08, p.G2385G)) are located within protein coding regions of the LRRK2 gene. Of the three SNVs in LRRK2, p.N2081D and p.E2108E lie within the kinase domain and G2385G is located in the WD40 domain of the predicted protein. Six hundred and sixty seven variants were classed as rare (MAF < 0.02). Where samples had more than two rare variants within a fragment, those variant calls were set to no-call. This eliminated 95 variants, leaving 572 rare variants for analysis. These variants were pooled by gene and tested for allelic association with PD risk using Fisher’s Exact Test. There was a trend towards more rare variants in cases than controls in the genes DJ1 (p ¼ 0.01), SNCA (p ¼ 0.04) and LRRK2 (p ¼ 0.06) (Table S4), but these differences were not significant after correction for multiple testing of 16 genes. The presence of confirmed and potentially pathogenic sequence alterations in PD-associated genes (Table 2) were demonstrated. A heterozygous LRRK2 p.Tyr1699Cys missense variant was identified in a case with a strong family history of PD and an early disease onset. Three carriers (two cases and a control) of LRRK2 p.Ile1371Val were also identified. In addition, a heterozygous PINK1 p.Gln456X nonsense variant was identified in a control. In addition there were six non-synonymous variants with uncle-ar pathogenicity in ten cases and two controls. We did not identify any carriers of LRRK2 M1646T, which was recently shown to be associated with PD in whites of European origin, although this variant was not particularly common in this study (MAF w 1.6%) [14]. Further, we did not identify any of the variants associated with PD in Asians (A419V, G2835R) or Arab-berbers (Y2189C) in the same study, however, these variants appear to be ethnically restricted. We identified two of the three variants(N551, frequency 5.2%; K1423K, frequency 3.1%) comprising the N551K-R1398H-K1423K protective Table 2 Pathogenic sequence alterations in PD-associated genes Gene

DNA

Protein

dbSNP

Carriers

Zygosity

Parkin

823T>C

Arg275Trp

rs34424986

Heterozygous

930C>G 1310T>C 952T>A 1366T>C 2769C>G 5096G>A 4111G>A

Glu310Asp Pro437Leu Met318Leu Gln456X Gln923His Tyr1699Cys Ile1371Val

rs72480423 e e rs45539432 rs58559150 rs35801418 rs17466213

3 1 2 1 1 1 1 1 2 1

PINK1 LRRK2

cases, control cases case case control case case cases, control

Heterozygous Heterozygous Heterozygous Heterozygous Heterozygous Heterozygous Heterozygous

E.J. Wilkins et al. / Parkinsonism and Related Disorders 18 (2012) 386e390

haplotype and both were in strong linkage disequilibrium in our sample, which was exemplified by the fact that14 of the 15 carriers of the minor A allele at K1423K also carried the minor G allele at N551K. 4. Discussion We developed the PD GeneChipÒ, a custom designed resequencing array that delivers 44 kb of accurate and reproducible sequence data with significantly less labour and reagent costs than conventional techniques [18]. In addition to validating the technology for use in high-throughput resequencing, we have demonstrated that even with relatively small sample numbers, full resequencing of genes implicated in PD can identify pathogenic sequence alterations and provide for genotypeephenotype association studies. By sequencing 11 samples in duplicate and comparing the PD GeneChipÒ data from five other samples to capillary sequencing data, we have demonstrated that array resequencing is both accurate and reproducible. In fact, we may have underestimated the accuracy of the array, as visual inspection of the capillary sequencing data indicated that several of the false negative calls could in fact have been false positive calls in the capillary sequencing data. Nevertheless, GSEQ failed to identify all of the known single nucleotide substitution variants in the positive control samples, and visual inspection of no-calls at sites of known disease-associated variation is therefore required to reduce false negative rates, as is often required when using conventional sequencing methods. Furthermore, this technology is unlikely to accurately identify insertions or deletions unless they are specifically incorporated into the array design by way of alternate probes. The presence of insertions or deletions may on occasions be inferred from disruptions in normal patterns of hybridisation and this was observed in our dataset (data not shown). Copy number variations in known PD genes may have pathogenic consequences, and we acknowledge this as a limitation of this technology and indeed, conventional sequencing methods. We would therefore recommend that sequence-based analyses using this approach be complemented with analysis of exon-level dosage, using an MLPA, for example. Next-generation sequencing (NGS) provides the opportunity to characterise the genetic architecture of the human genome to a high resolution. Off-the-shelf kits for whole- and exome sequencing are now commercially available and large consortia have been formed to interrogate the full spectrum of human genomic variation [29]. While NGS is an attractive method for targeted sequencing, the set up costs for custom projects, the analytical knowhow required to manage and QC data and the scale of project required to maximise its benefits may not preclude its use in every DNA sequencing experiment (e.g. for mutation screening of a single case at the request of the treating physician). It is thus likely for some time to come that Sanger-based sequencing methods will be useful and that array-based sequencing such as CustomSeq will also have a place, providing “off-the-shelf” option for modestly-sized sequencing projects or sample by sample analyses with turnaround within a week. Furthermore, data analysis and QC for this technology is supported by tailor-made software and the requisite hardware for chip processing and scanning is ubiquitous. A recent report of examining the association between LRRK2 exonic variants and susceptibility to Parkinson’s disease exemplifies the effects that rare and common variations on gene loci can have on risk of disease [14]. Short-range PCR produce better sequencing data than longrange PCR protocols, which was of poorer quality, evidenced by numerous fragments with low call rates or low reference call rates,

389

as well as many more variant calls (data not shown). This inferior performance has been discussed previously [28], and is likely due to under-fragmentation of the longer amplicons, resulting in inefficient hybridisation to the arrays. In addition, the greater variability in quantities of amplified DNA produced by long-range reactions may not have been adequately accounted for in our quantification process. This outcome is unfortunate because longrange PCR requires less labour, and their use would improve the throughput of this technology. Despite the advances in NGS methods, alternative techniques that support targeted sequencing in small to modestly-sized projects are still required. We propose that array-based resequencing complements conventional methods for modestly-sized sequencing projects where fast turnaround is necessary for only a few, or hundreds of samples at a time. Specifically, the PD GeneChipÒ represents both a clinical and academic research tool that could be used to understand the genetic aetiology of PD. Acknowledgements We thank Laura Johnson for sample handling and for funding of this research we are grateful to The Australian Research Council (Linkage Project: LPO776735), The Australian Brain Foundation and the Rebecca L. Cooper Medical Research Foundation. Appendix. Supplementary material Supplementary material associated with this article can be found, in the online version, at doi:10.1016/j.parkreldis.2011.12.012. References [1] Sveinbjornsdottir S, Hicks AA, Jonsson T, Petursson H, Gugmundsson G, Frigge ML, et al. Familial aggregation of Parkinson’s disease in Iceland. N Engl J Med 2000 Dec 14;343(24):1765e70. [2] Polymeropoulos MH, Lavedan C, Leroy E, Ide SE, Dehejia A, Dutra A, et al. Mutation in the alpha-synuclein gene identified in families with Parkinson’s disease. Science 1997 Jun 27;276(5321):2045e7. [3] Kitada T, Asakawa S, Hattori N, Matsumine H, Yamamura Y, Minoshima S, et al. Mutations in the parkin gene cause autosomal recessive juvenile parkinsonism. Nature 1998 Apr 9;392(6676):605e8. [4] Bonifati V, Rizzu P, van Baren MJ, Schaap O, Breedveld GJ, Krieger E, et al. Mutations in the DJ-1 gene associated with autosomal recessive early-onset parkinsonism. Science 2003 Jan 10;299(5604):256e9. [5] Valente EM, Abou-Sleiman PM, Caputo V, Muqit MM, Harvey K, Gispert S, et al. Hereditary early-onset Parkinson’s disease caused by mutations in PINK1. Science 2004 May 21;304(5674):1158e60. [6] Paisan-Ruiz C, Jain S, Evans EW, Gilks WP, Simon J, van der Brug M, et al. Cloning of the gene containing mutations that cause PARK8-linked Parkinson’s disease. Neuron 2004 Nov 18;44(4):595e600. [7] Mizuta I, Satake W, Nakabayashi Y, Ito C, Suzuki S, Momose Y, et al. Multiple candidate gene analysis identifies alpha-synuclein as a susceptibility gene for sporadic Parkinson’s disease. Hum Mol Genet 2006 Apr 1;15(7):1151e8. [8] Pankratz N, Wilk JB, Latourelle JC, DeStefano AL, Halter C, Pugh EW, et al. Genomewide association study for susceptibility genes contributing to familial Parkinson disease. Hum Genet. 2009 Jan;124(6):593e605. [9] Simon-Sanchez J, Schulte C, Bras JM, Sharma M, Gibbs JR, Berg D, et al. Genome-wide association study reveals genetic risk underlying Parkinson’s disease. Nat Genet. 2009 Dec;41(12):1308e12. [10] Fung HC, Scholz S, Matarin M, Simon-Sanchez J, Hernandez D, Britton A, et al. Genome-wide genotyping in Parkinson’s disease and neurologically normal controls: first stage analysis and public release of data. Lancet Neurol 2006 Nov;5(11):911e6. [11] Satake W, Nakabayashi Y, Mizuta I, Hirota Y, Ito C, Kubo M, et al. Genomewide association study identifies common variants at four loci as genetic risk factors for Parkinson’s disease. Nat Genet. 2009 Dec;41(12):1303e7. [12] Maraganore DM, de Andrade M, Elbaz A, Farrer MJ, Ioannidis JP, Kruger R, et al. Collaborative analysis of alpha-synuclein gene promoter variability and Parkinson disease. Jama 2006 Aug 9;296(6):661e70. [13] Di Fonzo A, Wu-Chou YH, Lu CS, van Doeselaar M, Simons EJ, Rohe CF, et al. A common missense variant in the LRRK2 gene, Gly2385Arg, associated with Parkinson’s disease risk in Taiwan. Neurogenetics 2006 Jul;7(3):133e8. [14] Ross OA, Soto-Ortolaza AI, Heckman MG, Aasly JO, Abahuni N, Annesi G, et al. Association of LRRK2 exonic variants with susceptibility to Parkinson’s disease: a case-control study. Lancet Neurol 2011 Oct;10(10):898e908.

390

E.J. Wilkins et al. / Parkinsonism and Related Disorders 18 (2012) 386e390

[15] Lwin A, Orvisky E, Goker-Alpan O, LaMarca ME, Sidransky E. Glucocerebrosidase mutations in subjects with parkinsonism. Mol Genet Metab 2004 Jan;81(1):70e3. [16] Srinivasan BS, Doostzadeh J, Absalan F, Mohandessi S, Jalili R, Bigdeli S, et al. Whole genome survey of coding SNPs reveals a reproducible pathway determinant of Parkinson disease. Hum Mutat 2009 Feb;30(2):228e38. [17] Maraganore DM, de Andrade M, Lesnick TG, Strain KJ, Farrer MJ, Rocca WA, et al. High-resolution whole-genome association study of Parkinson disease. Am J Hum Genet 2005 Nov;77(5):685e93. [18] Affymetrix. GeneChipÒ CustomSeqÒ resequencing array base calling algorithm version 2.0: performance in homozygous and heterozygous SNP detection; 2006. [19] Hughes AJ, Daniel SE, Kilford L, Lees AJ. Accuracy of clinical diagnosis of idiopathic Parkinson’s disease: a clinico-pathological study of 100 cases. J Neurol Neurosurg Psychiatry [Research Support, Non-U.S. Gov’t] 2002 Jun;12(6):996-1006. [20] Huang Y, Halliday GM, Vandebona H, Mellick GD, Mastaglia F, Stevens J, et al. Prevalence and clinical features of common LRRK2 mutations in Australians with Parkinson’s disease. Mov Disord 2007 May 15;22(7):982e9.

[21] Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, et al. The human genome browser at UCSC. Genome Res 2002 Jun;12(6):996e1006. [22] Smit A, Hubley R, Green P. RepeatMasker Open-3.0. 1996-2004. [23] Lahiri DK, Nurnberger Jr JI. A rapid non-enzymatic method for the preparation of HMW DNA from blood for RFLP studies. Nucleic Acids Res 1991 Oct 11; 19(19):5444. [24] Rozen S, Skaletsky H. Primer3 on the WWW for general users and for biologist programmers. Methods Mol Biol 2000;132:365e86. [25] Finckh U, Seeman P, von Widdern OC, Rolfs A. Simple PCR amplification of the entire glucocerebrosidase gene (GBA) coding region for diagnostic sequence analysis. DNA Seq 1998;8(6):349e56. [26] Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 2007 Sep;81(3):559e75. [27] Purcell S. PLINK v1.07. [28] Kothiyal P, Cox S, Ebert J, Aronow BJ, Greinwald JH, Rehm HL. An overview of custom array sequencing. Curr Protoc Hum Genet; 2009 Apr. Chapter 7:Unit 7 17. [29] 1000 Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature 2010 Oct 28;467(7319):1061e73.