doi:10.1006/geno.2001.6651, available online at http://www.idealibrary.com on IDEAL
Short Communication
Identification of the Human Cortactin-Binding Protein-2 Gene from the Autism Candidate Region at 7q31 Joseph Cheung,1 Erwin Petek,2 Kazuhiko Nakabayashi,1 Lap-Chee Tsui,1 John B. Vincent,1 and Stephen W. Scherer1,* 1Department
of Genetics, The Hospital for Sick Children, and The Department of Molecular and Medical Genetics, University of Toronto, Toronto, Ontario, M5G 1X8, Canada 2Institute of Medical Biology and Human Genetics, University of Graz, Harrachgasse 21/8, A-8010 Graz, Austria
*To
whom correspondence and reprint requests should be addressed. Fax: (416) 813-8319. E-mail:
[email protected].
Human chromosome 7q31 contains putative susceptibility loci for autism (AUTS1) and speech and language disorder (SPCH1). We report here the identification and characterization of a novel gene encoding cortactin-binding protein-2 (CORTBP2), which is located 45 kb telomeric to the cystic fibrosis transmembrane conductance regulator gene (CFTR) at 7q31.3. The full-length (5975-bp) gene was isolated and found to be composed of 23 exons encompassing 170 kb of DNA. In addition to being a positional candidate for AUTS1, CORTBP2 was expressed at highest levels in the brain, as shown by northern blot analysis. Subsequent mutation analysis of CORTBP2 in 90 autistic patients identified two polymorphisms, including a leucine to valine change caused by a T to G substitution in exon 15. However, comparison of allele frequencies between autistic and control populations (n = 96) showed no significant difference, suggesting that this variant is not a susceptibility factor for autism.
Autism, or autistic disorder (AD, MIM 209850), is a severe and debilitating developmental condition with deficits in reciprocal social interaction and communication skills, with onset in childhood. The q31 band of human chromosome 7 has been implicated in a number of genome-wide linkage studies for autism as a potential site harboring a disease locus (called AUTS1) [1–4]. Moreover, studies of a family with a specific speech and language disorder (MIM 602081) featuring some symptoms in common with AD have also identified a disease locus, SPCH1, at 7q31 [5]. A number of autism patients with cytogenetic abnormalities mapping to this region have been reported, adding further support for involvement of 7q31 in autism or speech and language disorder. We recently reported an autism patient (HSC1) with a balanced translocation, t(7;13)(q31.2;q21), for which the 7q31 breakpoint interrupted the gene ST7 (FAM4A1 and RAY1), situated just centromeric to CFTR and WNT2 [6] (Fig. 1). We have also identified another autistic individual (HSC2; t(5;7)(q15;q31.3)) with a translocation breakpoint positioned between CFTR and KCND2 (Fig. 1, and unpublished data). Two additional
individuals, one autistic (inv(7)(p12.2q31.3)]) [7] and the other having speech and language disorder (named BRD, t(2;7)(p23;q31.3)) [8], also have breakpoints in the CFTR–KCND2 region (Fig. 1). Therefore, we have constructed a transcription map of this genomic interval to identify candidate genes for mutational analysis in autism patients. Initially, the gene prediction program Genscan [9] was used to deduce gene models from genomic sequence contained in BAC clones AC004240 and AC007568, located between CFTR and KCND2. Three separate models, transcribed in the same direction with respect to the DNA sequence, were predicted (Fig. 1) and Unigene cluster Hs.65748 was found in the 3⬘ region of model 1 upon BLAST analysis [10]. A CpG island was detected in the 5⬘ end of one of the gene models by CpG frequency analysis program (http://www.ebi.ac.uk/cpg/). BLASTX against the nonredundant (nr) database indicated that Genscan models 1, 2, and 3 shared 40% amino acid identity with a predicted pufferfish gene (GenBank acc. no. AJ271361). To generate the full-length cDNA sequence of this putative new gene in human, we designed PCR primers from the predicted exons (Fig. 1) and carried out reverse transcription (RT)PCR on brain cDNA. PCR products were subsequently purified by Qiagen columns and directly sequenced in both orientations using standard protocols. A continuous cDNA sequence of 5975 bp (CORTBP2; GenBank acc. no. AF377960) was assembled using BLAST2 [11]. Alignment of the assembled cDNA sequence against the genomic DNA sequence indicated CORTBP2 contained 23 exons encompassing 170 kb. An intergenic distance of 45 kb between the 3⬘ ends of CFTR and CORTB2 was also established (Fig. 1). A putative open reading frame (nt 93–5084) of 1663 amino acids was translated from the assembled cDNA sequence. An in-frame stop codon was identified at 9 bp upstream of the translation start site (nt 93–95). A polyadenylation signal (AATAAA) was found at nt 5948–5953. In addition, our gene sequence was found to overlap with an anonymous cDNA clone, KIAA1758 [12]. KIAA1758, however, does not contain the 5⬘ end of the gene (it spans nt 95–5967 of CORTBP2). BLASTP search with the deduced protein sequence against the GenBank protein database (nr) showed that it does not contain significant similarity to any other human protein.
GENOMICS Vol. 78, Numbers 1-2, November 2001 Copyright © 2001 by Academic Press. All rights of reproduction in any form reserved. 0888-7543/01 $35.00
7
Short Communication
doi:10.1006/geno.2001.6651, available online at http://www.idealibrary.com on IDEAL
FIG. 1. The gene CORTBP2 at 7q31.3. (A) The location and orientation of transcription of CORTBP2 and the other known genes in the region. From our work the translocation breakpoints in two autistic (HSC1, HSC2) and one speech and language disorder (BRD) individual from another study [8] are indicated by vertical black arrows. The inversion breakpoint from another autistic individual (inv(7)(p12.2q31.3)) is located between CFTR and D7S643 [7]; a region that encompasses HSC2 and BRD. (B) The gene structure of CORTBP2. (C) Three partial models of CORTBP2 (Genscan 1, 2, and 3). Oligonucleotide primers (horizontal arrows) were designed based on the predicted exons of the gene models and used for RT-PCR experiments: (P1) 5⬘-GGACAGCAGCGGGTTAAGT-3⬘, (P2) ⬘CGTTTCTCAGCGGAGAGTTC-3⬘, (P3) 5⬘-AGACGTAATGGCCAAACTGG-3⬘, (P4) 5⬘-CAGGCACCCACCTGAGAC-3⬘, (P5) 5⬘-GCCACCAAAACCATCCATAG-3⬘, (P6) 5⬘-CGATGGTTCCCATCAGAAAG-3⬘, (P7) 5⬘-GGAAATGGACTGTCCGAATG-3⬘, (P8) 5⬘-TGGAGCACATGCTCTGAAGT-3⬘, (P9) 5’-TGTGCTCCAGCAAGTCTGAG-3⬘, (P10) 5⬘-CTGGGAACACTGGGTGACTT-3⬘, (P11) 5⬘-GGAGGTCAGTCCTCTCAGCA-3⬘, (P12) 5⬘-TGCCCCATTCTAGTCATTCC-3⬘.
However, an 85% sequence similarity between amino acids 1–637 of CORTBP2 and a rat protein, cortactin-binding protein (CBP90; GenBank acc. no. AAC35911) [13], was found (providing the basis for nomenclature assignment). Based on our comparison (Fig. 2), it is likely that CBP90 is the orthologue of CORTBP2, but that CBP90 is still incomplete because its coding sequence does not contain a stop codon. Moreover, an incomplete mouse gene (GenBank acc. no. AF229844) also shared 85% sequence identity with amino acids 1180 to 1663 of CORTBP2. Sequence analysis by Pfam [14] and SMART [15] revealed that CORTBP2 contains six 33-amino-acid ankyrin repeats in the middle of its sequence (Fig. 2). These repeats are likely to
8
function as protein–protein interaction domains [16,17]. Furthermore, several proline-rich regions were observed amino terminal to the ankyrin repeats and they are likely to interact with the SH3 domain of cortactin, as has been demonstrated for rat CBP90 [13]. No other unique motifs or domains were identified. Also, no transmembrane domains were detected using hydropathy analysis program HMMTOP [18]. To survey the expression pattern of CORTBP2 in human tissues, a multiple-tissue mRNA blot (MTN) and a brain-specific MTN (Clontech, CA) were used in northern blot analyses. The 450-bp probe, covering the 3⬘ UTR of CORTBP2, gave a strong signal of 6.0 kb in brain, kidney, and pancreas, while showing lower expression levels in lung, heart, liver, skeletal
GENOMICS Vol. 78, Numbers 1-2, November 2001 Copyright © 2001 by Academic Press. All rights of reproduction in any form reserved.
doi:10.1006/geno.2001.6651, available online at http://www.idealibrary.com on IDEAL
Short Communication
FIG. 2. Protein sequence alignment of human CORTBP2 and the partial rat gene product CBP90. The six 33-amino-acid ankyrin repeats are shown in boxes (amino acids 709–739, 743–772, 776–805, 809–838, 842–876, 912–942) and proline-rich regions are bolded to indicate sequence features of CORTBP2.
muscle, and placenta (Fig. 3). A brain-specific MTN indicated that there was expression in all subsections examined (Fig. 3). Besides having the ability to bind cortactin, the function of CORTBP2 is currently unknown. Because cortactin is an actin-binding protein that has been suggested to mediate aspects of cell signaling associated with the cortical cytoskeleton [19], it is possible that CORTBP2 is involved in the same cellular process. A different cortactin-binding protein, called CORTBP1, was previously identified from rat in a yeast twohybrid screen [20]. Although CORTBP1 does not share significant homology with the CORTBP2 described here, they both contain proline-rich motifs, which have been shown to interact with the SH3 domain of cortactin [13,20]. In addition,
CORTBP1 is highly expressed in brain. It has been shown that both cortactin and CORTBP1 localize in the growth cones in differentiating hippocampal neurons [20], suggesting that the cortactin–CORTBP1 interaction might be involved in signaling pathways associated with the formation and migration of growth cones during neurite outgrowth in rat. Moreover, Ohoka and Takai [13] showed that the subcellular distribution and developmental expression patterns were similar to both CBP90 and cortactin in the rat brain. Taking all the mapping and functional data together, we hypothesized that CORTBP2 may have a role in the etiology of autism. CORTBP2 was screened for mutations in genomic DNA from 90 unrelated individuals with autism from multiplex
FIG. 3. Expression pattern analyses using human multiple tissue and brain-specific northern blots. Northern blot hybridization was completed according to the manufacturer’s recommendations. The same 450-bp probe from the 3⬘ UTR of CORTBP2 (nt 5191–5641) was used in both hybridizations. A 2.4-kb band was observed in kidney, but subsequent RT-PCR analysis indicated it may not be part of the CORTBP2 transcription unit.
GENOMICS Vol. 78, Numbers 1-2, November 2001 Copyright © 2001 by Academic Press. All rights of reproduction in any form reserved.
9
Short Communication
doi:10.1006/geno.2001.6651, available online at http://www.idealibrary.com on IDEAL
TABLE 1: Oligonucleotide primers used in mutation analyses Exon
Forward primer sequence
Reverse primer sequence
Product sizea (bp)
1
GGACAGCAGCGGGTTAAGT
GCCCGCGTCTACACTAGC
220
2
TTGCATACCCTTAGGATGTGTG
AAGGTTTGCCAATGGCTCTT
253
3
CCATTTGGAAGTATAATCACAAAACC
AGGTGATCTCCTCATTGGACA
379
4a
AATCAGGGTTTTCCTGTGAA
GAGAGGCTTGGTTTGCTGTC
455
4b
TGAAAAGGGGAAGTGACAGC
TCTGAGGAGCTATGCCTGGT
450
4c
CGCTCAAACACCAGGCATA
GTGGTGGAGAAGGAGTTTGG
452
4d
AGGGCTCTCCCAAACTCCT
CAAGGCAAGTGAGAAGAGAGG
481
5
CTGTCTCTTGTCCTTGCTCCT
TCCATCCTTTTGCAAGTGTG
301
6
TGGGGAATTTACATGGCTTT
CTCCTTTGATGGTGGGAAAA
177
7
GGTGCTTCAAAGGTCAAACC
TGCAAATTATATATACAGGGGGAAA
251
8
CTTTCCCCCTGGTCATTTTA
ATGATGAGGCCAGCTGCTAC
312
9
TTCCTTTGGGTTTTGCATTC
CAGCAGCATAGGGGAAAGAT
233
10
TCCAGTGTAAGAACGTTCAATGTAA
CTCCGTCATAGGACTGCTGA
457
11
CCACTCATAAGCTTCCCTTGA
CAGCCATGAAGCAAACACTC
196
12
GGAACACCCATTTTTCTTCG
CAGGGATGGAAAACACAGGT
161
13 & 14
AGCATCTTCACCCTAGCTCTCT
ATCACAGGCAATTTGCTGAA
338
15b
TGTCAAAAGCTTGAATCACTGAA
CTTCTCAGCATGGTCCCTTT
225
16
TCTTTCTGTTTCCCAGAGTACCA
ACAAAAACCCCTGCTCTCCT
332
17
TTTGGGCTTCTAAAAACAAAATG
TCAGCATTCTCACCCTCAAA
322
18
TGCTCCTAAGGGATGAATCG
TTCACAAGGGTGTGAATTTCC
289
19
TGCATTATTTTGAACAGTCCGTA
TCACGTATGTCCACCTCCTG
335
20
CTGCAAAGTCCACAGTCTGTTT
CATTCAAATGAGAACATGGAACA
248
21
TGGGAATTGTTCTCACTTTGAA
CCAAAAGTCTCCCAGCTGTT
240
22
GCTGTGCCTTGAGTATGCAC
TCCACAGCAATCATCCTTTG
215
23
CAGGGGTAGGGCTCATAACA
TGTCCTTGGTTTCTGTGTGAA
368
aAll
PCR was completed using the following conditions: initial denaturation at 95⬚C for 30 s, followed by 35 cycles of denaturation at 95⬚C for 30 s, annealing at 58⬚C for 30 s, and extension at 72⬚C for 30 s. bPrimers used in the PCR amplification of exon 15 (before analysis using the Pyrosequencing machine) were biotinylated and sequencing primer 5⬘CGGAGTTATTGAGGGACT-3⬘ was used in the reaction. Conditions used in PCR and genotyping are described in Technical note #101 and #102 by Pyrosequencing AB and can be found at http://www.pyrosequencing.com.
families (primer sequences are shown in Table 1). Ascertainment and diagnostic criteria for the families were those described by the Autism Genetic Resource Exchange (http://www.agre.org). The WAVE DNA Fragment Analysis System, which uses denaturing high performance liquid chromatography (DHPLC), was used to detect changes in heteroduplexes of the 23 exons from 90 autism patients using standard protocols [21]. When variants were found they were sequenced using the Thermo Sequenase Radiolabeled Terminator Cycle Sequencing kit (USB Corporation, OH). No patient-specific variants were found. However, two polymorphisms were observed, a T to G substitution in exon 15 (nt 3729 T>G, L1213V) and a G to T substitution in intron 2, located 29 bp downstream from the exon 2 donor-splice site. Using the PSQ 96 Pyrosequencing Instrument (Pyrosequencing AB, Uppsala, Sweden), the allele frequencies
10
for the exon 15 variant were determined in the probands with autism and a population of unrelated, unaffected Caucasian individuals (n = 96). No allelic or genotype association with autism was observed (allele frequency, 2 = 0.008, 1 df, P = 0.928; genotype frequency, 2 = 0.033, 1 df, P = 0.855). We conclude that the newly identified human gene CORTBP2 is unlikely to be a susceptibility factor for autism. ACKNOWLEDGMENTS We gratefully acknowledge the resources provided by the Autism Genetic Resource Exchange (AGRE) consortium. This research was supported by a grant from the Cure Autism Now Foundation to S.W.S. E.P. is a scholar of the Fonds zur Foerderung der wissenschaftlichen Forschung. J.B.V. is a National Alliance for Research into Schizophrenia and Depression (NARSAD) young investigator. S.W.S. is a Scholar of the Canadian Institutes of Health Research (CIHR). RECEIVED FOR PUBLICATION JUNE 5 ; ACCEPTED SEPTEMBER 18, 2001.
GENOMICS Vol. 78, Numbers 1-2, November 2001 Copyright © 2001 by Academic Press. All rights of reproduction in any form reserved.
doi:10.1006/geno.2001.6651, available online at http://www.idealibrary.com on IDEAL
REFERENCES 1. International Molecular Genetic Study of Autism Consortium. (1998). A full genome screen for autism with evidence for linkage to a region on chromosome 7q. Hum. Mol. Genet. 7: 571–578. 2. Ashley-Koch, A., et al. (1999). Genetic studies of autistic disorder and chromosome 7. Genomics 61: 227–236, doi:10.1006/geno.1999.5968. 3. Barrett, S., et al. (1999). An autosomal genomic screen for autism: collaborative Linkage Study of Autism. Am. J. Med. Genet. 88: 609–615, doi:10.1002/ajmg.1999.1215. 4. International Molecular Genetic Study of Autism Consortium (IMGSAC). (2001). Further characterization of the autism susceptibility locus AUTS1 on chromosome 7q. Hum. Mol. Genet. 10: 973–982. 5. Fisher, S. E., Vargha-Khadem, F., Watkins, K. E., Monaco, A. P., and Pembrey, M. E. (1998). Localisation of a gene implicated in a severe speech and language disorder. Nat. Genet. 18: 168–170. 6. Vincent, J. B., et al. (2000). Identification of a novel gene on chromosome 7q31 that is interrupted by a translocation breakpoint in an autistic individual. Am. J. Hum. Genet. 67: 510–514. 7. Warburton, P., et al. (2000). Support for linkage of autism and specific language impairment to 7q3 from two chromosome rearrangements involving band 7q31. Am. J. Med. Genet. 96: 228–234, doi:10.1002/ajmg.2000.0403. 8. Lai, C. S. L., et al. (2000). The SPCH1 region on human 7q31: genomic characterization of the critical interval and localization of translocations associated with speech and language disorder. Am. J. Hum. Genet. 67: 357–368. 9. Burge, C., and Karlin, S. (1997). Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268: 78–94, doi:10.1006/jmbi.1997.0951. 10. Altschul, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. (1990). Basic local alignment search tool. J. Mol. Biol. 215: 403–410, doi:10.1006/jmbi.1990.9999.
Short Communication
11. Tatusova, T. A., and Madden, T. L. (1999). Blast 2 sequences—a new tool for comparing protein and nucleotide sequences. FEMS Microbiol. Lett. 174: 247–250. 12. Nagase, T., et al. (2000). Prediction of the coding sequences of unidentified human genes. XIX. The complete sequences of 100 new cDNA clones from brain which code for large proteins in vitro. DNA Res. 7: 347–355. 13. Ohoka, Y., and Takai, Y. (1998). Isolation and characterization of cortactin isoforms and a novel cortactin-binding protein, CBP90. Genes Cells 3: 603–612. 14. Bateman, A., et al. (2000). The Pfam protein families database. Nucleic Acids Res. 28: 263–266. 15. Schultz, J., Milpetz, F., Bork, P., and Ponting, C. P. (1998). SMART, a simple modular architecture research tool: identification of signaling domains. Proc. Natl. Acad. Sci. USA 95: 5857–5864. 16. Bork, P. (1993). Hundreds of ankyrin-like repeats in functionally diverse proteins: mobile modules that cross phyla horizontally? Proteins 17: 363–374. 17. Gorina, S., and Pavletich, N. P. (1996). Structure of the p53 tumor suppressor bound to the ankyrin and SH3 domains of 53BP2. Science 274: 1001–1005. 18. Tusnády, G. E., and Simon, I. (1998). Principles governing amino acid composition of integral membrane proteins: applications to topology prediction. J. Mol. Biol. 283: 489–506, doi:10.1006/jmbi.1998.2107. 19. Wu, H., and Parsons, J. T. (1993). Cortactin, an 80/85-kilodalton pp60src substrate, is a filamentous actin-binding protein enriched in the cell cortex. J. Cell Biol. 120: 1417–1426. 20. Du, Y., Weed, S. A., Xiong, W., Marshall, T. D., and Parsons, J. T. (1998). Identification of a novel cortactin SH3 domain-binding protein and its localization to growth cones of cultured neurons. Mol. Cell. Biol. 18: 5838–5851. 21. Kuklin, A., Munson, K., Gjerde, D., Haefele, R., and Taylor, P. (1997/8). Detection of single-nucleotide polymorphisms with the WAVETM DNA fragment analysis system. Genet. Test. 3: 201–206.
GENOMICS Vol. 78, Numbers 1-2, November 2001 Copyright © 2001 by Academic Press. All rights of reproduction in any form reserved.
11