The triosephosphate isomerase gene from maize introns antedate the plant-animal divergence

The triosephosphate isomerase gene from maize introns antedate the plant-animal divergence

Cell, Vol. 46, 133-141, July 4, 1986, Copyright 0 1986 by Cell Press The Triosephosphate lsomerase Gene from Maize: lntrons Antedate the Plant-Anim...

2MB Sizes 0 Downloads 9 Views

Cell, Vol. 46, 133-141,

July 4, 1986, Copyright

0 1986 by Cell Press

The Triosephosphate lsomerase Gene from Maize: lntrons Antedate the Plant-Animal Divergence Mark Marchionni and Walter Gilbert Department of Cellular and Developmental Biology Harvard University Biological Laboratories 16 Divinity Avenue Cambridge, Massachusetts 02138

Summary We have cloned and characterized a cDNA and genomic DNA for the triosephosphate isomerase expressed in maize roots. The gene is interrupted by eight introns. If we compare this gene with that for the protein in chicken, which has six intmns, we see that five of the introns are at identical places, one has shifted by three codons, and two are totally new. This great matching leads us to conclude that the introns were in place before the plant-animal divergence, and that the parental gene had at least eight introns, two of which were lost in the line that leads to animals.

idly dividing organisms lost them by streamlining in response to the selection pressure of rapid reproduction. The highly conserved, ubiquitous glycolytic enzymes most likely evolved completely before the archaebacteria-prokaryotic-eukaryotic division and thus represent extremely ancient genes. In vertebrates, introns punctuate nonrandomly the sequences encoding chicken pyruvate kinase (Lonberg and Gilbert, 1985), chicken glyceraldehyde phosphate dehydrogenase (Stone et al., 1985) chicken triosephosphate isomerase (Straus and Gilbert, 1985b), as well as human phosphoglycerate kinase (Michelson et al., 1985), suggesting that the introns were not inserted into preexisting genes. To push our view of one of these glycolytic enzyme genes back in time, we have characterized a gene encoding triosephosphate isomerase (EC 5.3.1.1, TIM) in maize in order to compare its structure to the chicken gene. A remarkable conservation of intron positions between these distant relatives indicates that the introns were in place before the plant-animal divergence, more than one billion years ago. Results

Introduction Genes of higher eukaryotes are discontinuous: regions of noncoding DNA (introns) separate the coding sequence into discrete segments (exons). Gilbert (1978) proposed that introns are vestigial DNA sequence, remnants of a recombination process that accelerated molecular evolution by assembling new linkage groups from the exons, creating novel gene products. That hypothesis of exon shuffling predicts that positions of introns should portray the evolutionary history of a gene; the exons themselves would encode distinct functional elements (Gilbert, 1978, 1979), stably folding peptides (Blake, 1978) or compact modules (Go, 1981, 1983). Evidence in support of these ideas derives from molecular studies on genes and proteins as diverse as immunoglobulins (reviewed by Tonegawa, 1983, and Honjo, 1983) collagen (Yamada et al., 1980) serum albumin-a-fetoprotein (Sargent et al., 1981; Eiferman et al., 1981) ovomucoid (Stein et al., 1980), globins (Go, 1981) intermediate filaments (Marchuk et al., 1984; Balcarek and Cowan, 1985), lysozyme (Jung et al., 1980) and crystallins (Moormann et al,, 1983). Moreover, recent studies (Siidhof et al., 1985a, 1985b) comparing the gene structure of low density lipoprotein receptor with those of epidermal growth factor and blood coagulation factors 9 and 10 provide a dramatic example of exon shuffling in recent evolutionary history. Although there are a few exceptions (Kaine et al., 1983; Chu et al., 1984; Nellen et al., 1981; Miller, 1984), the conspicuous absence of introns in bacteria and yeast poses questions as to their age and origins. Were introns present as the first genes formed? Or have they more recently taken up residence in genomic DNA by invading continuous gene sequences? Doolittle (1978) suggested that the genomes of primitive cells contained introns but that rap-

Cloning and Structure of Maize TIM cDNA During glycolysis, triosephosphate isomerase (TIM) catalyzes the interconversion of dihydroxyacetone phosphate and glyceraldehyde 3-phosphate. In addition to a cytosolic TIM, plant cells active in photosynthesis express a plastid enzyme encoded in the nucleus (Pichersky and Gottlieb, 1982). That isozyme functions in the dark reactions that fix carbon dioxide into sugars. As is true for other photosynthetic proteins, the chloroplast-specific TIM is light inducible (e.g., Nelson et al., 1984; Berry-Lowe et al., 1982). Hence, mRNA from roots grown in the dark should be enriched in the cytosolic form, which ought to be cognate to the chicken enzyme. Because TIM is less than 60% diverged across the most distant species (Pichersky et al., 1984; Straus and Gilbert, 1985b), cross-hybridization from chicken to maize root cDNA was a plausible approach to cloning the maize gene. Using a chicken probe (Straus et al., 1985a), we screened 80,000 phage cDNA recombinants that represented mRNA extracted from maize root seedlings grown in the dark for 7 days. We chose hybridization and washing conditions of medium stringency (see Experimental Procedures) and detected 18 potential TIM clones. After partially mapping all of the inserts, we focused on two large (>800 bp) overlapping clones, subcloned each of these into the EcoRl site of pUC8 (Vieira and Messing, 1982), and sequenced (Maxam and Gilbert, 1980) both strands of the DNA. Figure 1 displays a partial restriction map and the complete nucleotide sequence of a full-length maize root TIM cDNA clone, designated pMRT1. Within the 123 bases of 5’ untranslated sequence is a 37 base pyrimidine tract, which is flanked by two overlapping, imperfect, direct repeats of 20 bases each. A similar alternating pyrimidine

Cdl 134

100 200 300 400 500 600 700 800 900

ACGMCATACTTTTTTTGCTCMCTGTGCTATGTAOTGACTGTTTTGCCTGCCCCC~~ACTTTT~GCTTGAGCTT~

1000

CTMTM~GTTTACCTCTG~~~GT~TCMT*ATGGTGC**A*.~T~~~AAAA~AAAAA~AA~~~~~AAA~AAAAGGAATTC

Figure 1. Structure

of a Maize Triosephosphate

lsomerase

cDNA Expressed

in Root

We subcloned into plasmid pUC8 the insert of a hgtl0 phage, which we had isolated from a root cDNA library using a chicken TIM probe. We determined the nucleotide sequence of the full-length clone by the chemical cleavage method (Maxam and Gilbert, 1980). Boxed in the 5’ untranslated region is a 37 base pyrimidine stretch, and the polyadenylation signal is underlined 12 bases before the site of poly(A) addition, Indicated in the upper panel are the restriction sites we used for sequencing: R=EcoRI, B=BamHI, D=Ddel, Hf=Hinfl, and N=Ncol.

tract found upstream of the alcohol dehydrogenase 1 gene has been implicated in maize roots as important for the induction of transcription in response to anaerobic stress (Hake et al., 1985; R. Ferl, personal communication). A polypeptide chain of 253 residues begins at nucleotide 124 and is followed by a 3’ untranslated stretch of 159 bases. Located 15 bases upstream of the site of polyadenylation is AATAA, which resembles the consensus sequence (Fitzgerald and Shenk, 1981) but lacks the final A. TIM Is Encoded by a Small Multigene Family in Maize In chicken, TIM is encoded by a unique gene (Straus and Gilbert, 1965b). Human DNA, however, has two processed pseudogenes (Brown et al., 1985) in addition to a single-copy functional TIM locus. Some wild flowers of the genus Clarkia have two unlinked TIM loci (Pichersky and Gottlieb, 1983). To enumerate the TIM genes in maize, we analyzed Southern blots of maize DNA with a specific nick-translated fragment (bases 564-661 in pMRT1; see Figure l), which we chose with the following properties in mind. First, such a short probe, 78 bp, is unlikely to harbor sites for restriction enzymes that have 6 bp recognition se-

quences. Second, this segment of TIM cDNA sequence surrounds the active site residue Glu-165 and is most (83%) conserved in nucleotide sequence between maize and chicken. Finally, this sequence corresponds to a specific exon (number 6, see Figure 5). Thus each fragment detected should represent a distinct maize TIM gene. Since maize DNA is highly methylated, we used enzymes that lacked CXG in their recognition sequences. Figure 2 shows the hybridization of Ncol-digested maize DNA; nine fragments appear. The band at 1.35 kb corresponds to the 1374 bp Ncol fragment in the cloned gene described below. The pattern suggests that maize contains a total of at least nine genes and pseudogenes encoding TIM. Cloning and Structure of an Active Maize TIM Gene We have characterized a functional structural gene corresponding to our cDNA clone using overlapping clones, derived from two different genomic libraries. Initially, we labeled the insert from pMRT1 (Figure 1) and then probed an incomplete maize genomic library (106 EcoRl partial maize DNA fragments in Charon 4A), which was the kind gift of J. Sorenson (Upjohn Co., Kalamazoo, Ml). We detected and recovered 28 hybridizing phage, then mapped and sorted them into groups according to their hybridiza-

f$$e

and Chicken TIM Genes Share lntron Positions

4

1315

4 4

4

612 518 418

4

313

4

218

4 <

Figure 2. Gene Counting

I 191bo

of Triosephosphate

lsomerase

in Maize

We digested to completion 10 trg of maize nuclear DNA using Ncol, electrophoresed the fragments in a 1% agarose gel, and blotted (Southern, 1975) and immobilized them onto an uncharged nylon membrane. We probed the filter with a nick-translated fragment comprising sequences of maize TIM exon 6, as described in the text. Autoradiography was for 3 days at -80% (with an intensifying screen).

tion to discrete cDNA subprobes. Though no individual phage contained sequences complementary to the entire cDNA, we chose one, MT1 1, that reacted with a probe from the 3’ untranslated region and several probes containing coding sequences near the C terminus. As summarized in Figure 3, we subcloned overlapping Ncol and BamHl fragments of MT11 into plasmid vectors pKK233-2 (J. Bros/us, unpublished) and pUC8 (Vieira and Messing, 1982) and determined their nucleotide sequences. Maize TIM sequences in MT11 adjoin the left arm of Charon 4A and encode the C-terminal 100 amino acid residues. lntrons divide these coding sequences into four exons. After mapping the remaining TIM clones of this collection, we concluded that none overlapped with MTll. From that group, however, we sequenced another phage that bore a portion of a second maize TIM gene differing by a few amino acid substitutions (17 replacements out of 102 residues compared) and by at least 50% in the intron sequences. However, the five intron positions in this partially characterized gene are identical to those of MTll. To characterize the remaining portion of the expressed

maize TIM gene, we screened a second genomic library (gift of J. Shen, Harvard University) constructed of Mboll partials. To find our way through the forest of multiple genes, we identified specific restriction fragments and synthetic oligonucleotides (Table 1) that detected unique sequences on Southern blots of maize genomic DNA. In particular, both an N-terminal Haelll-Pvull fragment (bases 128-287 in Figure 1) and a synthetic 20-mer (number 2) hybridized under stringent conditions solely to a 1.4 kb Hinfi fragment. Therefore, we screened 4 x 106 recombinants with that specific nick-translated fragment (bases 128-287 in Figure 1) and detected 20 positive phage. Subsequently, we diagnosed them with a set of synthetic oligomers (Table 1) derived from sequences of 5’ untranslated DNA, the coding segments of the N-terminal 150 residues, and an intron of MTll. One phage, MT48, hybridized with each probe tested. We concluded that MT46 overlapped MT1 1. We analyzed MT46 using a combination of genomic sequencing (Church and Gilbert, 1984) and direct chemical sequencing of the recombinant phage to determine all of the remaining exons and 90% of the remaining intron sequence (see Figure 3). Most often we exploited genomic sequencing as adapted for use with synthetic oligonucleotide probes (Tizard and Nick, unpublished data) to map seven of the remaining nine intron/exon boundaries. In Figure 4, we display an example of our use of this technique. When we analyzed a complete Haelll digest of recombinant phage MT46, we could discern more than 50 fragments in the ethidium bromide stained gel. We chemically sequenced the mixture of fragments, then electrophoresed, transferred, and cross-linked the DNA to a solid support, as described in Experimental Procedures. We visualized the sequences of individual Haelll fragments by probing with oligonucleotides that border the enzyme cleavage site. After we determined the sequence using the first probe (Figure 4, left panel), the membrane was stripped and reprobed with another oligonucleotide (center panel). The complete removal of the first probe is documented well in this second autoradiograph by the absence of the unreacted band seen in the initial sequence ladder. Furthermore, we used that technique to confirm the overlap of those two genomic clones by sequencing from their shared Ndel site. In three cases (restriction sites for Stul, Ncol, and Xbal), however, we resorted to conventional chemical sequencing. We designate this active gene as maize triosephosphate isomerase 1; Figure 5 summarizes its nucleotide sequence. The 3.8 kb gene is divided into nine exons by eight introns. All of the splice junctions conform to the GT/AG rule (Mount, 1982). Prominent among the features of this gene is the nonrandom interruption of coding sequence by introns. Whereas these introns range broadly in length from 92 to 630 bp, the exons fall into three discrete size classes. The first and last exons are relatively short, encoding only 13 and 15 residues, respectively. Exons 3 and 5 are the longest at 42 and 44 amino acids. Tightly clustered around a length of 26-32 residues is the preponderant class composed of exons 2,4,6,7, and 8. Models that postulate the insertion of introns into previ-

Cell 136

5kb

N RNN

LE L

MT 11

RNNNN

RE

1 II

1 B

I B

I

Q

, kb

t

lo

E6

E7 E8

PRNdX 1,’f r Hi

H

0

E9

MT 46 _---

Nd

P H I, I

N Hf

X

F I Hf

Figure 3. Cloning and Sequencing

H PB HHN I II I 1’1 ‘I S F Hf F F

of the Structural

N __--

Gene for Triosephosphate

lsomerase

of Maize

Using the insert of cDNA clone pMRT1 as a probe, we isolated the genomic clone MTli. The 5 kb scale applies solely to that phage clone, which contained sequences composing exons 6-9 and the 3’ untranslated region on two overlapping restriction fragments: a 1.2 kb Ncol piece and a 300 bp BamHl fragment. We suboloned those fragments into plasmids, then sequenced both strands by conventional chemical cleavage methods, as indicated by the arrows. We found an overlapping genomic clone, MT46, and as described in the text, we determined sequences of the 5’untranslated region, exons 1-5 and most of the introns between them. Numbers above the arrows designate the probes (Table 1) that we used to visualize genomic sequences; elsewhere we sequenced labeled DNA fragments. Restriction sites are abbreviated as follows: Nd=Ndel, N=Ncol, Hf=Hinfl, P=Pvull, X=Xbal, H=Haelll, F=Fnu4H, S=Stul, Bs=BstNI, R=EcoRI, and B=BamHI.

ously uninterrupted genes (Orgel and Crick, 1980; CavelierSmith, 1980) are not supported by these data. Rather, our findings on the structure of maize TIM 1 suggest that exon shuffling played an important role in the assembly of ancient genes. Conservation of IntronlExon Patterns in TIM How old are the introns in TIM? Figure 8 shows that all six chicken introns are shared with maize. Five are located at exactly the same positions, while one has shifted over three codons. Moreover, there are two extra introns in maize, located near each of the termini. The conservation of intron positions between plant and animal TIM genes spreads across much of the molecule. This pattern cannot be explained easily by the separate insertion of introns into a continuous, preexisting gene. Rather, the identity of the positions of the five introns leads us to the conclusion that the ancestral gene was broken up at these positions before the time of plant-animal divergence. Thus we interpret intron six as a case of sliding and introns one and eight as lost in the chicken. Measurements of the rate of divergence of 5s ribosomal RNA genes (Ohama et al., 1984; Huysmans et al., 1983) suggest that plants and animals have evolved separately for approximately one billion years. Hence, the common forebear of plants and animals, a unicellular organism living at least one billion years ago, contained a TIM gene with a structure resembling the one present in corn.

Figure 4. Genomic

Sequencing

of Maize TIM Clone MT46

We digested to completion 10 99 of the maize genomic clone MT46, then analyzed 1 9g of the DNA fragments on a 1% agarose gel (ethidium bromide stain). We treated the remaining 9 ng by chemical

cleavage reagents, resolved the fragments by electrophoresis. and transferred them to uncharged nylon membranes, as described in Experimental Procedures. Initially, using probe 2 (see Table l), we visualized (left panel) sequences of the lower strand of exon 1, extending for 115 bases up the autoradiogram to a Haelll site in intron 1. We stripped probe 2 off the filter and restained with probe 7 (center panel), this time determining 155 bases of upper strand sequence within exon 4 and crossing into intron 3. Samples are loaded in the following order: G, A + G (missing in this gel), A > C, C + T, C, and T.

f$$ze and Chicken TIM Genes Share lntron Positions

Cdl 138

13 14 38 cys asn g lu I

Figure 6. Phylogenetic

Comparison

78 se r I

107108 glu phe pe

152 g lu qu

se’ r 78

glu’ leu 107 108

a’sp 152

of lntron Positions in Triosephosphate

183 184 glu val glu ,val

glniala 180 181

210 237 238 g IY lys pro II

dlY

210

lsomerase

We have aligned the positions of the introns in maize (Figure 5) and chicken (Straus and Gilbert, 1985b) TIM genes onto the amino acid sequence of the chicken enzyme. We illustrate in this figure examples from two different genes in maize and from the single gene in chicken, above and below the linear intron/exon map, respectively. 5Strands (arrows a-h) and a-helices (cylinders A-H) are represented in the colors appearing in the schematic diagram above. There we have divided the TIM barrel according to the maize gene structure. The figure is redrawn from Jane Richardson.

Correlations of Gene and Protein Structure in TIM TIM is a homodimer, with subunits ranging, in different species, from 248-253 amino acids in length. Threedimensional structures of the yeast and chicken enzymes exhibit an indistinguishable conformation in their active sites and display virtual identity elsewhere in the protein (Alber et al., 1981a, 1981b). Each subunit folds into a pseudosymmetrical barrel (Banner et al., 1975), which is composed of an alternating a-helix&strand motif that is repeated 8 times. A core of parallel B-strands is surrounded by a concentric sheath of a-helices. Figure 6 shows that seven of nine introns fall between, and not within, the helices and strands constituting the TIM barrel. The terminal exons of chicken are divided by introns in maize. lntron 1 (Figure 6) divides P-strand a from helix A (three turns), coded by exon 2. The 8th intron in maize (between Lys-237 and Pro-238) occurs at a bend in the last helix (H). Does our knowledge of the evolutionary history and anatomy of TIM permit an understanding of how that ancient gene might have been assembled? Branden (unpublished data) has proposed that proteins with an al8

structure were constructed by fusing exons bearing al8 motifs. His model accounts for a high correlation of intron positions in loops located on the catalytic face (at the carboxy1 end of 8 structures) of those enzymes. Six introns (1, 3, 4, 6, 7, and 8) in TIM obey those rules, but two do not: the intron located at the beginning of B-strand b (Glu38) and the one near the end of helix E (Glu/Asp-152) are positioned on the amino side of the barrel. Discussion A fundamental puzzle in molecular biology today is that the genes of prokaryotes and eukaryotes are patently so different in structure. To explain the presence of introns in eukaryotic DNA (and consequently their absence in prokaryotes and some simpler eukaryotes) requires one of three alternative hypotheses: (1) the original genes were discontinuous, and prokaryotes and yeast lost their introns by streamlining their genomes (Doolittle, 1978); (2) eukaryotes have inserted introns continuously throughout evolution as selfish DNA established residence in their genomes (Orgel and Crick, 1980; Cavalier-Smith, 1980);

73y

and Chicken TIM Genes Share lntron Positions

Table 1. Synthetic

Oligonucleotide

Probes Used for Genomic

Sequencing

TIM Phage Position

Reads

-Mer

Site

1

CTAGAAGTTCCCCTCTCCCT

20

Xbal

36

3’ Lower

2

CCGCAAGTTCTTCGTCGGTG

20

Haelll

127

3’ Lower

Probe

Sequence

3

CTGTGGTTCCATTGCATTTCC

21

SaudA

173

5’ Upper

4

GGGTTTTGACAATCTTCTCG

20

Taql

181

3’ Upper

5

CTGCGCCAAGAGTTCCATGT

20

PVUII

288

3’ Lower 5’ Lower

CCTGTGGTCAAGAGCCAGCTG

21

Hhal

292

7

AGAGTGTCCAAGAATGACCC

20

BstN I

392

3’ Upper

a

GGAGAGCTCTGCTGGGAGAA

20

Hinfl

435

5’ Lower

AGCAACAACATCCATGGTAG

20

Accl

538

3’ Upper

6

9 10

GGGAGGCTGGGTCTACCATG

20

Ncol

541

5’ Lower

11

GGTAAATCCAAAGCAGGGCAC

21

Ndel

179

5’ Upper

We designed probes l-10 from the nucleotide sequence of pMRT1 (Figure l), and probe 11 from intron sequence in genomic clone MT1 1. For analyses of immobilized phage plaques, Southern blots, and genomic sequence electrotransfers, we labeled probes at their 5’ends and hybridized them to the appropriate filters. Commencing from the positions and restriction sites shown, we used these probes to visualize sequences of the complementary strand in the directions indicated.

or (3) introns were added at some time shortly after nucleated cells split from the bacteria, and thus precluded yeast from their invasion. To test those models we have focused on a gene that was created early in evolution, before the divergence of prokaryotes and eukaryotes. Rosephosphate isomerase is a ubiquitous and highly conserved glycolytic enzyme, which has radiated to all organisms as a perfect catalyst. Because nearly 25% of the residues are invariant across vastly different lineages (see Straus and Gilbert, 1985b), the activity present in all surviving species descended from a common ancestral gene. The TIM gene of maize has eight introns; chicken has six. Five of these are in identical positions. Parsimony prescribes that it is preferable to assume a small number of losses rather than a large number of independent insertions in identical locations. In the one other published comparison of plant and animal genes (Shah et al., 1983), in actin the position of one intron is preserved, while all others are different. The pattern in that case could have been explained equally well by either intron gain, loss, or sliding (Fornwold et al., 1982; Davidson et al., 1982; Craik et al., 1983). Our results do not support the model that introns have been added continuously during the last billion years of evolution, because we would not have expected such strong identity between the two species. We conclude that the ancestral gene had eight introns, two were lost in the line that leads to chicken, while one slid over three codons. Precedent exists for the excision of an intron from functional genes in preproinsulin (Perler et al., 1980) and myosin heavy chain (Strehler et al., 1985). That the exons in TIM tend to be of regular size and correlate with recognized elements of protein structure is not in accord with either of the second two hypotheses. Thus, all these arguments give strong support to the idea that introns are as old as the genes themselves.

Experimental Procedures Isolation of Clones We obtained a LgtlO root cDNA library (gift of T. Theugh, Stanford University) constructed from the mRNA of maize seedlings grown in the dark for 7 days. Using a cloned chicken cDNA (Straus and Gilbert, 1985a) as a probe, we screened the library according to the plaque hybridization procedure of Benton and Davis (1977). Following a 1 hr prehybridization, we denatured the nick-translated (Rigby et al., 1977) insert, and hybridized (lo6 cpmlml; 1 x lo* to 2 x 10s cpmlkg) to duplicate nitrocellulose filters in 5x SSC. 10x Denhardt’s, 50 m M phosphate buffer (pli 7.2), 1 m M EDTA, 0.1% SDS, 10% dextran sulfate, and 250 pglml of denatured salmon testes DNA. After 12 hr at 65OC, we washed (65OC) filters once for 1 hr in 3x SSC, 0.2% SDS and thrice for 30 min in 2x SSC, 0.2% SDS. We exposed XAR-5 film (with intensifying screens) for three days at -80°C and subsequently recovered and plaque-purified the positive phage. A W64A maize genomic library containing 15-20 kb EcoRl frag ments cloned in Charon 4A was kindly provided by J. Sorenson (Upjohn Co., Kalamazoo, Ml.), and a maize genomic library of Mboll partial fragments cloned in EMBL IIIB was the generous gift of J. Shen (Harvard University). We grew phage of both genomic libraries in E. coli host strain LE392 and probed them with homologous nick-translated fragments derived from maize cDNA clones. For those homologous screenings, we did our final washes at 65OC in 0.5x SSC, 0.1% SDS. To obtain the subclones used in DNA sequence determination, we ligated (T4 ligase; New England Biolabs) overlapping fragments containing maize TIM exons into a complementary site of linearized, dephosphorylated (Calf intestinal phosphotase, Boehringer Mannheim) vectors and transformed the ligation products into competent E. coli MC1061 (Casadaban and Cohen, 1980) or JM83 (Vieira and Messing, 1982). Subsequently, we selected clones on the basis of antibiotic resistance and colony hybridization (Grunstein and Hogness, 1975). Nucleic Acids We extracted maize nuclear DNA from the shoots of 7 day old seedlings as described (Rivin et al., 1982). We isolated DNA from CsCIpurified recombinant phage according to the method of Thomas and Davis (1974). Plasmid DNA was prepared by alkaline lysis and CsCIEtBr centrifugation. Restriction enzymes, purchased from New England Biolabs and Boehringer Mannheim, were used as recommended by the suppliers. We mapped the restriction sites used for subcloning and sequence determination by electrophoresis in 6% acrylamide gels or 1% agarose gels, which we blotted (Southern, 1975) onto nitrocellulose

Cdl 140

or nylon membranes. We probed those filters with nick-translated restriction fragments or with synthetic oligonucleotide probes, which we labeled with crude [Y-~P]ATP (7000 Cilmmol; New England Nuclear and ICN), using T4 polynucleotide kinase (Boehringer Mannheim). DNA Sequence Determination All sequencing was done by the Maxam-Gilbert chemical cleavage method (1980) and included modifications suggested by Rubin and Schmid (1980) and by Bencini et al. (1984). To determine the DNA sequences of maize genomic clones directly, we took advantage of genomic sequencing (Church and Gilbert, 1984), but used synthetic oligonucleotide probes (Tizard and Nick, unpublished data) to analyze the electroblotted DNA. For each restriction site used in genomic sequencing, we analyzed 10 vg of phage DNA. After confirming that the digests were complete, we precipitated the fragments once with spermidine, then twice with ethanol. We dissolved the fragments in 25 PI of HZ0 and dispensed 3 pl for each of the six reactions. We electrophoresed 2 ng of the unlabeled sequence fragments in each lane of a 6% polyacrylamide-7 M urea sequencing gel. We cast such gels in molds of 60 x 50 x 0.04 cm as discontinuous gradients (Biggin et al., 1963) of TBE buffer (1 M TBE = 1 M Tris-HCI, 1 M boric acid, and 30 m M EDTA, pH 6.3), according to a recipe developed by R. Tizard (Biogen Research Corp., Cambridge, MA). We poured these gels in four stages-12.5% 300 m M TBE with 5% sucrose, 12.5% 217 m M TBE, 12.5% 133 m M TBE, and 83% 50 m M TBE-and ran them for 4 hr at 90 W. Then, we followed the procedures developed by Church and Gilbert (1964) for electrotransfer. ultraviolet illumination, and hy bridization of DNA immobilized on uncharged nylon membranes (Biodyne A, Pall and Zetabind, AMF-Cuno). However, as prescribed by Tizard and Nick (unpublished data), we supplemented the solutions used for hybridization and washing with 0.25 M and 0.125 M NaCI, respectively. We included 20 pmol of 5’-end-labeled 20-mer in a 10 ml hybridization for 4 hr at 4S’C. In most cases, following 2 days of film exposure (without an intensifying screen) we could read the sequence beginning 25 bases past the restriction cut to beyond 250 bases. To visualize sequences from other restriction fragments, we eluted the hybridized probe in 0.05 N NaOH and restained with a different probe. Computer Analysis Sequence data was assembled using the programs of Staden (1960) and analyzed with programs from the University of Wisconsin Genetics Computer Group on VAX 780 and Microvax 2 computers. Acknowledgments Foremost we thank Donald Straus for numerous contributions throughout this investigation. For maize libraries we acknowledge Tanya Theugh, John Sorenson, and Jen Shen; for expertise in genomic DNA sequencing we thank George Church, Richard Tizard, and Harry Nick; for plasmid preparations we acknowledge Neil Malone; for synthesis of oligonucleotides we thank K. L. Ramachandran; and, for assistance in preparation of the manuscript, we thank Nancie Thurston. This work has been supported in part by Biogen Research Corp., Cambridge, Massachusetts. M. M. was also supported by PHS grant 5 F32 CAO7048-03 from the National Cancer Institute. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. Received

March 31, 1966.

References Alber, T., Banner, D. W., Bloomer, A. C.. Petsko, G. A., Phillips, D., Rivers, P S., and Wilson, I. A. (1981a). On the three-dimensional structure and catalytic mechanism of triosephosphate isomerase. Phil. Trans. Roy. Sot. (Lond.) B 293, 159-171. Alber, T., Hartman, F. C., Johnson, R. M.. Petsko, G. A., and Tsernoglou, D. (198lb). Crystallization of yeast triose phosphate isomerase from polyethylene glycol. Protein crystal formation following phase separation. J. Biol. Chem. 256, 1356-1361. Balcarek, J. M., and Cowan, N. J. (1985). Structure of the mouse glial

fibrillary acidic protein gene: implications for the evolution of the intermediate filament multigene family. Nucl. Acids Res. 73, 5527-5543. Banner, D. W., Bloomer, A. C., Petsko, G. A., Phillips, D. C., Pogson, C. I., Wilson, I. A., Corran, P. H., Furth, A. J., Milman, J. D., Offord, R. E., Priddle, J. D., and Waley, S. G. (1975). Structure of chicken muscle triose phosphate isomerase determined crystallographically at 2.5 A resolution using amino acid sequence data. Nature 255, 609-614. Eencini, D. A., O’Donovan, G. A., and Wild, J. R. (1984). Rapid chemical degradation sequencing. Biotechniques 2, 4-5. Benton, W. D., and Davis, R. W. (1977). Screening lambda gt recombinant clones by hybridization to single plaques in situ. Science 796, 159-176. Berry-Lowe, S. L., McKnight, T D., Shah, D. M., and Meagher, R. B. (1982). The nucleotide sequence, expression, and evolution of one member of a multigene family encoding the small subunit of ribulose-1, 5-bisphosphate carboxylase in soybean. J. Mol. Appl. Genet. 1, 483498. Biggin, M. D., Gibson, T. J., and Hong, G. F. (1963). Buffer gradient gels and %.S label as an aid to rapid DNA sequence determination. Proc. Natl. Acad. Sci. USA 80, 3963-3965. Blake, C. C. F. (1978). Do genes-in-pieces ture 273, 267.

imply proteins-in-pieces?

Na-

Brown, J. R.. Daar, I. O., Krug, J. R., and Maquat, L. E. (1965). Characterization of the functional gene and several processed pseudogenes in the human triosephosphate isomerase gene family. Mol. Cell. Biol. 5. 1694-1706. Casadaban, M. J., and Cohen, S. N. (1980). Analysis of gene control signals by DNA fusion and cloning in Escherichia coli. J. Mol. Biol. 138, 179-207. Cavalier-Smith,

T. (1980). How selfish is DNA? Nature 285, 617-618.

Chu, F. K., Maley, G. F., Maley, F., and Melfort, M. (1984). Intervening sequence in the thymidylate synthase gene of bacteriophage T4. Proc. Natl. Acad. Sci. USA 87, 3049-3053. Church, G. M., and Gilbert, W. (1984). Genomic Natl. Acad. Sci. USA 87, 1991-1995.

sequencing.

Proc.

Crabtree. G. R., Comeau, C. M., Fowlkes, D. M., Fornace, A. J., Jr., Malley, J. D., and Kant, J. A. (1965). Evolution and structure of the fibrinogen genes. Random insertion of introns or selective loss? J. Mol. Biol. 785, 1-19. Craik, C. S., Rutter, W. J., and Fletterick. R. (1963). Splice junctions: association with variation in protein structure. Science 220, 1125-1129. Davidson, E. H., Thomas, T. L., Scheller, R. H., and Britten, R. J. (1982). The sea urchin actin genes, and a speculation on the evolutionary significance of small gene families. In Genome Evolution, G. A. Dover and R. 8. Flavell, eds. (London: Academic Press), pp. 117-192. Doolittle, W. F. (1978). Genes in pieces: were they ever together? ture 272, 581-582.

Na-

Eiferman. F. A., Young, P. R., Scott, R. W.. and Tilghman, S. M. (1981). lntragenic amplification and divergence in the mouse alphafetoprotein gene. Nature 294, 7l3-718. Fitzgerald, M., and Shenk, T. (1961). The sequence 5’-AAUAAA-3 forms parts of the recognition site for polyadenylation of late SV40 mRNAs. Cell 24, 251-260. Fornwald, J. A., Kuncio, G., Peng, I., and Ordahl, C. P. (1982). The complete nucleotide sequence of the chick alpha actin gene and its evolutionary relationship to the actin gene family. Nucl. Acids Res. 79, 3861-3876. Gilbert, W. (1978). Why genes in pieces? Nature 277, 501 Gilbert, W. (1979). lntrons and exons: playgrounds of evolution. In Eucaryotic Gene Regulation: ICN-UCLA Symposia on Molecular and Cellular Biology, R. Axel, T. Maniatis, and C. F. Fox, eds. (New York: Academic Press), pp. l-10. Go, M. (1981). Correlation of DNA exonic regions with protein structural units in haemoglobin. Nature 297, 90-92. Go, M. (1983). Modular structural units, exons, and function in chicken lysozyme. Proc. Natl. Acad. Sci. USA 80, 1964-1968. Grunstein, M., and Hogness, D. (1975). Colony hybridization: a method for the isolation of cloned DNAs that contain a specific gene. Proc. Natl. Acad. Sci. USA 72, 3961-3965.

y4;ize and Chicken TIM Genes Share lntron Positions

Hake, S., Kelley, P M., Taylor, W. C., and Freeling, M. (1985). Coordinate induction of alcohol dehydrogenase 1, aldolase, and other anaerobic RNAs in maize. J. Biol. Chem. 260, 5050-5054.

Southern, fragments

Homo, T. (1983). lmmunoglobulin 499-528.

7,

Staden, R. (1980). A new computer method for the storage and manipulation of DNA gel reading data. Nucl. Acids Res. 8, 3673-3694.

Huysmans, E., Dams, E., Vandenberghe, A., and DeWachter, R. (1983). The nucleotide sequences of the 5S rRNAs of four mushrooms and their use in studying the phylogenetic position of basidiomycetes among the eukaryotes. Nucl. Acids Res. 77, 287%2880.

Stein, J. P. Catterall, J. F., Kristo, P., Means, A. R., and O’Malley, B. W. (1980). Ovomucoid intervening sequences specify functional domains and generate protein polymorphism. Cell 27, 681-687.

genes.

Ann.

Rev. Immunol.

Jung, A., Sippel, A., Grez, M.. and Schutz. G. (1980). Exons encode functional and structural units of chicken lysozyme. Proc. Natl. Acad. Sci. USA 77; 5359-5783. Kaine, B. l?, Gupta, R., and Woese, C. R. (1983). Putative introns in tRNAgenes of procaryotes. Proc. Natl. Acad. Sci. USA80.3309-3312. Lonberg, N., and Gilbert, W. (1985). lntronlexon chicken pyruvate kinase gene. Cell 40, 81-90.

structure

of the

Marchuk, D., McCrohon, S., and Fuchs, E. (1984). Remarkable conservation of structure among intermediate filament genes. Cell 39, 491-498. Maxam. A. M., and Gilbert, W. (1980). Sequencing end-labeled DNA with base-specific chemical cleavages. Meth. Enzymol. 65, 499-580. Michelson, A. M.. Blake, C. C., Evans, S. T., and Orkin, S. H. (1985). Structure of the human phosphoglycerate kinase gene and the intronmediated evolution and dispersal of the nucleotide-binding domain. Proc. Natl. Acad. Sci. USA 82, 6965-8969. Miller, A. M. (1984). The yeast MAT alpha 1 gene contains two introns. EMBO J. 3, 1061-1065. Moorman, R. J., den Dunnen, J. T, Mulleners, L., Andreoli, P, Bloemendal. H., and Schoenmakers, J. G. (1983). Strict co-linearity of genetic and protein folding domains in an intragenically duplicated rat lens gamma-crystallin gene. J. Mol. Biol. 777, 353-368. Mount, S. M. (1982). A catalogue of splice junction sequences. Acids Res. 70, 461-472.

Nucl.

Nellen, W., Donath, C., Moos, M., and Gallwitz, D. (1981). The nucleotide sequences of the actin genes from Saccharomyces carlbergensis and Saccharomyces cerevisiae are identical except for their introns. J. Mol. Appl. Genet. 7, 239-244. Nelson, T, Harpster, M. H.. Mayfield. S. P., and Taylor, W. C. (1984). Light-regulated gene expression during maize leaf development. J. Cell Biol. 98, 558-564. Ohama, T., Kumazaki, T., Hori, H., and Osawa, S. (1984). Evolution of multicellular animals as deduced from 55 rRNA sequences: a possible early emergence of the Mesozoa. Nucl. Acids Res. 72, 5101-5108. Orgel, L. E., and Crick, F. H. C. (1980). Selfish DNA: the ultimate parasite. Nature 284, 604-607. Perler, F., Efstratiadis, A., Lomedico, P, Gilbert, W., Kolodner, R., and Dodgson, J. (1980). The evolution of genes: the chicken preproinsulin gene. Cell 20, 555-566. Pichersky, E., and Gottlieb, L. D. (1983). Evidence for the duplication of the structural genes coding plastid and cytosolic isozymes of triose phosphate isomerase in diploid species of Clarkia. Genetics 705, 421-436. Pichersky, E.. Gottlieb, L. D., and Hess, J. F. (1984). Nucleotide sequence of the triose phosphate isomerase gene of Escherichia coli. Mol. Gen. Genet. 795, 314-320. Rigby, W. F. Diekmann, M., Rhodes, C.. and Berg, l? (1977). Labelling DNA to high specific activity in vitro by nick translation with DNA polymerase I. J. Mol. Biol. 773, 237-251. Rivin, C. J., Zimmer, E. A., and Walbot, V. (1982). Isolation of DNA and DNA recombinants from maize. In Maize for Biological Research, W. F. Sheridan, ed., (Grand Forks, North Dakota: University Press), pp. 161-164. Rubin, C. M., and Schmid, C. W. (1980). Pyrimidine-specific chemical reactions useful for DNA sequencing. Nucl. Acids Res. 8, 4616-4619. Sargent, T. D., Jagodzinski. L. L.. Yang, M., and Bonner, J. (1981). Fine structure and evolution of the rat serum albumin gene. Mol. Cell, Biol. 7, 871-883. Shah, D. M., Hightower, R. C., and Meagher, R. B. (1983). Genes encoding actin in higher plants: intron positions are highly conserved but

the coding sequences

are not. J. Mol. Appl. Genet. 2, 111-126.

E. M. (1975). Detection of specific sequences among DNA separated by gel electrophoresis. J. Mol. Biol. 8, 503-517.

Stone, E. M., Rothblum. K. N., and Schwartz, R. J. (1985). Introndependent evolution of chicken glyceraldehyde phosphate dehydrogenase gene. Nature 373, 498500. Straus, D., and Gilbert, W. (1985a). Chicken triosephosphate isomerase complements an Escherichia coli deficiency. Proc. Natl. Acad. Sci. USA 82, 2014-2018. Straus, D., andGilbert, W. (1985b). Geneticengineering in the Precambrian: Structure of the chicken triosephosphate isomerase gene. Mol. Cell. Biol. 5, 3497-3506. Strehler, E. E., Mahdavi, V., Periasamy, M., and NadaCGinard, B. (1985). lntron positions are conserved in the 5’ end region of myosin heavy-chain genes. J. Biol. Chem. 260, 468-471. Siidhof, T. C., Goldstein, J. L., Brown, M. S., and Russell, 0. W. (1985a). The LDL receptor gene: a mosaic of exons shared with different proteins. Science 228, 815-822. Sudhof, T. C., Russell, D. W., Goldstein, J. L., Brown, M. S., SanchezPescador, R., and Bell, G. I. (1985b). Cassette of eight exons shared by genes for LDL receptor and EGF precursor. Science 228,893-895. Thomas, M., and Davis, R. W. (1975). Studies on the cleavage bacteria-phage lambda DNA with EcoRl restriction endonuclease. Mol. Biol. 97, 315-328.

of J.

Tonogawa, S. (1983). Somatic generation of antibody diversity. Nature 302, 591-596. Vieira, J., and Messing, J. (1982). The pUC plasmids, an M13mp7derived system for insertion mutagenesis and sequencing with synthetic universal primers. Gene 79, 259-268. Yamada, Y., Awedimento, V. E., Mudryj, M., Ohkubo, H., Vogeli, G., Irani, M., Pastan, I., and de Crombrugghe, B. (1980). The collagen gene: evidence for its evolutionary assembly by amplification of a DNA segment containing an exon of 54 bp. Cell 22, 887-892.