Gene, 129 (1993) 59-68 0 1993 Elsevier Science Publishers B.V. All rights reserved. 0378-l 119/93/$06.00
59
GENE 07172
Two histone H 1-encoding genes of the green alga Volvox carteri with features intermediate between plant and animal genes (Nucleotide sequence; DNA-binding domains; Sl mapping; conserved promoter elements; 3’ palindrome; nonpolyadenylated mRNA; stage-regulated gene expression)
Andreas Lindauer, Kurt Miiller and Riidiger Schmitt Lehrstuhlfiir Genetik, Unioersitiit Regensburg, D-8400 Regensburg, Germany
Received by H.G. Zachau: 23 December 1992; Revised/Accepted: 8 March/8 March 1993; Received at publishers: 22 March 1993
SUMMARY
Southern hybridization indicated the presence of at least two and possibly four histone Hl-encoding genes occurring as singlets in the V&ox carteri genome. Two of these genes, HZ-Zand HZ-II, have been cloned and characterized. Their coding sequences are each interrupted by three introns, but only the position of the second intron is identically conserved in both HZ-Zand HZ-II. The encoded 260-amino-acid (aa) (Hl-I) and 240-aa (Hl-II) polypeptides possess the typical tripartite organization of animal Hl histones, with variable N- and C-terminal domains flanking a conserved ‘globular’ DNA-binding domain. Extensive differences in their variable regions suggest that Hl-I and Hl-II (62% identity) represent two isotypes with different functions. A prominent KAPKAP-KAA motif in the Hl-I N-terminal region, similarly seen in single H 1 variants of a mosquito and a nematode, has a putative function in packing condensed subtypes of chromatin. Different from higher plants, but like animals, the HZ genes of V. carteri possess a typical 3’ palindrome for mRNA processing, resulting in non-polyadenylated mRNAs. Transcription initiates 33 nucleotides (nt) (HZ-Z)and 26 nt (HZ-ZZ) downstream of typical TATA boxes. A putative 20-bp conserved enhancer element upstream of each TATA box closely resembles the consensus sequence associated with the nucleosomal histone-encoding genes in V. carteri [Miiller et al., Gene 93 (1990) 167-1751 and suggests stringent regulation. Accordingly, transcription of HZ was shown to be restricted to late embryogenesis, when new flagella are produced. We discuss the inferred accessory role of histone HI proteins in stabilizing axonemal microtubules, as has been recently observed in sea urchin flagella [Multigner et al., Nature 360 (1992) 33-391.
INTRODUCTION
The lysine-rich histones Hl are essentially responsible for the folding of the nucleosome chain into higher orders of chromatin structure (Thoma et al., 1979). Among the histones, Hl exhibits the highest degree of heterogeneity, Correspondence to: Dr. R. Schmitt, Lehrstuhl fur Genetik, Universitlt Regensburg, D-8400 Regensburg, Germany. Tel. (49-941)943-3162; Fax (49-941)943-3163; e-mail:
[email protected]
Abbreviations: A., Arabidopsis; aa, amino acid(s); bp, base pair(s); C., Chironomus; DIG, digoxigenin; GCG, Genetics Computer Group
and indications are that a family of Hl isotypes exist that differ in their ability to condense chromatin both in vitro and in vivo (Cole, 1984). Moreover, it has been suggested that the different isotypes may bind specifically to different chromatin structures (Mohr et al., 1989; Schulze et al., 1992). In vitro studies have also revealed the (Madison WI, USA); Hl, histone Hl; HI, gene encoding Hl (assignment of roman numerals to genes and deduced polypeptides refers to HI isotypes, e.g., HI-I); kb, kilobase or 1000 bp; nt, nucleotide(s); oligo, oligodeoxyribonucleotide; SDS, sodium dodecyl sulfate; ss, single start point(s); UTR, untranslated strand(ed); tsp, transcription region(s); V., Volvox.
60
involvement of Hl histone proteins in transcriptional regulation (Grunstein, 1990; Zlatanova, 1991). A new role for histone Hl in stabilizing flagellar microtubules has been recently established in sea urchins and apparently also pertains to ciliates and to the green alga Chlamydomonas (Multigner et al., 1992). Our knowledge of HI histones comes almost entirely from animal systems; more recently, some protein and sequence data on higher plant (Gantt and Key, 1987; Gantt and Lenvik, 1991; Razafimahatratra et al., 1991; Yang et al., 1991) and on a ciliate (Wu et al., 1986) H 1 species have been reported, but nothing is known about the structure and functioning of Hl genes in eukaryotes of an intermediate phylogenetic position, such as the simple green alga v&ox (Rausch et al., 1989; Larson et al., 1992; Schmitt et al., 1992). This communication reports the isolation and characterization of two HI genes from the multicellular green alga I/oluox carteri. The life cycle of this simple alga exhibits striking similarities to stages of animal embryogenesis, including differentiation into specialized celltypes and a gastrulation-like inversion (reviewed in Kirk and Harper, 1986; Kirk, 1988). In previous analyses of the conserved nucleosomal histone-encoding genes of I/. curteri (Miller and Schmitt, 1988; Miiller et al., 1990), we described the pairwise organization of approx. 15 copies of HZA-H2B and H3-H4 gene doublets, their possible transcriptional regulation by 20-bp enhancer elements, and the processing of histone pre-mRNA at a 3’ palindromic sequence yielding non-polyadenylated mRNA as in animal systems. Because in other organisms the Hl histones differ substantially from each other and from the other histones in their DNA sequences, function, and stoichiometry (Osley, 199 1), it was not clear, whether the HI histone genes in Volvox would have diKerent controlling elements or follow a similar transcription pattern as their nucleosomal counterparts. We have therefore characterized the first algal HI genes to (i) analyze their genetic organization, at the levels of both the gene and derived peptide structures, (ii) define potential cis-regulatory control elements of transcription, (iii) investigate the role of 3’ palindromes in pre-mRNA processing and (iv) monitor gene-specific transcription during the lifecycle of V. carteri.
RESULTS AND DISCUSSION
(a) Copy number and gene organization
at two HI loci
Clones hVHl-821 and hVHl-1112 containing two non-allelic Hl gene loci (Fig. 1) were isolated by differential screening of a hEMBL3-based genomic library of V. carteri HKlO (Mages et al., 1988) using sea
urchin HI DNA as the probe (Levy et al., 1982). The organization of the two HI genes, HI-1 and HI-II, as inferred from the genomic and cDNA sequences (the latter retrieved by reverse transcription of mRNA and PCR using ON3 and ON1 or ON2, respectively, as primers), is illustrated in Fig. 1. It summarizes the insular location of each HI gene, their physical maps, relative locations of four exons and three introns, promoter elements, and transcription termination signals, as determined in this study. Southern hybridization of total DNA digested with BumHI, NcoI and PstI, respectively, using a conserved 234-bp Hue111 subfragment (derived from HI-Z cDNA) as a probe revealed the presence of at least four HI gene copies in the V. curteri genome (Fig. 2). Since the cDNA probe and the corresponding genomic sequence did not contain BumHI and NcoI restriction sites, each separate band in these genomic digests should represent a separate gene. The signals at 5.1 kb (BumHI), 3.5 kb (NcoI) and 1.0 kb (PstI) in Fig. 2 correspond to HI-1 and may represent double bands, whereas the less intense signals at 4.2 kb (BumHI), 4.3 kb (NcoI) and 0.8 kb (PstI) correspond to HI-II. Two weaker signals (in each lane) are believed to represent additional members of the Hl gene family. This brings the number of Hl gene copies to at least four per haploid Voloox genome. Each Hl gene occurs as a singlet, and none is physically linked to gene clusters that encode the four nucleosomal histones (Miiller and Schmitt, 1988; Miiller et al., 1990), suggesting that they may occupy quite separate map positions. (b) Nucleotide sequence analysis
Appropriate fragments from hVHl-821 and hVHl1112 (Fig. 1) were used to determine the complete 1191nt and 1113-nt sequences of the Hl-I and Hl-II genes, plus their upstream and downstream UTRs, along with the deduced aa sequences of the encoded HI polypeptides, as presented in Figs. 3A and 3B. All regions were sequenced in both directions from overlapping subfragments. These Volvox Hl genes possess distinctive structural features, as each coding region is interrupted by three introns, 93 to 175 bp in length. The vast majority of HI genes are intronless and we are aware of only three organisms, whose HI coding regions are interrupted by single introns, namely Tetrahymenu thermophilu (Wu et al., 1986), Cuenorhubditis eleguns (Sanicola et al., 1990) and Arubidobsis thaliuna (Gantt and Lenvik, 1991). We note that A. thuliuna HI-I shares its intron position with the first intron of Volvox HI-I, but intron positions in the T. thermophila and C. eleguns HI genes are different. A comparison of the V. curteri HI gene structures assigns different positions to the first and the third intron, but reveals that the second intron position is precisely conserved. Its location near the transition between the globu-
61
Hl
1 kh
HQ-O
A
R
TR
PT
T
T
TB 1711 bp
-
O.lkb
1
64
129
65
130
168
169
262
VHl-I
Hl
1 kh
-
ON2 +
-
ONSl-II e
P
0.47SP Ism-IIs*_-_d
P ;
R
I
2140 bp
O.lkb
1
60
61
95
96
127 128
242
VHl-II Fig. 1. Physical maps of two genomic clones (hVHl-821 and AVHI-1112) each containing a single HI gene. Both hEMBL3 dertvatives were isolated from a V. carteri genomic library (Mages et al., 1988) using a sea urchin HI probe (Levy et al., 1982) and were mapped by restriction analysis. The relative location of the HI genes (black boxes) was determined by Southern hybridization using the gene-specific sea urchin DNA probe. 171I-bp and 2140-bp subclones containing the genes, HI-I and HI-II, the sequence-derived location of exons (shaded boxes El-E4) and introns (heavy lines) and deduced transcription signals in the 5’ and 3’ UTRs (thin lines) are shown on expanded scales. Numbering at exon boundaries refers to codons; intron sizes are indicated in bp. Promoter (E, 20-bp element; T, TATA box) and transcription termination signals (two-headed arrows: *, 3’ palindrome) as derived from Sl-mapping and sequence comparisons (Fig. 6; Miiller et al., 1990) are shown. Arrows marked ONI, ON2 and ON3, respectively, indicate location and polarity of oligos used for amplifying the corresponding cDNAs; oligos ONSI-I and ONSl-II and hatched boxes representing restriction fragments 0.46RP (0.46-kb RsaI-P&I) and 0.47SP (0.47-kb SalI-PstI) were used for Sl-mapping (Fig. 5). A, AccI; B, BumHI; E, EcoRI; P, PstI; R, RsuI; S, San; T, TuqI; X, XhoI.
lar and C-terminal domains (see section c) may be interpreted in terms of the ‘domain shuffling’ hypothesis (Doolittle, 1978; Blake, 1983). Alternatively, this may simply be another example of the curious phenomenon
that introns are generally located in nt sequences determining residues on the surface of the encoded protein (Rogers, 1985; 1989). The splice sites for all six introns interrupting the two
62
kb -5.1 - 4.2 - 3.5
- 1.0 -0.8
Fig. 2. Southern hybridization of V. carteri HKlOgenomic DNA. DNA samples (5 pg) were digested with BamHI (lane l), NcoI (lane 2), and PstI (lane 3), electrophoresed on a 0.8% agarose gel and transferred to a nylon membrane. The blot was probed at 57°C with HI-I GD-[a3ZP]DNA (a 234-bp HaeIII-fragment from the conserved central portion of HI-I cDNA) and washed in 0.2 x SSC/O.I% SDS at 50°C (2 x 30 min). Size markers are shown in the right margin. SSC is 0.15 M NaCI/O.OIS M Na,*citrate, pH 7.6. I/. carteri HI genes are conserved with the consensus splice sites of other plant species (Brown, 1986). However, the elevated AT content that typically precedes the 3’splice site of higher plant introns and that is considered essential for their correct splicing (Wiebauer et al., 1988) does not appear to exist in V. carteri. Rather, a polypyrimidine tract and a prevalent 3’ consensus sequence, PuCAGG (Gruber et al., 1992), appear to guide splicing at the acceptor site.
(c) Coding regions Of all histones, the lysine-rich Hl proteins exhibit the highest degree of heterogeneity, and indications are that a family of Hl isotypes exist that differ in their functional specificities (Cole, 1984; Mohr et al., 1989). This variability is clearly reflected in the derived Hl polypeptide sequences of I/. carteri that contain 260 (Hl-I) and 240 (Hl-II) aa residues, respectively (Fig. 3). Both exhibit the typical three-domain pattern of Hl proteins (Doenecke, 1988) with a highly conserved central globular core flanked by variable N- and C-terminal domains, as illustrated in Fig. 4 (top). Substantial differences between the variable regions in both their lengths (N: 54 vs. 20 aa
residues; C: 129 vs. 143 aa) and composition (N: 60% identity; C: 54% identity), contrast with uniform size and high conservation (79% identity) of the central 77-aa domain. These are the first algal Hl sequences to be characterized. We will, therefore, briefly describe salient features of the three I/&ox Hl domains, and compare them to known Hl proteins from animals and higher plants (Fig. 4B). Although the N-terminal domains of the two L’oloox HI proteins differ considerably, they share a common sequence arrangement: the N-terminal sequence (H 1-I: aa l-22; Hl-II: aa l-7) invariably lacks basic (but not acidic) residues, whereas the adjacent portion (Hl-I: aa 23-54; HI-II: aa 8-20) contains from 6 to 13 lysine residues (Fig. 4A). This bipartite organization of the N-terminal domain is considered essential for the correct positioning of the central globular domain at the DNA entry and exit of the nucleosome (Allan et al., 1986). A sequence found in H l-1, PKQPKAPKAPKEPKAPK, distinguishes the lysine-rich portions of Hl-I and Hl-II N-terminal domains. A quite similar sequence has been identified in a variant (named Hl I-l) of the midge Chironomus thummi thummi by Schulze et al. (1993), as illustrated in Fig. 4A. It has been inferred that the bipartite aa-sequence motif KAPKAP-KVA and the spacing between these two sequence elements may be responsible for an observed binding specificity for highly condensed polytene chromosomal bands (Schulze et al., 1993). Of all known Hl sequences inspected, only those of the midge Chironomus, the green alga Volvox and the nematode Caenorhabditis (Vanfleteren et al., 1990) have a single variant that contains this conspicuous motif (Fig. 4A). Its conservation in three organisms which are phylogentitally widely separated suggests a general functional role for this sequence, possibly in generating a condensed chromatin subtype (Schulze et al., 1993). The central globular domain of Hl, which seals the two ends of a DNA-loop wrapped around the nucleosoma1 core, is the one most conserved in all Hl proteins. It consists largely of non-polar hydrophobic residues interspersed with charged (mainly basic) aa residues. The two sequenced V. carteri central domains exhibit 79% identity to each other, 47%‘57% with higher plant species and about 40% with animal globular domain sequences. This sequence conservation is demonstrated by an alignment of nine globular Hl domains from V. carteri, higher plants, and animals in Fig. 4B. Algal and higher plant H 1 proteins share 42 conserved residues in the globular region; among these aa, 22 residues are also conserved in the animal consensus sequence. Moreover, this comparison reveals several interesting properties of the central domain (Fig. 4B). First, superimposition of the tertiary structure model of the chicken H5
63
Fig. 3. The nt sequences of (A) the 171I-bp (HI-I), and (B) the 2140-bp fragment (HI-II) from Fig. 1 and the deduced aa sequences. Overlapping subfragments of recombinant phage DNA were cloned into pUC8 (Vieira and Messing, 1982) and both strands sequenced with Sequenase (US Biochemical, Cleveland, OH, USA) using the dideoxy chain-termination method (Sanger et al., 1977). The location of introns and coding sequences have been verified by cDNA sequencing of reversely transcribed mRNAs using ON3 for priming ss cDNAs (reverse transcriptase from Boehringer Mannheim, Germany) and ONl/ON3 (HI-I) and ON2/ON3 (HI-II), respectively, as primers for PCR amplification (Taq polymerase, Cetus Corp., Norwalk, CT, USA; 30 cycles). Introns are displayed in lower case letters. Numbering in the margin refers to nt (upper lines) and aa residues (lower lines), respectively; aa are aligned with first nt of each codon. Accounting for the lack of an N-terminal Met in processed Hl proteins (Wells and McBride, 1989), numbering of aa residues starts at Ser’ and Ala’, respectively. Domain boundaries (Fig. 4) are indicated by upward arrowheads (A). The 20-bp elements (E), TATA-boxes (T) and 3’-palindromes (two headed arrows: -) and the tsp (+ 1) derived from Sl-mapping (Fig. 5) are shown. Asterisks mark stop codons. Synthetic oligos (Fig. 1) are represented by S-to-3 arrows with their symbols above the nt sequence. Nucleotide sequences were processed by the GCG software package (Devereux et al., 1984). The sequences have been deposited in the GenBank Nucleotide Sequence Database under accession Nos. LO7946 (HI-I) and LO7947 (HI-II).
histone central domain (Clore et al., 1987) to the two V. carteri Hl proteins reveals three potential a-helical regions termed A, B and C in the Voluox proteins. These regions create one principal and two secondary DNA binding sites in the chicken protein, so conceivably the Voluox motifs serve analogous functions (CraneRobinson and Ptitsyn, 1989). Second, invariant and conservative surface residues, which are considered responsible for internucleosomal DNA interactions, are also found in the Volvox Hl globular domain; these are marked in Fig. 4B. Of 20 conserved aa from animal histones H5 and Hl (Crane-Robinson and Ptitsyn, 1989) 13
are found in V. carteri and higher plants, with six basic aa that are likely to interact with negatively charged DNA. Finally, all animal HI central domains have a highly conserved pentapeptide motif, GTGAS, near the C terminus of the globular domain. This motif is not found in the V. carteri and higher plant sequences, thus distinguishing the plant and animal Hl species. The variable C-terminal domains of Hl proteins are rich in basic residues. The predominant aa residues are Ala, Lys and Pro. This portion of the protein has been associated with the formation of higher order chromatin structures in other organisms (Doenecke, 1988). These
64 Hl-I C
N HI-II conserved globular
variable
V.C.
Hl-I
S
Hl-II
A
c.t.t.
s
C.e.
S
0
G9
@
A
variable
29
APKQPK&‘~p
12
T_-------_-24
15
KA&&
KEPKkKE-
QjKJ,&&
KAPKAP
I6
K@--=
KAPKAP
lo
Kl$F
B ,
0.
Cl .00
0.0
0”.
. . .
.,
.000
0
0
0
00.
0.
.
bo*
00.
.
. .
0
vHl-I APTHPPYIEMVKDAITTLKERNGSSLPALKKFIENKYGKDIHDKNFAKTLSQVVKTFVKGGKLVKVK-----GSFKLSEALK vHl-II ........Q..T...LS....D............A.........K.P....LAL.....N.......-----N.Y...D.Q. PAS..T.E..I....VS...K....QY.IA....E.Q-.QLPA-..K.L.L.NL.KN.AS...I...-----......A.A. pH1 aHl-1 VSS..T.E..I....V.....T...QY.IQ....E.R-.ELPP-T.R.L.LLNL.RL.AS.......-----A....PS.-S aHl-2 TSS..T.f..I....V.....T...QY.IQ....E.H-.SLPP-T.R.L.LVNL.RL.ASE......-----A...IPS.RS mH1 ....T...SY.IA..V.D.HKAKLPP-..R.L.NVQL.KL.A....T...-----N.Y...S.T. 5 ...L..A...SE...S htil-1 T.A..S.A...SE..M....S...TI.IG....D.HEAHLPA-..R.I.LTQI.KL.AA...T...-----..Y..AK.PA .AA...AA...AT...E..D.. ....Q.I..Y.ATNFDVQMDRQLL--FIKRAL.SG.EK....QT.GKGAS....VNVQ sHle ....ALMGGYDVE--..NSR-IKLGL.SL.SK.T..QT.GTGAS.....NKK con K.SG.SVT.LITK.VSAS...K....A A A A AAAAAMAA A A AUAA +
+ 1
c
Hl-I
@@+
4
--+
+O@
+
0
0
++
.......10........20........30........40........50........60........70..........82
132
AKAKKS1TCIXMKADGEAKPKKSEAKPKKAEAVKKTKAPKEKVERPKKEKKEKVEKK~TPKAEKPKK~TPKSAGKK~TPKPK --AAPKSPAKKDAKPKKATPSKKAAPKKAPAKKSTPKAKEAKSKGKK' --
98
HI-II
SKAKAAAKPKAAPKKAAAPKKAAAPKKAKAPKKEGEKKAVKKA ---
Fig. 4. Three-domain structure of histone Hl proteins (top diagram: numbering refers to V. carteri HI-I and HI-II proteins) and special features of the N-terminal (A), central (B) and C-terminal (C) peptide sequences. (A) Partial aa sequences of N-terminal domains from V. carteri (V.C.),Chironomus thummi thummi (C.t.t.) and Caenorhabditis elegans (C.e.) with Hl proteins aligned relative to the conserved KAPKAP-KV/AA motif (Schulze et al., 1993, modified). Numbering refers to aa residues between the N-terminal Ser (or Ala), the KAPKAP motif, and the KV/AA motif. The open box marked ‘CD’ represents the start of the globular domain. (B) Comparison of Y. carteri, higher plant and animal Hl central domain aa sequences. The V. carteri HI peptide sequences (vHl-I; vHl-II) were aligned with those of pea (pH1; Gantt and Key, 1987), Arabidopsis (aHl-1 and aHl-II;
Gantt and Lenvik, 1991), maize (mH1; Razafimahatratra et al., 1991), wheat (wH1; Yang et al., 1991), a sea urchin early embryogenesis variant (sHle; Levy et al., 1982) and an animal concensus sequence averaged from 30 Hl peptide sequences (con; Wells and McBride, 1989). Dots represent identical aa. Numbering of aa residues includes the gaps (-) that have been introduced for optimizing the alignments. Note, that the algal and higher plant sequences are lacking a conserved pentapeptide, GTGAS, typical for animal Hl sequences. The location and extent of three helical domains (Clore et al., 1987), termed A, B and C, are pointed out by shadowed boxes; residues conserved among V. carferi and plants (0) and among V. carteri, plants and animals (0) are marked appropriately. Open arrowheads (A) mark invariant and conservative surface residues deduced from the chicken H5 globular domain, and black arrowheads (A) also those conserved in plant Hls. The position of acidic (-) and basic (+) residues in the animal consensus are pointed out as well. Basic residues thought to interact with linker DNA in condensed chicken chromatin structure are indicated (+) (Crane-Robinson and Ptitsyn, 1989). (C) Primary structures of the histone Hl-I and HI-II C-terminal domains from V. carteri. Numbering as in Fig. 3. Repetitious motifs KKS/ATP (HI-I) and PKKAA/KA (HI-II) are underlined; DNA-binding units S/TPKK (Suzuki, 1989) are in outlined letters. Asterisks
(*) represent
stop codons.
65
characteristics are also true of the C-terminal portions of the two derived Hl polypeptides from V. carteri (Fig. 4C). Both species contain copies (H 1-I; 1; HI-II: 3) of the tetrapeptide TPKK (Fig. 4C, shadowed), which closely resembles the SPKK motif implicated in DNA binding (Suzuki, 1989; Schulze et al., 1993). Distinguishing the Hl-I and Hl-II C-terminal regions are two repetitive sequence elements, namely KKATP (six repeats in Hl-I) and PKKAA/KA (eight repeats in HI-II). As has been postulated for Hl species from other organisms (Mezquita et al., 1985), such repetitive clusters may have been generated by duplication, thus creating new specific variants during the evolution of HI subspecies. A comparison of the animal and plant Hl proteins studied thus far reveals two striking differences in the proteins. (i) Animal Hl proteins typically consist of about 220 aa, whereas the algal and plant H 1 are larger, varying from 240 aa (H l-II) to 274 aa (A. thaliana H 1-I). The globular domains of plant and vertebrate Hl proteins are nearly identical in size, as the large differences result from variations in the sizes of N- and C-terminal regions. (ii) The overall sequence identity of Hl subtypes within a single species is about 90% in most animal species, e.g., among the chicken Hl proteins (Coles et al., 1987), whereas the I/. curteri Hl proteins exhibit an overall sequence identity of only 62%. Whether these differences between the two algal Hl proteins reflect functional differences for the V&ox proteins, such as differential binding to condensed chromatin (Schulze et al., 1993) or to axonemal microtubules (Multigner et al., 1992), will be studied by immunolocalization. (d) Sl mapping and promoter signals Sl mapping was used to determine
the tsp of the Hl-I and HI-II genes. Gene-specific ssDNA probes were generated by T4 polymerase-catalyzed extension of labeled primers, ONS 1-I and ONS l-11 (Fig. 3), using suitable fragments as unique templates (0.46RP and 0.47SP; Fig. 1). As shown in Fig. 5, mRNA synthesis starts at a G residue 63 nt upstream of the start codon in HI-I and at an A residue 85 nt upstream of the start codon in HZ-II.
Two conserved, potential promoter signals are present in the 5’UTRs of both V. curteri HZ genes (Fig. 3): a TATA box, 90 nt (HI-Z) and 105 nt (HI-II) upstream of the respective translational start points, and a 20-bp putative enhancer element at 36 nt (HI-Z) and 29 nt (HZ-II) upstream of the tsp, respectively. The latter resembles the consensus, 5’-GGNTCGGCTCACCGGGNCAA (Fig. 6A), previously identified as an apparent candidate for histone-specific transcriptional control in V. curteri (Miiller and Schmitt, 1988; Miiller et al., 1990). Of the 20nt consensus, 12 nt (HI-Z) or 13 nt (HI-II) are conserved in the two elements preceding the Hl genes. Unlike the
B
A 5’
0.46RP
T
5’
T _33 l-lA
0.47SP
u T
A
C G
A
C
C
C
T
A
T
A
T
C
T
C
C T
--I
T
ACGTcoS1
C C
A
A
T
G
T
C
A C
G A
C T T G
\
u A
G
A
C C
26
A
A
1
T r-lA A
T
ACGTcoSl
-
C A C T A
C T A +I
C
\
A A
k+, G
A
3’
Fig. 5. Sl nuclease mapping of tsp for HI-I (A) and HI-II (B). Autoradiographs of signals after Sl digestion (S l), of controls (Co) using 50 ug tRNA, and of reference sequences (ACGT). The 5’ ends of mRNAs were mapped with specific ssDNA probes generated on cloned 0.46RP (HI-I) and 0.47SP (HI-II) templates (Fig. 1) by priming with ONSl-I and ONSl-II (Fig. 3), respectively. RNA was isolated as described by Kirk and Kirk (1985) from synchronously growing V. carteri HKlO (female strain) during embryogenesis (Kirk and Harper, 1986). 5 x 10“ cpm of DNA probe were hybridized with 50 ug of total RNA and digested with 250 units of Sl nuclease according to Ausubel et al. (1987). Relevant portions of nt sequences (numbered as in Fig. 3) with the deduced tsg (+ 1) and presumptive TATA boxes are shown.
metozoan HI genes, no other typical promoter signals such as an HZ-CCAAT box, a G-rich element or an AC box (Eick, 1989; Osley, 1991) have been found upstream of the V. curteri genes (Fig. 3). However, starting about 250-300 nt upstream of the tsp, clustered A+T stretches, interrupted by short GC islands, in the UTRs of both HI genes may have a specific role in regulating their transcription. The exact function of these upstream elements including enhancer and A + T-rich sequences will now be testable with the advent of genetic transformation in Vo’oluox (Gruber et al., 1992; B. Schiedlmeier, pers. comm.) facilitating the in vivo expression of recombinant Hl genes preceded by serial deletions. (e) 3’ UTRs, pre-mRNA processing and gene expression Like genes coding for the nucleosomal histones, each HI gene possesses a typical 3’ palindrome as a signal for pre-mRNA processing (Miiller et al., 1990). An alignment with the previously established consensus (Fig. 6B) highlights the 27-bp conserved sequence (with central, 6-bp
66
A HI-1
TACGGGGATCAGCGQQCCAA
k
TATATA
Hl-II
TCAGCQQGTCACCQQGCCAC
k
TAAAAC
N-con
GGNTCGGCTCACCGGGNCAA
8-62
TATA
box
B 3’PaIindrome Hl-I
TAA -
Hl-II
TAA -
N-con
stop
60 55
SE
CAAMTCGGTGTTTTTCAACACCACCA
TGTCCG
TTCGACCTT
GTGGAT
TCTAACCGGTGTTTTTCAACACCACCT
GCCTTG
TGCGAATTC
CGTCCC
22-50 -AAAATCCGGTGTTTTTCAACACCACCA
411
< 2 -7
>
TTGNAWCTT
Fig. 6. Comparison of conserved 5’ UTR and 3’ UTR elements. (A) Alignment of the conserved 20-bp element from the 5’ UTRs of Hl-I and HI-II compared to a consensus (N-con) sequence derived from the nucleosomal histone genes of V. carteri (Mtiller et al., 1990). Identical positions are highlighted, by bold-face lettering. The distances to the corresponding TATA boxes are given in bp. (B) Conserved sequences from 3’ UTRs from HI-I and HI-11 genes, aligned with respect to the highly conserved 27-bp element (3’ palindrome) and a moderately conserved 9-bp spacer element (SE), are compared to the corresponding V. carteri-derived consensus (N-con) from nucleosomal histone genes (Miiller et al., 1990). Distances to translation stops (in bp), a 6-bp dyad symmetry (-), 3’ termini of transcripts (downward arrows) defined by Sl-mapping (Miiller et al., 1990), and distance (2-7 bp) between 3’ palindrome and SE for N-con are specified. N= A, C, G or T; W =A or T.
inverted repeats) located 60 nt (HZ-Z) or 55 nt (HI-11) downstream of the respective translational stop codons. A less-conserved 9-nt spacer element is located 6 nt further downstream. Like the animal histone genes (Birnstiel et al., 1985), but unlike protozoan (Bannon et al., 1983), fungal (Fahrner et al., 1980) and higher plant (Chaboute et al., 1988; Chaubet et al., 1988) analogues that encode poly(A)+mRNAs, the 3’ palindrome signals the endonucleolytic cleavage of histone pre-mRNA to produce non-polyadenylated mRNA species in V. carteri (Miiller et al., 1990). There are two lines of experimental evidence consistent with this animal-like mode of histone Hl pre-mRNA processing in V. carted. (i) HI-Z and HZ-II cDNAs have been reverse transcribed from the poly(A)-RNA fraction isolated from Voluox embryos, using oligos ON3 (complementary to the 3’ palindrome) and the gene-specific ON1 and ON2 (Fig. 3), respectively, as primers (see section b). Conversely, no HZ cDNAs have been obtained from the poly(A)+RNA fraction (data not shown). (ii) Northern blots probed with gene-specific fragments, 0.46RP (HZ-Z)and 0.47SR (HZ-ZZ)(Fig. l), revealed 0.9-kb hybridizing bands with each probe (data not shown). This mRNA size is in accord with the calculated values for non-polyadenylated mRNAs of 929 nt (HZ-Z:63-nt 5'leader + 786-nt coding region + 87-nt 3’-trailer) and 893nt (HZ-II: 85-nt 5’-leader + 726~nt coding region + 82-nt
3’-trailer), respectively. Taken together, these data suggest that histone HZ mRNAs, like their nucleosomal counterparts, lack a poly(A) tail. Histone-encoding genes are among the best-studied examples of cell-cycle regulated genes (Osley, 1991). With regard to expression, they have been divided into replication-dependent histone genes, whose expression is regulated in the cell-cycle, and replacement histone genes, whose expression occurs at a basal level throughout the cell-cycle (Old and Woodland, 1984; Osley, 1991). To analyze the mode of HZ gene expression in V&ox, we harvested synchronized cultures at different stages throughout the 48 h cell-cycle of V. carteri, with hour 0 defining the startpoint of embryogenesis. Total RNA was extracted from isolated cells, separated by gel electrophoresis, blotted to a nylon membrane, and probed with a cDNA fragment encoding the conserved HI globular domain (Fig. 4). As shown in Fig. 7, expression was only detected at 6 h, corresponding to the termination of cell division. Thus, peak transcription of the HZ genes is restricted to late embryogenesis, classifying these genes as ‘replication-dependent’, but seems to occur subsequent to replication. At this point, the condensation of chromosomes accompanying mitosis and believed to require Hl has been completed. Short-interval sampling of RNA during embryogenesis and Northern analysis at increased sensitivity (using PCR) may tell, whether HZ message is
67 0
6
12
18
24
30
42
46
0.9 kb
Fig. 7. Stage-specific transcription of the HI genes probed by Northern hybridization. RNA samples (10 ug) probed with DIG-labelled (Boehringer Mannheim, Germany) HI-I GD-DNA (a 234-bp HaeIlIfragment from the conserved central portion of HI-1 cDNA; overnight annealing at 37°C in 50% formamide/S x SSPE/lO x Denhardt’s solution/l% SDS/300 ug per ml herring sperm DNA; two times 30 min washing in 0.3 x SSC/O.l% SDS at room temperature). Numbering of lanes corresponds to time of sampling through the 48-h Voluox life cycle (0 h=begin of cell divisions). Size marker (left) refers to hybridizing band. For SSC see Fig. 2 legend.
also synthesized at low levels during the mitotic divisions. At 6 h (Fig. 7), two flagella are produced by each somatic cell (Schmitt and Kirk, 1992) and it may be speculated that axonemal microtubules require the bulk of either specific or all members of the Hl family for stabilization. This contention, kindled by the recent demonstration of histone Hl-mediated stabilization of flagellar microtubules in sea urchin (Multigner et al., 1992), has to be tested in the V&ox system by immunolocalization using specific antibodies raised against the various Hl subtypes. (f) Conclusions (1) The first algal histone HI genes have been isolated and characterized. There are at least four different HI genes in the V. carteri genome, two of which have been sequenced and characterized. Each HZ gene occurs as a singlet and maps separate from the nucleosomal histoneencoding gene clusters (Miiller et al., 1990). (2) The HI-Z and HZ-II coding regions are interrupted by three introns, a feature distinguishing them from other known HI genes. The first and the third intron occupy different positions in the two analyzed HI genes, whereas the location of the second intron is conserved. (3) The derived 260-aa (HI-Z) and 240-aa (HI-II) polypeptides exhibit the typical three-domain pattern of Hl proteins, with a highly conserved central globular core flanked by variable C-terminal and N-terminal domains. The N-terminal domain of HI-1 contains a repetitive LysAlaPro motif at a discrete distance from a LysAlaAla motif, a configuration reported to specifically bind to highly condensed bands in the polytene chromosomes of
the midge Chironomus (Schulze et al., 1993). The C-terminal domains are distinguished by different lysinerich repetitive elements thought to direct higher order chromosome structure. (4) Transcription of the two HI genes initiates 33 nt (HZ-Z) and 26 nt (HZ-II), respectively, downstream of a TATA box motif. An upstream 20-bp putative enhancer element partially resembles the enhancer sequence associated with the V. carteri nucleosomal histone genes and is thought to regulate and to co-ordinate the levels of Hl transcripts. The significance of these elements and 35-nt A + T-rich regions preceding the tsp by 250-300 bp will now be established by recombinant technics. (5) The 3’ UTR of each V. carteri HI gene contains a 27-nt 3’ palindrome and a 9-nt spacer element as signals for the endonucleolytic cleavage of nonpolyadenylated pre-mRNA. Non-polyadenylation of HI mRNA was substantiated by Northern analysis suggesting a mechanism of HI mRNA processing like in animals, but different from higher plants. (6) HI mRNA accumulated 6 h after the initiation of embryogenesis, when cell division terminates, but was not observed at any other stage of the Volvox life cycle. The production of new flagella concurrent with Hl gene expression fits in the proposed role of Hl stabilizing the flagellar microtubules (Multigner et al., 1992).
ACKNOWLEDGEMENTS
We thank David L. Kirk and Lai-Wa Tam for the gift of the stage-specific Voluox RNAs, Ekkehard Schulze and Ulrich Grossbach for sharing unpublished data and Steve Miller for critical review of the manuscript. This investigation was supported by the Deutsche Forschungsgemeinschaft (SFB43).
REFERENCES Allan, J., Mitchell, T., Harborne, N., Bohm, L. and Crane-Robinson, C.: Roles of Hl domains in determining higher order chromatin structure and HI location. J. Mol. Biol. 187 (1986) 591-601. Ausubel, F.M., Brent, K., Kingston, R.E., Moore, D.D., Smith, J.A., Seidmann, J.G. and Struhl, K. (Eds.), Current Protocols In Molecular Biology 1987-1988. Wiley, New York, 1987. Bannon, G.A., Calzone, F.J., Bowen, J.K., Allis, C.D. and Gorovsky, M.A.: Multiple, independently regulated, polyadenylated messages for histone H3 and H4 in Tetrahymena. Nucleic Acids Res. 11 (1983) 3903-3917. Birnstiel, M.L., Busslinger, M. and Strub, K.: Transcription termination and 3’ processing: the end is in site! Cell 41 (1985) 349-359. Blake, C.: Exons, present from the beginning. Nature 306 (1983) 535. Brown, J.W.S.: A catalogue of splice junction and branch point sequences from plant introns. Nucleic Acids Res. 14 (1986) 9549-9559. Chaboute, M.E., Chat&t, N., Clement, B., Gigot, C. and Philipps, G.:
68 Polyadenylation
H3 and H4 mRNAs
of histone
plants. Gene 71 (1988) 217-223. Chaubet, N., Chaboute, M.E., Clement,
sequence and organization Res. 16 (1988) 4121-4136.
in dycotyledonous
B., Ehling, M., Philipps,
G. and
Gigot, C.: The histone H3 and H4 mRNAs are polyadenylated maize. Nucleic Acids Res. 16 (1988) 1295-1304. Clore,
G.M.,
Gronenborn,
A.M., Nilges,
M., Sukumaran,
in
D.K.
and
Zarbock, J.: The polypeptide fold of the globular domain of histone H5 in solution. A study using nuclear magnetic resonance, distance geometry
and restrained
molecular
dynamics.
EMBO
J. 6 (1987)
1833-1842. Cole, R.D.: A minireview of microheterogenity in Hl histone possible significance. Anal. Biochem. 136 (1984) 24-30. Coles,
L.S.,
Robins,
Characterization
A.J.,
Madley,
of the chicken
L.K.
histone
J. Biol. Chem. 262 (1987) 9656-9663. Crane-Robinson, C. and Ptitsyn, O.B.: Binding of linker histone H5/Hl 2 (1989) 577-582. Doenecke, D.: Histone, ifications.
histone variants
In: Kahl,
J., Haeberli,
sequence analysis (1984) 387-395. Doohttle,
Weinheim,
of the globular a hypothesis.
Prot. Eng.
histone mod-
for the VAX. Nucleic
Acids
set of Res. 12
Nature
272
D. and
Doenecke,
D.: Human
Hl
and varied sequence elements in two Hl subtype mRNA
is
of a pea Hl histone cDNA.
Gantt, J.S. and Lenvik, T.R.: Arabidopsis thaliana HI histones. Analysis of two members of a small gene family. Eur. J. Biochem. 202 (1991). 1029-1039. H., Goetinck,
SD.,
Kirk,
D.L. and Schmitt,
R.: The nitrate
reductase-encoding gene of Fo0luo.x carteri: map location, and induction kinetics. Gene 120 (1992) 75-83. Grunstein, M.: Histone 6 (1990) 643-678.
function
in transcription.
Kirk, D.L.: The ontogeny and phylogeny Volvox. Trends Genet. 4 (1988) 32-36.
Kirk, M.M. and Kirk, D.L.: Translational sis, in response to light, at a critical Cell 41 (1985) 419-428. A., Kirk,
M.M. and Kirk,
sequence
of cellular
differentiation
in
and molecular Int. Rev. Cytol.
regulation of protein synthestage of I/oloox development.
D.L.: Molecular
sequence of a gene for Hl histone that interacts J. Biol. Chem. 257 (1982) 9438-9443.
N. and Schmitt,
algae
Nucleotide
amino
synthesis
Fo’oloox carteri
in the cell cycle. Annu.
R.: Phylogenetic deduced
from
sequence
and expression
relationships
small-subunit
Nature
of the acid
with euchromatin.
of ribo-
Gigot,
of a maize H 1 histone
Nucleic Acids Res. 19 (1991) 1491-1496. Rogers, J.: Exon shuffling and intron insertion
C.:
cDNA.
in serine protease
genes.
315 (1985) 458-459. inserted into nuclear
genes? Trends Genet.
5 (1989) 213-216. Sanger, F., Nicklen, S. and Coulsen, A.R.: DNA sequencing with chaintermination inhibitors. Proc. Nat]. Acad. Sci. USA 74 (1977) 5463-5467. M., Ward,
S., Childs, G. and Emmons,
a Caenorhabditis elegans histone (1990) 2599268. R., Fabry,
of cellular
W.: Identification
HI gene family.
in Fooluox and its relatives.
organized microtubule Cytoskeleton of the
in the volvocales:
assemblages. Algae. CRC
pp. 369-392. Schulze, E., Trieschmann,
L., Schulze,
In: Menzel, Press, Boca B., Schmidt,
of
Mol. Biol. 212
S. and Kirk, D.L.: In search of the molecular
differentiation
139 (1992) 189-265. Schmitt, R. and Kirk, D.L.: Tubulins
origins
Int. Rev. Cytol. from genes to D. (Ed.), The Raton, 1992, E.R., Pitzel,
S.,
Zechel, K. and Grossbach, U.: Structural and functional differences between histone HI sequence variants with differential intranuclear Proc. Nat]. Acad. Sci. USA (1993) in press.
Suzuki, M.: SPKK, a new nucleic acid-binding in histone. EMBO J. 8 (1989) 797-804. Thoma,
F., Keller,
T. and Klug, A.: Involvement
unit of protein
Vanfleteren,
found
of the histone
the organization of the nucleosome and the salt dependent structures of chromatin. J. Cell. Biol. 83 (1979) 403-427.
HI in super-
J.R., Van Bun, S.M., De Baere, I. and Van Beeumen,
J.J.:
The primary structure of a minor isoform (Hl.2) of histone HI from the nematode Caenorhabditis elegans. Biochem. J. 265 (1990) 739-746. Vieira,
phylogeny
volvocine flagellates. Mol. Biol. Evol. 9 (1992) 85-102. Levy, S., Sures, I. and Kedes, L.: The nucleotide and
of histone
genes: not so simple after all.
60 (1991) 827-861.
H., Larson,
distribution.
Annu. Rev. Cell Biol.
Kirk, D.L. and Harper, J.F.: Genetical, biochemical approaches to I/olvox development and evolution. 99 (1986) 217-293.
Larson,
Rausch,
Schmitt,
Fahrner, K., Yarger, G. and Herford, L.M.: Yeast histone polyadenylated. Nucleic Acids Res. 8 (1980) 572555737. Gantt, J.S. and Key, J.L.: Molecular cloning Eur. J. Biochem. 166 (1987) 119-125.
H.R.: Histone
Cell 38 (1984) 624-626. Osley, M.A.: The regulation
Sanicola,
genes. Eur. J. Cell Biol. 49 (1989) 110-l 15.
Gruber,
(1992) 33-39. Old, R.W. and Woodland,
Rogers, J.: How were introns
W.F.: Genes in pieces: were they ever together?
(1978) 581-582. Eick, S., Nicolai, M., Mumberg, histones: conserved
Genes.
0.: A comprehensive
R.: Organization
somal RNA comparisons. J. Mol. Evol. (1989) 255-265. Razafimahatratra, P., Chaubet, N., Philipp, G. and
domain
of Eukaryotic
M. and Schmitt,
Acids
Multigner, L., Gagnon, J., Van Dorsselaer, A. and Job, D.: Stabilization of sea urchin flagellar microtubules by histone Hl. Nature 360
the green
1988, pp. 123-141.
P. and Smithies, programs
J.R.E.:
complement.
A., Briiderlein,
gene loci. Nucleic
and transcription of I’oloox histone-encoding genes: similarities between algal and animal genes. Gene 93 (1990) 167-175.
Rev. Biochem.
Wells,
gene
and postsynthetic
G. (Ed.), Architecture
VCH-Verlagsgesellschaft, Devereux,
to the nucleosome:
and HI
and it’s
Miiller, K., Lindauer,
of two H3-H4
J. and
Messing,
J.: The pUC
plasmids,
an M13mp7-derived
system for insertion mutagenesis and sequencing with synthetical universal primers. Gene 19 (1982) 259-268. Wells, D. and McBride, C.: A comprehensive compilation and alignment of histones r3 11 -r346.
and
histone
genes.
Nucleic
Acids
Res.
17 (1989)
Mages, W., Salbaum, L.M., Harper, J.F. and Schmitt, R.: Organization and structure of Voluox a-tubuhne genes. Mol. Gen. Genet. 213 (1988) 449-458.
Wiebauer, K., Herrero, J.-J. and Filipowicz, W.: Nuclear pre-mRNA processing in plants: distinctive modes of 3’-splice-site selection in plants and animals. Mol. Cell. Biol. 8 (1988) 2042-2051.
Mezquita, J., Connor, W., Winkfein, R.J. and Dixon, G.H.: An HI histone gene from rainbow trout (Salmo gairdnerii). J. Mol. Evol. 21 (1985) 209-219.
Wu, M., Allis, C.D., Richman, R., Cook, R.G. and Gorovsky, M.A.: An intervening sequence in an unusual histone HI gene of Tetrahymena thermophila. Proc. Nat]. Acad. Sci. USA 83 (1986) 8674-8678. Yang, P., Katsura, M., Nakayama, T., Mikami, K. and Iwabuchi, M.: Molecular cloning and nucleotide sequences of cDNAs for histone
Mohr, E., Trieschmann, L. and Grossbach, U.: Histone Hl in two subspecies of Chironomus thummi with different genome sizes: homologous chromosome sites differ largely in their content of a specific HI variant. Proc. Nat]. Acad. Sci. USA 86 (1989) 9308-9312. Miller, K. and Schmitt, R.: Histone genes of Voluox carteri: DNA
HI and H2B variants from wheat. Nucleic Acids Res. 19 (1991) 5077. Zlatanova, J.: Histone Hl and the regulation of transcription of eucaryotic genes. Trends Biochem. Sci. 15 (1990) 273-276.