MOLECULAR PHYLOGENETICS AND EVOLUTION
Vol. 7, No. 3, June, pp. 377–393, 1997 ARTICLE NO. FY960395
Phylogenetic Relationships of the Liverworts (Hepaticae), a Basal Embryophyte Lineage, Inferred from Nucleotide Sequence Data of the Chloroplast Gene rbcL Louise A. Lewis,1 Brent D. Mishler,2 and Rytas Vilgalys Department of Botany, Duke University, Durham, North Carolina 27708 Received November 26, 1996
Sequence data from the chloroplast-encoded gene rbcL were obtained for 24 liverworts, a basal group of embryophytes. Maximum likelihood and parsimony analyses of these data, along with data from other major green plant lineages, confirm hypotheses based on morphological data, such as the paraphyly of bryophytes, and the basal position of liverworts. Molecular data corroborate the deep separation between the complex thalloid and leafy/simple thalloid liverworts implied by morphological data, but the monophyly of liverworts could not be rejected. The effects of accounting for site-to-site rate heterogeneity in these data were examined using maximum likelihood methods. Comparison of trees obtained with and without rate heterogeneity showed that simply allowing for heterogeneity had a greater improvement on likelihood score than optimization of transition/transversion bias. Incorporation of site-to-site rate heterogeneity in the larger analysis, however, did not necessarily change which topology was favored. Properties of rbcL sequences from the two liverwort groups were compared. Significantly different substitution rates were found between leafy/simple thalloid and complex thalloid liverwort taxa, with rates of rbcL sequence evolution in leafy/simple thalloid taxa being higher and more indicative of those of vascular plants, and with those of complex thalloid taxa (such as Marchantia) being slower. Codon usage in rbcL in complex thalloid liverworts was biased toward NNU and NNA, compared to the leafy/simple thalloid liverworts. Although base composition and relative substitution rates differed between the two groups, no significant differences were detected within each of the two groups of liverworts. The signal present in first and second codon sites versus third codon sites was compared. While the third codon positions in rbcL across this
1 Current address: University of New Mexico, Department of Biology, Albuquerque, NM 87131. 2 Current address: University Herbarium, Jepson Herbarium and Department of Integrative Biology, University of California, Berkeley, CA 94720.
taxon sampling are highly variable (with only 15 constant sites of 439), the trees obtained were in general agreement with trees from the entire data set and with trees obtained from independent sources of data. The presence of signal in third codon positions across greater than 400 MY of plant evolution means that definitions of saturation based on pair-wise comparisons of sequences inadequately assess phylogenetic signal. r 1997 Academic Press
INTRODUCTION Several major life history changes occurred during the evolution of vascular plants from their green algal ancestors (Graham, 1993). One of the most dramatic changes was the formation of a multicellular embryo, involving both the shift from zygotic to sporic meiosis and establishment of cellular interactions between cells of the different generations (reviewed in Graham, 1993). As apparently the earliest extant embryocontaining plants (embryophytes), the bryophytes (mosses, liverworts, hornworts) therefore occupy a critical position in the evolution of green plants. A consideration of the phylogenetic placement of bryophytes is therefore important for the selection of outgroup taxa for evolutionary studies of vascular plants and in characterization of chloroplast genes among green plants. The liverwort Marchantia, for example, is commonly used as an outgroup to vascular plants (Gaut et al., 1992; Morton, 1994), but whether this taxon adequately represents close sister groups is not known. Resolution of the branching order among the mosses, liverworts, and hornworts is controversial. Information from the fossil record is sparse, but the earliest records are of liverworts in Devonian-aged sediments (Schuster, 1984; Stewart and Rothwell, 1993). Morphological data provide evidence that the bryophytes are paraphyletic (Crandall-Stotler, 1980; Mishler and Churchill, 1984, 1985; Sluiman, 1985; Kendrick and Crane, 1991; Mishler et al., 1994; but see Garbary et al., 1993) and support the liverworts as the most basal
377
1055-7903/97 $25.00 Copyright r 1997 by Academic Press All rights of reproduction in any form reserved.
378
LEWIS, MISHLER, AND VILGALYS
lineage of the three (Mishler and Churchill, 1984, 1985; Graham et al., 1992; Mishler et al., 1994). Molecular data have been equivocal, supporting a number of conflicting branching orders, in part due to poor taxon sampling or limited sequence lengths (Mishler et al., 1992, 1994; Waters et al., 1992; Manhart, 1994; Bopp and Capesius, 1995; Kranz et al., 1995). Results using molecular data have also varied depending on the selection of taxa and methods utilized (Kranz et al., 1995). However, the lack of extant intermediate taxa, and the large time since divergence (more than 400 MYA, Stewart and Rothwell, 1993) probably contribute more to the poor resolution, as short internal branches compared to long external branches on a tree are known to cause spurious attraction of branches (Felsenstein, 1978). Like other bryophytes, the liverworts are haploiddominant plants with morphologically simple thalli (gametophytes), composed of from a few to many cell layers. In some taxa, differentiation of cell layers has occurred to produce air chambers and pores, as is found in Marchantia, a complex thalloid taxon. Sexual reproduction in liverworts (and other bryophytes) gives rise to a relatively short-lived sporophyte generation that produces haploid spores following meiosis. Based on gametophytic characters, the liverworts have been divided into two subclasses: the Jungermanniidae, with leafy or simple thalloid gametophytes; and, the Marchantiidae, with complex thalloid gametophytes (Schuster, 1984). Although the two subclasses are morphologically distinct, the liverworts are usually hypothesized to be a monophyletic group because of three synapomorphies, the occurrence of elaters in the sporophyte, and the presence of both oil bodies and lunularic acid in the gametophyte (Mishler and Churchill, 1984). The root of the liverwort clade has been the subject of debate. Based on morphological characters (Schuster, 1984), and ultrastructural characters of male gametogenesis (Garbary et al., 1993), rooting was reported to be in the Calobryales (Jungermanniidae). On the other hand, cladistic analyses of morphological characters of the gametophyte and sporophyte have rooted the liverwort clade in the Marchantiidae (Mishler and Churchill, 1985). Mishler et al. (1994) used a data set composed of both ribosomal DNA sequence data, plus morphological data, and found the liverworts to be split into leafy (Porella) and complex thalloid taxa (Conocephalum, Asterella, Riccia), rooted between the two groups. Several previous studies have used sequences from the chloroplast gene encoding the large subunit of ribulose-1,5-bisphosphate carboxylase/oxygenase (rbcL) in analyses of green plant phylogeny (e.g., Chase et al., 1993; Manhart, 1994; Hasebe et al., 1995), and its substitution rate appears appropriate for questions at this level. However, Manhart (1994) found rbcL too substitution-rich at third codon positions to be useful across chlorophyll a 1 b-containing photosynthetic
organisms, and analysis at the amino acid level produced unorthodox relationships. Recently, Goremykin et al. (1996) have criticized the use of rbcL for questions directed toward the level of basal tracheophytes because of the high amount of saturation (defined using pairwise comparisons) in the molecule. Because the liverworts occupy an important position in the evolution of the land plants, we obtained rbcL sequence data from 24 liverwort taxa, representing a broad taxonomic sample. These sequences were combined with representatives of all extant major basal green plant groups and analyzed using maximum parsimony (MP) and maximum likelihood (ML). The results are compared with phylogenetic relationships inferred from 18s rRNA data and morphological data. In addition, we examine previous hypotheses concerning relationships among liverworts, including the possible rootings for the liverworts. Last, characteristics of rbcL sequences are examined for the two liverwort groups. A comparison of what has been found for the liverworts and vascular plants is necessary as it is not known if the rbcL gene of Marchantia typifies basal green plant lineages. MATERIALS AND METHODS Specimens were obtained from the field or from the living bryophyte collection of Malcolm Sargent, University of Illinois (see Table 1 for voucher information and GenBank Accession Nos.). All specimens received from M. Sargent were pure cultures, free of epiphytes. Field-collected material was washed in a dilute solution of Tween 20, followed by rinses in running water. Surfaces of the thalli were inspected for epiphytes under a dissecting microscope. All material was dried in a lyophilizer overnight. DNA Extraction A small amount of dried material (approximately 50 mg) was ground using liquid nitrogen in 1.5-ml eppendorf tubes to yield a fine powder. Extractions followed a modification of the CTAB protocol listed in Hillis et al. (1996). This entailed an addition of 23 CTAB 1 mercaptoethanol, extraction in a 60°C water bath for 45 min, addition of an equal volume of chloroform:isoamyl alcohol. After removal of the aqueous layer and an ethanol precipitation, the DNA was cleaned by running through a 0.8% SeaPlaque low melting agarose gel. The high molecular weight DNA was quantified, cut from the gel, and diluted to 3–5 ng/µl. PCR Amplification Partial and entire rbcL gene segments were amplified by PCR, using the primers listed in Table 2. Approximately 30 ng of total DNA was added to a 50 µl PCR reaction containing 5 µl of a 10x buffer (30 mM MgCl2, 670 mM Tris, pH 8.8), 8 µl of deoxynucleoside
LIVERWORT rbcL PHYLOGENY
379
TABLE 1 List of DNA Study Reference Numbers (in Parenthesis), Population, and Voucher Information, and GenBank Accession Numbers for the Taxa Used in the rbcL Study
Voucher information
GenBank accession number
MLS from M. Cole; Monroe Co. MD; 30 July 1982 MLS from B. Crandall-Stotler; IL; 5 Nov. 1977 DUKE; New Hope Creek, Orange Co. NC; Mishler, DeLuna, and Hopple 3774; 8 May 1989 MLS #392; Little Grand Canyon IL; 31 Mar. 1977 DUKE; Raven Rock State Park, Harnett Co. NC; Mishler and Lewis 3777; 4 Feb. 1992 MLS from H. Crum and N. Miller; Vincent Lake, Cheboygan Co. MI; 16 Aug. 1982 MLS from W. T. Doyle; Kearney Mesa CA; 10 Dec. 1980 MLS from B. Crandall-Stotler; Japan; Spring 1989 MLS #H56 from W. T. Doyle; Prague; Aug. 1977 DUKE; El Yunque, Puerto Rico; Mishler 3780; 25 Feb. 1992 MLS from B. Crandall-Stotler; IL; 5 Nov. 1977 MLS from K. Renzaglia; TN; 30 May 1983 MLS #481; Portland Arch IN; 23 Mar. 1980 MLS from D. Basile, New York Botanical Garden, NY; Jan. 1982 MLS from M. Mizutani; Kyushu, Japan; 17 Jan. 1980 DUKE; Durham Co., NC; Duke University Greenhouses; Mishler 3783; 9 May 1989 DUKE; New Hope Creek, Orange Co., NC; Mishler, DeLuna, and Newton 3772; 9 May 1989 DUKE; El Yunque, Puerto Rico; Mishler 3781; 25 Feb. 1992 MLS #10/20/79, Pinhook Bog, IN; 20 Oct. 1979 MLS #1 from Proskauer; York, England; 4 Aug. 1961 MLS from S. Ehlers; Huntsville State Park, Walker Co., TX; 12 Dec 1982 DUKE; Orange Co. NC; Mishler, DeLuna, and Hopple 3773; 8 May 1989 MLS #9/81 from K. Spencer; Busey Woods, Urbana, IL; Sept. 1981 DUKE; Durham Co. NC; Schwartz; 5 May 1989
U87064 U87065 U87066 U87067 U87068 U87069 U87070 U87071 U87072 U87073 U87074 U87075 U87076 U87077 U87078 U87079 U87081 U87083 U87084 U87085 U87086 U87088 U87089 U87090
MLS #295; Portland Arch, IN; May 1980 DUKE; Orange Co. NC; Mishler, DeLuna, and Newton 3775; 8 May 1989 DUKE; Buncombe Co. NC; Pittillo 9764; 8 July 1988
U87082 U87087 U87091
DUKE; Durham Co. NC; Mishler, DeLuna, Newton 3776; 9 May 1989 DUKE; El Yunque, Puerto Rico, Mishler 3782; 25 Feb. 1992
U87063 U87080
Taxon Liverworts Asterella tenella (91) Calypogeja muelleriana (66) Conocephalum conicum (2) Conocephalum conicum (92) Dumortiera hirsuta (100) Fossombronia foveolata (23) Geothallus tuberosa (68) Haplomitrium mnioides (29) Haplomitrium hookeri (93) Herbertus pensilis (98) Jubula pennsylvanica (38) Lepidozia reptans (70) Lophocholea heterophylla (69) Lunularia cruciata (94) Makinoa crispata (40) Marchantia polymorpha (5) Metzgeria furcata (3) Monoclea gottschei (97) Pallavicinia lyellii (43) Pellia epiphylla (57) Petalophyllum ralfsii (84) Porella pinnata (4) Ricciocarpos natens (26) Sphaerocarpos texanus (1) Mosses Mnium cuspidatum (82) Polytrichum commune (10) Tetraphis pellucida (9) Hornworts Anthoceros punctatus (6) Megaceros vincentianus (96)
Note. MLS, specimens provided by M. L. Sargent, Univ. of Illinois. DUKE, voucher specimens deposted in the Cryptogamic Herbarium, Duke University, Durham, NC.
triphosphate mixture (1.25 mM each), 10 mM of each primer, and 5 U of Taq DNA polymerase (Cetus). The PCR reaction conditions were 30 cycles of the following: 94°C for 1 min, 47°C for 30 s, 72°C for 1 min, plus a 2 s per cycle extension. Products were visualized by running on a 0.8% ethidium bromide-stained agarose gel. PCR amplification products were prepared for sequencing by cleaning with MagicPreps (Promega) following the manufacturer’s instructions. The cleaned products were quantified by comparison with a standard on an ethidium bromide-stained agarose gel.
six times), with 70% of the sequence data confirmed using both forward and reverse primers. Sequences were read and entered manually into ESEE (Cabot and Beckenbach, 1989). Alignment of the rbcL sequence data was straightforward due to the presence of codon structure. All data were independently checked against the original gels in order to minimize error. Nucleotide data were converted to protein data for purposes of checking the alignment and to check for the presence of stop codons.
DNA Sequencing
Taxon sampling consisted of those newly sequenced accessions listed in Table 1, plus previously published sequences from two liverworts (Bazzania trilobata GenBank No. L11056, and Marchantia polymorpha X04465), one hornwort (Megaceros enigmaticus L13481), one moss (Brotherella recurvens L13475), and six tracheophytes (Angiopteris evecta L11052, Cycas circinalis
One picomole of PCR-amplified template was used in each cycle sequencing reaction (BRL kit and instructions). The primers were end-labeled with 33P. Eight percent denaturing polyacrylamide gels were used to separate the sequencing products. Approximately 98% of the bases were sequenced more than once (some up to
Phylogenetic Analyses
380
LEWIS, MISHLER, AND VILGALYS
TABLE 2 Primer Sequences Used for PCR Amplification and Sequencing of Liverwort rbcL Forward primers M34 M286 M288 M313 M636 M955 M985 M1150 Reverse primers M305r M596r M740r M1010r M1390r
GGATTTAAAGCTGGTGT CAATATATYGCTTATGTWGC ATATATTGCTTATGTAG TTAGATYTATTYGAAGAAGG GCGTTGGAGAGATCGTTTCT CGTATGTCTGGTGGAGATC GGTACTGTWGTAGGTAAAC GTTTGGCATATGCCTGC GTTATATAACGAATACATCG GCGCCACCTGAACTAAA CGATGACGTCCATGTAC CATCCATTTGAACTTCC GACGACGAACACTTYAWACCTTTC
Note. The primer numbers correspond to positions in the Marchantia rbcL gene (Ohyama et al., 1980). Primers are numbered from the 58 end. Built in redundancies are symbolized by their IUB ambiguity codes.
L12674, Ephedra tweediana L12677, Ophioglossum engelmanii L11058, Welwitschia mirabilis D10735, Zamia intermis L12683). We limited presentation of results to analyses using six tracheophyte taxa, since the inclusion of additional vascular plant sequences (Equisetum arvense L11053, Isoetes melanopoda L11054, Lycopodium digitatum L11055, Psilotum nudum L11059) did not change our overall conclusions. Because rooting is critical to this study, both Coleochaete orbicularis (L13477) and Nitella translucens (L13482) were included as outgroup taxa, representing two orders that span the divergence of the charophycean green algae (Kranz et al., 1995). Inclusion of additional Charophyte rbcL sequences (not shown) did not influence our results. The sequence from Jubula was omitted from the analyses because topologies obtained using Jubula, while well-resolved, were completely different than those obtained excluding Jubula and were inconsistent with any independent evidence (e.g., mosses nested within the liverworts). The Jubula sequence is not overly divergent, nor does it contain unusual substitution patterns. However, in no other case did a particular sequence have such an effect on the topology. The problem might be because Jubula is the only representative included of one of the largest and most rapidly speciating clades of liverworts, a group that is highly diversified in lowland tropical rain forests. Further sampling in the two related, hightly diverse families Jubulaceae and Lejeuneaceae (Schuster, 1984) will be necessary to resolve this problem. For a comparison of the divergence between and within lineages, pairwise distance estimates were obtained using DNADIST in PHYLIP 3.5c (Kimura (1980) correction; Felsenstein, 1993). Prior to the likelihood searches, site-to-site rate
heterogeneity and transition/transversion bias (ts/tv) were estimated for these data. Because ts/tv and rate heterogeneity are linked (Yang, 1994), ideally they should be estimated simultaneously. For a specified user tree of 40 taxa (corresponding to Fig. 3A), combinations of parameters were systematically chosen to determine the optimal ts/tv and rate heterogeneity parameters. Likelihood scores were determined using DNAML in PHYLIP 3.5c for this range of ts/tv and for relative rate category settings whose frequency and distribution approximated the gamma distribution (Yang, 1994), under the substitution model of Felsenstein (1984). The same comparison was made using PAUP* 4.0 d33 (Swofford, in press), except that the rate heterogeneity parameters could be estimated directly. PAUP* allows site-to-site rate heterogeneity to be modeled in several ways; the discrete gamma distribution method was used here, involving a single shape parameter (alpha) that was estimated from the data (see Swofford et al., 1996, for a discussion of optimizing over a tree). Optimum values for the shape parameter and for ts/tv were then fixed at this value during subsequent searches. In order to examine rates of synonymous and nonsynonymous substitution among lineages, maximum likelihood relative rate tests were performed for selected pairs of green plant rbcL sequences using the program ‘‘Codrates’’ (Muse and Gaut, 1994) compiled for Unix. Relative codon usage was obtained using MEGA (Kumar et al., 1993) for the complex thalloid and leafy/ simple thalloid liverworts. Data Set 1: Land Plants All parsimony searches of the rbcL data matrix were performed with the heuristic option of PAUP 3.1.1 (Swofford, 1993). Data were analyzed using three weight schemes. The first method used equal weights and is a strong weighting for gene sequences in which a ts/tv bias exists (Swofford et al., 1996). Because transition and transversion substitutions are known to have different rates, a priori differential weighting in the form of a stepmatrix can assign greater weight to transversions. The assignment of values to the weights for parsimony from an a priori tree for rbcL data has been explored by Albert and Mishler (1992) and a weight ratio of 2:1 was used. In addition, special consideration for the codon structure and differences in ts/tv rate bias of rbcL has resulted in the ‘‘codons’’ stepmatrix of Albert et al. (1993). In all searches, a random addition of taxa (25 replicates) was used to generate initial trees for tree bisection and reconnection branch-swapping. All maximally parsimonious trees were held. Bootstrap support (Felsenstein, 1985) was determined for each topology using 100 sample replicates, with random addition of taxa, and heuristic searching. Decay values (Donoghue et al., 1992; Bremer,
381
LIVERWORT rbcL PHYLOGENY
1994) were also determined for the equal weights parsimony analysis. ML searches were done using DNAML in PHYLIP 3.5c, with 10 jumbled additions of taxa, under the F84 model, with ts/tv and rate categories specified using values determined to be optimal (see above). PAUP* (Swofford, in press) was also used because, for our data, heuristic searches done in PAUP* examine approximately 20 times more trees than global searches in DNAML. In this case, the ts/tv was set to the optimum determined for the 40-taxon user tree (see above), and alpha (the shape parameter of gamma rate heterogeneity) was estimated. Base compositional bias or nonstationarity of base composition across the tree can lead to artifactual relationships in phylogeny reconstruction (Lake, 1994; Lockhart et al., 1994; Steel, 1994). Implementation of the LogDet correction, minimum evolution criterion, in PAUP* allowed for examination of whether these biases affected the topology. In order to evaluate the difference between competing hypotheses, user trees representing these topologies were evaluated using the likelihood ratio test of Kishino and Hasegawa (1989) implemented in PHYLIP 3.5c (Felsenstein, 1993), under the F84 model, with values of ts/tv and gamma set to those found earlier. Signal contained in the first and second codon positions was compared with that of third positions, in order to address whether rbcL is saturated at this level of divergence. Comparisons were done using equalweighted parsimony in order to make comparisons to what was found for the entire data set. Trees for first 1 second codon positions and for third codon positions were obtained using heuristic searches with TBR branch swapping (with 10 random addition of taxa) and MULPARS. Bootstrap estimates were obtained as for the searches described above. Data Set 2: Liverworts Only Because divergent outgroup sequences can influence the topology of ingroup taxa, we restricted some searches to the liverwort taxa only. MP analyses were done as for Data set 1, except that more thorough searching was possible for this smaller data set. ML analyses were performed in PAUP*, estimating substitution parameters, rate heterogeneity parameters, and proportion of invariant sites simultaneously with the tree. Data Sets 3 and 4: Complex Thalloid Liverworts Alone and Leafy and Simple Thalloid Taxa Alone In our analyses of liverwort taxa, it became apparent that the two major groups of liverworts also may be influencing branching order of the other group. For each of these data sets, MP analyses were done as for Data set 1. ML searches involved estimation of ts/tv and alpha directly during the searches.
RESULTS A total of 1315 sites was obtained for the study taxa, excluding primer sequences (alignment available in Nexus format from http://biology.unm.edu/,lewisl/ livrbcL.html). Sites not resolved were coded as missing. There were no confirmed insertions or deletions of codons detected, and no introns were found. Pairwise distance estimates between groups of bryophytes (Table 3) indicate that the rbcL genes of these taxa are quite divergent. The expected number of substitutions per site (divergence) between congeneric taxa are 0.005 for Conocephalum, 0.019 for Marchantia, and 0.073 for Haplomitrium. The average pairwise divergence within the Jungermanniidae is 0.143, while within the Marchantiidae it is 0.070. Between the two subclasses, the pairwise distance estimate is 0.144, and between the liverworts and mosses, 0.141. Collectively, the hornwort rbcL sequences contain stop codons at three amino acid positions. All three hornwort sequences share UGA at amino acid position (aa) 41, instead of the codon for arganine (CGA). Both Megaceros rbcL sequences share UGA, instead of CGA, at aa 83, and Anthoceros has one additional stop codon (UAA) at aa 45, instead of glutamate (CAA). In addition to stop codons, there are two changes in amino acid positions corresponding to active sites (aa 177, and 205, Knight et al., 1990) of rbcL in Megaceros-96. Fitting a Model to the Data A user tree corresponding to the topology in Fig. 3A was input with several combinations of ts/tv and relative rate categories to determine which of the two parameters had the most effect on likelihood score (Fig. 1). Note that ts/tv set to 0.5 corresponds to equal rates of transitional and transversional substitution (the Felsenstein 1981 model). Our results show that: (1) the shape of the curve is similar across all combinations of rate heterogeneity parameters used, with the maximum at approximately the same value for ts/tv (in this case, 2).; (2) allowing for site-to-site rate heterogeneity in the model greatly improved the fit of model to data, with a jump from 216388 for one rate category to 215945 for two categories (r 5 1, 2). Using a grid search method we found that the best score occurred for the two-rate category (lnL 5 214799 for r 5 1, 20) model. Scores were improved with increasing heterogeneity, up to a point, after which the ML score worsened (for example, r 5 1, 40 was slightly worse than r 5 1, 20). Three categories did not greatly improve the score (at ts/tv 5 2.0 the lnL 5 214941); (3) the optimum ts/tv was not influenced by topology. The optimum ts/tv was the same for trees that gave a significantly different likelihood score. All of these conclusions point to rate heterogeneity having the most influence on likelihood score, with ts/tv having less of an effect. The same conclusions were found for simulated data and small
Colechaete Nitella Asterella Conocephalum2 Conocephalum92 Dumortiera Geothallus Lunularia Marchantia5 MarchantiaOH Monoclea Ricciocarpos Sphaerocarpus Bazzania Calypogeia Fossombronia Haplomitrium29 Haplomitrium93 Herbertus Jubula Lepidozia Lophocholea Metzgeria Makinoa Pallavicinia Pellia Petalaphyllum Porella Anthoceros Megaceros96 MegacerosJM Brotherella Mnium Polytrichum Tetraphis Cycas Zamia Ephedra Welwitschia Angiopteris Ophioglossum
— 0.17307 0.16466 0.14386 0.14386 0.16279 0.16846 0.15131 0.16382 0.16246 0.16453 0.15803 0.15238 0.19643 0.19043 0.19618 0.20752 0.17855 0.15923 0.17262 0.17118 0.17883 0.20327 0.19014 0.20561 0.18929 0.21927 0.17861 0.19681 0.19390 0.20992 0.16931 0.17017 0.15686 0.16082 0.22135 0.22342 0.21071 0.22258 0.17833 0.22930
1
— 0.18022 0.17518 0.17425 0.18413 0.17928 0.18004 0.18625 0.18381 0.18986 0.18413 0.17431 0.19895 0.19558 0.20844 0.22444 0.20401 0.16611 0.17292 0.18325 0.18807 0.18235 0.20962 0.21670 0.20840 0.21481 0.18992 0.21973 0.21874 0.23049 0.17833 0.17423 0.18594 0.17661 0.23674 0.24725 0.22007 0.21997 0.20403 0.25624
2
— 0.05403 0.05322 0.05469 0.07401 0.06549 0.06635 0.05537 0.06195 0.04887 0.07716 0.16065 0.14231 0.15382 0.17145 0.15976 0.10665 0.09892 0.13306 0.14784 0.15057 0.13675 0.15153 0.15763 0.17150 0.12647 0.19366 0.12467 0.19364 0.14282 0.13181 0.12956 0.11746 0.19483 0.19524 0.18055 0.19074 0.17058 0.20467
3
— 0.00535 0.05726 0.06207 0.05961 0.06298 0.05957 0.06537 0.04650 0.06868 0.15199 0.13856 0.15510 0.16793 0.15025 0.09866 0.09449 0.12191 0.14402 0.13436 0.13701 0.15173 0.16213 0.17268 0.11812 0.17354 0.12384 0.18281 0.12790 0.11893 0.11176 0.11196 0.19194 0.19098 0.16988 0.17904 0.16073 0.21211
4
— 0.05726 0.06295 0.05710 0.06214 0.05957 0.06453 0.04650 0.07037 0.14907 0.13470 0.15410 0.16491 0.14637 0.09427 0.09449 0.11822 0.14211 0.13352 0.13605 0.14962 0.16443 0.17062 0.11804 0.17455 0.12384 0.18179 0.12513 0.11801 0.11268 0.11107 0.18883 0.18994 0.16988 0.17703 0.15790 0.21103
5
— 0.06877 0.06452 0.06207 0.05533 0.04807 0.04890 0.07463 0.15128 0.13664 0.15406 0.16121 0.14929 0.10587 0.10763 0.12566 0.14034 0.14485 0.13617 0.14382 0.15983 0.16872 0.12821 0.17950 0.12109 0.18282 0.13994 0.13090 0.12489 0.12214 0.19091 0.19428 0.17570 0.18098 0.15029 0.20302
6
— 0.05955 0.07813 0.07040 0.08127 0.06954 0.05293 0.16264 0.14609 0.16373 0.16871 0.15019 0.11388 0.11298 0.12287 0.15543 0.15142 0.14839 0.15711 0.16354 0.17903 0.13742 0.18845 0.12922 0.18376 0.13979 0.14113 0.12580 0.12110 0.20946 0.21063 0.17642 0.18854 0.17067 0.22628
7
— 0.06212 0.05869 0.07108 0.05950 0.06041 0.15455 0.14500 0.15637 0.16477 0.14626 0.10387 0.10131 0.13181 0.14656 0.15500 0.14025 0.15097 0.17464 0.17499 0.12981 0.18015 0.11274 0.17569 0.13414 0.12800 0.11539 0.11093 0.19178 0.20095 0.16679 0.17972 0.15288 0.20227
8
— 0.01852 0.07203 0.06460 0.08151 0.16178 0.14403 0.16006 0.18076 0.15882 0.10862 0.11392 0.13375 0.15169 0.15581 0.14389 0.14758 0.16666 0.17808 0.14188 0.18012 0.13009 0.18393 0.14265 0.13002 0.12402 0.12869 0.19574 0.20301 0.17475 0.19581 0.15998 0.21722
9
— 0.06769 0.06112 0.07460 0.16342 0.14310 0.16315 0.17267 0.15403 0.10842 0.10836 0.13291 0.15239 0.15966 0.14308 0.14450 0.16616 0.17583 0.13615 0.17855 0.12003 0.18352 0.13877 0.12887 0.12290 0.12191 0.19249 0.19779 0.17063 0.19137 0.15777 0.21152
10
— 0.05703 0.07871 0.15565 0.14511 0.15668 0.16165 0.15587 0.11008 0.10746 0.13375 0.14378 0.15768 0.13283 0.15067 0.16616 0.17850 0.12900 0.18535 0.12089 0.19133 0.14258 0.13354 0.11924 0.11734 0.18676 0.19780 0.17745 0.18207 0.16063 0.21083
11
— 0.07537 0.15475 0.14707 0.14957 0.16972 0.15483 0.10385 0.08238 0.13284 0.14023 0.14823 0.12832 0.14796 0.15641 0.17122 0.13442 0.18870 0.12740 0.18745 0.14348 0.13528 0.11997 0.11637 0.18820 0.20033 0.16696 0.17424 0.15984 0.20613
12
— 0.15664 0.14310 0.15294 0.16437 0.14867 0.10658 0.11016 0.12549 0.14584 0.15051 0.14244 0.14586 0.16017 0.16550 0.12343 0.18471 0.12559 0.17786 0.13427 0.13555 0.11930 0.10561 0.20227 0.20431 0.17374 0.18162 0.15714 0.21124
13
— 0.11660 0.15353 0.18394 0.16855 0.09554 0.14824 0.10314 0.07433 0.16093 0.13899 0.14620 0.15200 0.15251 0.12979 0.19616 0.17688 0.19199 0.16209 0.15621 0.15379 0.15835 0.19037 0.19411 0.21021 0.19851 0.16672 0.19860
14
— 0.15437 0.17812 0.16549 0.09398 0.14080 0.08889 0.11238 0.16020 0.14052 0.15402 0.16798 0.16810 0.12741 0.20650 0.18023 0.19995 0.15078 0.14798 0.14632 0.14092 0.19840 0.19831 0.19344 0.18783 0.18116 0.21700
15
Kimura Two-parameter Distance Matrix for Green Plant rbcL Sequences
TABLE 3
— 0.16655 0.16434 0.13505 0.14824 0.14941 0.14928 0.15401 0.12204 0.13872 0.15738 0.10016 0.13966 0.21072 0.18959 0.21335 0.16477 0.16817 0.14908 0.15403 0.18789 0.19838 0.19871 0.18592 0.17383 0.21169
16
— 0.07349 0.16629 0.18306 0.16902 0.18244 0.18282 0.17490 0.19668 0.18662 0.18352 0.16369 0.19323 0.18494 0.19514 0.17973 0.17230 0.18032 0.16469 0.18820 0.20188 0.21774 0.19857 0.17931 0.20565
17
— 0.14425 0.16620 0.14232 0.16735 0.17510 0.15255 0.18442 0.17545 0.17058 0.14937 0.18673 0.17445 0.19946 0.16128 0.14995 0.16636 0.15241 0.18078 0.19627 0.19752 0.18638 0.15698 0.18560
18
— 0.09794 0.07542 0.08455 0.12697 0.11516 0.13585 0.13857 0.15244 0.09696 0.15977 0.14269 0.17275 0.11461 0.11556 0.10656 0.10654 0.17549 0.17934 0.17523 0.16581 0.14916 0.20202
19
— 0.12113 0.12575 0.14422 0.12659 0.14681 0.15713 0.15853 0.11571 0.18633 0.16367 0.19559 0.14759 0.13739 0.13148 0.12126 0.18749 0.19036 0.17292 0.18755 0.16518 0.20886
20
382 LEWIS, MISHLER, AND VILGALYS
— 0.16034 — 0.14683 0.11590 — 0.17556 0.18638 0.18115 — 0.18691 0.21627 0.20179 0.16517
383
— 0.04482 0.16094 0.13681 0.16682 0.18618 — 0.17842 0.17929 0.16706 0.17651 0.14771 0.20410 — 0.07447 0.16960 0.16760 0.16561 0.15991 0.15278 0.20852 — 0.17177 0.16803 0.17020 0.16821 0.19766 0.20622 0.21061 0.20715 0.17174 0.21714 — 0.14992 0.12275 0.15120 0.14739 0.16187 0.12677 0.18572 0.16859 0.19157 0.15749 0.14949 0.13699 0.14279 0.19631 0.20026 0.19880 0.20251 0.16371 0.20601
— 0.14933 0.16891 0.16401 0.16643 0.14534 0.20077 0.18540 0.20831 0.16106 0.16055 0.15801 0.15225 0.20431 0.20760 0.20701 0.20298 0.19666 0.22497
— 0.14479 0.14094 0.12554 0.12763 0.18818 0.16089 0.18819 0.15522 0.14697 0.13062 0.12759 0.18824 0.18624 0.18719 0.19515 0.16228 0.19546
— 0.15040 0.15298 0.16391 0.20395 0.18090 0.19991 0.17352 0.16441 0.16851 0.16092 0.19956 0.20429 0.20254 0.19992 0.18977 0.21207
— 0.17272 0.15441 0.19962 0.18773 0.19272 0.17356 0.15731 0.16246 0.13798 0.17024 0.17908 0.19956 0.18876 0.16537 0.19446
— 0.14272 0.22633 0.19692 0.22172 0.19461 0.18918 0.17871 0.17884 0.19664 0.20080 0.21514 0.19454 0.18345 0.21216
— 0.18370 0.16077 0.19008 0.14200 0.14344 0.14048 0.12886 0.17933 0.18771 0.18696 0.18071 0.16808 0.22526
— 0.14511 0.13823 0.17589 0.17770 0.17208 0.16894 0.19228 0.19754 0.20650 0.20313 0.18623 0.21977
— 0.09622 0.15778 0.14946 0.14110 0.14272 0.19180 0.19937 0.18830 0.19614 0.16606 0.20317
— 0.06301 0.10885 0.09915 0.17755 0.18401 0.18465 0.17932 0.15730 0.21007
— 0.10122 0.09625 0.17341 0.17452 0.16879 0.17042 0.15134 0.20722
35 31 30 29 28 27 26 25 24 23 22 21
— 0.09204 0.14939 0.12662 0.13201 0.15797 0.15926 0.11295 0.18310 0.15882 0.18514 0.13220 0.14175 0.13584 0.13084 0.19214 0.19012 0.18624 0.18803 0.17084 0.21084 Lepidozia Lophocholea Metzgeria Makinoa Pallavicinia Pellia Petalaphyllum Porella Anthoceros Megaceros96 MegacerosJM Brotherella Mnium Polytrichum Tetraphis Cycas Zamia Ephedra Welwitschia Angiopteris Ophioglossum
TABLE 3—Continued
32
33
34
36
37
38
39
40
LIVERWORT rbcL PHYLOGENY
FIG. 1. The effect of incorporating rate heterogeneity on likelihood scores for a given tree. Likelihood scores were determined for rbcL data and two user trees over a range of ts/tv estimates. The two bottom-most curves correspond to one rate category, for the ‘‘best’’ (closed triangles, Fig. 3A) and a ‘‘significantly worse’’ (open triangles) topology (see Table 5). The curve with closed boxes corresponds to two rate categories (with relative rates, r 5 1, 2), while the curve with closed diamonds corresponds to more heterogeneity (r 5 1, 20). The open box plots the likelihood scores for two rate categories with more severe rate heterogeneity (r 5 1, 40) at ts/tv 5 2.
data sets (Yang, 1994). Whether this impacts which topology is ‘‘favored’’ in a search was not investigated. Relative Rate Tests The relative rate test implemented in CODRATES (Muse and Gaut, 1994) is based on a maximum likelihood model of codon structure and has the advantage of being able to account for dependency of nucleotides within codons. Our results (Fig. 2) show significant differences in tests for synonymous rates in 11 of 24 comparisons involving complex thalloid taxa and 9 of 15 involving mosses. For the nonsynonymous substitutions, 6 of 24 significant tests involved complex thalloid taxa, and 4 of 15 involved mosses. In all but one of the comparisons (nonsynonymous for Tetraphis-Mnium), the within group rates were not significantly different. A general trend (from slowest to fastest) from complex thalloid liverworts , mosses , leafy liverworts , tracheophytes was observed. A comparison of the relative codon usage (Table 4) for the two liverwort groups illustrates that the complex thalloid liverworts more frequently utilize NNA and NNT codons compared to their sister clade, the leafy/simple thalloid liverworts. Estimates of the synonymous rates (Ks) obtained using CODRATES ranged from around 0.1 to just over 1.0. Of the 55 comparisons examined, in only one case was Ks greater than 1.0, with the majority (35) below 0.2. Although the CODRATES estimates of Ks are calculated differently than the estimates obtained from the method by Nei and Gojobori (1986), the resulting estimates have been shown, through simulation and
384
LEWIS, MISHLER, AND VILGALYS
FIG. 2. Summary of pairwise relative rate tests on 1315 nucleotides of rbcL using CODRATES (Muse and Gaut, 1994). In all cases, the rbcL sequence from Coleochaete (see Table 1) was used as the outgroup taxon. Comparisons above and below the diagonal represent test for synonymous and nonsynonymous differences, respectively. Arrows occur in cells with significant differences in rate at the 0.05 level and point to the taxon with the faster rate.
TABLE 4 Examples of the Relative Synonymous Codon Usage for Leafy/Simple Thalloid (First Number) and the Complex Thalloid (Second Number) Liverworts
Leu UUA UUG CUU CUC CUA CUG Val GUU GUC GUA GUG Ser UCU UCC UCA UCG
L/S
C
2.61 0.89 1.37 0.22 0.67 0.24
3.90 0.21 1.58 0.00 0.16 0.15
1.49 0.22 1.93 0.36
2.02 0.06 1.90 0.03
2.80 1.23 0.47 0.13
4.34 0.08 0.28 0.08
Pro CCU CCC CCA CCG
Thr ACU ACC ACA ACG Ala GCU GCC GCA GCG
L/S
C
1.80 0.54 1.31 0.35
2.62 0.00 1.13 0.25
2.11 0.69 0.79 0.41
2.86 0.14 0.89 0.11
2.15 0.28 1.35 0.22
2.18 0.06 1.56 0.20
Note. Only selected codons are shown. Note that under equal usage, the relative frequencies for each codon in a group should equal 1.
empirical study, to be virtually identical for cases of low to moderate levels of sequence divergence (see Muse, 1996). Data Set 1: Land Plants Parsimony produced three basic topologies, depending on the weighting applied (Figs. 3A–3C). In all trees, the same groups of taxa were found, but the branching order varied among groups (and somewhat within groups). ‘‘Codons’’ weighting produced one tree (Fig. 3A), with length 5 1377192. With ‘‘ts/tv’’ weighting, two trees were found, length 5 3840 (one tree shown in Fig. 3B). Equal weighting searches produced one tree, length 5 2880, CI 5 0.3042 (Fig. 3C). The ‘‘codons’’ tree had bryophytes as paraphyletic, and liverworts monophyletic, at the base of the embryophytes. One ts/tv tree placed the Calobryales at the base of embryophytes (Fig. 3B) and the second was similar to Fig. 3A, except for an alternative rooting of the complex thalloid taxa (rooted at Dumortiera plus Monoclea. In the tree obtained with equal weighting, both the bryophytes and liverworts were paraphyletic, with the complex thalloid taxa at the base. The ML topology was identical to the
FIG. 3. Phylogenetic trees obtained using parsimony and maximum likelihood analysis of rbcL data across embryophytes. (A) Single tree from ML and ‘‘codons’’ weighted parsimony. (B) One of two ‘‘ts/tv’’ parsimony trees. The second ‘‘ts/tv’’ tree had the same arrangement of basal lineages as in A. (C) Single tree from equal weighted parsimony. Bootstrap support (above) is shown on all nodes with a score of 50% or greater, and decay indices are shown below nodes only in the equal weighted analysis. Note that the leafy liverworts are separate from the Calobryales in this treatment.
LIVERWORT rbcL PHYLOGENY
385
386
LEWIS, MISHLER, AND VILGALYS
TABLE 5 Negative ln Likelihood Scores for Four User Trees User trees TR
TR
TR
TR
HO
HO
HO
HO
MO
MO
MO
MO
CT
CT
CA
CT
Model
LE CA OG
LE CA OG
LE CT OG
LE CA OG
One rate category Two rate categories (r 5 1, 2) Two rate categories (r 5 1, 20)
16,387 15,945 14,799
16,406 15,961 14,810
16,402 15,957 14,809
16,472* 16,018* 14,853*
Note. Likelihood scores were obtained using PHYLIP 3.5c (Felsenstein, 1993), arranged with better scores on the left. Asterisks indicate scores that are significantly worse than the best score using the Kishino–Hasegawa test (1989). See text for a discussion of the topologies. TR, tracheophytes; HO, hornworts; MO, mosses, CT, complex thalloid liverworst; LE, leafy liverworts; CA, calobryalean taxa; OG, outgroup taxa (in this case both Coleochaete and Nitella).
codons topology (Fig. 3A), except for the placement of Metzgeria at the base of the leafy taxa, instead of with Porella. When standardized for equal weights, tree lengths for Figs. 3A–3C were 2885 (CI 5 0.3036), 2887 (CI 5 0.3034), and 2880 steps (CI 5 0.3042), respectively. Because alternative hypotheses are plausible, a likelihood ratio test (Kishino and Hasegawa, 1989) was used to determine if the resulting topologies were significantly different (Table 5). We found no significant differences among the topologies shown in Fig. 3, whereas the ‘‘bryophytes monophyletic’’ topology, determined using a constraint tree, was significantly worse than the best tree. There is an increase in rbcL GC content from basal groups to tracheophytes, from 37.5% GC in charophytes to 44% in tracheophytes (measured for all sites) and from 20 to 44% (measured for variable sites). Despite the differences in base composition observed between groups of taxa, the LogDet topology (not shown) is similar to those from the parsimony and ML searches, demonstrating that taxa are not being grouped artificially because of base composition. Although base compositional differences exist between groups, the bias is either not strong enough to ‘‘mislead’’ the methods or base compositional changes are correlated with the phylogeny. Parsimony trees obtained from either first and second codon positions or from third positions for the forty taxon data set are shown in Fig. 4. With first and second positions, a total of 876 characters were used, composed of 433 constant characters and 127 parsimony-informative characters. Searches using this data set produced 308 MP trees (length 5 628 steps, CI 5 0.3279, and RI 5 0.5311). The strict consensus (Fig. 4A) demonstrates that the signal present is con-
tained mostly among the more closely related taxa. This tree also shows unusual placement of the charophycean alga Nitella, and the hornwort Anthoceros, and probably reflects nucleotide bias, because trees obtained using LogDet (not shown) on this data set do not contain such relationships. With the third codon position data set, five MP trees were obtained (one is shown in Fig. 4B, L 5 2402, CI 5 0.3016, RI 5 0.4607). Of the 439 characters used, only 15 were constant and 394 were parsimony-informative. The resulting trees were nearly identical to those found for all of the data (see Fig. 3). Data Set 2: Liverworts Only The ML tree, and the single maximally parsimonious ‘‘codons’’ tree (L 5 136894), were exactly the same (Fig. 5). Two major groups, corresponding to the Marchantiidae (Sphaerocarpales, Monocleales, Marchantiales) and Jungermanniidae (Calobryales, Metzgeriales, Jungermanniales), were present in 100% of the bootstrap trees. The Jungermanniidae and Marchantiidae have very different branch lengths, corresponding to different substitution rates (see CODRATES results). The Jungermanniidae consisted of Calobryales, a clade composed of jungermannialean taxa, and a clade of metzgerialean taxa. Metzgeria, however, was at the base of the Jungermanniales and Metzgeriales. The Marchantiidae consisted of the Monocleales nested within the Marchantiales, plus the Sphaerocarpales. Porella and Metzgeria were two sequences whose position changed depending upon the composition of ingroup taxa. In all cases, congeners grouped together. Groups well-supported by bootstrap estimates were the Sphaerocarpales (supported as 92%), Monoclea plus Dumortiera (supported by 94% bootstrap and shared amino acid substitutions). The Jungermannia-
387
LIVERWORT rbcL PHYLOGENY
FIG. 4. Comparison of signal in different codon positions of rbcL. (A) Strict consensus of 308 MP trees (L 5 628, CI 5 0.3279, RI 5 0.5311) obtained from first and second codon positions. (B) One of five MP trees (L 5 2402, CI 5 0.3016, RI 5 0.4607) obtained from third codon position data. In both A and B, bootstrap support of 100 replicates is indicated for branches with greater than 50% support.
les and Metzgeriales are both weakly supported as monophyletic. Within the Jungermanniales, the clade that includes Lophocholea and Bazzania is strongly supported. Also included are Calypogeia and Lepidozia (Schuster’s suborder Lepidoziineae), Herbertus, and Porella. The branching order of taxa in the Metzgeriales is uncertain; only Fossombronia and Petalophyllum are well supported. With equal weighted MP, two tree islands (length 5 1446, CI 5 0.4130) were found. Both trees had the same topology for CT liverworts, but differed drastically in the branching order of the leafy liverworts. The consensus of two trees (Fig. 6) has a polytomy of leafy/simple thalloid taxa, except for the Jungermanniales minus Porella, the two species of Haplomitrium, and two taxa of the Metzgeriales (Fossombronia and Petalaphyllum). Data Sets 3 and 4: Complex Thalloid and Leafy/Simple Thalloid Liverworts Alone For the complex thalloid taxa, MP searches involving the different weighting schemes produced one topology,
length 5 261, CI 5 0.590 (Fig. 7), which was different from the toplologies found that included leafy liverworts and other green plants. In this subset of taxa, the two Conocephalum species moved to a position between the two Marchantia taxa and Lunularia. The same tree was found using ‘‘codons’’ weighting (not shown, L 5 138213). When the leafy/simple thalloid taxa were analyzed alone, using equal weighted parsimony, two tree islands were found (not shown, L 5 990 steps, CI 5 0.4840), which were the same two topologies found in the liverworts only analysis (data set 2). Like the liverworts only analysis, ‘‘codons’’ weighting produced one MP tree (not shown, L 5 465860) that differed in arrangement from the equal weighted trees. DISCUSSION Resolution of the branching order of basal green plants has proven to be problematic. With morphological data, it is difficult to polarize character states since
388
LEWIS, MISHLER, AND VILGALYS
many of the characters are present in the bryophytes but not present in green algal ancestors (see Mishler et al., 1994). With molecular data, the difference in branch lengths between internal and external branches (probably reflecting the rapid radiation of the major groups) can make phylogenetic reconstruction problematic. Distinguishing between two hypotheses, either a brief shared history of the two major liverwort groups followed by a long period of divergence, or two independent lineages, is the dilemma. Morphological studies favor the former hypothesis (Mishler et al., 1992, 1994), while some previous molecular studies have favored the latter (Bopp and Capesius, 1995; Kranz et al., 1995). Several studies have utilized rbcL data to address phylogenetic relationships of green plants (Doebley et al., 1990; Chase et al., 1993; Manhart, 1994; Wolf et al., 1994; Hasebe et al., 1995). Manhart (1994) questioned the utility of rbcL for studies directed at the larger picture of green plant evolution, because a majority of the substitutions occur at third codon positions and multiple hits would likely obliterate historical signal. In his study unorthodox relationships were obtained for some of the basal taxa. However, taxonomic sampling of the bryophytes was sparse, and it is predicted that adding more taxa may break up the long branches and make the branch attraction problem less serious.
FIG. 6. Strict consensus of two equally parsimonious networks of liverworts only, showing the well-resolved complex thalloid taxa and the poorly resolved leafy/simple thalloid taxa. Bootstrap support of 50% or greater (above) and decay indices (below) are given for each node.
Long branch attraction places groups not historically related nearby in a topology because of accumulated parallel changes (Felsenstein, 1978). Factors leading to branch length heterogeneity include evolutionary rate heterogeneity among lineages, time differences, and inadequate taxon sampling (of extant taxa, or because of extinctions) such that nodes separating major lineages have a large number of substitutions supporting them. While our study has shown that with an increased sampling it is possible to get more reasonable topologies, the rapid radiation of these groups makes obtaining strong support for deeper branches unlikely. Sequence Divergence, Nucleotide Composition, and Rates among Groups
FIG. 5. Unrooted network of liverwort taxa only. The topology was obtained with both ‘‘codons’’ weighting in MP, and using ML, but is displayed with ML estimates of branch lengths. Parsimony bootstrap support of 50% or greater is indicated on the branches.
Pairwise distance estimates for congeners of liverworts were on the order of what has been found for two genera of grasses, Zea mays–Triticum aestivum (0.07; Doebley et al., 1990). Strikingly, differences between liverwort groups were comparable to the liverwort– moss comparisons, and the complex thalloid taxa had about half the divergence of the leafy/simple thalloid taxa. Branch lengths for the complex thalloid taxa are shorter than for the other groups of green plants. Branch lengths are estimates of the product of diver-
LIVERWORT rbcL PHYLOGENY
FIG. 7. Single MP network of complex thalloid taxa showing different relative branching of taxa as compared to the topology obtained with the ‘‘all taxa’’ and ‘‘liverworts only’’ data sets. Bootstrap support of 50% or greater (above) and decay indices (below) are given for each node.
gence time and substitution rates, so short branches can be due to shorter time or slower rates of substitution. We know from independent evidence (ribosomal RNA and morphological data) that the leafy/simple thalloid and complex thalloid liverworts are sister clades. However, we cannot rule out that the terminal splits in the complex thalloid liverwort tree are more recently derived, since the first definitive occurance of a Marchantian liverwort is in the Triassic, much later than the simple thalloids recognizable at least back to the Devonian (Krassilov and Schuster, 1984). Examination of substitution rates shows that complex thalloid liverworts have rates of transitions that are half that of other lineages (while transversions remain about the same in all groups). Of the possible rate tests available, the maximum likelihood codons model developed for testing differences in relative rates (Muse and Gaut, 1994) is most appropriate for rbcL, since it is able to account for both differences in rate across the topology and dependencies among nucleotides within a codon. Between pairs of sequences, significant differences in the rate of both synonymous and nonsynonymous substitution were demonstrated (especially in those tests that included the complex thalloid liverworts). However, these differences probably do not represent independent events, but an
389
inherited difference between complex thalloid taxa and the others. Differences in rates of rbcL sequence substitution have been demonstrated for many green plant lineages. Bousquet et al. (1992) found accelerated rates in many groups of angiosperms (especially annuals in the Poaceae, Asteridae) compared to Marchantia. We found that the slower rate observed for Marchantia is consistent for all presently included complex thalloid liverworts. This also appears to be true of other genes in the chloroplast, such as the 16S rDNA (unpublished results) but has not been examined for nuclear genes. Gaut et al. (1992) found rate differences between monocot lineages and suggested that the rates may reflect generation time and phylogeny. In our case, the observed differences also appear to be lineage specific. Differences in generation time of the complex thalloid liverworts compared to the leafy liverworts may be difficult to assess because these groups are often highly clonal. The leafy/simple thalloid clade has a much greater diversity of habitats, with a much greater biotic component (i.e., epiphytism of many types), while the complex thalloids remain in a relatively ancient, xerophytic soil environment. And, it has been hypothesized that leafy liverworts, which are epiphytes on angiosperms, underwent an adaptive radiation correlated with the angiosperm radiation, while the complex thalloids remained ecologically relatively static on soil (see, for example, Schuster, 1966, p. 285). This explanation is intriguing, but cannot be tested at present because doing so would require a more thorough understanding of speciation rates. The choice of synonymous codons in E. coli and yeasts is constrained by factors related to translational efficiency (Li and Graur, 1991), which could result in selection that slows down the rate of synonymous substitutions (Ikemura, 1981; Kimura, 1983). This is the pattern observed for the rbcL gene in the complex thalloid liverworts, which have about half the rate of transitions but approximately the same rate of transversions as other green plant lineages, and a lower rate of synonymous substitutions. Codon usage in rbcL in complex thalloid liverworts is more severely biased toward NNA and NNT codons than in leafy/simple thalloid liverworts, possibly due to a mutational bias associated with an increase in A 1 T content in the chloroplast genome. In a comparison of many chloroplast genes of Marchantia and the vascular plant Nicotiana, Morton (1994) found differing amounts of selection acting on the Marchantia genes depending on translation rates. Codon usage bias in the highly expressed plastid gene psbA differed from what would be expected from the base composition of the genome, and this was also found, although to a lesser extent, for rbcL. More work needs to be done in order to get better estimates of translation rates, since most studies assume that the relative levels of translation are comparable between vascular plants and nonvascular plants.
390
LEWIS, MISHLER, AND VILGALYS
It is apparent from our study that substitution parameters differ greatly across the liverwort tree, but in a lineage-specific manner. All commonly used methods of phylogeny inference (except LogDet) assume base frequencies are in equilibrium across the tree, but this is not the case for rbcL. How such nonstationarity influences phylogeny inference needs further investigation. Stop Codons in the rbcL Gene of Hornworts Manhart (1994) reported and discussed the presence of the two stop codons in Megaceros. Additional stop codons were found in Anthoceros. There are two possible explanations that would account for the presence of stop codons in the hornwort sequences. First, the gene is a pseudogene and the operational gene is elsewhere. Chloroplast genes have been identified in other compartments of the cell (Baldauf and Palmer, 1990) in a few green plants, and Pike et al. (1992) reported the presence of chloroplast fragments outside of the chloroplast DNA fraction (i.e., in the nucleus or mitochondria) for two liverworts. However, this explanation is not entirely satisfactory because one would expect to see polymorphism in sequences if using PCR-sequencing of total DNA. If the hornwort rbcL sequences represent pseudogenes, the ts/tv ratio would be expected to approach 0.5 given the amounts of time available for substitution. Instead, at least for the three sequences currently in hand, the estimated ts/tv ratio is 2.7, a value more typical of functional protein-coding genes. A more plausible explanation is RNA editing (discussed by Manhart, 1992), which as been demonstrated in Anthoceros by examination of cDNAs (Ueda and Yoshinaga, 1995). If this is the case, it appears that among the bryophytes only the hornworts have RNA editing, since stop codons are absent from the rbcL genes of other bryophyte lineages. Monophyly of the Liverworts and Placement of Liverworts in Relation to Other Basal Green Plants We have attempted to incorporate the most appropriate sampling of both outgroup taxa and representatives of sister lineages to the liverworts. In addition, the methods used during the analysis have demonstrated a difference in the trees obtained using different underlying ‘‘models.’’ Specifically, trees obtained using equal weighted parsimony and weighted parsimony had different resolutions of the basal-most part of the tree. Both simple transition/transversion and ‘‘codons’’ weighting produced trees that were identical or very similar to the tree obtained using maximum likelihood. From the present study, however, it is impossible to determine unequivocally (as demonstrated using the likelihood ratio test) whether the liverworts are monophyletic or paraphyletic. Molecular data may not be able to reconstruct a common ancestor shared by the different subclasses because branch lengths may be too
short compared to the subsequent time since divergence. However, in no case did an amino acid change found in rbcL contradict a monophyletic liverworts clade. Although the liverworts are the most weakly supported group of bryophytes based on morphological synapomorphies, these data do suggest that the diverse morphological types shared a common ancestor, albeit for a very short time (Mishler and Churchill, 1984; Mishler et al., 1994). The three characters used to support this are: lunularic acid, presence of elaters, and oil bodies. Some studies of rDNA data of liverworts have indicated they are monophyletic (chloroplast smallsubunit; Mishler et al., 1992), while others have shown them to be paraphyletic (nuclear small-subunit; Bopp and Capesius, 1995; Kranz et al., 1995). The problem with these studies is that very few taxa were used or small data sets composed of partial sequences or adequate sister groups were omitted from the analysis. In one study using 18S rDNA data, differences in substitution rate and/or base composition among taxa may have caused misleading results, and little exploration concerning the robustness of the results was made (Bopp and Capesius, 1995). That study used only liverwort and moss sequences as ingroup taxa, did not include any tracheophyte sequences, and the analysis was restricted to equal weighted parsimony. While the rbcL sequence data cannot fully resolve the question of liverwort monophyly, two results were common to all of the trees and deal with the branching order of bryophyte groups in the broad picture of green plant evolution. Our analyses favor topologies that include liverworts as the basal embryophyte lineage and hornworts as the closest lineage to vascular plants. These results agree with several independent lines of evidence (both morphological and molecular), suggesting that the ‘‘bryophytes’’ are not monophyletic and support the placement of the liverworts as the earliest embryophyte lineage (Mishler and Churchill, 1984; Mishler et al., 1992, 1994). Conversely, Garbary et al. (1993) found that the bryophytes were monophyletic, with the liverworts as a sister group to the mosses. While using a large number of characters of male spermatogenesis, that study did not incorporate published morphological data from other stages in the life cycle or examine congruence with molecular data and included only seven taxa of liverworts. While certain ultrastructural characters of the motile cells of green algae have been shown to be congruent with molecular data (Lewis et al., 1992), given the discordance of the bryophyte sperm ultrastructure data set with other lines of evidence, it is possible that (at least) some of these characters in bryophytes are the result of convergence. However, within a subset of their tree, those characters may be useful at resolving relationships. The placement of hornworts contradicts what has been found by previous studies using morphological data (Mishler and Churchill, 1984; Mishler et al., 1994),
391
LIVERWORT rbcL PHYLOGENY
which showed (albeit rather weakly) a sister-group relationship between mosses and tracheophytes. Support for the placement of hornworts with tracheophytes in our trees did not involve positions related to the stop codons, and the possibility of RNA editing may not be causing misleading signal. One major difficulty in the use of morphological characters has been the problem of character polarization. Many characters used within and among the bryophyte lineages are absent from the charophycean green algae. Further study on hornworts, especially of other genes, is needed to resolve the incongruence between rbcL data and the morphology. There is more information in rbcL than previously demonstrated (Manhart, 1994) for the ordering of basal groups, especially with an increased taxon sampling of bryophyte taxa. Manhart (1994) compared analyses of rbcL data across chlorophyll a- and b-containing photosynthetic organisms for nucleotides versus amino acids, and using different codon positions, and found the most support for internal nodes came from the use of all nucleotide positions. Although rbcL is considered saturated at the deeper tracheophyte branches (Goremykin et al., 1996), many of our estimates of the synonymous substitution rates in bryophytes would be considered low-to-moderate (less than 0.6), especially when compared to the values found within ferns (.1.0, see Hasebe et al., 1995). In addition, trees obtained using data from third codon positions are nearly identical to those obtained using all sites in rbcL and are generally congruent with independent evidence. Therefore, we must conclude that the third codon positions of rbcL, while quite variable, have retained phylogenetic signal across green plants. Unfortunately, most definitions of saturation are based on pairwise comparisons involving evolutionary distances that in some cases span the entire tree. It is, however, the distances between adjacent nodes of the tree that are the relevant units for measuring saturation when using discrete character tree building methods such as parsimony or likelihood. It has been shown through simulation studies that individual internodal distances can be quite large (i.e., well beyond the biologically reasonable zone sensu Nei) before parsimony becomes inconsistent (Huelsenbeck and Hillis, 1993) so long as taxon sampling has been sufficient to homogenize branch lengths. Additional support comes from the recent finding of Hillis (1996) concerning the improvement of phylogeny reconstruction methods, at least for simulated data, with an increased taxon-sampling. The general agreement of rbcL data with morphological data suggests that in this case, parallel substitutions do not pose an especially difficult problem. Rather, it appears that the lack of resolution stems mostly from a rapid radiation of taxa. Further support for this interpretation comes from 18S rRNA studies, since 18S data have also failed to unequivocally resolve this issue (Kranz et al., 1995; Hedderson et al., 1996). Because this divergence is
probably best characterized as a rapid radiation, morphological characters or molecular characters derived from the presence/absence of introns or structural rearrangements of the chloroplast genome (see Bauldauf and Palmer, 1990; Manhart and Palmer, 1990), may be better suited to resolving the deepest branching patterns of green plants. These characters are likely to be rare and have not been investigated to a great extent in bryophytes, and further study is warranted. Relationships at the subClass Level Although rbcL data do not unequivocally resolve the basal-most branching order in green plants, they do seem to provide suitable information within the groups. The subclass distinctions based on morphology are supported by the rbcL data. In all analyses, the two groups that include leafy/simple thalloids and complex thalloids are well supported. It appears that the leafy and complex thalloid liverworts have different rates of substitution. Maximum likelihood analyses showed strikingly different branch lengths (Fig. 5), and relative rate tests found these differences significant. Lineage-specific rate asymmetry and unequal divergence times are a potential contributor to long branch attraction (Hendy and Penny, 1989). Albert et al. (1993) have argued that uneven sampling or extinction of lineages may present even greater problems than rate differences. While there are no other extant subclasses from which to sample, a more thorough sampling across the diversity of liverwort taxa could potentially yield more resolution. Relationships within Subclasses Our analyses of the smaller data sets provide topologies within subclasses that differ from the global analysis. Within the Marchantiidae, the Monocleales diverges from within the Marchantiales. The Sphaerocarpales are a sister group to the Marchantiales/ Monocleales. Within the Jungermanniidae, there appear to be three clades, corresponding to the Calobryales, Jungermanniales, and Metzgeriales. It is clear from the comparison of analyses of all liverworts and of only the subclasses that taxa with the largest divergence influence taxa with the smallest divergence. In this case, the leafy/simple thalloid taxa influenced the topology of the complex thalloid taxa, because relative branching order among complex thalloid taxa changed depending on inclusion of leafy/simple thalloid liverworts. CONCLUSIONS In summary, our study has clearly shown that bryophytes are not monophyletic, and that the liverworts are the basal-most lineage of embryophytes. The hornworts are the closest lineage of bryophytes to the tracheophytes with our rbcL data, contrary to morpho-
392
LEWIS, MISHLER, AND VILGALYS
logical evidence. The rbcL data also favor topologies with the liverworts as a monophyletic group, although trees with paraphyletic liverworts were not significantly different. A statistically significant difference in substitution rates between the two major groups of liverworts may be contributing to this result. Trees obtained from third codon positions in rbcL were nearly identical to trees from all sites and were in general agreement with independent evidence. Therefore, data from the rbcL gene are not necessarily worse at resolving relationships among the groups as data from nuclear rRNA genes. Extensive variation in base composition, substitution patterns, and rates in rbcL has now been demonstrated for angiosperms (Smith and Doyle, 1986; Wilson et al., 1990; Bousquet et al., 1992; Gaut et al., 1992), ferns (Hasebe et al., 1995), and ‘‘bryophytes.’’ Our results demonstrate a significant difference in substitution rates and patterns between the two major lineages of liverworts. Because of this, more attention needs to be given to ‘‘bryophytes’’ and other basal green plant lineages. No single sample from the ‘‘bryophytes’’ necessarily represents the diversity of the basal green plants (just as no one angiosperm captures the diversity present in angiosperms). Studies of chloroplast gene evolution among green plants may need to sample heavily from among bryophytes and not rest on comparisons with Marchantia alone. ACKNOWLEDGMENTS The authors thank Malcolm Sargent for providing plant material from his bryophyte collection and P. Lewis, J. Manhart, and one anonymous reviewer for many insightful suggestions concerning the manuscript. D. Swofford kindly provided and allowed the use of prerelease versions of PAUP*. L.A.L. gratefully acknowledges the Department of Botany, Duke University, and the A. W. Mellon Foundation for financial support, E. Zimmer at the Smithsonian Institution, Laboratory of Molecular Systematics for computing facilities and space, R. Alston and the Duke Cancer Center Molecular Biology Computing Facility for computing time and generous assistance, D. Penny, P. Lockhart, and P. Waddell for helpful discussion concerning LogDet, and the LMS journal club members for helpful discussion. This research was supported by NSF Grant BSR-9107484 to B.M. and R.V.
REFERENCES Albert, V. A., and Mishler, B. D. (1992). On the rationale and utility of weighting nucleotide sequence data. Cladistics 8: 73–83. Albert, V. A., Chase, M. W., and Mishler, B. D. (1993). Character-state weighting for cladistic analysis of protein-coding DNA sequences. Ann. Missouri Bot. Gard. 80: 752–766. Baldauf, S. L., and Palmer, J. D. (1990). Evolutionary transfer of the chloroplast tuf A gene to the nucleus. Nature 344: 262–265. Bopp, S. L., and Capesius, I. (1995). New aspects of the systematics of bryophytes. Naturwissenschaften 82: 193–194. Bousquet, J., Strauss, S. H., Doerksen, A. H., and Price, R. A. (1992). Extensive variation in evolutionary rate of rbcL gene sequences among seed plants. Proc. Natl. Acad. Sci. USA 89: 7844–7848.
Bremer, K. (1994). Branch support and tree stability. Cladistics 10: 295–304. Cabot, E. L., and Beckenbach, A. T. (1989). Simultaneous editing of multiple nucleic acid and protein sequences with ESEE. Comput. Appl. Biosci. 5: 233–234. Chase, M. W., Soltis, D. E., Olmstead, R. G., et al. (1993). Phylogenetics of seed plants: an analysis of nucleotide sequences from the plastid gene rbcL. Ann. Missouri Bot. Gard. 80: 528–580. Clegg, M. T., Gaut, B. S., Learn, G. H., Jr., and Morton, B. R. (1994). Rates and patterns of chloroplast DNA evolution. Proc. Natl. Acad. Sci. USA 91: 6795–6801. Crandall-Stotler, B. (1980). Morphogenetic designs and a theory of bryophyte origins and divergence. BioScience 30: 580–585. Doebley, J., Durbin, M., Golenberg, E. M., Clegg, M. T., and Ma, D.-P. (1990). Evolutionary analysis of the large subunit of carboxylase (rbcL) nucleotide sequence among the Grasses (Gramineae). Evolution 44: 1097–1108. Donoghue, M. J., Olmstead, R. G., Smith, J. F., and Palmer, J. D. (1992). Phylogenetic relationships of Dipsacales based on rbcL sequences. Ann. Missouri Bot. Gard. 79: 333–345. Felsenstein, J. (1978). Cases in which parsimony and compatibility methods will be positively misleading. Syst. Zoo. 27: 401–410. Felsenstein, J. (1984). Distance methods for inferring phylogenies: a justification. Evolution 38: 16–24. Felsenstein, J. (1985). Confidence limits on phylogenies: An approach using the bootstrap. Evolution 39: 783–791. Felsenstein, J. (1993). PHYLIP (Phylogeny Inference Package) version 3.5c. Distributed by author. Department of Genetics, University of Washington, Seattle. Garbary, D. J., Renzaglia, K. S., and Duckett, J. G. (1993). The phylogeny of land plants: A cladistic analysis based on male gametogenesis. Plant Syst. Evol. 188: 237–269. Gaut, B. S., Muse, S. V., Clark, W. D., and Clegg, M. T. (1992). Relative rates of nucleotide substitution at the rbcL locus of monocotyledonous plants. J. Mol. Evol. 35: 292–303. Goremykin, V., Bobrova, V., Pahnke, J., Troitsky, A., Antonov, A., and Martin, W. (1996). Noncoding sequences from the slowly evolving chloroplast inverted repeat in addition to rbcL data do not support gnetalean affinities of angiosperms. Mol. Biol. Evol. 13: 383–396. Graham, L. E. (1993). ‘‘The Origin of Land Plants,’’ Wiley, New York. Hasebe, M., Wolf, P. G., Pryer, K. M., et al. (10 co-authors). (1995). Fern phylogeny based on rbcL nucleotide sequences. Am. Fern J. 85: 134–181. Heckman, J. E., Sarnoff, J., Alzner-DeWeerd, B., Yin, S., and Raj Bhandary, U. L. (1980). Novel features in the genetic code and codon reading patterns in Neurospora crassa mitochondria based on sequences of six mitochondrial tRNAs. Proc. Natl. Acad. Sci. USA 77: 3159–3163. Hedderson, T. A., Chapman, R. L., and Rootes, W. L. (1996). Phylogenetic relationships of bryophytes inferred from nuclear-encoded rRNA gene sequences. Plant Syst. Evol. 200: 213–224. Hendy, M. D., and Penny, D. (1989). A framework for the quantitative study of evolutionary trees. Syst. Zool. 38: 297–309. Hillis, D. M. (1996). Inferring complex phylogenies. Nature 383: 130–131. Hillis, D. M., Mable, B. K., Larson, A., Davis, S. K., and Zimmer, E. A. (1996). Nucleic Acids IV: Sequencing and cloning. In ‘‘Molecular Systematics’’ (D. M. Hillis, C. Moritz, and B. K. Mable, Eds.), 2nd ed., Sinauer, Sunderland, MA. Huelsenbeck, J. P., and Hillis, D. M. (1993). Success of phylogenetic methods in the four-taxon case. Syst. Biol. 42: 247–264. Ikemura, T. (1981). Correlation between the abundance of Escherichia coli transfer RNAs and the occurrence of the respective codons in its protein genes: a proposal for a synonymous codon
LIVERWORT rbcL PHYLOGENY choice that is optimal for the E. coli translational system. J. Mol. Biol. 151: 389–409. Kendrick, P., and Crane, P. R. (1991). Water-conducting cells in early fossil land plants: implications for the early evolution of tracheophytes. Bot. Gaz. 152: 335–356. Kimura, M. (1980). A simple method for estimating evolutionary rates of base substitution through comparative studies of nucleotide sequences. J. Mol. Evol. 16: 111–120. Kimura, M. (1983). ‘‘The Neutral Theory of Molecular Evolution,’’ Cambridge Univ. Press, Cambridge. Kishino, H., and Hasegawa, M. (1989). Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order of the Hominoidea. J. Mol. Evol. 29: 170–179. Knight, S., Andersson, I., and Branden, C.-I. (1990). Crystallographic analysis of ribulose 1,5-bisphosphate carboxylase from spinach at 2.4 A resolution. J. Mol. Biol. 215: 113–160. Kranz, H. D., Miks, D., Siegler, M.-L., Capesius, I., Sensen, C. W., and Huss, V. A. R. (1995). The origin of land plants: phylogenetic relationships among Charophytes, Bryophytes, and Vascular plants inferred from complete small-subunit ribosomal RNA gene sequences. J. Mol. Evol. 41: 74–84. Krassilov, V. A., and Schuster, R. M. (1984). Paleozoic and mesozoic fossils. In ‘‘New Manual of Bryology’’ Volume 2 (R. M. Schuster, Ed.), pp. 1172–1193, Hattori Bot. Lab., Nichinan, Japan. Kumar, S., Tamura, K., and Nei, M. (1993). MEGA: Molecular Evolutionary Genetics Analysis. Version 1.0. Pennsylvania State University, University Park, Pennsylvania. Lake, J. A. (1994). Reconstructing evolutionary trees from DNA and protein sequences: Paralinear distances. Proc. Natl. Acad. Sci. USA 91: 1455–1459. Lewis, L. A., Wilcox, L. W., Fuerst, P. A., and Floyd, G. L. (1992). Concordance of molecular and ultrastructural data in the study of zoosporic chlorococcalean green algae. J. Phycol. 28: 375–380. Li, W.-H., and Graur, D. (1991). ‘‘Fundamentals of Molecular Evolution,’’ Sinauer Associates, Inc., Sunderland, Massachusetts. Lockhart, P. J., Steel, M. A., Hendy, M. D., and Penny, D. (1994). Recovering evolutionary trees under a more realistic model of sequence evolution. Mol. Biol. Evol. 11: 605–612. Manhart, J. (1994). Phylogenetic analysis of green plant rbcL sequences. Mol. Phyl. Evol. 3: 114–127. Manhart, J., and Palmer, J. D. (1990). The gain of two chloroplast tRNA introns marks the green algal ancestors of land plants. Nature 345: 268–270. Mishler, B. D., and Churchill, S. P. (1984). A cladistic approach to the phylogeny of the ‘‘bryophytes.’’ Brittonia 36: 406–424. Mishler, B. D., and Churchill, S. P. (1985). Transition to a land flora: phylogenetic relationships of the green algae and bryophytes. Cladistics 1: 305–328. Mishler, B. D., Lewis, L. A., Buchheim, M. A., Renzaglia, K. S., Garbary, D. J., Delwiche, C. F., Zechman, F. W., Kantz, T. S., and Chapman, R. L. (1994). Phylogenetic relationships of the ‘‘green algae’’ and ‘‘bryophytes.’’ Ann. Missouri Bot. Gard. 81: 451–483. Mishler, B. M., Thrall, P. H., Hopple, J. S., Jr., De Luna, E., and Vilgalys, R. (1992). A molecular approach to the phylogeny of the bryophytes: Cladistic analysis of chloroplast-encoded 16s and 23s ribosomal RNA genes. The Bryologist 95: 172–180.
393
Morton, B. R. (1994). Codon use and the rate of divergence of land plant chloroplast genes. Mol. Biol. Evol. 11: 231–238. Muse, S. V. (1996). Estimating synonymous and nonsynonymous substitution rates. Mol. Biol. Evol. 13: 105–114. Muse, S. V., and Gaut, B. S. (1994). A likelihood approach for comparing synonymous and nonsynonymous substitution rates, with application to the chloroplast genome. Mol. Biol. Evol. 11: 715–724. Nei, M., and Gojobori, T. (1986). Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol. Biol. Evol. 3: 418–426. Pike, L. M., Hu, A., Renzaglia, K. S., and Musich, P. R. (1992). Liverwort genomes display extensive structural variations. Bot. J. Linn. Soc. 109: 1–14. Schuster, R. M. (1966). ‘‘The Hepaticae and Anthocerotae of North America,’’ Vol. 1, Columbia Univ. Press. Schuster, R. M. (1984). Evolution, phylogeny and classification of the Hepaticae. In ‘‘New Manual of Bryology’’ (R. M. Schuster, Ed.), Vol. 2, pp. 892–1070, Hattori Bot. Lab., Nichinan, Japan. Sluiman, H. J. (1985). A cladistic evaluation of the lower and higher green plants (Viridiplantae). Plant Syst. Evol. 149: 217–232. Smith, J. F., and Doyle, J. J. (1986). Chloroplast DNA variation and evolution in the Junglandaceae. Am. J. Bot. 78: 730. Steel, M. (1994). Recovering a tree from the leaf colouration it generates under a Markov model. Appl. Math. Lett. 7: 19–23. Stewart, W. N., and Rothwell, G. W. (1993). Paleobotany and the evolution of plants, 2nd ed., Cambridge Univ. Press. Swofford, D. L. (1993). PAUP 3.1.1. Illinois Natural History Survey. Swofford, D. L. PAUP*, in press. version 4.0 d. 38–42. Sinauer, Sunderland, MA. Swofford, D. L., Olsen, G. J., Waddell, P. J., and Hillis, D. M. (1996). Phylogenetic Inference. In ‘‘Molecular Systematics’’ (D. M. Hillis, C. Moritz, and B. K. Mable, Eds.), 2nd ed., Sinauer, Sunderland, MA. Ueda, K., and Yoshinaga, K. (1995). Can rbcL tell us the phylogeny of green plants? In ‘‘Current topics on molecular evolution’’ (M. Nei and N. Takahata, Eds.), pp. 97–103, The Institute of Molecular Evolutionary Genetics, Pennsylvania State University, and Graduate School for Advanced Studies, Hayama, Japan. Waters, D. A., Buchheim, M. A., Dewey, R. A., and Chapman, R. L. (1992). Preliminary inferences of the phylogeny of bryophytes from nuclear-encoded ribosomal RNA sequences. Amer. J. Bot. 79: 459–466. Wilson, M. A., Gaut, B., and Clegg, M. T. (1990). Chloroplast DNA evolves slowly in the Palm family (Arecaceae). Mol. Biol. Evol. 7: 303–314. Wolf, P. G., Soltis, P. S., and Soltis, D. E. (1994). Phylogenetic relationships of Dennstaedtioid ferns: evidence from rbcL sequence variation. Mol. Phyl. Evol. 3: 383–392. Wolfe, K. H., Morden, C. W., Ems, S. C., and Palmer, J. D. (1992). Rapid evolution of the plastid translational apparatus in a nonphotosynthetic plant: Loss or accelerated sequence evolution of tRNA and ribosomal protein genes. J. Mol. Evol. 35: 304–317. Yang, Z. (1994). Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: Approximate methods. J. Mol. Evol. 39: 306–314.