Fungal Family Trees – finding relationships from molecular data

Fungal Family Trees – finding relationships from molecular data

Mycologist, Volume 16, Part 2 May 2002. ©Cambridge University Press Printed in the United Kingdom. DOI: 10.1017/S0269915X02002045 Fungal Family Trees...

356KB Sizes 0 Downloads 56 Views

Mycologist, Volume 16, Part 2 May 2002. ©Cambridge University Press Printed in the United Kingdom. DOI: 10.1017/S0269915X02002045

Fungal Family Trees – finding relationships from molecular data GRAEME DOWN HRI-East Malling, Kent, ME196BJ

The study of the evolutionary history of organisms by tracing them to common ancestors is known as phylogenetics. Over the last 20 years or so, the increasingly widespread availability of molecular data has seen the blossoming of methodologies for converting DNA information to evolutionary trees and networks. This article will attempt to give an overview of some of the methods that are available, covering data types, considerations for their use, and methods by which evolutionary ancestry can be reconstructed. Where relevant, the text will include examples of the use of these methodologies for the obligately biotrophic plasmodiophorids.

Keywords: Fungi, evolution, techniques, trees, phylogeny Introduction Any method of tracing the ancestry and relatedness of species will not give you results that can be guaranteed to be 100% accurate – after all, the process involves dealing with organisms that no longer exist. What we can do in any circumstance to make sure our findings are as reliable as possible is to decide on; a) the most appropriate type of data to use, and b) the most informative analysis method(s) for the data selected. In the past, fungi and other microbes have been assigned to taxonomic groupings using a range of morphological and physiological properties, such as growth on certain media, pigmentation (e.g. Lardner et al., 1999, on Colletotrichum acutatum), and resting spore structures (e.g. the plasmodiophorids (see Braselton, 1995)). Biochemical properties and cellular ultrastructure (e.g. Braselton, 1992) have also been utilised. In many cases, such properties have proven extremely reliable in classifying organisms. However, potentially the most powerful raw information at our disposal are DNA sequences, where similarities and differences in sequences can be correlated with different taxonomic groupings, or even with individual isolates. Data types Many approaches can be followed to obtain molecular data so that variation can be seen between individuals, populations, and species, or whatever level of relation-

ship one wishes to investigate. For examining relationships at around the population and species level there have been some four commonly used electrophoresis band pattern methods. 1. RAPDs (Randomly Amplified Polymorphic DNA) involve a random amplification from the total genome of organisms, resulting in a series of PCR amplified products that are visualised as band patterns in electrophoresis gels. 2. RFLPs (Restriction Fragment Length Polymorphisms) generate band patterns in gels based on the digestion of regions of DNA with particular restriction enzymes. 3. AFLPs (Amplified Fragment Length Polymorphisms) provide complex band patterns from a restriction digest followed by selective amplification of the resulting fragments. 4. Simple Sequence Repeats (microsatellites) are based on the amplification of repetitive DNA and can also generate many PCR products from a single organism. A good review of the use of the above approaches can be found in Karp et al. (1996), focusing on botanical diversity in which sequence data (see later) are also covered. All of these techniques and related approaches generate discrete bands on whatever medium the results are visualised. Generally, these are scored by a binary method, with a score of 1 given for the presence of a band and 0 for absence. In Fig 1, which shows a hypothetical banding pattern, it can be seen that there is a discrete banding pattern for each organism. A score can be assigned for each band, and in this way, a bina51

Mycologist, Volume 16, Part 2 May 2002

12345 11100 00111 10011

Fig 1 An imaginary banding profile for three organisms, A, B, and C (left). The position of each band has been numbered. This presence or absence of bands can be converted to a binary chart (1 for presence, 0 for absence) as indicated (right).

ry profile can be built up for each organism in question. The number of bands in common between organisms can be represented numerically and these presence/absence tables can be used to derive quantitative measures of similarity or difference such as the simple matching score or a genetic difference. A more complete understanding of relationships between organisms may be obtained from the analysis and comparison of DNA sequences. Firstly, one needs to decide which part of the genome to investigate. Examples in the literature include analysis of mitochondrial genes (e.g. Hibbett, 2001; Yamagishi et al., 1997), ß-tubulin (e.g. Keeling et al., 2000; Thon & Royse, 1999). However, by far the most widely utilised region for fungal studies is the nuclear ribosomal RNA gene cluster, the properties of which have been well reviewed by Hillis & Dixon (1991). This region is a highly conserved unit in all eukaryotes, and for evolutionary studies has the advantage that it comprises regions of DNA that are evolving at different rates. For instance, the internal transcribed spacers (ITS) evolve quite rapidly, which means that they are useful in studies of closely related species, or even for comparing populations within species. The 18S ribosomal DNA (or small subunit) evolves much more slowly, and so sequence variation in this region may be more informative for studies on groups that diverged much earlier. Analysis of ITS sequences may be inappropriate in such cases, as there may be too much change in sequence, and this can obscure patterns of relationships. In addition to its variable evolutionary rate, the ribosomal DNA exists in many copies per genome. As such, it is a good target for PCR based studies, particularly where analytical samples may be limited such as in studies from soil samples or historic material (White et al., 1990). 52

Methods for defining relationships Binary data Data with two possible states, generally resulting from presence or absence of a character, can be analysed using many different simple coefficients for determining relationships, and results from these can be represented as trees by using the same general methods available for sequence data. Although this may be an oversimplification, with some particular restrictions and requirements for discrete presence/absence data being different from those for sequences, this article will concentrate on the basics of handling sequence data. Sequence data Given a set of DNA sequences, how does one go about comparing them and identifying relationships between them? All the sequences of interest can be compared by multiple sequence alignment. Packages for performing multiple sequence alignments include GCG and ClustalW (Thompson et al., 1994) and these can be accessed via numerous websites, a selection of which is listed later. Sequences should be trimmed to equal starting and ending points (see Fig 2). Otherwise, the longest sequence will effectively be matched against blanks or missing bases in the other sequences. Blank spaces, or gaps, within sequences when aligned may represent insertions or deletions, and so may convey significant information. The particular weight, or importance, given to such gaps can be varied when developing alignments, and this is a usual option in DNA alignment packages. However, gaps due to incomplete sequencing do not convey any information, as the sequence data are not known, and so the overhangs between sequences must be removed to prevent the missing data being considered as significant gaps. As stated earlier, a gap due to a single missing base in a sequence is likely to indicate a simple point deletion or insertion, but a series of several bases may suggest either one large deletion event or several smaller deletions (or insertions in the other sequences). Such events are weighted during alignment – that is, their likelihood of occurrence can be estimated and given a value. If an

Fig 2 Three aligned sequences of varying length are indicated. For accurate evolutionary reconstruction, the 5 bases at each end of the first sequence should be discarded as these are missing from one or other of the remaining sequences. Only the bases highlighted provide a meaningful comparison.

Mycologist, Volume 16, Part 2 May 2002

Fig 3 A. A set of three partial 18S rDNA sequences prior to ClustalW alignment. Pm = Phytophthora megasperma (Accession No. X54265), PBr = Plasmodiophora brassicae (U18981), Ssn = Spongospora subterranea f. sp. nasturtii (AF245217). B. After a ClustalW alignment with penalties of 10 for gap opening and 0.05 for gap extension. C. After a ClustalW alignment with penalties of 1 for gap opening and 1 for gap extension. The differences caused in the alignment by changing the parameters in B and C are indicated by the highlights.

event is considered unlikely, it is given a high level of significance, usually expressed as a penalty. For instance, if it is considered that opening a gap in a sequence is far more important than the length of that gap, then a high penalty would be given to “gap opening” and a much lower one to “gap extension”. Although there are rough guidelines for assigning

these values it is very much down to user judgement in most cases, and if in doubt, one is often best advised to use whatever default values are supplied. Although it is possible to weight individual character positions such that changes in one place are given more importance than others, there is always a certain amount of estimation involved, and events can provide misleading 53

Mycologist, Volume 16, Part 2 May 2002

results. Figure 3 illustrates a short segment of sequences before and after a ClustalW alignment with varying parameters. Once the alignment has been performed, it is possible to look at the sequences and make any minor adjustments as necessary. The multiple sequence alignment will now give an idea of what might be expected when building a tree diagram. Some sequences will be very similar, and should group close to each other in subsequent analysis, whilst some may be very divergent, and should separate from the rest. If they do not, it does not necessarily mean that an error has been made, but it implies that a thorough check of the procedures should be made. Methods for optimising alignments are still subjects for current research, and recent developments in this area include work on direct optimisation, where the alignment is optimised and the most parsimonious cladogram is produced simultaneously (for examples, see Wheeler (2001)). Whatever analysis method is now chosen, there are some further considerations to bear in mind. A very simple one is the length of the sequences being analysed. It has been suggested that 1000 bases is the minimum length on which one should base an analysis (Wainwright et al., 1993). In practice, this is often not possible, but it stands to reason that the shorter the sequences are, the more significant any slight differences will become, and, in a short sequence, this may be unrepresentative of the true variation. The more DNA regions that can be used, and the longer the sequences available from them, the more confidence is generated in the results, and a more complete evolutionary history will be built. However, there is an argument that the constraints to variation in one gene may be different to those in another, and conflicting results can arise. Hence care should be taken when deciding whether to combine data sets. This is particularly true when considering sequences from disparate regions such as mitochondrial and nuclear DNA. Another point to note is that if all sequences have the same base at a particular point (a constant character), then this will yield no useful phylogenetic information. For example, if all sequences in an alignment contain a motif of CATA at the same position, then any method that is used to look at these 4 bases can only assume that there is no difference between all the organisms. However, it is generally simpler to include such regions in analyses, and if one is interested in true evolutionary distance, then regions of homology such as this will provide an idea of how much or how little divergence has taken place between sequences. Other considerations that may need to be taken into account when building a complete picture of molecular 54

evolution include the ratio of transitions to transversions (see Kimura, 1980; Jukes & Cantor, 1969), secondary structure (see Kumar & Rzhetsky, 1996), site to site variation (Kumar & Rzhetsky, 1996), and multiple evolutionary events. Transitions/transversions This is the phenomenon by which base changes occur more frequently from purine to purine and pyrimidine to pyrimidine (transitions), rather then purine to pyrimidine and vice-versa (transversions). Methods for tree-building can be set to account for this – one example being the Kimura-2-parameter model (Kimura, 1980) where transitions are scored with lower value than the rarer transversions (making the latter a more significant event). Other models, such as Jukes-Cantor (Jukes & Cantor, 1969), assume equal probability of all base changes. In this context, it is worth mentioning N’s, or unknown bases. These are characters that are included in a DNA sequence when the sequencing process has given an ambiguous result, and the correct identity of a base can not be determined. An N in a sequence alignment could therefore equally potentially signify identity, a transversion, a transition, or even a missing base, depending on the quality of the sequence data. Most scoring systems use 0 (zero) for identity, and a positive score, such as 1, for a base change, with maybe 2 for transversions, in some models, such as the Kimura-2-parameter. In many cases, N will in reality signify identity, as base changes are relatively rare, and so if it is possible to set values for N in a programme, it may be advisable to set these at the lower end of the scale, perhaps as 0.1 or similar. If data contain too many N’s, however, then the alignments made will be unreliable. Secondary structure, site-to-site variation and independence Regions within a single DNA molecule may interact to form stable secondary structures. Ribosomal RNA, for example, forms complex ‘stem and loop’ structures, which may affect the possibility of change of state for some nucleotide positions (Kumar & Rzhetsky, 1996). In effect, the change of one base may influence the structure, so that a change elsewhere becomes necessary, and so there is loss of independence amongst characters. Structures such as stems and loops may also lead to a site to site variation in rate of change, i.e. there are constraints on some sites that will prevent the bases from changing as easily as at other site. It is possible to estimate the distribution of rate of change over a sequence (Kumar & Rzhetsky, 1996), and incorporate this into phylogenetic calculations.

Mycologist, Volume 16, Part 2 May 2002

Nucleotide ratios A final consideration in this article would be that not all organisms or DNA sequences have an equal ratio of A, T, C, and G. This will clearly lead to distorted results if base change likelihood is incorrectly assumed. Models such as Felsenstein’s (Felsenstein, 1981) seek to address this phenomenon. The rest of the article will consider ways of representing evolutionary history. Trees are the most common way of representing phylogeny, and may be either rooted, with all species present descended from an ancestor at the base of the tree, or unrooted, with no fixed start time. Rooted trees have a direction, which corresponds to evolutionary time (Page & Holmes, 1998). Unrooted trees will give a more accurate representation of evolution, however, if one is unclear as to where the root of the tree should lie. To root trees, it is common practice to use an outgroup, which is generally a sequence from an organism considered to be external to the group being examined. Outgroups are comparison taxa, and the organism selected as an outgroup should share a common ancestor with the organisms being considered, but should also not be too far diverged from them so as to provide a meaningless comparison (e.g. rooting a group of whale species with a green alga). Tree-building methods There are four major terms that are routinely used when deciding which tree-building method to use. These are discrete, continuous, clustering and optimality (see Swofford & Olsen, 1990; Page & Holmes, 1998). In the case of sequence information, ‘discrete data’ implies that each nucleotide position is treated individually across all sequences considered. ‘Parsimony’ is the most widely used example of this, and is also based on optimality criteria. One of the guiding principles of phylogenetic reconstruction is that evolutionary pathways have occurred through the minimum number of steps. In this way a tree that depicts lineages involving the least number of changes in characters will be the most parsimonious (e.g. Camin & Sokal, 1965; Edwards & Cavalli-Sforza, 1964). Parsimony analysis produces a tree involving the least amount of evolutionary change based on the input alignment. Optimality requires a defined relationship between tree and data, such as a model of evolution. Each tree built is given a score based on this function, and the best score obtained is chosen as the optimal tree. The maximum parsimony contained within any tree may be calculated and many trees may be constructed from a data set. The final number of trees will depend on the type of tree constructed, and whether or not both the tips of the tree

Fig 4 Three sequences (left) are converted to distance values (right). For example, out of the six positions, sequences 1 and 2 differ at 2 of them. Assuming a score of 1 for all difference and 0 for identity, then the distance between sequence 1 and sequence 2 is 2. This is indicated in the distance matrix (right).

(the living organisms) and the branch points (hypothetical ancestors) are included. In parsimony methods it is possible that there may be more than one tree because the results are equally parsimonous, and so a definite answer may not be reached. Likelihood analyses also treat characters individually, but require a hypothesis of evolution to be formulated first, and then the methods attempt to fit the original data to this (optimal tree construction). Continuous data are analysed by distance methods. These treat sequences as a whole to build evolutionary pictures, rather than building a picture from each base in turn, and then choosing the best. A number of clustering methods are based on pairwise distance matrices (Fig 4). In simple terms, two sequences are chosen as a starting point and the overall difference between them is calculated. These are then tested for difference to sequence three, the differences scored, and so on. These scores can be built to account for factors such as transition/transversion ratios as previously discussed. A distance matrix is thus created, and a tree can be built. Clustering methods take three distances to start with, and then add a fourth sequence based on its distance score compared to those three. The process is repeated iteratively until the tree is complete (Fig 5). It is possible to randomise the order in which sequences are added as this can affect the pattern of branching in the tree. It is also possible to set constraints on branch lengths, as it is possible to generate negative values. Clearly, these cannot be correct in reality as they would imply that a taxon had evolved before its ancestor, but they may need to be ‘allowed’ to obtain the best tree, and hence they are often set to zero. Optimal distance tree methods do exist, and include minimum evolution, a method similar to parsimony in that it searches for the tree with the minimum total length of differences between the sequences (total branch lengths). There are several variants on the themes of parsimony and distance, which allow slightly different variations and constraints to be placed on analyses. All these methods and their basic principles 55

Mycologist, Volume 16, Part 2 May 2002

Fig 5 Tree-building by clustering. A and B have the shortest distance in the top distance matrix, and so these are chosen as the starting point for building the tree. The distances between A and B from C or D are now averaged, and A-B is seen to be closest to D, which is placed on the tree. In this case C is the only remaining species and so the tree is completed.

have individual strengths and weaknesses, many of which will be reduced by using the most appropriate method for the data. How does one check that the analysis is reliable? By far the most widely used of such methods for estimating branch reliability is ‘bootstrapping’. This can give a value that is taken as a measure of support of individual branches within a tree, but clearly this will also be dependent on how reliable the results are, based on the data and analysis method chosen. If either of these is inappropriate then bootstrapping can give apparently good values from a flawed analytical design. The principle of bootstrapping is well-established in statistics (see Fig 6), although its use in molecular systematics can be controversial and other measures of support have been proposed (see Lee, 1999). Essentially, bootstrapping involves the random re-sampling of data sets from the original data, and the determination of the tree that would result from each data set. Any nucleotides may be chosen or ignored in this process, and the procedure is repeated many (often 1000-10,000) times. At the end of this, there will be many trees available, and the occurrence of each branch in each tree is determined as the proportion of 56

times that it occurs. If there are few bootstraps, one can look at each individually to ascertain how consistent the results are. However, it is common practice to build what is known as a majority–rule consensus tree from these, in which the groups that occur in the majority of the bootstrap results are combined into a single tree. In these trees, a number at each branch point indicates the proportion of individual trees that contained this branch. Clearly, as the individual trees generated during a bootstrap analysis will vary to some extent (although if the data and analysis are very good, this variation may be minimal), there may be branches which cannot fit on to the consensus tree without causing other branches to misalign. Bootstrapping may however be inappropriate for resolving phylogenetic ambiguity, and other approaches are being developed (see Sharkey & Leathers, 2001). In practice, branch values often tend to decrease as branch length increases, and there is no single value that can be taken as a reliable cut-off for providing a good confidence level, although in practice values approaching >90% may be expected. When interpreting bootstrap values, it should be borne in mind that the figure indicates the confidence in whatever is to the

Mycologist, Volume 16, Part 2 May 2002

Fig 6 The principle of bootstrapping. Nucleotides are sampled from within the sequence data (same data as in Fig3). If the highlighted nucleotides were chosen, then P. brassicae and S. subterranea will be found as most closely related, which is correct. If position 2 is chosen in place of position 1, then P. brassicae would be closer to P. megasperma (which is not correct). If enough bootstraps are performed then the former conclusion will be reached in the majority of cases. The proportion of cases in which it is found will be the bootstrap or confidence value.

right of the value (on a tree rooted at the left margin) having grouped away from whatever is to the left of that particular branch point. It is therefore a measure of how reliable the occurrence of that branch point is, and does not necessarily imply any information about relationships among the organisms grouped to the right of that branch point (see Fig 7, for example). It should also be remembered that bootstrapping is essentially a one-tailed method, and although groups occurring in a majority-rule tree can be inferred to have support, groups not occurring cannot be assumed to be unsupported. Showing evolution without trees In addition to trees, evolution can be depicted in other ways. One example is known as spectral analysis, and this is well described in Page and Holmes (1998). It is perhaps particularly useful for binary data, as it works best with only two character states (i.e. 0 and 1). Rather than treating each character position independently as in parsimony, the method seeks to combine positions that give the same type of information, and hence pick up patterns that might otherwise be missed. A second example is principal co-ordinate analysis, where calculations are performed to place organisms in n-dimensional space, where n is the number of characters used. Vectors are then constructed from combinations of characters, or parts of characters, that maximise the variation in the data. In this way multiple characters that show the same variations between organisms combine together in single vectors, and each vector could be considered as a different trait in variation. One limitation of this is that it is not possible to present an n-dimensional representation on paper or screen, and such analyses are usually confined to the 2 or 3 most significant vectors (for example, see Maurer et al., 1997). One possible benefit of principle co-ordinate analysis is that truly random

Fig 7 An imaginary tree constructed for species A1, A2 and A3 within family A, B1 and B2 within family B and species C as an outgroup. Bootstrap values are shown for 100 replicates. The figure of 99 indicates that on 99 out of 100 occasions, A1, 2, and 3 were recovered as a single group. It does not give any information on the closeness of A3 to A1 and A2. This is indicated by the figure of 67, which shows that the separation of A1 and A2 from A3 was recovered only 67 times out of 100. The results show that A forms a separate grouping which has split off from B and C, but that relationships within A are less certain.

variation is not correlated, and so is effectively screened out from the most significant vectors, and thus slight correlations occurring in otherwise random variation may be easier to detect (see Bridge, 1997). Websites, programs and literature The following are some useful references and websites that provide further information and links to some of the techniques mentioned above. Further information and detailed explanation on many of the above techniques can be found in two very useful references. The first is a chapter by Swofford & Olsen (1990), and the second a book by Page & Holmes (1998). Websites that the author has found useful include: 57

Mycologist, Volume 16, Part 2 May 2002

National Centre For Biotechnological Information http://www.ncbi.nlm.nih.gov European Bioinformatics Institute http://www.ebi.ac.uk The Pasteur Institute http://bioweb.pasteur.fr WWW-server of Felsenstein lab Department of Genome Sciences, University of Washington http://evolution.genetics.washington.edu/ Taxonomy Centre, Glasgow University http://taxonomy.zoology.gla.ac.uk/ These sites should provide access or links to most of the programs and packages mentioned in the text.

References Braselton, J.P. (1992). Ultrastructural karyology of Spongospora subterranea (Plasmodiophoromycetes). Canadian Journal of Botany 70: 1228-1233. Braselton, J.P. (1995). Current status of the plasmodiophorids. Current Reviews in Microbiology 21: 263-275. Bridge, P.D. (1998). Numerical analysis of molecular variability, a comparison of hierarchic and nonhierarchic approaches. In Molecular Variability of Fungal Pathogens (eds P.D. Bridge,Y. Couteaudier & J.M. Clarkson) pp. 291-308 CAB International, Wallingford. Camin, J.H. & Sokal, R.R. (1965). A method for deducing branching sequences in phylogeny. Evolution 19: 311-326. Edwards, A.W.F. & Cavalli-Sforza, L.L. (1964). Reconstruction of evolutionary trees. In Phenetic and Phylogenetic Classification (eds. V.H. Heywood & J. McNeill) pp. 67-76. Systematics Association. Felsenstein, J. (1981). Evolutionary trees from DNA sequences: a maximum likelihood approach. Journal of Molecular Evolution 17: 368-376. Hibbett, D.S. (2001). Shiitake mushrooms and molecular clocks: Historical biogeography of Lentinula. Journal of Biogeography 28: 231-241. Hillis, D.M. & Dixon, M.T. (1991). Ribosomal DNA: Molecular evolution and phylogenetic inference. The Quarterly Review of Biology 66: 411-453. Jukes, T.H. & Cantor, C.R. (1969). Evolution of protein molecules. In: Mammalian Protein Metabolism III, ed. H.N. Munro, pp 21-132. Academic Press, New York. Karp, A., Seberg, O. & Buiatti, M. (1996). Molecular techniques in the assessment of botanical diversity. Annals of Botany 78: 143-149.

58

Keeling, P.J., Luker, M.A. & Palmer, J.D. (2000). Evidence from ß-tubulin phylogeny that microsporidia evolved from within the fungi. Molecular Biology of Evolution 17: 23-31. Kimura, M. (1980). A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. Journal of Molecular Evolution 16: 111-120. Kumar, S. & Rzhetsky, A. (1996). Evolutionary relationships of eukaryotic kingdoms. Journal of Molecular Evolution 42: 183-193. Lake, J.A. & Moore, J.E. (1998). Phylogenetic analysis & comparative genomics. In Trends Guide to Bioinformatics, pp 2223, Elsevier Science. Lardner, R., Johnston, P.R., Plummer, K.M. & Pearson, M.N. (1999). Morphological and molecular analysis of Colletotrichum acutatum sensu lato. Mycological Research 103: 275-285. Lee, M.S.Y. (1999). Measuring support for phylogenies: the “Proportional Support Index”. Cladistics 15: 173-176. Maurer, P., Couteaudier, Y., Girard, P.A., Bridge, P.D. & Riba, G. (1997). Genetic diversity of Beauveria bassiana and relatedness to host insect range. Mycological Research 101: 159164. Page, R.D.M. & Holmes, E.C. (1998). Molecular Evolution – A Phylogenetic Approach, 1st ed, Blackwell Science Ltd, Oxford, U.K. Sharkey, M.J. & Leathers, W. (2001). Majority does not rule: the trouble with majority-rule consensus trees. Cladistics 17: 282-284. Swofford, D.L. & Olsen, G.J. (1990). Phylogeny reconstruction. In: Molecular Systematics, eds. D.M. Hillis & C. Moritz), pp 411-501. Sinauer Press, Sunderland. MA, U.S.A. Thompson, J.D., Higgins, G.D. & Gibson, T.J. (1994). CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positionspecific gap penalties and weight matrix choice. Nucleic Acids Research 22: 4673-4680. Thon, M.R. & Royse, D.J. (1999). Partial ß-tubulin gene sequences for evolutionary studies in the Basidiomycotina. Mycologia 91: 468-474. Wainwright, P.O., Hinkle, G., Sogin, M.L. & Stickel, S.K. (1993). Monophyletic origins of the Metazoa: An evolutionary link with the fungi. Science 260: 340-342. Wheeler, W. (2001). Homology and the Optimization of DNA Sequence Data. Cladistics 17: S3-S11. White, T.J., Bruns, T., Lee, S. & Taylor, J. (1990). Amplification and direct sequencing of fungal ribosomal RNA genes for phylogenetics. In PCR Protocols. A Guide to Methods and Applications, eds. M.A.Innis, D.H.Gelfand, J.J. Sninsky & T.J. White, pp 315-322. Academic Press: San Diego, U.S.A. Yamagishi, Y., Kawasaki, K. & Ishizaki, H. (1997). Mitochondrial DNA analysis of Phialophora verrucosa. Mycoses 40: 329-334.