News & Comment
basic number of protein types required for multicellularity. Comparisons with other plant species and examination of the overall genome structure suggest that at least one whole genome duplication event probably occurred ~12 million years ago, yielding a tetraploid1. Subsequent genome reorganization and deletion produced the diploid system now seen in Arabidopsis. This putative genome duplication, in addition to various tandem gene duplications, has led to a significant increase in gene-family size. In Arabidopsis, 37.4% of the gene families have more than five members, compared with 12.1% for D. melanogaster and 24.0% for C. elegans. Gene duplications have probably established functional redundancy in many cases, explaining why many plant mutants do not have an obvious phenotype. Horizontal transfer of genes from the plastid and mitochondrial genomes to the nuclear genome has also been prevalent. The seemingly greater genome malleability in Arabidopsis might be necessary to allow for new functions in an evolving environment – alternative promoters and alternative splicing are less common in plants. On the basis of
TRENDS in Genetics Vol.17 No.2 February 2001
73
homology, putative gene functions have been assigned to 69% of the genes. Only 9% of the Arabidopsis genes had been identified through traditional experimentation. Comparing the Arabidopsis gene set to those of the other sequenced multicellular organisms suggests that basic intracellular processes, such as translation, are conserved, whereas intercellular processes, such as development, often use different proteins. This makes sense because the common ancestor of plants and animals was a singlecelled eukaryote, with multicellularity arising independently in the two lineages1. Although plants and animals started with the same general nuclear gene cohort, different gene families have been differentially expanded and consolidated to fulfill various functions. For example, plants use MADS-box transcription factors for regulating spatial patterning, whereas homeobox genes fill this role in animals. Differences in intracellular processes generally seem to be the result of the presence of the plant cell wall. With the completion of the Arabidopsis genome project, the research emphasis is shifting from the identification of genes to the delineation of gene functions. Of
particular interest will be discovering functions for genes with unclassified functions, and those genes belonging to the approximately 150 plant-specific protein families. In addition, although putative functions have been assigned to many genes on the basis of homology, it is possible that they actually have different roles. Functional redundancy within the genome will need to be considered when conducting these studies. Increasing our understanding of Arabidopsis gene function will inevitably expand our ability to improve crop plants.
the ecotypes were quite distinct, the mutants were close to the parental ecotype. This demonstrates that, although an alteration in a single gene can have a pleiotropic effect on a number of compounds, the perturbations introduced are less than the differences between two varieties of the same plant. But how many gene differences are there between ecotypes? Thanks to the Arabidopsis genome project this is now becoming clear. A comparison of the Columbia ecotype (sequenced by a consortium of laboratories) with the Landsberg erecta ecotype [sequenced by Cereon (http://www.arabidopsis. org/cereon)] showed that ecotypes differ in thousands of genes. So, although it is true that alterations in a single gene can have more widespread effects than one might suppose from the ‘one gene, one enzyme’ dogma, breeders have been manipulating thousands of genes producing large changes in metabolism but with little untoward effect. It is good to see that the UK Food Standards Agency (http:// www.foodstandards.gov.uk/research.htm) intends to sponsor research in genomics and metabolic profiling to asses the safety of GM plants, so may be at last the debate about GM plants can move on. Fiehn, O. et al. (2000) Nat. Biotechnol. 18, 1157–1161. RS
A framework for functionality
1 The Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408, 796–815 2 Theologis, A. et al. (2000) Sequence and analysis of chromosome 1 of the plant Arabidopsis thaliana. Nature 408, 816–820 3 Salanoubat, M. et al. (2000) Sequence and analysis of chromosome 3 of the plant Arabidopsis thaliana. Nature 408, 820–822 4 Tabata, S. et al. (2000) Sequence and analysis of chromosome 5 of the plant Arabidopsis thaliana. Nature 408, 823–826
Matthew R. Willmann
[email protected]
In Brief
One gene, many metabolites – does it matter? Opponents of the genetic manipulation of plants claim that the insertion of a single characterized gene might have unpredictable consequences. The standard riposte is that breeders have been combining many unknown genes from different plants for years so where is the problem? At last the technology to answer this sterile debate is to hand. In a tour de force, Lothar Willmitzer and his colleagues used gas chromatography with mass spectrometry (GC/MS) to screen hundreds of compounds in two ecotypes of Arabidopsis and in single gene mutants of each of the ecotypes. Although the metabolic profiles of
The problem with the various genome sequencing projects is the overwhelming mass of data they produce – how can the vast number of uncharacterized gene products begin to be understood? Recently, in an attempt to solve this problem, Stanley Fields and colleagues developed a framework to assign broad functions to proteins. Using published data on proteins that interact physically with each other in the yeast Saccharomyces cerevisiae, they built up a vast network of 1548 proteins linked by 2358 interactions. The network reveals global patterns – the proteins tend to cluster in groups with particular functions or subcellular localizations – and highlights some intriguing links; for instance, as well as being associated with processes of RNA splicing and RNA turnover, RNA processing proteins are also unexpectedly linked with mitosis and chromatin synthesis. The network can also be used to place a protein in a biochemical pathway or subcellular structure and predict its function on the basis of the properties of its interaction partners. Using this approach, the authors propose functions for more than 300
http://tig.trends.com 0168-9525/01/$ – see front matter © 2001 Elsevier Science Ltd. All rights reserved.