B B6 See: C57BL/6
BAC (Bacterial Artificial Chromosome) G M Weinstock Copyright ß 2001 Academic Press doi: 10.1006/rwgn.2001.0098
Bacterial artificial chromosomes (BACs) are plasmids used for cloning and stably maintaining large segments of foreign DNA in Escherichia coli. This is important in various types of analyses of mammalian and other genomes. A problem in some recombinant DNA experiments is the stable maintenance of large (>100 kilobase pairs) inserts in E. coli. Typically, most plasmid vectors used for cloning foreign DNA into E. coli can stably carry DNA fragments of 10 kb or less. When the DNA inserted into a standard cloning vector exceeds this size, several problems may result: 1. The plasmid clone replicates poorly (due, for instance, to foreign sequences that adopt structures that are difficult for the E. coli apparatus to replicate, or do not segregate evenly at cell division). 2. The cell grows slowly (due, for example, to toxic products being produced by gene expression from the foreign DNA). 3. The inserted DNA adopts structures, such as cruciforms, that are readily deleted in E. coli. In the first two cases, rare deletions in the insert that reduce its size and eliminate growth problems will have a growth advantage over the parental clone and eventually overgrow the culture. Thus in all these instances the large insert is unstable and hard to maintain. In 1992, Shizuya and Simon at Cal Tech developed vectors capable of maintaining large inserts without these problems. These vectors were plasmids that contained the origin of replication from the F factor of E. coli. The F factor is a large plasmid that is normally capable of replication of DNA molecules greater than 100 kb in length. It is a low copy number plasmid,
being present in only one to two copies per cell. These two features aid in preventing problems 1 and 2 above. The mechanism of deletion of structures such as cruciforms (problem 3) requires that certain enzymes, such as nucleases, load onto DNA and may also be influenced by DNA superhelicity, both of which may be different with a replication fork formed from the F origin of replication than from other replicons, such as are used in higher copy number vectors. BAC vectors have been engineered to have other features such as selectable antibiotic resistance genes and restriction enzyme cleavage sites for inserting or removing foreign DNA. BAC vectors have been extremely successful for cloning and maintaining mammalian DNA in E. coli. In the Human Genome Project, BAC clones from large libraries (105±106 clones) were first carefully mapped along each chromosome, and then the DNA sequences of a subset of these BAC clones were individually determined and assembled to provide the complete human genome sequence. See also: DNA Cloning; Plasmids; Vectors
Bacillus subtilis A Danchin Copyright ß 2001 Academic Press doi: 10.1006/rwgn.2001.0099
Dozens of genomes have been sequenced and many more will soon be added to the list. Unfortunately, most genomes have long runs of nucleotides that encode genes with unknown functions. Achieving an understanding of model bacteria as a means of organizing our biological knowledge (and more generally our knowledge about what constitutes life) is therefore of particular importance. Only two bacterial models are available: Escherichia coli, which is the model for gram-negative bacteria and is the bestknown living organism; and Bacillus subtilis, which is the model for gram-positive bacteria. The recent controversial proposal of Gupta to classify bacteria into monoderm and diderm cell types places these two
136
B a c i l l u s s u b tilis
model systems in key positions in investigations of what constitutes life. In the case of B. subtilis, most of the studies have been devoted to specific processes such as sporulation, competence/transformation/recombination, or secretion. Until recently, not much was known about the intermediary metabolism of B. subtilis. After the elucidation of its genome sequence, facts and concepts in this area increased dramatically. Apart from its importance as a model organism, B. subtilis is also widely employed in biotechnology, for example in fermentation processes (secretion of enzymes and processing of plants such as soybean). Because the sequence of its genome is now known, it is fast becoming one of the few universal models for the understanding of the requirements for life in unicellular organisms.
Bacillus subtilis and its Biotope The objective of any living organism is to occupy a part of the earth's crust. This means, among other ancillary functions, the exploration, colonization, maintenance and exploitation of the local resources dealing with congeners and with other organisms, etc. As a consequence, one cannot understand an organism if one does not have knowledge of its habitat. Bacillus subtilis was first identified in 1872. It is a bacterium that can be routinely obtained in pure culture by soaking hay in water for a few hours at 37 8C, then filtering and boiling for 1 h at neutral pH. Bacillus subtilis has also been isolated directly from soil-inoculated nutrient agar, where B. subtilis predominates among the outgrowing cultures. Spores are more readily obtained in solid media than in liquid media, and they require the presence of manganese ions. The bacteria produce a complex lipopeptide, surfactin, that permits them to glide very efficiently over the surface of certain types of media. This property is likely to be related to colonization of the surfaces of leaves (the `phylloplane'), fruits or sometimes roots. Indeed B. subtilis makes up the major population of bacteria on flax stems during the retting process. Vegetative cells of B. subtilis are responsible for the early stages of breakdown of plants, and sometimes products of animal origin; some variants (e.g., B. amyloliquefaciens) cause potato tubers to rot. When conditions become unfavorable, the onset of a differentiation process, sporulation, permits the cells to generate resistant spores that can be easily dispersed throughout the environment, where they will germinate if conditions are appropriate. Unlike most other bacterial species, endosporeforming bacteria are highly resistant to the lethal effects of heat, drying, many chemicals, and radiation. In fact, one fashionable hypothesis of the origin of life on earth by panspermia by Sven Arrhenius, and more
recently Francis Crick, relies on the notion that bacterial spores such as those of B. subtilis could travel through space and survive for millions of years. Despite its appeal to a wild imagination, this hypothesis essentially puts the investigation of the origin of life out of our reach, because exploring the whole Universe is not possible.
Compartmentalization: Bacillus subtilis and its Envelopes Envelope of the Vegetative Cell
Gram-positive bacteria and, in general, monoderms have complex envelopes comprising one bilayer lipid membrane separating the cytoplasm from the exterior of the cell. The membrane is part of a very complex structure that comprises many layers (up to 40 in the case of B. subtilis) of murein, or peptidoglycan, a complex of peptides containing d-amino acids (in particular mesodiaminopimelic acid), and amino sugars. The cell envelope also has several layers of teichoic acid (Figure 1). The possible existence of a periplasm in B. subtilis in a distinct cell compartment surrounded by the cytoplasm membrane and the cell wall is a controversial issue. Cytoplasm, membrane, and protoplast supernatant fractions were prepared from protoplasts generated from phosphate-limited cells. The protoplast supernatant fractions was found to include cell wall-bound proteins, exoproteins in transit, and contaminating cytoplasmic proteins arising through leakage from a fraction of protoplasts. By this operational definition, 10% of the proteins of B. subtilis can be considered periplasmic.
Sporulation
Upon starvation, B. subtilis stops growing and initiates sporulation. This developmental process involves differentiation into two cell types (Figure 1). The process begins with a reorganization of the cell cycle that leads to the production of cells whose size and chromosome content is appropriate for the developmental process. The formation of the two cell types, a forespore and a mother cell twice as large as the
Figure 1 (See Plate 2) Electron micrograph of Bacillus subtilis in the process of sporulation.
B a c i l l u s s u b t i l i s 137 forespore, with differing developmental fates is the first morphological indication of the early stages of sporulation in B. subtilis. Endospore formation is a multistep process that is common among bacilli. This seemingly simple structure is the product of a very complex network of interconnected regulatory pathways that become activated during late growth in response to unbalanced nutritional shifts and cell cycle-related signals. Sporulation starts with stage 0 (vegetative growth). Symmetrical cell division, characteristic of vegetative growth, is blocked. Instead, the cell divides asymmetrically to produce a small polar prespore cell and a much larger mother cell. During stage I, asymmetrical preseptation starts. The cellular DNA takes the shape of an axial filament. At stage II, septation proceeds and the daughter chromosomes are separated. Spore development follows at stage III (engulfment of the forespore and complete separation of the spore membrane from that of the mother cell). Stage IVinvolves formation of the spore cortex. In stage V spore coat proteins are synthesized and assembled. At stage VI the spore becomes highly refractile under the microscope, and it acquires heat and stress resistance. Finally, the programmed death of the mother cell occurs, leading to lysis and release of the mature spore (stage VII). Pigments are produced that stain the spores from reddish brown to blackish brown (black in the presence of tyrosine). Stage 0
Stage I
Our understanding of sporulation control in B. subtilis is extensive. The process combines phosphorylation cascades mediated by kinases and phosphatases with a network of transcription controls by sigma factors, together with membrane-bound effector molecules that control compartmentalization (Figure 2). Despite the intensive work of hundreds of scientists on many of the signals involved in the onset and control of sporulation, some still remain unknown. The spore coat is a complex envelope comprising several layers of spore coat proteins that protect the almost entirely desiccated interior of the spore, where DNA is compacted and protected from the harmful influence of the environment. Under conditions of appropriate moisture, in media that contain alanine, glucose, and minerals, spores are able to germinate. This process involves swelling and a complex lytic process that opens and sometimes degrades the coating envelope, during which time metabolism is initiated. Cells then resume normal vegetative growth.
Quorum-Sensing and Chemotaxis
It has long been known that bacteria form colonies on agar plates. If the medium is appropriate these colonies give rise to bacterial swarming. In the late 1960s, it was observed that cultures of Vibrio fischeri (a luminescent gram-negative bacterium that colonizes squid) remained nonluminescent during the first Stage II
Septation σA
σA
pro-σ E
pro-σ E
pro-σ E
σH
σH
σF
σF
σF Stage IV Cortex
Stage III Engulfment pro-σ E σF σG
pro-σE σE Stage V
Coat σK
σE
pro-σ K σG
pro-σ K
σG
σK
Stages VI −VII Maturation, lysis
Free spore
σG
Germination
Outgrowth
Figure 2 The stages of sporulation. The various sigma transcription factors that control gene expression processes during sporulation are indicated within the compartments where they operate.
138
B a c i l l u s s u b tilis
hours of growth, during which time the number of cells increased. Luminescence appeared when the population reached a significant density, at a moment when the bacteria ran out of nutrients. This collective behavior meant that a bacterial function was expressed at a certain cell density: the organisms in the population were sensing each other. This, was termed `quorum sensing.' A variety of processes are regulated in a cell density or growth phase-dependent manner in gram-positive bacteria. In the early 1990s quorum sensing was discovered in B. subtilis and was certainly linked to sporulation (to swarm or to sporulate, that is the question), but the functional reason(s) for the existence of the process are not yet known. Most bacteria that use quorum sensing systems inhabit an animal or plant. The microorganisms benefit from the process, but the host organism may or may not. Each bacterium produces small diffusible molecules that allow cell-tocell communication. As the population of bacteria increases, so does the concentration of the signaling molecules. Sensors recognize these molecules. Once the local concentration in the medium has reached a threshold value, the sensor proteins transmit a signal to a transcriptional regulator. Examples of such quorum-sensing modes are the development of genetic competence in B. subtilis and Streptococcus pneumoniae, the virulence response in Staphylococcus aureus, and the production of antimicrobial peptides by several species of gram-positive bacteria, including lactic acid bacteria. Avariety of ways for bacterial populations to coordinate their activities have been discovered. Cell densitydependent regulation in these systems appears to follow a common theme. First, the signal molecule (a posttranslationally processed peptide±pheromone) is secreted by a dedicated ATP-binding cassette (ABC) exporter. The role of the secreted peptide pheromone is to function as the input signal for a specific sensor component of a two-component signal-transduction system. Coexpression of the elements involved in this process results in self-regulation of peptide± pheromone production. Peptides are secreted and processed under various conditions that are further recognized by the cell. Next, in response to pheromone, cells swim in a coordinated fashion, thereby forming a kind of wall surrounding rings of bacteria having the same exploration behavior (Figure 3). Bacillus subtilis is a highly motile bacterium, endowed with a complex flagellar machinery. This permits cells to swim toward nutrients or away from repellents. Many genes similar to those known in motile bacteria are found in B. subtilis, making it likely that the tumbling and swimming processes function similarly to those of E. coli. One can expect that this behavior
permits the cell to invade and colonize the surface of leaves where they can find nutrients (especially as carbon and nitrogen sources as well as micronutrients) secreted by the plant or decaying leaves. The bacteria secrete antibiotics that permit them to outcompete other organisms, for example the products of the pks genes act against Agrobacterium species. This establishes a cooperation between the plant and the bacteria; commensalism rather than symbiosis.
Protein Secretion
Bacillus subtilis is one of the organisms of choice in the study of protein secretion. At the time of this review, many fundamental aspects of this process are not yet understood. Several systems enable proteins to be inserted into the membrane and/or to be located outside of the membrane or secreted into the surrounding medium. In B. subtilis, the Sec-dependent pathway (one that recognizes signal peptides) has at least five different signal peptide peptidases. Proteins that are periplasmic in gram-negative bacteria are also found in B. subtilis, presumably as lipoproteins (i.e., possessing a specific signal peptide, cleaved upstream of a cysteine residue that is covalently coupled to the outer lipid layer of the cell membrane upon cleavage). The signal recognition particle (SRP) system is an oligomeric complex that mediates targeting and insertion of proteins into the cytoplasmic membrane. SRP consists of a 4.5S RNA and several protein subunits. One of these subunits, Ffh, interacts with the signal sequence of nascent polypeptides. The N-terminal residues of Ffh include a GTP-binding site (G-domain) and are evolutionarily related to similar domains in other proteins. A second protein, the counterpart of the E. coli FtsY protein, is believed to play a role similar to that of the docking protein in eukaryotes. Finally, it appears that some B. subtilis secreted proteins are made of two parts. The first part remains inserted in the membrane, presumably as a permease, and the second part is liberated in the surrounding medium after cleavage by an unknown protease.
Metabolism In addition to the need for compartmentalization, living cells must chemically transform some molecules into others. Metabolism is the hallmark of life. Cells can be in a dormant state, as is the case with spores, for example, but one cannot be sure that they are living organisms unless, at some point, they initiate metabolism. In general, one distinguishes between primary metabolism (the transformation of molecules that support cell growth and energy production) and secondary metabolism (transactions involving molecules that are not necessary for survival and multiplication, but
B a c i l l u s s u b t i l i s 139 Processing
PEP19
Oligopeptide permease
Outside
Cell membrane
Inside
Modification ?
RapA phosphatase
Phr PEP6 Spo0F + Pi +
ATP
KinA
ADP
KinA −P
−
Spo0F −P
Spo0B
Spo0A −P
Spo0B −P
Spo0A
Spo0E Spo0F
Pi
Figure 3 Competence is triggered when Bacillus subtilis encounters a signal from the environment and when an appropriate quorum is reached, monitored by phenomones synthesized by the bacteria. The chain of events is depicted. A sensor controls a regulator which, though a phosphorylation cascade, controls transcription. The onset of sporulation negatively controls competence under appropriate conditions through the action of the protein SpoOK. assist in the exploration and occupation of biotopes, e.g., antibiotic synthesis).
Transport of Basal Cell Atoms
Carbon, oxygen, nitrogen, hydrogen, sulfur, and phosphorus are the core atoms of life. Electron transfers and catalytic processes, as well as the generation of electrochemical gradients, require many other atoms in the form of ions. Metabolic processes allow the cell to concentrate, modify, and excrete ions and molecules that are necessary in energy management, growth and cell division. Nutrients and ions are transported into cells by a number of more or less specific permeases, most of which belong to the ABC permease category. In B. subtilis these permeases generally comprise a binding lipoprotein responsible for part of the specificity, located at the external surface of the membrane, an integral membrane channel made of proteins of two different types, and a dimeric, membrane-bound cytoplasmic complex, which binds and hydrolyzes ATP as the energy source. For positively charged ions, selectivity is the most important feature of the permease, because the electrochemical gradient is oriented toward the interior of the cell (negative inside). Ions must be concentrated from the environment until they reach the concentration required for proper activity,
but must not reach inhibitory levels. Apart from iron, which is scavenged from the environment with highly selective siderophores synthesized in response to iron limitation, manganese is the most important transition metal ion for B. subtilis. It is required for many enzyme activities, such as superoxide dismutase, agmatinase, phosphoglycerate mutase, pyrophosphatase, etc. Copper is important for electron transfer and cobalt is required by the important recycling protein methionine aminopeptidase. Nickel is required by urease, zinc is needed as a cofactor of polymerases and dehydrogenases, and magnesium is involved in catalytic complexes with substrate in about one-third of enzyme reactions. Potassium is needed to construct the electrochemical gradient of the cell's cytoplasm, and is a likely cofactor in many reactions. Calcium is probably needed in major reactions during the division cycle, but the importance of this ion still remains a mystery. Anions are also important, and they need to be imported against a strong electrochemical gradient. Phosphate in particular requires a set of highly involved transport systems. For B. subtilis a main source of phosphate is probably phytic acid, a slowly degraded phosphate-rich molecule. Sulfate is the precursor of many important coenzymes in addition to cysteine and methionine, but not much is yet known about its transport and metabolism, except
140
B a c i l l u s s u b tilis
by comparison with the counterparts known to be present in E. coli.
Intermediary Metabolism
Carbon and nitrogen metabolism in B. subtilis follow the general rules of intermediary metabolism in aerobic bacteria, with a complete glycolytic pathway and a tricarboxylic acid cycle. Electron transfer to oxygen is mediated by a set of cytochromes and cytochrome oxidases, allowing efficient respiration in B. subtilis. This organism is generally said to be a strict aerobe, and indeed it respires very efficiently. However, it can grow in the absence of molecular oxygen, provided that appropriate electron acceptors such as nitrate are present in the environment. Coupled to electron transfer, a proton addition between NAD(P) and NAD(P)H occurs. Bacillus subtilis does not possess a transhydrogenase that could equilibrate the pools of NADH and NADPH. Therefore, because the enzymes using NAD and NADP often differ, there must be a means of equilibrating the corresponding pools of reduced molecules. As expected from its vegetal biotope, B. subtilis can grow on many of the carbohydrates synthesized by plants. In particular, sucrose can function as a major carbon source in this organism, via a very complicated set of highly regulated pathways. As in many other eubacteria, the phosphoenolpyruvate-dependent (PTS) system plays a major role in carbohydrate transport and regulation. Catabolite repression control, mediated by a unique system involving specific factors (and no cyclic AMP), exists in this organism. Some knowledge about nitrogen metabolism in B. subtilis has accumulated, but significantly less than in its E. coli counterpart. Many nitrogenous compounds, such as arginine or histidine, can be transported and used by B. subtilis. A specific transcription factor controls nitrogen availability. Amino acid biosynthesis is not yet well documented, but purine and pyrimidine metabolism is well understood. In B. subtilis, in contrast to E. coli, there are two carbamoylphosphate synthases: one specific for arginine synthesis and the other for pyrimidine synthesis. As in other living organisms, the ubiquitous polyamines putrescine and spermidine play a fundamental, yet enigmatic, role. They arise via the decarboxylation of arginine to agmatine, coupled to a manganese-containing agmatinase, and not from decarboxylation of ornithine, as in higher eukaryotes.
Special Environmental Conditions
Another aspect of the B. subtilis life cycle is that it can grow over a wide range of different temperatures up to 54±55 8C. This indicates that its biosynthetic machinery comprises control elements and molecular chaperones
that permit this versatility. Specific transcription control processes allow the cell to adapt to changes of temperature by transiently synthesizing heat shock or cold shock proteins, according to the environmental conditions. In addition, gene duplication may permit adaptation to high temperature, with isozymes having low and high temperature optima. As a case in point, B. subtilis has two thymidylate synthases. The one coded by the thyA gene is thermostable (and more related to the archebacterial type) and the other, ThyB, is thermosensitive. Bacillus subtilis is also able to adapt to strong osmotic stresses, such as the one that occurs during dehydration and can adapt to high oxygen concentrations and changes in pH. Not much is yet known about the corresponding genes and regulation. Because the ecological niche of B. subtilis is linked to the plant kingdom it is subjected to rapid alternating drying and wetting. Accordingly, this organism is very resistant to osmotic stress, and can grow well in media containing 1 m NaCl, and indeed B. subtilis has been recovered from sea water.
Secondary Metabolism
In Bacillus species, starvation leads to the activation of a number of processes that affect the ability to survive during periods of nutritional stress. Capabilities that are induced include competence and sporulation, the synthesis of degradative enzymes, motility, and antibiotic production. Some genes in these systems are activated during the transition from exponential to stationary growth. They are controlled by mechanisms that operate primarily at the level of transcription initiation. One class of genes functions in the synthesis of special metabolites such as peptide antibiotics, as well as the cyclic lipopeptide surfactin. These genes include the srfA operon that codes for the enzymes of the surfactin synthetase complex or the pks operon, presumably controlling synthesis of polyketides. Several antifungal antibotics, some of which are used in agriculture, are produced by B. subtilis strains, indicating that competition with fungi is probably a major feature of the B. subtilis biotope. Peptide or polyketide antibiotic biosynthesis genes are regulated by factors as diverse as the early sporulation gene product Spo0A, the transition-state regulator AbrB, and gene products such as ComA, ComP, and ComQ, required for the initiation of the competence developmental pathway.
Information Transfer: B. subtilis Genome and its Organization The complete sequence (4 214 820 bp) of the B. subtilis genome (strain 168) was published in November 1997, and further corrected after several rounds of sequence verification. The reference specialized database,
B a c i l l u s s u b t i l i s 141 SubtiList, updates the genome sequence and annotation as work on B. subtilis progresses throughout the world. Of the more than 4100 protein-coding genes, 53% are represented once. A quarter of the genome corresponds to several gene families that have been greatly expanded by gene duplication, the largest of which is a family containing 77 putative ATP-binding cassette permeases.
Features of Genome Sequence
Analysis for repeated sequences in the genome demonstrated that strain 168 does not contain insertion sequences. A strict constraint on the spatial distribution of repeats longer than 25 bp was found in the genome, in contrast to the situation in E. coli. This was interpreted as a hallmark of selective processes leading to the insertion of new genetic information into the genome. Such insertion appears to rest on the uptake of nonspecific DNA by the competent cell and its subsequent integration in the chromosome in a circular form through a Campbell-like mechanism. Similar patterns are found in other competent genomes of gram-negative bacteria as well as Archaea, suggesting a similar evolutionary mechanism. The correlation of the spatial distribution of repeats and the absence of insertion sequences in the genome suggests that mechanisms aiming at their avoidance and/or elimination have been developed. Knowledge of whole genome sequences allows one to investigate the relationships between gene and gene products at a global level. Although there is generally no predictable link between the structure and function of biological objects, the pressure of natural selection has created some fitness among gene, gene products, and survival. Biases in features of predictably unbiased processes is evidence for prior selective pressure. In the case of B. subtilis one observes a strong bias in the polarity of transcription with respect to replication: 70% of the genes are transcribed in the direction of the replicating fork movement. Global analysis of oligonucleotides in the genome demonstrated that there is a significant bias, not only in the base or codon composition of one DNA strand with respect to the other, but, quite surprisingly, also at the level of the amino acid content of the proteins. The proteins coded by the leading strand are valine-rich, and those coded by the lagging strand are threonine isoleucine-rich. This first law of genomics seems to extend to most bacterial genomes. It must result from a strong selection pressure of a yet unknown nature.
Codon Usage and Organization of the Cell's Cytoplasm
Because the genetic code is redundant, coding sequences exhibit highly variable patterns of codon
usage. If there were no bias, all codons for a given amino acid should be used more or less equally. The genes of B. subtilis have been split into three classes on the basis of their codon usage bias. One class comprises the bulk of the proteins, another is made up of genes that are expressed at a high level during exponential growth, and a third class, with AT-rich codons, corresponds to portions of the genome that have been horizontally exchanged. What is the source of such biases? Random mutations would be expected to have smoothed out any differences, but this is not the case. There are also systematic effects of context, with some DNA sequences being favored or selected against. The cytoplasm of a cell is not a tiny test tube. One of the most puzzling features of the organization of the cytoplasm is that it accommodates the presence of a very long thread-like molecule, DNA, which is transcribed to generate a multitude of RNA threads that usually are as long as the length of the whole cell. If mRNA molecules were left free in the cytoplasm, all kinds of knotted structures would arise. There must exist therefore, some organizational principles that prevent mRNA molecules and DNA from becoming entangled. Several models, supported by experiments, postulate an arrangement where transcribed regions are present at the surface of a chromoid, in such a way that RNA polymerase does not have to circumscribe the double helix during transcription. Compartmentalization is important even for small molecules, despite the fact that they can diffuse quickly. In a B. subtilis cell growing exponentially in rich medium, the ribosomes occupy more than 15% of the cell's volume. The cytoplasm is therefore a ribosome lattice, in which the local diffusion rates of small molecules, as well as macromolecules, is relatively slow. Along the same lines, the calculated protein concentration of the cell is ca. 100±200 mg ml 1, a very high concentration. The translational machinery requires an appropriate pool of elongation factors, aminoacyl-tRNA synthetases, and tRNAs. Counting the number of tRNA molecules adjacent to a given ribosome, one conceptualizes a small, finite number of molecules. As a consequence, a translating ribosome is an attractor that acts upon a limited pool of tRNA molecules. This situation provides a form of selective pressure, whose outcome would be adaptation of the codon usage bias of the translated message as a function of its position within the cytoplasm. If codon usage bias were to change from mRNA to mRNA, these different molecules would not see the same ribosomes during the life cycle. In particular, if two genes had very different codon usage patterns, this would predict that the corresponding mRNAs are not formed within the same sector of the cytoplasm.
142
B a c i l l u s s u b tilis
When mRNA threads are emerging from DNA they become engaged by the lattice of ribosomes, and they ratchet from one ribosome to the next, like a thread in a wiredrawing machine (note that this is exactly opposite to the view of translation presented in textbooks, where ribosomes are supposed to travel along fixed mRNA molecules). In this process, nascent proteins are synthesized on each ribosome, and spread throughout the cytoplasm by the linear diffusion of the mRNA molecule from one ribosome to the next. However, when mRNA disengages from DNA, the transcription complex must sometimes break up. Broken mRNA is likely to be a dangerous molecule because, if translated, it would produce a truncated protein. Such protein fragments are often toxic, because they can disrupt the architecture of multisubunit complexes (this explains why many nonsense mutants are negative dominant, rather than recessive). There exists a process that copes with this kind of accident in B. subtilis. When a prematurely terminated mRNA molecule reaches its end, the ribosome stops translating, does not dissociate, and waits. A specialized RNA, tmRNA, which is folded and processed at its 30 end like a tRNA and charged with alanine, comes in, inserts its alanine at the C-terminus of the nascent polypeptide, then replaces the mRNAwithin a ribosome, where it is translated as ASFNQNVALAA. This tail is a protein tag that is then used to direct it to a proteolytic complex (ClpA, ClpX), where it is degraded. The organization of the ribosome lattice, coupled to the organization of the transcribing surface of the chromoid, ensures that mRNA molecules are translated parallel to each other, in such a way that they do not make knots. Polycistronic operons ensure that proteins having related functions are coexpressed locally, permitting channeling of the corresponding pathway intermediates. In this way, the structure of mRNA molecules is coupled to their fate in the cell, and to their function in compartmentalization. Genes translated sequentially in operons are physiologically and structurally connected. This is also true for mRNAs that are translated parallel to each other, suggesting that several RNA polymerases are engaged in the transcription process simultaneously, yoked as draft animals. Indeed, if there is correlation of function and/or localization in one dimension, there exists a similar constraint in the orthogonal directions. Because ribosomes attract tRNA molecules, they bring about a local coupling between these molecules and the codons being translated. This predicts that a given ribosome would preferentially translate mRNAs having similar patterns of codon usage. As a consequence, as one moves away from a strongly biased ribosome, there would be less and less availability of the most biased tRNAs. This creates a selection pressure
for a gradient of codon usage as one goes away from the most biased messages and ribosomes, nesting transcripts around central core(s), formed of transcripts for highly biased genes. Finally, ribosome synthesis creates a repulsive force that pushes DNA strands away from each other, in particular from regions near the origin of replication. Together these processes result in a gene gradient along the chromosome, which is an important element of the architecture of the cell.
Information Transfer
The DNA polymerase complex of B. subtilis is attached to the membrane. During replication, the DNA template moves through the polymerase. This might be caused in part by the formation of planar hexagonal layers of DnaC, the homolog of E. coli DnaB helicase. The B. subtilis chromosome starts replicating at a well-defined Ori site, and terminates in a symmetrical region, probably using a recombinational process to resolve the knotted structure at the terminus. This may account for the presence of horizontally exchanged genetic material (prophages in particular) near the terminus. Transcription in B. subtilis is similar to that in other eubacteria. The major RNA polymerase is a holoenzyme made up of four subunits (two as, b, b0 ) and a sigma factor. Eighteen sigma factors have been identified within the genomic sequence. Apart from s54, which is specialized for the control of nitrogen metabolism, the other ss specifically control specialized processes such as sporulation, stress response, or chemotaxis and motility. Translation in B. subtilis is typical of eubacterial translation. A new type of control of the synthesis of aminoacyl-tRNA synthetase was discovered in B. subtilis. Most aminoacyl-tRNA synthetase genes belong to the so-called T-box family of genes. They are regulated by a common mechanism of transcriptional antitermination. Each gene is induced by specific amino acid limitation; the uncharged cognate tRNA is the effector that induces transcription of the full-length message. The mRNA leader regions of the genes in this family share a number of conserved primary sequence and secondary structural elements, some of which are involved in binding the charged tRNA molecule.
Horizontal Gene Transfer and Phylogeny Three principal modes of transfer of genetic material, namely transformation, conjugation, and transduction occur naturally in prokaryotes. In B. subtilis there is not much evidence for conjugation processes (although DNA can be conjugated into the organism), but transformation is an efficient process (at least in
B a c i l l u s s u b t i l i s 143 some B. subtilis species such as the Marburg strain 168) and transduction with the appropriate carrier phages is well understood.
Bacillus subtilis Phages
An unexpected result that emerged from an analysis of the B. subtilis genome sequence was that it harbors at least 10 prophages or prophage-like elements. While the lysogenic SPbeta phage, as well as the defective PBSX and skin elements, was known to be present, no other phage had been identified. Many phages however can utilize B. subtilis as a host, in particular phi29, phi-105, SPO1, SPP1, beta 22 or SF6, but the details of their biology are generally not well documented. Bacteriophage PBS1, or the phages IG1, IG3, and IG4 can perform specialized transduction. Among the remarkable features of the phage genomes are the presence of introns or inteins, especially in genes involved in modulating DNA synthesis by the host. A three-dimensional reconstruction of phage phi-29 and its empty prohead precursor has been performed using cryoelectron microscopy. The head±tail connector, which is the central component of the DNA packaging machine, has been visualized in situ. The connector, with 12- or 13-fold symmetry, appears to fit loosely into a pentameric vertex of the head, a symmetry mismatch that may be required to rotate the connector to package DNA. An RNA molecule, pRNA, is required in the form of an hexamer to package DNA in the capsid.
Competence and Transformation
In addition to sporulation, B. subtilis enjoys another developmental process, i.e., competence, which can lead to genetic transformation. Interconnected regulatory networks control the initiation of sporulation and the development of genetic competence. These two developmental pathways have both common and unique features and make use of similar regulatory strategies. This explains why, before the genome of B. subtilis was sequenced, the vast majority of experiments using this organism were dealing with these processes. Quorum-sensing, used by cells to monitor local cell density, controls the transformationcompetence of B. subtilis. This control system is part of the 11 phosphorylation cascades comprising a regulatory aspartate phosphatase that have been discovered in the strain 168 genome.
Recombination
The presence in the B. subtilis genome of local repeats, suggesting Campbell-like integration of foreign DNA, is consistent with a strong involvement of recombination processes in its evolution. In addition, recombination must be involved in mutation
correction. It is therefore interesting to analyze the proofreading systems at the level of replication. In B. subtilis MutS and MutL homologs exist, presumably for the purpose of recognizing mimatched base pairs. But no MutH activity could be identified that would allow the daughter strand to be distinguished from its parent. It is therefore not known yet how long-patch mismatch repair corrects mutations in the proper strand. Excision of misincorporated uracil instead of thymine might be a general process that would not require extra information.
Restriction±Modification Systems
Bacillus subtilis strains contain many restriction± modification systems, mostly of type II, many of which were probably transferred from phages. The sequence specificities of several restriction±modification systems are known: BsuM (CTCGAG); BsuE (CGCG); BsuF (CCGG); BsuRI (GGCC); and BsuBI, which is similar to the PstI system. BsuC is a type I system, which is very similar to the ones found in enterobacteria.
Phylogeny
Bacillus subtilis is a typical gram-positive eubacterium. As such it is significantly more similar to Archaea than is E. coli. Many metabolic genes have a distinct archaeal flavour, in particular genes involved in the synthesis of polyamines, but it is rare to find genes in B. subtilis that are similar to eukaryotic genes. This led Gupta to propose that ancestral bacteria comprised a monoderm organism that diverged into grampositive bacteria and Archaea, and that gram-positive bacteria further led to gram-negative bacteria with their typical double membrane (diderms). This hypothesis stirred a very heated, but interesting, debate about the origin of the first cell(s). As such, bacilli form a heterogenous family of bacteria that can be split into at least five distinct groups. Bacillus subtilis is part of group 1 and is strongly linked to B. licheniformis (which is often found on the cuticle of insects), and to the group of animal pathogens formed by B. thuringiensis, B. cereus, and B. anthracis. In this classification B. sphaericus is typical of group 2, B. polymyxa of group 3, and B. stearothermophilus of group 5. The pathogen Listeria monocytogenes (in between groups 2 and 5) is related to B. subtilis, and, indeed, its genome has many features in common with that of the genome of B. subtilis. Accordingly, B. subtilis is an excellent model for these groups of bacteria.
Industrial Processes As a model organism, B. subtilis possesses most of the functions that one would expect to find in bacteria.
144
Bac kcross
It is an organism Generally Recognized As Safe (GRAS). This explains why it is a source of many products synthesized by the agro-food industry. Bacillus subtilis has often been thought to be a desirable host for foreign gene expression or fermentation and it is commonly used at the industrial level for both enzyme production (amylase, proteases, etc.) and food supply fermentation (Bacillus natto, a close parent of B. subtilis, is used in Japan to ferment soybean, producing the popular `natto'). Riboflavin is derived from genetically modified B. subtilis using fermentation techniques. For some time, high levels of heterologous gene expression in B. subtilis was difficult to achieve. Knowledge of the genome allowed identification of one of the major bottlenecks in this process: Although it has a counterpart of the rpsA gene, this organism lacks the function of the corresponding ribosomal S1 protein that permits recognition of the ribosome binding site upstream of the translation start codons. In general gram-positive bacteria have transcription and translation signals that must comply with rules much more stringent than do gram-negative bacteria. Traditional techniques (e.g., random mutagenesis followed by screening; ad hoc optimization of poorly defined culture media) are important and will continue to be utilized in the food industry. But modern biotechnology now includes genomics, which adds the possibility to target genes constructed in vitro at precise position, as well as to modify intermediary metabolism. As a complement to standard genetic engineering and transgenic technology, this has opened up a whole new range of possibilities in food product development, in particular allowing `humanization' (i.e., adaptation to the human metabolism and even adaptation to sickness or health) of the content of food products. These techniques provide an attractive means of producing healthier food ingredients and products that are presently not available or are very expensive. Bacillus subtilis will remain a tool of choice in this respect.
reconstructed pathway (i.e., no missing reaction) is an indicator of the correctness of the initial functional assignment. The core biosynthetic pathways of all 20 amino acids have been completely reconstructed in B. subtilis. However, many satellite or recycling pathways have not been identified yet. Finally, there remains at least 800 completely unknown genes in the genome of strain 168. Functional genomics is aimed at identifying their role.
Further Reading
Bron S, Bolhuis A, Tjalsma H et al. (1998) Protein secretion and possible roles for multiple signal peptidases for precursor processing in bacilli. Journal of Biotechnology 64: 3±13. Gupta RS (1998) Protein phylogenies and signature sequences: a reappraisal of evolutionary relationships among archaebacteria, eubacteria, and eukaryotes. Microbiology and Molecular Biology Reviews 62: 1435±1491. Kunst F, Ogasawara N, Moszer I et al. (1997) The complete genome sequence of the gram-positive bacterium Bacillus subtilis. Nature 390: 249±256. Perego M (1998) Kinase±phosphatase competition regulates Bacillus subtilis development. Trends in Microbiology 6: 366±370. Sonenshein AL, Hoch JA and Losick R (eds) (1993) Bacillus subtilis and Other Gram-Positive Bacteria: Biochemistry, Physiology and Molecular Genetics. Washington, DC: ASM Press.
Reference
Subti List Database. http://genolist.pasteur.fr/SubtiList/
See also: Archaea, Genetics of; Bacterial Genetics; Bacterial Transcription Factors; Codon Usage Bias; Escherichia coli
Backcross L Silver Copyright ß 2001 Academic Press doi: 10.1006/rwgn.2001.0100
Conclusion: Open Questions The complete genome sequence of B. subtilis contains information that remains underutilized in the current prediction methods applied to gene functions, most of which are based on similarity searches of individual genes. Methods that utilize higher level information on molecular pathways to reconstruct a complete functional unit from a set of genes have been developed. The reconstruction of selected portions of the metabolic pathways using the existing biochemical knowledge of similar gene products has been undertaken. But it often remains necessary to validate such in silico (using computers) reconstruction by in vivo and in vitro experiments. The completeness of a
Backcross is the term for a cross between a class of organisms that is heterozygous for alternative alleles at a particular locus under investigation and a second class that is homozygous for one of these alleles. The term is often used by itself to describe a twogeneration breeding protocol that begins with a cross between two inbred strains to produce F1 hybrid offspring (see F1 Hybrid). These F1 hybrid offspring are heterozygous atnumerousloci throughoutthe genome. The F1 organisms are `backcrossed' to organisms from one of the original parental strains to obtain a secondgeneration population of organisms in which segregation and assortment of alleles occurs independently