Movable Finite Automata (MFA) models for biological systems I: Bacteriophage assembly and operation

Movable Finite Automata (MFA) models for biological systems I: Bacteriophage assembly and operation

J. theor. Biol. (1988) 131,351-385 Movable Finite Automata (MFA) Models for Biological Systems h Bacteriophage Assembly and Operation RICHARD L. THOM...

2MB Sizes 0 Downloads 17 Views

J. theor. Biol. (1988) 131,351-385

Movable Finite Automata (MFA) Models for Biological Systems h Bacteriophage Assembly and Operation RICHARD L. THOMPSON:~ and NARENDRA S. GOELt

Department of Systems Science, State University of New York, Binghamton, New York 13901, U.S.A. (Received 10 June 1987, and in revised form 25 November 1987) A new class of models, called Movable Finite Automata (MFA) models, is introduced. MFA models are physically realistic, but still share some of the features of cellular automata that make the latter easy to handle mathematically and computationally. They are found to be quite versatile in modeling the self-organization of biological systems. Their use in simulating the interaction of protein molecules in the self-assembly and operation of the T4 bacteriophage is described. The results of these simulations carried out on a microcomputer, are given.

1. Introduction

Hierarchical organization is prevalent in biological systems. An organ is made up of cells, where each cell is an organized form of matter including D N A , RNA, and proteins. These macromolecules are organized combinations of small molecules such as nucleotides, a m i n o acids and sugars, which in turn are c o m p o s e d of atoms. In m a n y cases, a biological entity can be broken apart into pieces which can be put together to obtain the original organized form. The process whereby some biological systems spontaneously develop a high level of organization is called self-organization. Understanding self-organization of biological systems has challenged m a n y investigators with diverse scientific interests. A set of basic principles which seem to be relevant to biological organization has been recognized (Goel & T h o m p s o n , 1986, 1988). These include the principles of sub-assembly, optimization (e.g., minimization of free energy), and conformational p r o g r a m m i n g (in which the information defining the final structure is built into the geometry and chemical affinities of the interacting units). Even with these principles given, it has been difficult to understand organization at the lowest level o f the structural hierarchy, presumably because of inherent complexity of biological systems. A well known unsolved problem is the reversible folding of globular proteins. At this stage of our understanding it is not clear whether we need a new a p p r o a c h for implementing the already known principles of selforganization, or new principles.

t To whom requests for offprints should be addressed. $ Permanent Address: La Jolla Institute, P.O. Box 1434, La Jolla, CA 92038, U.S.A. 351

0022-5193/88/070351 + 35 $03.00/0

O 1988 Academic Press Limited

352

R.L. THOMPSON AND N. S, GOEL

In this p a p e r we introduce a new a p p r o a c h leading to a new class of models which we call Movable Finite Automata (MFA) models. These models are biophysically realistic and can be easily implemented on a microcomputer. In section 2, we describe various types of quantitative models which have been used to study self-organization and biomolecular interactions. This description is used as a backdrop to introduce the M F A models. These models were conceived by us while attempting to simulate the interaction of protein molecules in the self-assembly and operation of the T4 bacteriophage which infects E. Coll. In this initial study we have made some simplifications in the architecture of the phage, namely, our model phage does not possess a head capsule and the neck which attaches the head to the tail. In section 3, we describe the basic model, which includes a description of cell walt and phage molecules (subunits), their mutual interactions, and their rules of movement (simulation dynamics). These molecular subunits are the finite a u t o m a t a of our model. In section 4, we describe a specific bacteriophage design, by which we mean a list of parameter values specifying the properties (dimensions, bond site locations, and interactive properties of these sites) of particular components making up the phage. We show that for this design, molecules will spontaneously self-assemble to form a complete structure which is like a phage contractile tail which incorporates a test molecule (representing the phage D N A ) , attaches itself to the cell wall, penetrates the wall, and places the test molecule on the other side of this wall. In section 5, we make a few concluding remarks and suggestions for future work involving the use of M F A models to simulate bacteriophages more realistically, and also other biological systems and processes.

2. Movable Finite Automata (MFA) Models The quantitative models for self-organization and biomolecular processes that have been studied by various investigators can be roughly divided into three categories as follows. (a) Physically Realistic Models. From the viewpoint of physics, the most realistic model possible would be one based on the analytical or numerical solution of the Schr/Sdinger equation. Unfortunately, inherent mathematical difficulties make this a p p r o a c h impractical for all but the simplest of molecular systems. A somewhat more tractable type of model can be constructed by making semiclassical approximations of the potential function of a molecular system, and then using various numerical techniques to study the behavior of the system under the influence of this potential. For example, m a n y attempts have been made to approximate the potential function of a polypeptide chain with a particular amino acid sequence and then use the numerical minimization of this function as a means for calculating the three-dimensional folded structure that a protein with that sequence will assume in nature (see Schulz & Schirmer, 1979; Rossman & Argos, 1981; G6, 1983).

BACTERIOPHAGE

ASSEMBLY

AND OPERATION

353

Although the strategy behind such calculations has much to recommend it from a physical point of view, there are some serious practical problems in implementing it. First, there is no direct way to measure the strengths of interactions between various atoms, and the functional dependence of the potential function on the locations of various atoms is hard to define. Second, the potential functions are nonlinear, involving a very large number of variables defining the locations of various atoms, and it is hard to find the global minimum of these functions. Third, these approaches require excessive computer time for the calculations for even a small protein (Levitt, 1983). This time increases sharply as the number of amino acids increases. Attempts have also been made to numerically simulate the dynamics of the interaction of complex molecules by constructing physically realistic potential functions and numerically integrating the corresponding equations of motion (Karplus & McCammon, 1983, 1986). In general, however, simulations of this kind are not very practical for complex systems due to the enormous amounts of computation which such simulations would require. (b) Models Based on Reaction-Diffusion Equations. In these models the concentrations of various chemicals are represented by continuous variables which in general are functions of position in three-dimensional space. The changes in the concentrations as a function of time are modeled by first-order differential equations containing reaction terms specifiying the rates of various chemical reactions and diffusion terms representing the migration of molecules through random thermal motion. In situations where the number of molecules of a given kind is so small that it cannot be adequately represented by a continuous variable, it is possible to introduce stochastic differential equations in which reactions are modeled as discrete events occurring with particular probabilities. The work of Nicolis & Prigogine (1977) and Eigen and his coworkers (Eigen et al., 1980; Eigen & Winkler-Oswatitsch, 1983) provide many examples of biological models based on reaction-diffusion equations. This type of model has been actively studied to explain cellular rearrangements. Here it is assumed that such rearrangement is determined by the spatial and temporal concentration of a certain chemical which is continuously produced and/or destroyed and which also diffuses in space (Gierer, 1981; Meinhardt, 1982). These models have been used in representing cellular rearrangements of slime molds (Segel, 1984), the spiral patterns of the sunflower head (Berding et al., 1983), and the wing formation of the common fruit fly (Kauffman, 1981). Although such models are often reasonably amenable to mathematical analysis and computer simulation, they suffer from the drawback that they cannot directly represent the three-dimensional structures generated by molecules. Since molecular structures are represented in such models simply by real (or integer) variables, each molecular structure which is formed must be explicitly represented by a variable. This makes such models difficult to set up in cases where very large numbers of intermediate structures are possible (for example, as is true for the protein folding problem). In such cases, a model representing structural relationships between intermediates may be needed to generate the needed reaction rate coefficients for

354

R.L.

THOMPSON

AND

N . S. G O E L

the approach. Thus, models based on reaction-diffusion equations are incomplete in general and may need to be supplemented by models that can explicitly deal with structure. (c) Cellular Automata Models. These models differ from the ones discussed in (a) in that they tend to lack direct physical realism. In general, a cellular automata model consists of an array of simple a u t o m a t a situated at the sites in a one-, two-, or three-dimensional lattice of integers. The automata all obey the same rules; each one can be in a finite n u m b e r of different states, and each one changes its state in accordance with information obtained from the automata which are its immediate neighbors in the lattice. Some examples of this kind o f model are C o n w a y ' s " g a m e of life'" (Berlekamp & Conway, 1982), models of aggregation of cells into tissues (Goel, 1978; Goel & Rogers, 1978; Rogers & Goel, 1978), and cellular automata as models of complexity (Wolfram, 1984a, b). The published proceedings of a recent interdisciplinary workshop provide a spectrum of applications of cellular automata (Farmer et al., 1984). Cellular automata models have the advantage that they can be readily analyzed and simulated on computers. As a result, they make it possible to raise and answer many questions that can, at best, be considered only vaguely in the absence of definite models. For example, by introducing his cellular a u t o m a t o n model, yon N e u m a n n was able to raise and settle (in the context of that model) the question of whether or not self-reproducing machines capable of arbitrarily complex functions can be constructed. Models of this type also make it possible to discover m a n y unexpected p h e n o m e n a through numerical experiments, as has been seen with Conway's game of life. The Movable Finite Automata ( M F A ) models are similar to cellular a u t o m a t a models, but they are endowed with rules of operation that mimic as closely as possible some of the key biophysical principles governing the interaction of biological macro-molecules, cells, and other natural subunits. These models are based on finite state automata that undergo discrete changes in a step-by-step fashion with the passage of time, and thus they share some of the features of cellular automata that make the latter easy to handle mathematically and computationally. The key feature allowing for greater biophysical realism in MFA models is that in these models the automata are allowed to move about and interact with one another. The nature of these models will become clearer in the next section where we describe their use for simulating bacteriophage assembly and operation.

3. Basic Model for Simulating Bacteriophage Assembly and Operation The T4 bacteriophage is a complex virus (Fig. 1) that consists of an elongated icosahedral head, made of protein, and filled with DNA. It is attached by a neck to a tail consisting of a hollow core surrounded by a contractile sheath and based on a spiked end plate to which six tail fibers are attached, The spikes and fibers affix the virus to a bacterial cell wall (Fig. 2). The sheath contracts, driving the core through the wall, and viral D N A enters the cell. The biogenesis of this phage is one of the most thoroughly studied biological assembly processes. In fact, this thorough

BACTERIOPHAGE

ASSEMBLY

AND

OPERATION

355

~J

HEAD

COLLAR TAIL TAIL

-FIBER

I END P L A T E

FIG. I. Structure of a T4 bacteriophage. (Adapted from Eiserling, 1983.) study led us to choose the T4 phage as the vehicle to develop MFA models, and the inspiration for our model came from the biophysical description of phage assembly and operation given below. (1) Various major components of the phage are subassembled and then combined to form complete viral particles. Further, the formation of these components involves a hierarchy of assembly operations, each of which proceeds in a particular order, and must be completed before subsequent operations can take place. Figure 3 shows this sequential subassembly. (2) The driving force for the aggregation of phage components is the minimization of free energy. (3) The principles of quasi-equivalence and conformational switching (Caspar, 1980) seem to play important roles. These principles allow for the design of a structure which will carry out specific operations according to a built-in algorithm, and thus they can be thought of as representing elementary instructions in a kind

356

R. L. T H O M P S O N

AND

N . S. G O E L

() ( ) J

OCCXDO O00~0000000

FIG. 2. Schematic representation of tail functions in T4 infection. The bar represents 10 nanometers. (Reproduced from Simon & Anderson. 1967.)

of molecular p r o g r a m m i n g language 'conformational programming). Briefly, the principle of quasi-equivalence states that biomolecular subunits will often be constructed so that they can bond together in a n u m b e r of different orientations (including variable bond lengths). Conformational switching is a related p h e n o m e n o n in which the capacity of a subunit to form bonds at one site depend on whether or not it has formed bonds at other sites. In our initial work, we have made some simplifications in the architecture of the phage. Most noticeably, our model phages neither possess the head capsule nor the neck which attaches the head to the tail. The main function of the head is to store the D N A molecule, and to channel it into the tail tube after the tube has penetrated the bacterial cell wail. To simulate this function, a standardized test molecule is introduced, which is to be injected through the cell wall. Thus in our model we neither address the assembly of the head capsule nor the structure, physical properties, and packing of the D N A molecule. Instead, we focus on the assembly of tail and the operation of the phage (penetration of the bacterial wall and the transfer o f phage D N A , into the bacterium). Specifically, the bacterium to be infected by the " p h a g e " is simulated by the following arrangement. An artificial "cell wall" m a d e up of specialized molecular subunits is set up. These rectangular subunits are able to slide relative to one another, but they are endowed with sticky surfaces which enable them to adhere together to form a stable wall. This wall divides a large box into two regions. The test molecule representing phage D N A is placed in one region, along with a n u m b e r o f " m o l e c u l e s " representing the proteins specified by the phage genes. These molecules are placed in the region in r a n d o m locations. If they are able to assemble together to form a structure incorporating the test molecule, then this structure is ~tllowed to interact with the cell wall, and possibly penetrate it in the m a n n e r of an actual T4 phage.

BACTERIOPHAGE ASSEMBLY AND OPERATION

357

7"4 MORFHOGEN£SIS

HEAD !

!

0 TAIL

i

0 xa

TAIL FIBER

FIG. 3. A representation of the pathway for T4 assembly. (Adapted from Wood, 1980.)

Basically our model consists of three parts: (a) General specifications for the subunits of the phage and bacterial cell wall. This includes assumptions about the shapes and sizes of subunits, and rules governing their internal conformational changes and mutual interactions. (b) Rules for movement of subunits under the influence of mutual interactions and thermal agitation, consistent with biophysical laws. (c) A specific design for the phage. Such a design is defined by a list of parameter values specifying the properties of the component molecules making up the phage. This takes the form of a sequence of numbers comparable to the genetic coding of an actual phage. Our motivation for the development of a specific model came from the biophysical description of tail assembly, and the process of adsorption and penetration, obtained

358

R. L. T H O M P S O N

AND

N . S. G O E L

through careful experiments by a large n u m b e r of investigators. This description is given in T h o m p s o n & Goel (1985) and briefly in Appendix l, where we describe a preliminary version of the model by assuming the molecules to be two-dimensional. We note that a suspension of pure phage can be maintained in the laboratory for long periods of time, consequently phages must be stable structures. A necessary physical condition for the stability of any structure is that it be in a state of minimum free energy. However, this minimum is not necessarily a global minimum; the phage may be situated at a local minimum of the free energy surface, and changes in conditions resulting from adsorption to the bacterial cell wall may change this surface, opening new pathways for the further release of stored energy. We will now describe the first two parts of our model and in the following section we will describe a specific design (part 3 of the model) for the phage and give some results of simulations. DESCRIPTION

OF THE SUBUNITS

The molecules (or subunits) making up the phage, the test molecule representing DNA, and the cell wall are represented by rectangular boxes. The n u m b e r of molecules and their shapes are kept fixed. Each molecule has a set of bond sites on its outer surface, and it is at these sites that one molecule binds with another. (In general, this set could possibly be empty, i.e., there may not be any bond sites.) Corners and edges are not allowed to be bond sites. These rectangular box-shaped "'molecules" correspond to the finite state automata of the model. We assume that the coordinates of the bond sites and the vertices of the molecules are always integers. Molecules are allowed to move only by unit steps along one of the coordinate axes (x, y, and z). We assume that all activity takes place within a box with 141 units on each side and centered on the origin. (That is, we require -70<-x,y, z - 70.) The sizes and numbers of molecules and the locations of bond sites on them are chosen so that if they are assembled appropriately; one will obtain the model phage structure as well as the bacterial wall. In the model we have chosen a fixed form for the bacterial wall molecules as well as for the test molecule r e p r e s e n t i n g the phage DNA. The specifications of the phage molecules can be varied, however, and therefore we can address the following question: Given the bacterial wall, the test

molecule, and the model's laws of molecular interaction, what phage designs are possible which will incorporate the test molecule into their structure during self-assembly, and later penetrate the cell wall, and place the test molecule on the other side? H o w the various molecules interact with each other is determined by a set o f rules chosen to emulate the biophysical interactions which are believed to occur between molecules (proteins). These "quasi-physical" rules of interaction are described below for both phage and cell wall molecules. (a) Cell Wall Molecules The cell wall is represented by a horizontal plate of cubes, each having three units per side. These tend to stick to their neighbors in the x and y directions to form an unbroken horizontal barrier (which is in the form of an 11 by 11 square

BACTERIOPHAGE

ASSEMBLY

AND OPERATION

359

of cubes, and is circumscribed on all four sides by fixed b o u n d a r y molecules). Further, each cube has two bond sites on its u p p e r surface (in the + z direction) which enable the phage to recognize and interact with the cell wall. (The rules for bonding are described in detail below.) The movable molecules are assumed to have anisotropic adhesive properties with respect to each other. The vertical surfaces of these molecules tend to adhere to one another with a force which increases as one molecule is displaced vertically from its equilibrium position into full contact with the other. They similarly tend to adhere to the fixed b o u n d a r y molecules. The interaction potential between two barrier molecules with vertical surfaces in contact is assumed to be as follows. Let their area of contact be defined by a rectangle extending H units vertically (in the z direction) and W units horizontally. The potential is then defined to be equal to Wf(H), where f ( H ) is - 4 for H = 3 and -3-667, -2.333, 0 as H is reduced to 2, 1, and 0. These values are chosen to ensure that it is energetically increasingly difficult to slide one molecule against another as they are m o v e d farther and farther from the equilibrium position. (This is akin to a simple harmonic potential.) The motive behind these potentials is that the barrier should elastically resist deformation but should not be so strong that the phage couldn't possibly break through it. When the horizontal sides of the barrier molecules come in direct contact with one another, they repel one another with a potential o f 4 units per unit of shared surface area. Also, when a barrier molecule is in direct contact with a phage molecule or a phage molecule is in direct contact with another phage molecule, the two molecules also repel one another with a potential of 4 units per unit of shared surface area. The physical justification of this rule is as follows. We can imagine that flexible molecular groups extend a short distance out from the surfaces of the molecules, and that these groups b e c o m e compressed when two molecules are brought into contact. This compression results in a force tending to push the molecules apart. In the model, this law of repulsion serves two purposes: (1) it allows one molecule to push another by coming in contact with it and then generating a force of repulsion that the other molecule tries to relieve by moving away; and (2) it also allows molecules to store up energy for later release through a mechanism c o m p a r a b l e to the compression of a spring. As noted in Appendix 1, energy for T4 tail contraction most likely comes from conversion o f the 144 ATP molecules, built into the tail sheath, into A D P molecules. This energy might directly power tail sheath contraction through the formation of bonds, and it might also power it indirectly by causing the breaking of bonds and thereby releasing stored tension. In the phage design discussed in the next section, we will, in fact, simulate cell wall penetration by means of a mechanism in which the formation of a series of bonds draws subunits together, thus causing the tail sheath to contract.

( b ) Phage Subunits As we have already indicated, each molecule may have a n u m b e r of bond sites on its surface. Any one of these sites x is described by a state vector b(x) which

360

R.L. THOMPSON AND N. S. GOEL

gives the b o n d length, site strength, configuration o f the site, and its 'rotational' orientation. A state vector is denoted as b(x) = L: VV : II : QQ

(1)

and this vector determines how a molecule will interact with another one if the interaction occurs at site x. Here L denotes the b o n d length, which is allowed to take two values, 0 and 1. This is done to simulate the concept of quasi-equivalence (i.e., to allow stretching or compression of a bond between two molecules if it leads to energetically favorable bonding elsewhere). VV is an integer which can take on values 0, l, 2 , . . . , 99. It denotes an arbitrary and relative scale of "strength" of the site. A higher value of W for a site indicates that a bond if formed at such a site, it will be stronger. II is an integer which takes on values 1, 2 , . . . , 198. It provides a label for the configuration of the site, and is used to determine whether or not two sites have " c o m p l e m e n t a r y " configurations, thus allowing them to form a bond. This label for a bond site may increase or decrease in steps of one due to changes in the configuration of other sites on the molecule (caused by the formation or dissolution of bonds). This change in label is introduced to emulate the conformational changes occurring in one part o f a protein molecule as a result o f changes made in another part, and to simulate C a s p a r ' s (1980) conformational switching concept. The fourth parameter, QQ = 1 , . . . , 8 is an orientation number. The molecules are allowed to rotate (in 90 ° steps) about any axis. As a result of such rotations, a particular pattern on one surface of a molecule can be m a p p e d into as many as eight distinct patterns. (These can be produced by mirror-imaging due to rotation about an axis parallel to the surface, plus rotation in steps of by 90 ° about the axis perpendicular to the surface.) As the molecules are rotated, the QQ values are updated to represent the current orientation of the site. The idea here is that two c o m p l e m e n t a r y sites will be able to form a bond only if they are properly oriented with respect to one another. In order to simulate reality as closely as possible, conformational coupling is allowed only between certain pairs of sites in each molecule. We assume that when the label of one site goes up, the label of the other site in such a pair must go down, and vice versa. Further, we assume that each site can take at most two label values, a basal value, I/, and a modified value which must be e i t h e r / / + 1 o r / / - 1. We make the following assumptions about the interactions between two subunits. (1) Two subunits cannot overlap one another. (2) As noted under the description of cell wall molecules, subunits generally repel each other when they come in direct contact. The potential associated with this repulsion is four times the size of the shared surface (contact area). One exception to this rule is that vertical surfaces of barrier molecules attract according to a rule akin to H o o k e ' s law. (3) Two bond sites on different subunits can form a bond only when their respective surfaces are parallel to one another, the two sites are directly opposite one another, and the orientation numbers, QQ, of the sites are also equal.

BACTERIOPHAGE

ASSEMBLY

AND OPERATION

361

(4) For two sites x and y, if b(x)= b ( y ) , a strong b o n d of strength 4VV will be formed if they meet criterion (3) above. A b o n d can also be formed if Ib(x) - b(y)l = 1 but it will be a weaker bond, of strength 2 V~. On th other hand if Ib ( x ) - b(y)l > 2, no bond will be formed even if the sites meet criterion (3). With these rules, it is obvious that two sites can bond only if their bond lengths (L) and site strengths (VV) are identical, their orientations (QQ) are equal, and their configuration labels (II) do not differ by more than 1. The idea behind these bonding rules is that the affinity for b o n d formation between two protein surfaces depends on how well their shapes match (the lock-and-key concept). (5) A total configurational energy, U, is defined for the complex of all molecules. This consists of the sum of all repulsive potential terms, minus the sum of the strengths of all bonds which have formed in accordance with (3) and (4). Thus, the formation o f bonds is energetically favored. Changes in the total complex of molecules are allowed only if these changes do not result in an increase of U. (6) One type of allowed change affects the b o n d site labels within a given molecule. This consists o f coordinated changes of bond site labels for sites of the molecule that are conformationally coupled. We assume that these coordinated changes will occur if two criteria are satisfied: (a) the total change in bond strength as defined in (4) above must be positive, and thus decreasing U, and (b) the total change in bond strength for b o n d site pairs satisfying (3) and having I b ( x ) - b ( y ) l = 1 must be positive. The rational behind this two-part rule for conformational change is that a bond of full or zero strength should not be subjected to a force tending to change its strength, whereas a half strength b o n d should be subjected to such a force. It should be noted that conformational coupling (which can decrease or increase the value of bond configuration labels), in conjunction with the above rules for bond formation, allows a strong b o n d to b e c o m e a weaker bond and vice-versa, and a weaker bond to break or to be formed between two u n b o n d e d subunits. Such reversible bonding between two proteins is a well-known and prevalent p h e n o m e n o n .

( c ) Simulation Dynamics The dynamics of the model involves three processes: (1) C o n f o r m a t i o n a l changes in bond site labels can occur in accordance with rule (6) above. We assume that these changes occur immediately, as soon as they are energetically allowed. (2) A maximal group of mutally bonded molecules is allowed to translate in randomly chosen unit steps in the x, y, or z direction if this does not increase the total energy of interaction, U. Here, by a maximal group, we mean a group formed by collecting together all of the molecules which can be reached from a given molecule by moving from one molecule to another along bonds. Such a connected group may consist of one or more molecules, and it is a natural unit for translational movements caused by molecular interactions and thermal agitation. (3) A maximally connected group can perform a randomly chosen clockwise or counterclockwise rotation by 90 ° about one o f the coordinate axes.

362

R.L.

THOMPSON

AND

N . S. G O E L

(4) A partial group of mutually bonded molecules is similarly allowed to translate by randomly chosen unit steps in the x, y, or z directions, if this does not increase the total energy of interaction. By a partial group of bonded molecules, we mean a set of molecules which is completely connected together by bonds, but which is also bonded to other molecules outside the set. The purpose of this rule is to allow for Free energy driven changes in shape within maximally bonded complexes of molecules. Clearly, there are m a n y ways to subdivide a large maximally connected group of molecules into partially connected groups. If the computer was required to continuously check all of them for the possibility that a random translation would be allowed, the computer time required would be prohibitive even For relatively small groups. To alleviate this problem, we restrict the class of partial groups which are allowed to translate. We only allow those partial groups which can be produced by splitting all of the bonds of a maximal group which lie on a chosen plane. In addition, we allow certain other designated bonds to be split during the production of a partial group. (We note that these bonds are not actually broken in the simulated molecular interactions within the model; we speak here of breaking bonds simply to define the groups of molecules that are allowed to translate.) The computer simulation of phage dynamics falls naturally into two parts: selfassembly and operation o f the phage. The principle behind the self-assembly process is that the phage molecules have been designed so that they can combine together only into certain configurations, and in a certain order. Given this design, which is based on patterns of bond site values and rules of conformational coupling, the phage should automatically self-assemble as a result of random collisions between molecules and maximal bonded groups o f molecules that have already Formed. However, to simulate this process by actually allowing these groups of molecules to execute random walks within the 141 x 141 x 141 simulate space would require excessive amounts of computer time. To alleviate this situation and make the simulation more efficient we use the following procedure. (1) Initially, we place the cell wall molecules in their natural positions so as to form a horizontal barrier dividing the finite simulation space into two parts. In the upper part of the space, enough subunits to Form one phage are randomly placed, with no overlapping of two subunits allowed. (2) The molecules in the cell wall are not allowed to move or form bonds until the phage has been assembled and a part of it has made contact with the wall. (3) We begin with a list of properties of all molecules. The set of properties includes the sizes, location of bonding sites, and bond state vectors (which determine how a molecule will link with others to form a complex). (4) We randomly select two molecules From the list and systematically match the bond sites on one molecule with the bond sites on the other in an effort to determine if a bond can be formed that does not lead to a increase in the value of U. As soon as such a possible bond is Found, the two molecules are deleted from the list, and the bonded complex that they have Formed is added to it. If no such bond is possible, the molecules are returned to the list. (We note that in evaluating the change in U

BACTERIOPHAGE

ASSEMBLY

AND

OPERATION

363

resulting from the formation of a bond, one must consider not only that bond, but also other bonds that may simultaneously form as well as areas where surfaces may be brought into mutual contact.) (5) We again randomly select two molecules or molecular complexes and check them for possible bonding as described in (3) above. This process continues until all the pairs of molecules are checked. This constitutes one iteration. (6) Initially and every 200 iterations, we place the molecular complexes in the list in random locations in the upper part of the simulation space. This is done by randomly selecting molecular complexes from the list and randomly translating them so that they lie within the boundaries of this region. For exach complex, this process of random translation is repeated until the complex lands in a position in which it does not overlap previously placed complexes. This is done so that a picture depicting the current state of self-assembly of the phage molecules can be produced. Once the complete phage is formed, this process o f trying to combine randomly chosen complexes is terminated. The completed phage is placed in proximity to the barrier, and normal molecular interactions involving single step translations of groups of molecules are allowed to take place. If the phage is properly formed, it will be able to reach the barrier by random walk, and then interact with the barrier in such a way as to place the test molecule on the other side.

4. A Phage Design and Results of Simulations We posed the following question: can the dimensions, bond site locations, and bond site codes of the phage molecules be specified in such a way that these molecules will spontaneously self-assemble to form a structure (a completed "bacteriophage") which incorporates a test molecule, and which will attach itself to the cell wall, penetrate the wall, and place the test molecule on the other side? The answer is yes. In fact, it is possible to design many different operational phage structures. Some of these designs closely mimic the T4 phage, and others are quite dissimilar. One of our designs is shown in Figs 4-11. The molecular parameter values defining this design are listed in Table 1. Our simplified architecture differs from the T4 phage in the following two aspects: (1) As noted in Section 3, the phage does not possess a head capsule (and the neck which attaches the head to the tail). However, to simulate the basic function of DNA injection, we have introduced in each model a standardized text molecule which is to be injected through our simulated cell wall. This molecule is taken to be a fixed chain of nine 4 x 3 x 3 boxes having a number of free bond sites. (2) They also do not possess flexible tail fibers, thus, we do not simulate the initial attachment of the fibers to the cell wall and their bending to allow base plate attachment to the cell wall. Instead, the base plate is allowed to directly make contact with the wall. Other features of these two phage designs were dictated by the overall idea of simulating a T4 bacteriophage as closely as possible and by the constraints imposed by the general laws of interaction chosen for the molecular subunits. The phage is

364

g . L. T H O M P S O N

AND

N. S. G O E L

BA2

BAI

BA3 TC TM TC TCN TC

J I¥1

FIG. 4. A three-dimensional bacteriophage design. The labels indicate types for selected subunits. (For specification of types of all subunits, see Table 1.)

SG1

SG2

SG3

SG1 SG2 CM SG3 CU

FIG. 5. Exploded view of the design in Fig. 4, s h o w i n g the central coil and s o m e of the contractile rings. The coil is c o m p o s e d of subunits CU and C L at its upper and lower ends, and 31 subunits of type CM. The rings are composed of 4 s u b u n i t s / r i n g of types SGI, SG2, and SG3.

BACTERIOPHAGE

365

ASSEMBLY AND OPERATION

SBlb SB3a CM

SB2a

DE

SBla

CL

TCN

LR11

LR12

LR1

Test Molecule

FIG. 6. Selected parts of the phage design of Fig. 4, including the top ring (including TCN), the test molecule (including DE), the last 4 subunits of the coil (including CL), and the base (including S G l a , . . . , SB3d and L R 1 , . . . , LR12).

composed of 40 different types of subunits, with from 1 to as many as 44 copies of each subunit type appearing in the complete phage structure. The bond sites and conformational change linkages of the subunit types are defined in Table 1 and are designated by code letters such as CL and CM. In Figs 4-6 the corresponding labels indicate the locations of some of the more important subunit types in the phage structure. In Figs 7-11 the relationships between different types of subunits are indicated in greater detail. (In these figures we have sacrificed three-dimensionality and distorted some of the subunit proportions for the sake of clearly depicting bond site relationships.) In Table 1, each subunit label is followed by a number in parentheses indicating how many identical copies of the given subunit appear in a completed phage. The second line of each group indicates the x, y, and z dimensions of the subunit. In the remaining lines, each 7-tuple in parentheses specifies a bond site of the subunit. These 7-tuples are of the form (L, W, / / ; QQ, S, X, Y), where L, VV, II, and QQ define the bond state vector, S = 1 , . . . , 6 indicates on which side of the subunit the bond site is situated, and (X, Y) specifies the site's location on this side. Conformational change linkages are indicated by connecting together bond site specifications with equals signs. For example, - ( A ) = (B)= (C) and + ( D ) = (E) mean that sites A, B, and C are connected by one conformational linkage a n d sites D and E are connected by another linkage; the signs indicate the initial direction of change allowed for the first site in the linkage. Figures 7-9 define the major pathways of self-assembly in this phage design. The assembly process builds the phage around the rod-shaped test molecule (which is

366

R.L. THOMPSON

TC

AND

N. S. G O E L

TC

TM

+0:50:33 p

+0:50:24

-0:50:35

4

_Z

-

:

:

~ -

+ 0:50:24

0:50:33 0:50:35

EC

/0:11:14

TM

TM

/0:8:17 + 41:1:21 0:50:35

-

-0:50:26~ TC

+ 0:50:24 -,P 0:50:26

1:1:23 1:30:12

d *0:50:24i ~ TCN

CM

TC

:-1:3°:61+1:30:4

"1

c.

-1:30:6

i [ ""% .1:3o:1o

:: CU

FIG. 7. Schematic diagram showing the conformational change linkages involved in the self-assembly of the top ring and the coil. Here, and in Fig. 8, the view is from the top of the phage (which includes subunits TCN, TC, and TM) looking toward the base (which includes subunits T C N , TC, and TM) looking toward the base (which includes B A I , . . . , B A 4 - - s e e Fig. 4). With this orientation, bond sites on the near side and top side o f a subunit are denoted by ~',~ and ® , respectively. Sites on the other four sides are denoted by 0 . (In Figs 7-11 the shapes of the subunits are s o m e w h a t distorted to a c c o m m o d a t e bond site labels.)

shown in perspective in Fig. 6 and from an end on view in Figs 7 and 8). One of two initial steps in phage assembly is the bonding of subunit EC to one end of the test molecule. This causes a conformational change in which the linkage +(1, 1,21, 1,4, 1,2) = (0, 8, 17; 1, 1,2, 1) of EC is activated, and the bond site labels 21 and 17 of these two bond sites change to 22 and 16, respectively. This linkage

BACTERIOPHAGE

- P0:95:52

SBlc

ASSEMBLY

AND

70:95:46

_0:51:156 I /

~ !

367

OPERATION

+~60

$B~-C0:51:148

+0:51;154 ~

SBld

-0:51:152 ~

I+0:51:146

~o:

_+0:51:150

.

~-0:94:52

SB3c

+ . • 0:50:152

-Y0:1;41 -~1D0:52:156 CM

SB3b

+1:30:4

0:50:146

-1:30:6

-1:30:6

I

DE

SB2b- ~:148

2:148

+1:5:26

+ t 0:52:146

1:1:1

CM

i

CL: +1:1:39

+1:3 .

+ ~lf0:50:154

SBlb +0:49:150 ~ % :

152

+0:49:146• y -0:49:148 SB3a

+0:97:60

SB2a

Z

~0:52:150

+1:5:27

-0:96:52

SB3d

Z

SBla _ 0:97:52

FIG. 8. Schematic diagram showing the key conformational change linkages involved in the selfassembly of the phage base. This view is from the same top down orientation as in Fig. 7. (Thus the view of the base is from the side opposite to that shown in Fig. 6,) is indicated in Fig. 7 in abbreviated form by the labels +1 : 1 : 21 and - 0 : 8 : 17, plus a line connecting them. As a result of the conformational change from 1 : 1:21 to 1 : 1 : 2 2 , it becomes possible for a bond to form between sites 1 : 1 : 2 2 on subunit EC and 1 : 1 : 2 3 on subunit TCN. This links these two subunits together, and allows for a further conformational change in which 1 : 1 : 2 3 changes to 1 : 1 : 2 2 and 0 : 5 0 : 2 4 changes to 0 : 50 : 25. By examining Fig. 7, one can see how a series of similar bond formations and conformational changes causes subunits to bond to the growing phage structure in the obligatory order: EC, TCN, TC, TM, TC, TM, TC, CU, CM, CM, . . . The growth of the "coil" formed from subunits CM could continue indefinitely, but it is blocked by the bonding of subunit CL to subunit DE at the end of the test molecule. Thus, the test molecule serves as a measuring rod which determines the length of the coil, and the length of the phage as a whole. We note that one of the standing problems in T4 self-assembly is to explain how the length of the tail tube is precisely determined. One hypothesis is that this length is determined by a measuring rod in the form of a possibly alpha-helical phage protein (Hendrix, 1985). Figure 8 indicates the conformational change linkages governing the formation of the base of the phage. Once CL bonds to DE, a conformational change enables

368

R. L. T H O M P S O N

TC

..,,=Q~

I

...

S G 1

...SG1

I

40

SG3

SG3

I

TCN

s Blb ISB3a

1:5:29 @

N. S. G O E L

I

I ~4' ,3o1 +,:~:27 -I T

+09,3, -1

TC

._JSG2 ~r,"L'~:5:29 J.SG1

+o9,.

~sJ

3 ...

..15G2 ~'5~s:29.[SG1 +o:9:138

e [ ~ -0:50:156 --4 e.-,~+0:49:150

..

AND

_.[SB2a

+1:5:27ele~...._0:49:156

3 ...

]I~-,-

+o:49:1541SB1a +o:s2:15o1 I~,.,~...

FxG. 9. Schematic diagram showing the conformational change linkages involved in the self-assembly of the contractile sheath. Here, and in Figs 10 and 11, the phage is seen from the side, with its top (including TCN) at the top of the diagram. CL to bond to the end of the growing CM coil, thus blocking its growth. This change also allows subunit SB2a to bond to CL. Starting with SB2a, the base subunits (with labels SB**) link together in counterclockwise order. Once SB3a has joined to the base, the change from 1 : 5 : 27 in SB2a to 1 : 5 : 28 is induced. This initiates a series of bond formations and conformational changes which result in the formation of the contractile sheath from 132 copies of subunits of types SG1, SG2, and SG3. The conformational change linkages governing sheath formation are as shown in Fig. 9. The conformational changes involved in formation of the small base ring of Fig. 8 also modify the b o n d sites shown on the periphery of this ring. These changes initiate the addition o f the large base ring subunits (LR*) and the base a p p e n d a g e s (BA*). It is important in this phage design to make sure that both the base small ring and the top ring (TCN, TC, and TM) are in place before sheath assembly begins. Figure 12 shows an aberrant, "teratologicai" phage structure that can develop if the sheath is allowed to grow beyond the base of the phage. In general, models of this type can be used to study the types of defects in phage assembly and operation that result from various modifications in the phage's "genetic code". One defect of this phage design is that if the concentration of base subunits is too high, the sheath can begin to enclose the region of the coil before the coil is complete. This is illustrated in Fig. 13. It would be interesting to know to what extent the assembly of actual phages are similarly dependent on subunit concentrations within the infected bacterium.

BACTERIOPHAGE

ASSEMBLY AND OPERATION

TM

369

TC

®1:80:32 ®0:99:108 ~

J

1:80:30/0:99:106

[ ~fT 1:80:30/0:99:106

® 1:3:68

CELL WALL

FIG. 10. Schematic diagram showing the conformational change linkages involved in the contraction of the sheath.

We note that this particular defect can be corrected by making the following changes in the "genetic code" in Table 1. CU and CM: CL: SB2a: TCN:

Add - ( 1 , 4, 9; l, 3, 3, l) = (1, 4, 7; 1, 1, 3, 1) Add - ( 1 , 4, 9; 1, 3, 3, 1) = (1, 1,180; 5, 4, 2, 3) Change +(1, 5, 27; 1, 3, 1, 2) = (0, 49, 148; 1, 4, 1, 1) to - ( 1 , 1,182; 1,5, 1 , 2 ) = ( 1 , 5 , 2 6 ; 1,3, 1 , 2 ) = ( 0 , 4 9 , 148; 1,4, 1, 1) Change (1,5,29; 2, 1, 1, 1) to -(1,5,29;2,1,1,1)=(1,4,7;1, 1, 1,4)

If these changes are made, a conformational signal will be propagated up the coil after both the small base ring and the coil have fully formed. This signal will initiate sheath formation by changing 1:5"29 in TCN to 1:5:28. Figure 14 shows the stage in this assembly process in which the sheath has just begun to form, starting at

370

R. L. THOMPSON

AND

N.

S. GOEL

TON

EC +1:12:18

-1:12:20

~,=

~0:11:14 -~ o:o:1~ Q 0:11:13

~Z 0:1:101

O 0:8:16

DM

~Z o:1:1Ol

LR6 o:1:1o4

~ --0:1:10S

0:1:101

SB3b I - ~ 0:1:103

FIG. 11. Schematic diagram showing the conformational change linkages involved in the release of the test molecule.

T C N . We note that by studying how improvements can be made in the phage by various genetic alterations, it may be possible to gain insights regarding the origin and evolution of phages and other similar biological systems. Even in this design it is possible, though not at all probable, that there will be an error in assembly. Thus, the growth of the coil is still terminated by the CL subunit. If this subunit is not in place by the time the coil reaches the end of the test molecule, then it will continue to grow without limit. The probability that this will happen is less than the probability that 31 CM subunits will join the developing structure before 1 CL subunit does (and thus it is also concentration dependent). Figure 15 gives a series of stages in the self-assembly of the phage specified Table 1. The first frame of this figure depicts the initial condition prevailing at the beginning of the simulation. In this initial state, the subunits defining the phage and the test molecule have been placed at random locations above the barrier in the 141 × 141 x 141 simulation space. The bond state vectors o f all molecules are assumed to have the values indicated in Table 1. However, the movable molecules of the barrier are

BACTERIOPHAGE

ASSEMBLY AND OPERATION TABLE 1

The molecular specificationsfor the bacteriophage design of figs 4-11 A. Coil. 1. Lower end. CL (1) 3 6 2 -(1, 1, 2; 1,2, 1 , 4 ) = ( 1 , 1,39; 5 , 4 , 2 , 2 ) = ( 1 , 3 0 , 7 ; 1, 5, 1, 1) 2. Main body. CM (31) 3 6 2 + ( 1 , 3 0 , 4 ; 1,2,2, 1 ) = ( 1 , 3 0 , 6 ; 1,5, 1, 1) 3. Upper end. CU (1) 3 6 2 + ( 1 , 3 0 , 4 ; 1,2,2, 1 ) = ( 0 , 3 0 , 6 ; 1,5, 1, 1)=(1,30, 10; 1,3, 5, 1) B. Large ring. 1. LRI (1) 2 3 6 (1,3,64; 1, 1, 1, 1) (1,3,68; 1, 1,2, 1) (1,3,64; 1, 1, 1,4) (1, 3, 68; 1, 1, 2, 4) (0, 97, 50; 1, 5, 1, 2) (0, 10, 116; 1, 2, 1, 1) (0, 10, 110; 1, 6, 1, 4) 2. LR2 (l) 2 3 6 (1,3,64; 1, 1, 1, 1) - ( 1 , 3 , 6 9 ; 1, 1,2, 1 ) = ( 0 , 4 , 7 0 ; 1,5, 1,2) (1,3,64; 1, 1, 1,4) (1,3,68; 1, 1,2,4) (0,97,44; 1,5, 1,4) (0, 10, 110; 1, 6, 1, 4) 3. LR3 (1) 2 3 6 (1,3,64; 1, 1, 1, 1) (1,3,68; 1, 1,2, 1) (1,3,64; 1, 1, 1,4) (1, 3, 68; 1, 1, 2, 4) (0, 97, 62; 1, 5, 1, 5) (0, 10, 110; 1, 6, 1, 4) 4. LR4 (1) 2 6 3 (1,3,64; 1, 1, 1, 1) (1,3,68; 1, 1,2, 1) (1,3,64; 1, 1,4, 1) (1, 3, 68; 1, 1, 5, 1) (0, 96, 50; 5, 2, 1, 4) (0, 10, 112; 1, 4, 1, 1) (0, 10, 110; 1, 6, 1, 1) 5. LR5 (I) 2 6 3 (1,3,64; 1, 1, 1, 1) (1,3,68; 1, 1,2, 1) -(1, 3, 65; 1, 1, 4, 1) = (0, 4, 70; 5, 2, 1, 4) (1,3,68; 1, 1,5, 1) (0,96,44; 5,2, 1,2) (0, 10, 112; 1,4, 1, 1) 6. LR6 (1) 2 6 3 (1,3,64; 1, 1, 1, 1) (1,3,68; 1, 1,2, 1) (1,3,64; 1, 1,4, 1) (1,3,68; 1, 1,5, 1) (0,96,62; 5,2, 1, 1) (0, 1, 104; 1,2, 1,2) (0, 10, 112; 1, 4, 1, 1) 7. LR7 (1) 2 3 6 (1,3,64; 1, 1, 1, 1) (1,3,68; 1, 1,2, 1) (1,3,64; 1, 1, 1,4) (1, 3, 68; I, 1, 2, 4) (0, 95, 50; 5, 6, 1, 4) (0, 10, 112; 1, 4, 1, 1) (0, 10, 114; 1, 5, 1, 1) 8. LR8 (1) 2 3 6 (1,3,64; 1, 1, 1, 1) (1,3,68; 1, 1,2, 1) - ( 1 , 3 , 6 5 ; 1, 1, 1 , 4 ) = ( 0 , 4 , 7 0 ; 5 , 6 , 1,4) (1, 3, 68; 1, 1, 2, 4) (0, 95, 44; 5, 6, 1, 2) (0, 10, 114; 1, 5, l, l) 9. LR9 (1) 2 3 6 (1,3,64; 1, 1, 1, 1) (1,3,68; 1, 1,2, l) (1,3,64; l, 1, 1,4) (1, 3, 68; l, l, 2, 4) (0, 10, 114; 1, 5, 1, l) (0, 95, 62; 5, 6, l, 1)

371

R. L. T H O M P S O N

372

A N D N. S. G O E L

TABLE 1 B. Large ring

(COnl.)

(cont.)

10. LR10 (1) 2 6 (1,3,64; 1, 1, 1, (1, 3, 68; 1, 1, 5, (0, 10, 114; 1, 5, II. LRll (1) 2 6 (1,3,64; 1, 1, 1, (1,3,64; 1, 1,4, (0, 10, 116; 1, 2, 12. LR12 (1) 2 6 (1,3,64; I, I, 1, (I, 3, 68; 1, 1, 5,

3 1) (1,3,68; 1, 1,2, 1 ) ( 1 , 3 , 6 4 ; 1, 1,4, 1) 1) (0, 94, 50; 1, 4, 1, 2) (0, 10, 116; 1, 2, 1, 4) 1, 1) 3 1) - ( 1 , 3 , 6 9 ; 1, 1,2, 1 ) = ( 0 , 4 , 7 0 ; 1,4, 1,2) 1) (1,3,68; 1, 1,5, 1 ) ( 0 , 9 4 , 4 4 ; 1,4, 1,4) 1, 4) 3 I) (1,3,68; 1, 1,2, 1) (1,3,64; I, 1,4, 1) 1) tO, 10, 116; 1, 2, 1, 4) t0, 94, 62; 1, 4, 1, 5)

C. Small rings. 1. Basal ring corners. a. SBla (1) 2 2 3 (0, 3, 118; 1, 1, 1, 2) (0, 3, 118; 1, 3, 1, 2) - ( 0 , 4 9 , 156; 1,4, 1, 1)=(0,52, 150; 1,5, 1 , 2 ) = ( 0 , 9 7 , 5 2 ; 1,6, 1,2) b. SBIb (1) 2 2 3 (0, 3, 118; 1, 1, I, 2) (0, 3, 118; 1, 3, 1, 2) -(0,50, 156; 1,4, 1, 1)=(0,49, 150; 1,5, 1 , 2 ) = ( 0 , 9 6 , 5 2 ; 1,6, 1,2) c. SBlc (1) 2 2 3 (0, 3, 118; I, 1, 1, 2) (0, 3, 118; 1, 3, 1, 2) - ( 0 , 5 1 , 156; 1,4, 1, 1)=(0,50, 150; 1,5, 1 , 2 ) = ( 0 , 9 5 , 5 2 ; 1,6, 1,2) d. SBId (1) 2 2 3 (0, 3, 118; 1, 1, I, 2) (0, 3, 118; 1, 3, 1, 2) - ( 0 , 5 2 , 156; 1,4, I, 1)=(0,51, 150; 1,5, 1 , 2 ) = ( 0 , 9 4 , 5 2 ; 1,6, 1,2) 2. General ring corners. SGI (44) 2 2 3 (0, 3, 118; 1, 1, I, 2) (0, 3, 118; 1, 3, 1, 2) -(0, 9, 144; 1, 4, 1, 1)= tO, 9, 138; 1, 5, 1, 2) 3. Basal ring, section 2. a. SB2a ( 1) 2 2 5 (0, 99, 108; 1, 1, I, 4) - ( 0 , 4 , 7 2 ; 1,6, 1, 1 ) = ( 1 , 8 0 , 3 0 ; 1,3, 1, 1 ) = ( 1 , 8 0 , 3 2 ; 1, 1, 1, 1) =(0, 99, 106; 1, 3, 1, 4) - ( 0 , 9 7 , 4 6 ; 1,6, 1 , 3 ) = ( 0 , 4 9 , 154; 1,2, 1, 1)=(1, 1,41; 1,5, 1,3) + ( 1 , 5 , 2 7 ; 1,3, 1, 2)=(0, 49, 148; 1,4, 1, 1) b. SB2b (1) 2 2 5 (0, 99, 108; 1, 1, 1, 4) -(0, 4, 72; 1, 6, 1, 1)=(1, 80, 30; 1, 3, 1, 1 ) = ( I , 80, 32; 1, 1, 1, 1) =(0, 99, 106; 1, 3, 1, 4) - ( 0 , 9 6 , 4 6 ; 1,6, 1 , 3 ) = ( 0 , 5 0 , 154; 1,2, 1, 1)=(0, 1,41; 1,5, 1,3) =(1,5,26; 1,3, 1, 2)=(0, 50, 148; 1,4, 1, 1)

BACTERIOPHAGE

ASSEMBLY AND OPERATION TABLE 1 (cont.)

C. Small rings (cont.). c. SB2c (1) 2 2 5 (0, 99, 108; 1, 1, 1, 4) -(0, 4, 72; 1, 6, 1, 1)=(1, 80, 30; I, 3, I, 1)=(1, 80, 32; 1, 1, 1, 1) =(0, 99, 106; 1, 3, 1, 4) - ( 0 , 9 5 , 4 6 ; 1,6, 1 , 3 ) = ( 0 , 5 1 , 154; 1,2, 1, 1)=(0, 1,41; 1,5, 1,3) = ( 1 , 5 , 2 6 ; 1 , 3 , 1 , 2 ) = ( 0 , 5 1 , 148; 1,4, 1, I) d. SB2d (1) 2 2 5 (0, 99, 108; 1, 1, 1, 4) - ( 0 , 4 , 7 2 ; 1,6, 1, 1 ) = ( l , 80,30; 1,3, 1, 1 ) = ( 1 , 8 0 , 3 2 ; 1, 1, 1, 1) =(0, 99, 106; 1, 3, 1, 4) - ( 0 , 9 4 , 4 6 ; t, 6, 1 , 3 ) = ( 0 , 5 2 , 154; 1,2, 1, 1)=(0, 1,41; 1,5, 1,3) =(1, 5, 26; 1, 3, 1, 2)=(0, 52, 148; 1, 4, 1, 1) 4. General ring, section 2. SG2 (44) 2 2 5 (0, 99, 108; 1, 1, 1, 4) - ( 0 , 4 , 7 2 ; 1,6, 1, 1 ) = ( 1 , 8 0 , 3 0 ; 1,3, 1, 1 ) = ( 1 , 8 0 , 3 2 ; 1, 1, 1, 1) =(0, 99, 106; 1, 3, 1, 4) +(0, 9, 142; 1, 2, 1, 1)=(1, 5, 29; 1, 1, 1, 2)=(1, 5, 27; 1, 3, 1, 2) =(0, 9, 136; 1, 4, 1, 1) 5. Basal ring, section 3. a. SB3a (1) 2 2 5 (0, 4, 120; 1, 1, 1, I) (0, 4, 120; 1, 1, 1, 1) +(0, 49, 146; 1, 2, 1, 1) = (0, 49, 152; 5, 4, 1, 1) =(0, 97, 60; 1, 6, 1, 3) -(0, 1, 105; 1, 6, 1, 2) = (0, 1, 101; 1, 3, 1, 4) = (0, 1, 103; 1, 1, 1, 4) b. SB3b ( 1) 2 2 5 (0, 4, 120; 1, 1, 1, 1) (0, 4, 120; 1, 3, 1, 1) +(0, 50, 146; 1, 2, 1, 1)=(0, 50, 152; 5, 4, 1, 1)=(0, 96, 60; 1, 6, 1, 3) -(0, 1, 105; 1, 6, 1, 2) = (0, 1, 101; 1, 3, 1, 4) = (0, 1, 103; 1, 1, 1, 4) c. SB3c (1) 2 2 5 (0, 4, 120; 1, 1, 1, 1) (0, 4, 120; 1, 3, 1, 1) +(0,51, 146; 1,2, !, 1)=(0,51, 152;5,4, 1, 1 ) = ( 0 , 9 5 , 6 0 ; 1,6, 1,3) -(0, 1, 105; 1,6, 1 , 2 ) = ( 0 , 1, 101; 1,3, 1 , 4 ) = ( 0 , 1, 103; 1, 1, 1,4) d. SB3d ( 1) 2 2 5 (0, 4, 120; 1, 1, 1, 1) (0, 4, 120; 1, 3, 1, 1) +(0, 52, 146; 1, 2, 1, 1)=(0, 52, 152; 5, 4, 1, 1)=(0, 94, 60; 1, 6, 1, 3) -(0, 1, 105; 1, 6, 1, 2)=(0, 1, 101; 1, 3, 1, 4)=(0, 1, 103; 1, 1, 1, 4) 6. General ring, section 3. SG3 (44) 2 2 5 (0, 4, 120; 1, 1, 1, 1) (0, 4, 120; 1, 3, 1, 1) +(0,9, 134; 1,2, 1, 1 ) = ( 0 , 9 , 140;5,4, 1, 1) -(0, 1, 105; 1,6, 1 , 2 ) = ( 0 , 1, 101; 1,3, 1 , 4 ) = ( 0 , 1, 103; 1, 1, 1,4) D. Top ring. 1. Corner segments. TC (4) 2 5 5 (0,99, 108; 1, 1, 1, 1) (0,3, 118; 1, 1, 1,4) (0,4, 120;4, 1,3,4) +(0, 50, 33; 1, 4, 1, 2)=(0, 50, 26; 1, 5, 1, 3)

373

374

R. L. T H O M P S O N A N D N. S. G O E L TABLE 1 (cont.)

Top ring (cont.) 2. Control segment. TCN (1) 2 5 5 (1,80,32; 2, 1,2, 1) (1,5,29; 2, 1, 1, 1) -(0, 1, 103; 2, 1, 4, 1)=(1, 12, 18; 1, 2, 1, 2) -(1, 1,23; 1,2, 1,3)=(0,50,24; 5,5, 1,2) -(0,50,35; 1,6, 1,2)=(0, 1, 199; 1,6, 1,3)=(1,30, 12; 1, 1,3,4) 3. Middle segments. TM (3) 2 5 5 (1, 80, 32; 3, 1, 4, 2) +(0, 50, 24; 1, 2, 1, 3)=(0, 50, 35; 5, 4, 1, 3) E. End cap. EC (1) 2 3 3 -(1, 12, 20; 1,4, 1, 1)=(0, 11, 14; 1, I, 1, 1) +(1,1,21;I,4,1,2)=(0,8,17;1,1,2,1) F. DNA molecule. 1. Main chain. DM (8) 4 3 3 (0, 11, 13; 1, 1, 1, 1) (0, 8, 16; 1, 3, 2, 1) (0, 11, 13; 1, 3, 1, 1) 2. Chain end. DE (1) 4 3 3 (0, 11, 13; 1,3, 1, It (1, 1, 1; 1,4,2, 1) G. Base appendages. 1. Group 1. BAI (4) 2 6 3 (I, 3, 64; 1, 1, 1, 1) (1, 2. Group 2. BA2 (4) 2 3 6 (1, 3, 64; 1, 1, 1, 1) (1, 3. Group 3. BA3 (4) 2 6 3 (1,3,64; 1, 1,4, 1) (1, 4. Group 4. BA4 (4) 2 3 6 (1,3,64; 1, 1, 1,4)( I,

3, 68; 1, 1, 2, 1) (0, 10, 110; 1, 5, 1, 1) 3, 68; 1, 1,2, 1 (0, 10, 112; 1,2, 1, 1) 3, 68; 1, 1, 5, 1) (0, 10, 114; 1, 6, 1, 1) 3, 68; 1, 1, 2, 4) (0, 10, 116; 1, 4, 1, 1)

The capitalized code letters (such as BA4) designate types of phage subunits, and they correspond to the labels in Figs 4-11. Each label is followed by a number in parentheses indicating how many identical copies of the given subunit appear in a completed phage. The second line of each group indicates the x, y, and z dimensions of the subunit. The remaining lines specify bond sites and bond state vectors, and they are explained in Section 4. a s s u m e d to be in a n o n - r e a c t i v e state. ( T h i s is d o n e to p r e v e n t p h a g e m o l e c u l e s f r o m b o n d i n g to the b a r r i e r p r e m a t u r e l y . ) In t h e s u b s e q u e n t f r a m e s o f Fig. 15 t h e a s s e m b l y p r o c e s s n e a r s t h e final s t a g e s h o w n in Fig. 4. O n c e t h e p h a g e is f u l l y a s s e m b l e d , it is a l l o w e d to m a k e c o n t a c t w i t h t h e cell wall, w h i c h is n o w a c t i v a t e d . T h e a t t a c h m e n t o f t h e p h a g e ' s l a r g e r i n g ( L R * ) to t h e cell w a l l t r i g g e r s c o n f o r m a t i o n a l c h a n g e s w h i c h p r o p a g a t e t h r o u g h t h e b o d y o f t h e p h a g e . A s a r e s u l t o f t h e s e c h a n g e s , s t r o n g b o n d s b e c o m e p o s s i b l e b e t w e e n t h e 12 rings m a k i n g u p t h e c o n t r a c t i l e s h e a t h . T h e s e r i n g s t h e n d r a w t o g e t h e r , f o r c i n g t h e c e n t r a l c o i l to p e n e t r a t e t h e cell wall. F i g u r e 5 g i v e s a n e x p l o d e d v i e w o f t h e tail,

BACTERIOPHAGE

ASSEMBLY

AND

OPERATION

375

showing the rings and central coil, and Fig. 10 illustrates the conformational change linkages involved in sheath contraction. Here, the formation of a bond between 1:3:68 in the cell wall and 1:3:69 in LR2 initiates contraction. The power for contraction is supplied by the formation of the 0-1ength bonds between sites having 0:99:108 and sites having 0:99:107 as a result of conformational change. Several stages in this contraction process are shown in Fig. 16. Once contraction has fully taken place, the linkages shown in Fig. 11 result in the release of the test molecule, which is now free to diffuse to the other side of the barrier. The key event here is a conformational change converting 0:11 : 14 in EC to 0:11:15, and thereby breaking its bond with 0 : 1 1 : 1 3 in the test molecule. Once this bond is broken, surface repulsion between EC and the test molecule overcomes the remaining bonds connecting the test molecule to the phage, and it is free to drift away. (This phase is not shown in Fig. 16.) Thus, as can be seen from Figs 15 and 16, this model does have the ingredients for simulating self-assembly and operation of a phage from its components. We have developed software which allows animation in color of the whole process of self-assembly and operation on an IBM P C / X T or AT compatible system, with an E G A graphics board and a high resolution monitor. Also, the 3-D phage design

FIG. 12. An aberrant stage in phage self-assemblyresulting from an error in the phage's "genetic code".

376

R. L. T H O M P S O N

A N D N. S. G O E L

-j

FIG. 13. A concentration-dependent error in phage self-assembly which can occur using the design of Table 1. Here the sheath begins to form before the inner coil is completed.

work was carried out with the aid of a display program that could make threedimensional drawings showing the relationships between groups of subunits. (This program was used to generate Figs 4-6.) In contrast, the earlier 2-D designs (Thompson & Goel, 1985) could be easily sketched on a piece of graph paper. We conclude our discussion by making a few observations regarding design of the phage. The design of the phage, which includes specification of the subunits (shape, size, bonding sites, and state vectors b(x) of various sites) is the most difficult and biophysically most rewarding aspect of these simulations. For certain designs, the phage will simply not assemble; still there are several designs for which assembly and wall penetration will occur. There are many interesting aspects of the designs, summarized below, which should be addressed further through systematic experimentation with many designs. (1) In designing the phage an effort was made to achieve as much as realism as possible. This was done by using compact subunits that are more like globular proteins than the elongated rectangles used in our earlier studies on two-dimensional phages (Thompson & Goel, 1985). Also an attempt was made to use arrays of identical subunits rather than requiring each subunit to have its own unique bond site configurations. Thus the 11 contractile rings visible in Figs 4 and 5 are all made of identical subunits, and each ring is made up of only 3 different subunit types. (See Fig. 9.)

BACTERIOPHAGE

ASSEMBLY

AND OPERATION

377

FIG. 14. A stage in self-assembly of the phage using a slightly modified design which avoids the error shown in Fig. 13. (See Section 4 for details.)

(2) The design of the 3-D phage is much more complex than any of the 2-D designs which were considered. This increased complexity may be partially due to the limitations imposed by our decision to use arrays of identical compact subunits. However, this complexity seemed to naturally emerge in the effort to construct a working 3-D model that would solve the predefined problem of self-assembly and barrier penetration. One general question that arises in this context is: what is the minimal complexity required for a natural biological system, given the problems it must solve to survive and the components (such as proteins) that are available to solve them ? (3) The data in Table 1 defining the phage can be compared to the genetic coding defining the protein components of an actual bacteriophage. By using arrays of identical subunits, one can reduce the total number of subunits that must be individually defined and thus reduce the volume of this "genetic" data. However, for a structure with arrays of identical subunits to self-assemble properly, provisions must be made to start and terminate the growth of the arrays at the appropriate points. This can be done by using "conformational programming" to build control algorithms into the growing structure. Thus, to reduce the amount of gross genetic coding, one must increase the logical complexity of the conformational interconnections in the phage.

378

R. L. T H O M P S O N

AND

N . S. G O E L

Iw

L I~ "dV

U'

v

I

d

~.dl FIG. 15. Computer-generated graphics representing the simulated self-assembly of the phage specified in Table 1. Nine successive stages are shown, leading up to the final stage shown in Fig. 4. (4) The need for complex conformational p r o g r a m m i n g led to the development of phage subunits that have as m a n y as 10 bond sites, nine of which are involved in conformational switching. This seems to be rather high for a realistic protein, and the n u m b e r could be reduced by further subdividing the subunit or distributing its functions a m o n g several other subunits. It would be interesting to know, however, how m a n y bond sites and conformational linkages are involved in the proteins of actual phages, such as T4. (5) There is a great deal of interdependence a m o n g the c o m p o n e n t s of the 3-D phage design, and this makes it quite difficult to make significant design modifications. Generally, if one c o m p o n e n t is changed, corresponding changes must be made in m a n y other components. (6) Although an effort was made to reduce the n u m b e r o f distinct subunit types to a minimum, it was found that specialized subunits were often needed. For example, although 11 of the contractile rings in the phage are of the type shown in Fig. 9, a twelfth specialized ring was required at the phage's basal end. (This is shown in Figs 8 and 9.) We found that it was natural to respond to these design requirements by using a process akin to the gene duplication found in nature. We would simply

BACTERIOPHAGE ASSEMBLY AND OPERATION

379

FIG. 16. Computer-generated graphics representing the simulated operation of the phage specified in Table 1.

make a copy of the specifications for one component, and modify them to make a new, specialized component. The result of this process can be seen in several places in Table 1. (7) We have observed that the assembly process will be impeded if the contractile sheath is allowed to form before completion of the coil. Here is another interesting example of the need to carefully control the assembly process. Initially it was thought that since a ring is a naturally self-limiting form, the various ring subunits should be allowed to assemble together freely, then completed rings would form, and these could be added to the structure one at a time. Unfortunately, it turned out that after all the ring components were bound in complexes, a number of just over halfcompleted rings remained. These partially completed rings could not combine with one another to form complete rings, and they tended to be incorporated into growing phage structures. Analysis shows that ring formation should therefore be required to proceed one subunit at a time, starting with an initial point of nucleation. (8) The operation of the 3-D phage involves a push directed against the barrier by the central coil, and a corresponding pull exerted by the base plate. We found that the base plate could easily be uprooted by this pull, thus defeating the efforts of the phage to penetrate the barrier. This tendency was overcome by the addition of the 16 base appendages, which increase the phage's grip on the surface of the cell wall. Several stages of the operation of the phage are given in Fig. 16. Our final observation concerns the amount of computer time needed to carry out the 3-D phage simulation. We found that this simulation required about 10 times as much computer time (typically 5 to 6 h on an IBM P C / X T compatible computer, equipped with a math-coprocessor) as the 2-D phage models. To reduce the amount of computer time required, we introduced the following shortcuts into the simulation

380

R. L. T H O M P S O N

AND

N . S. G O E L

process. First, no attempt was made to move partially bonded complexes that theoretically should not move, given the nature of the phage design. Second, since the barrier molecules directly underneath the central tube were likely to move much more frequently than others during phage operation, these were polled more frequently to see if a move was required. We note that one should be careful when imposing restrictions of the first type; the phage design often behaved in unexpected ways, and sections which were supposedly firmly anchored in place would sometimes break loose under strain. 5. Concluding Remarks and Future Work

In general, the value of simulating bacteriophage assembly and operation with MFA models lies in the insights this exercise can give into the molecular logic of these biological structures. Although models of this kind cannot represent molecular interactions on a detailed biophysical level, they can faithfully represent the logical steps in the conformational programming that governs the behavior of macromolecular complexes in biological systems. Although in this study we have simply experimented with plausible algorithms that simulate the broad features of bacteriophage assembly and operation, one could hope to devise models that fully duplicate these processes on the logical level. Of course, to do this, more detailed observational data are required. In addition, the model needs to be modified further to make it more realistic along the lines indicated below. (1) Software for interactive model development. In the present version of the model, one specifies the design of the phage and the rules for movement, and the state of the phage is displayed at predetermined intervals. An alternative and more attractive version is one in which a user can provide input to the model from the keyboard, see the results, and modify the model (by inputting from the keyboard) if the results are different than what he had expected. Such interactive software will obviously accelerate the development of a MFA model. It could also be used to generate a mapping between specifications of the model and the corresponding outcomes, and as a diagnostic tool. In addition, we expect this to be an invaluable tool for developing M FA models of not only bacteriophages, but also other biological systems. (2) Generalization to non-rectangular subunits. P r e s e n t MFA models use rectangular subunits. To make MFA models more widely applicable (e.g., for simulating assembly of the phage head) it will be desirable to generalize the models to allow units to deform and take on other shapes (e.g., ellipsoidal). (3) Incorporating generalized rotational movement. Present MFA models allow only for translational motion and limited rotation o f the subunits (by 90 ° increments). The model should be generalized so as to allow for arbitrary rotational as well as translational movement. In general, it is difficult to represent rotations within the framework of a 3-dimensional rectangular lattice of integers. However, the following approach can be used to define MFA models with general rotation on a finite 5-dimensional lattice. Each point in a subunit is assigned coordinates (x, y, z, u, v), where (x, y, z) specifies position, and (u, v) specifies angular orientation. All of the

BACTERIOPHAGE

ASSEMBLY

AND

OPERATION

381

points defining a subunit must have the same (u, v) .value. Translation o f a subunit is achieved by changing its (x, y, z) coordinates in the usual way. Rotation is achieved by changing (u, v) to (u', v') for each point in the subunit, where we assume a finite set of possible (u, o)'s giving a more or less uniform selection of possible orientations. This translation in the (u, v) dimensions has the effect of rotation when interactions between subunits are calculated: when two subunits are to interact, they are first rotated through their (u, v) angles (using stored rotation matrices for efficiency) and properly positioned in 3-dimensional Euclidian space. This is also done when structures are displayed on the computer monitor. Here, the objective is to represent 3-dimensional movement as fully as possible, while satisfying the requirement that all transformations must occur by discrete steps. If rotational movement is introduced into the model, presumably it will be appropriate to introduce non-rectangular molecular shapes. (4) Step size of the movement. If the protein unit is allowed to make a long-range jump (as is presently the case), the rate of self-organization is accelerated. However, this may introduce some systematic distortion in the course of events. For example, if a molecule moving in the step-by-step can reach a site in a given complex only by negotiating a narrow channel, then it may be much less likely to do so in this mode than it will in the long-range jumping mode. It is important to consider this aspect to avoid local trapping and to optimize the efficiency in self-organization. (5) Assembly of capsule or head. The formation of the head involves an assembly process that is even more complex than that of the tail structure. Caspar (1980, pp. 103-104) points out that the Tn virus capsules possess the same type of highly symmetrical geometrical structure that is utilized in Buckminster Fuller's geodesic domes. These structures, which are called icosadeltahedrons, can be constructed from a combination of 60n identical subunits, each of which can appear on the surface in n different geometrical situations. According to Caspar, although the facets of a T4 (or, in general, Tn) bacteriophage are identical proteins or protein complexes, they are able to form stable contracts with other subunits in n different ways due to quasi-equivalence. Since the identical subunits of the T4 capsule come together in topologically different configurations in the course of capsule formation, it is hard to see how the capsule could be formed simply by the mutual recognition between randomly moving capsule facets. Indeed, Casjens & King (1975, pp. 567-574) cite evidence indicating that the capsules of T4 and other bacteriophages are constructed on the basis of a temporary system of internal scaffolding. In the case of T4, they also point out that covalent modification of the capsule proteins is carried out by viral enzymes at certain stages in the assembly process. Although one can see in principle how all o f these transformations can be regulated by conformational programming, the specific steps involved are presently unknown. Even less is known about the process whereby viral D N A is packaged (or "encapsidated") within the capsules (Casjens, 1985). (6) Designs for self-assembly and operations. The phage designs we have considered have several interesting aspects, summarized below, which need to be addressed further through systematic experimentation with many designs.

382

R.L.

THOMPSON

AND

N , S. G O E L

(a) What are the trade-offs between using identical subunits and allowing each subunit to have its own unique bond site configuration? Our experience suggests that for identical subunits, the functioning design is more complex. Here, provisions must be made to start and terminate the growth of the complexes o f subunits at the appropriate points by using "configurational programming" to build control algorithms into the growing structure. Since reducing the total number of subunits that must be individually defined could be taken to be equivalent to reducing the volume of "'genetic" data, it seems that this reduction must be accompanied by increasing the logical complexity of the conformational interconnections in the phage. (b) What is the extent of necessary interdependence among the components of the phage? We found that if one component is changed, corresponding changes must be made in many other components, and sometimes an entirely new design strategy had to be developed. (c) Is it necessary that certain parts of the phage be formed before others for a successful assembly, and could failure to do so lead to errors in assembly? The evidence we have suggests this to be the case. For example, if the 11 small rings (around the central tube) are allowed to be formed before the central tube, then the formation of the structure of tube surrounded by rings is blocked. (d) What are the relationships between various components of the phage and its operation ? Here we should point out that the simulation of phage operation involves a push directed against the barrier by the central tube, and a corresponding pull exerted by the base plate. As noted in the preceding section, we found that the base plate can easily be uprooted by this pull, thus defeating the efforts of the phage to penetrate the barrier. This tendency was overcome by the addition of the 16 base appendages, which increase the phage's grip on the surface of the barrier. (7) Implementation on distributed array processors. Though the present model is implemented on a readily available microcomputer, as the model becomes more realistic, it will no doubt require the use of mainframe computers or possibly even computers with parallel processing capability (i.e., a distributed array processor, in which large numbers of microporcessors execute operations in parallel). The basic form of MFA models makes it possible to readily implement them on such parallel processors. If each processor in the array keeps track of the state of a particular "'molecule" and obtains information from processors representing that molecule's current neighbors, then the interactions of all the molecules in the model during one time step can be carried out during one computational cycle of the computer. This should allow for rapid simulation of phage formation and functioning. MFA models have a relatively simple basic structure which is intended to represent biological systems on the level of logic and control (or software) rather than on the level of detailed physical interaction (hardware). Thus it is practical to study quite complex MFA models through computer simulation and mathematical analysis. Since MFA models must actually work when simulated on a computer, the exercise of constructing such models to illustrate biological hypotheses can provide a useful check on the viability of those hypotheses. Also, if it is found that certain programming steps seem to be necessary to construct a working MFA simulation of a

B A C T E R I O P H A G E ASSEMBLY A N D O P E R A T I O N

383

b i o l o g i c a l process, t h e n o n e can a r g u e that such steps a c t u a l l y o c c u r in n a t u r e a n d m i g h t be o b s e r v e d by biologists. B e c a u s e o f t h e i r c h a r a c t e r i s t i c s a n d flexibility, we s t r o n g l y b e l i e v e that M F A m o d e l s will h a v e w i d e a p p l i c a b i l i t y in s t u d y i n g s e l f - o r g a n i z a t i o n a n d e v o l u t i o n o f b i o l o g i c a l systems (e.g., f o l d i n g o f g l o b u l a r p r o t e i n s , b i o s y n t h e s i s o f p r o t e i n s , a s s e m b l y o f p r o t e i n s into cell m e m b r a n e s , o r i g i n o f m a c r o m o l e c u l a r m a c h i n e r y , etc.). In f u t u r e p a p e r s in this series a n d e l s e w h e r e ( G o e l & T h o m p s o n , 1988), we h o p e to p r e s e n t such a p p l i c a t i o n s o f M F A m o d e l s . This research was, in part, supported by a Biomedical Research Support Grant SO7RR07149-12, awarded by the Biomedical Research Support Grant Program Division of Research Resources, National Institute of Health, and a grant from NASA.

REFERENCES BERDING, C., HARBICH, T., & HAKEN, H. (1983) J. theor. BioL 104, 53.

BERLEKAMP, E. R. & CONWAY, J. H. (1982). Winning Ways for Your Mathematical Plays. London: Academic Press. CASJENS, S. (1985). Virus Structure and Assembly. Boston: Jones and Bartlett. CASJENS, S. & KING, J. (1975). Annu. Rev. Biochem. 44, 555. CASPAR, D. L. (1980). Biophys. J. 32, 103. EIGEN, M., SCHUSTER, P., SIGMUND, K., & WOLFF, R. (1980). BioSystems 13, 1. EIGEN, M. & WINKLER-OSWATITSCH,R. (1983). In: Structure, Dynamics, Interactions and Evolution of Biological Macromolecules. (Helene, C. ed.). p. 353. Boston: D. Reidel. Eiserling, F. A. (1983). In: Bacteriophage T4. Washington: Amer. Soc. for Microbiology. FARMER, D.,ToFFOL1, T., & WOLFRAM,S. (eds.) (1984). Cellular Automata. New York: Elsevier Science. GIERER, A. (1981). Phil. Trans. Ro3: Soc. Lond. B295, 429. GO, N. (1983). Annual Rev. Biophys. Bioeng. 12, 183. GOEL, N. S. (1978). Proc. Int. Syrup. on Math. Topics in Biology, Kyoto, Japan, ~ept. 11-12, p. 146. GOEL, N. S. & ROGERS, G. (1978)..L theor. Biol. 71, 103. GOEL, N. S. & THOMPSON, R. L. (1986). Int. Rev. Cytol. 103, 1. GOEL, N. S. & THOMPSON, R. L. (1988). Computer Simulations of Self-organization in Biological Systems. London: Croom Helm. HENDRIX, R. W. (1985). ln: Virus Structure and Assembb,, (Casjens, S. ed.). Boston: Jones and Bartlett. KARPLUS, M. & McCammon, A. (1983). Annual Rev. Biochem. 52, 263. KARPLUS, M. & MCCAMMON, J. A. (1986). Sci. Amer. 254, no. 4, 42. KAUFFMAN, S. A. (1981). Phil Trans. Roy. Soc. Lond. B295, 567. LEVITT, M. (1983). J. molec. Biol. 170, 723. MATHEWS, C. K., KUTTER, E. M., MOSIG, G. & BERGOT, P. B. (Eds.) (1983). Bacteriophage T4. Washington: Amer. Soc. for Microbiology. MEINHARDT, H. (1982). Models of Biological Pattern Formation. New York: Academic Press. NICOLIS, G. & PRIGOGINE, I. (1977). Self-Organization in Nonequilibrium Systems. New York: John Wiley. PRIMROSE, S. B. & DIMMOCK, N. J. (1980). Introduction to Modern Virology. New York: John Wiley. ROGERS, G. & GOEL, N. S. (1978). J. theor. Biol. 71, 141. ROSSMAN, M. G. & ARGOS, P. (19811. Annual Rev. Biochem. 50, 497. SCHULZ, G. E. & SCHIRMER, R. H. (1979). Principles of Protein Structure. New York: Springer-Verlag. SEGEL, L. A. (1984). Molecular Dynamic Phenomena in Molecular and Cellular Biology. Cambridge: Cambridge Univ. Press. SIMON, L. D. & ANDERSON, T. F. (1967). Virology 32, 279. THOMPSON, R. & GOEL, N. S. (1985). BioSystems 18, 23. WOLFRAM, S. (1984A). Nature 311,419. WOLFRAM, S. (1984b). Physica 10D, 1. WOOD, W. B. (1980). Quart. Rev. Biol. 55, 353.

384

R.L.

THOMPSON

AND

N . S. G O E L

APPENDIX 1

Biophysical Description of Bacteriophage Tail Assembly and its Operation In this appendix, we will provide an overview of the biophysical aspects of T4 bacteriophage tail assembly and operation. (The latter includes adsorption on the cellular wall and penetration of the wall to inject the phage D N A into the bacterium.) This overview is adapted from one given in T h o m p s o n & Goei (1985). More detailed reviews can be found in Casjens (1985), Mathews et al. (1983), and Primrose & Dimmock (1980). (A) Tail assembly. The tail includes the head-tail connector, the core to which it is attached, the surrounding sheath, and the hexagonal base plate. Tail formation first involves assembly of the hexagonal base plate, then polymerization of the tube and sheath subunits on the base plate, followed by termination o f the completed tube and sheath. The tail assembly processes are sequential; the proteins interact with each other only in a particular order. If a protein is missing due to mutation, the precursor structure accumulates, and the proteins involved in subsequent steps remain functional and unaggregated. Adding the missing protein to an extract of mutant-infected cells often results in the formation of complete viral particles. Sections of the base plate are formed by two different subassembly processes. In one o f these processes, the five major base plate structural proteins assemble sequentially into a complex that morphologically represents a 1/6th segment of the base plate. In another process, several proteins assemble into a complex representing the base plate's central section (plug). Six segments then combine with this section to yield a completed base plate. After the assembly of the plate, four more proteins interact with the base plate. One protein provides the site for tail fiber attachment, and a second one forms the short fibers that interact with bacteria. The other two proteins are then added sequentially to the base plate to prepare it for tail tube polymerization. The tail tube is produced by the successive addition of exactly 24 annuli. Each annulus is composed o f six protein subunits o f one size or six subunits o f a larger size. Once the formation of the tail tube has begun, the polymerization of the tail sheath also commences. This polymerization is completed by the action of a protein which forms a stable bond between the ends of the tube and the sheath and prepares the completed tail for the addition of the viral capsule. Casjens & King (1975, p. 578) note that tail proteins appear to be synthesized in a state in which they do not spontaneously assemble; rather, they are activated during the assembly process itself by incorporation into a substrate complex. The regulation of the entire pathway takes place by limiting reactive sites to growing structures during the assembly process. Caspar (1980) proposes that this orderly process of self-organization can be explained in terms of conformational switching of the protein subunits. (B) The process of operation-adsorption and penetration. With bacteria in liquid culture, the interaction between phage and bacteria most likely occurs by simple

BACTERIOPHAGE ASSEMBLY AND OPERATION

385

diffusion. If this is what actually happens, the rate of adsorption would be linearly proportional to the product of free phage and the bacterial concentration. Experiments measuring the adsorption rate as a function of these concentrations support the idea of such a proportional dependence. Because the head of a T4 phage is much larger than the tail, diffusion will tend to cause the tail to oscillate more than the head, making it more likely that the tail will collide first with the cell and attach to the cell wall. The initial attachment, or adsorption, is generally, reversible. It may be favored or inhibited by varying the concentration of certain ions in the medium, in particular, Mg 2+ and Ca 2+. Eventually the phages become irreversibly adsorbed so that their removal becomes impossible. Once attached, the long tail fibers which make the first attachment bend at their center, and the phage particle is apparently brought closer to the cell surface. When the base plate o f the phage is about 1 0 0 ~ from the cell wall, contact is made between the cell wall and the short pins extending from the base plate, and the phage becomes fastened to the bacterium. The way in which a phage penetrates the cell wall is a rather complex and fascinating story. Briefly, as noted above, the sheath of the phage tail is contractile and in the extended form consists of 24 rings of subunits surrounding a core. Each ring consists of six subunits of one size or six subunits of a larger size. Following adsorption, the tail contracts, resulting in a merging of small and large subunits to give 12 rings of 12 subunits each. The tail core, which is not contractile, is pushed through the outer layers of the bacterium with a twisting motion, resulting in the injection of the D N A into the cell. This mechanism has been likened to injection by a hypodermic syringe. It should be noted that there are 144 ATP molecules built into the sheath, and the energy for tail contraction most likely comes from their conversion to ADP. Once the viral D N A enters the bacterium, the cellular machinery o f the bacterium proceeds to construct viral proteins on the basis o f the information encoded in this DNA. Then a self-assembly process results in the formation of viral structures within the bacterium. Viral D N A is replicated (using viral enzymes) and packaged within the freshly constructed viral capsules, and these are joined to completed tail sections. Finally, a viral enzyme dissolves the bacterial cell wall, and the new viral particles are released into the external medium.