Structure and assembly of turnip crinkle virus

Structure and assembly of turnip crinkle virus

,I. Mol. Bid. (198ti) 191,625638 Structure and Assembly of Turnip Crinkle Virus I. X-ray Crystallographic Structure Analysis at 3.2 A Resolution J...

9MB Sizes 0 Downloads 59 Views

,I. Mol. Bid. (198ti)

191,625638

Structure and Assembly of Turnip Crinkle Virus I. X-ray Crystallographic

Structure Analysis at 3.2 A Resolution

J. M. Hoglet, A. Maeda and S. C. Harrison$ Department of Biochemistry and Molecular Biology Harvard University, 7 Divinity Avenue Cambridge, MA 02138, U.S.A. (Received 4 September 1985, and in revised form

21 April

1986)

The st,ructure of turnip crinkle virus has been determined at 3.2 A resolution, using the electron density of tomato bushy stunt virus as a starting point for phase refinement by non-crystallographic symmetry. The structures are very closely related, especially in the subunit arm and S domain, where only small insertions and deletions and small co-ordinate shifts relate one chain to another. The P domains, although quite similar in fold: are oriented somewhat differently with respect to the S domains. Understanding of the structure of turnip crinkle virus has been important for analyzing its assembly, as described in an accompanying paper.

1. Introduction The three crystalline RNA plant viruses for which high-resolution X-ray structures now exist, tomato bushy stunt virus (TBSVS; Harrison et al., 1978), southern bean mosaic virus (SBMV; AbadZapatrro et al.. 1980), and satellite of tobacco necrosis virus (STNV; Liljas et aE., 1982), have a number of striking features in common. The coat protein subunits of all three have similarly folded structures; TBSV and SBMV coat proteins have significantly related primary sequences; all three proteins have a flexibly linked, positively charged amino-terminal segment to interact with RNA. We report’ here the determination of the structure of a fourth crystalline plant virus, turnip crinkle virus (TCV’), at 3-2 A resolution. The study was undertaken in order to examine a structure anticipated in advance to be very closely related to TBSV and in order to visualize a particle for which in vitro reassembly experiments could be carried out in parallel wrth the crystallographic analysis. The homology of TBSV and TCV was evident in early work using X-ray diffraction at low resolution t present address: Department of Molecular Biology. Research Instit,ute of Scripps Clinic. La Jolla, CA 92037, I:.S.A. $ Author to whom correspondence should be sent. $ .\bbrrviations used: TBSV. tomato bushy stunt

virus: SKMV. southern bean mosaic virus; STNV. satcllit,r of t.obawo necrosis virw.1: TCV, turnip crinkle \-i,.US.

and electron microscopy

(Longley,

1963; Finch et

al.: 1970). The coat proteins are of similar size, and

proteolytic cleavage of expanded virions indicates that they have a similar domain organization (Golden &, Harrison, 1982). Comparison of 6” precession photographs of TBSV and TCV convinced us that the two transforms were sufficiently related for us to exploit the similarity by determining an initial phase set for TCV with a molecular replacement method and using the high non-crystallographic symmetry of the TCV crystals for phase refinement. This strategy has been successful, and our work illustrates it,s limitations as well as its merits. The TBSV structure, with which the present analysis begins, contains 180 copies of a coat subunit with 387 amino acid residues. We show in an accompanying letter (Stockley et al., 1985) that in TBSV and in TCV, two of these subunits are covalently linked to form a unique dimer, termed ~80. The polypeptide chain of each TBSV subunit folds into three domains (R, S and P), with an ext,ended arm connecting R and 8, and a short hinge peptide between S and P. The R domain (about 66 residues), spatially disordered because of a flexible connection to the rest of the subunit, forms an inward-projecting RNA binding site. The 35 residues that connect R and S are folded in a regular way on 60 of the subunits (denoted C), but t,hey are disordered on the remaining 120 subunits (A and H). The structure formed by the 6OC subunit connecting arms is important in regulating

J. M. Hogle et al.

626

assembly, as shown in the accompanying paper (Sorger et al., 1986). The S domains (170 residues) have an all p-sheet framework with a “jelly roll” chain topology. The structure of this domain has been described in detail, and the nature of intersubunit contacts has been analyzed. The P domains (110 residues) are pairwise clustered, forming t’he 90 projections that are strongly contrasted by negative staining for electron microscopy. P domains of C subunits project at a different angle from those of A and B. The hinge between S and P domains consists simply of alternative conformations of three or four residues where the polypep‘tide chain passes from one domain to the other. The structure of TCV, determined as described in this paper, has all the features of TBSV outlined above. We have obtained partial protein sequence information (Stockley, unpublished results), and we recently learned that Carrington et al. (unpublished results) have derived a complete primary structure for the coat protein from the nucleotide sequence of a cDNA clone. The TBSV and TCV sequences are clearly related, but they agree at only about 23% of the positions in a preliminary alignment that includes arm, S and P domains. The best agreement is in the S domain (30%), where an unambiguous correspondence can be obtained. The TCV structure as seen at t,he present stage of analysis enables us to interpret the assembly experiments of Sorger et nl. (1985) in terms of structural details now well understood from analysis of TBSV and SBMV. Because the TCV structure has been solved entirely by molecular replacement, the determination itself has a number of novel aspects that’ we present below in some detail.

2. Materials and Methods (a)

Virus

propagation

TCV was propagated in Chinese cabbage (Hrassica chinensis var. crispy choy). Seedlings (4 to 6 weeks old) were infected by rubbing the leaves with a slurry of Carborundum in a solution containing 3 mg TC\ in 100 ml of 10 mm-sodium phosphate buffer (pH 7.0). Leaves were harvested after 5 to 6 weeks growth, and the virus isolated as described by Leberman (1966). Typical yields were 0.5 g TCVjkg leaves? or about 1.5 g TCV/SO plants. Virus concentrations were monitored spectrophotometrically assuming Ai% = 50. The virus was stored as a, 3 to 4% stock solution in distilled water containing 0.01% sodium azide to prevent microbial contamination. (b) Crystallization The crystal form studied here was first obtained by R. Leberman (unpublished results). Well-ordered crystals could be grown only as the methyl mercury adduct; the corresponding native crystals have a complex packing disorder. The methyl mercury adduct was obtained by bringing stock solution of virus (3.576 TCV (w/v) in 0.01% NaN,) to 6 equivalent methyl mercury/protein subunit by addition of 15 rnM-methyl mercury nitrate and incubating for 1 h. Crystallization was then initiated by addition of an approximately equal volume of saturated

sodium citrate (pH 7.0) and allowed to proceed undist,urbed for 2 to 4 months. The optimum concentration of sodium citrate required t,o produce large crysbais varied from experiment to experiment. but was generall) in the range 42 to 4604 saturated. Crystals grew as truncated square prisms, the major form being the [lOO] zone. The truncation was caused b\ initiation of growth on the 110 or 101 face; termination of growth was irregular, causing crystals t,o have a rough

appearance at one end. The crystals belong to space group I222 with unit (~11 constants a = 348.8 A, b = 379.1 a. c = 397.4 8. Packing considerations dictate that the virus (point group 532) be located at the 222 positions (the origins of the hod?centered lattice). Thus the cryst,allographic asymmet~rlc unit contains l/4 of the virus particle. (c) Data

collection

Data were collected by oscillation (Arndt & Wonacott. 1977) using CuKcr radiation from an Elliot (ix-6 rotatinganode generator equipped with a 100 pm x 1 mm focusing cup and operated at 40 kV. 20 mA. The beam was focused at the film (100 mm crystal-to-film distance) using Franks mirror optics (Harrison, 1968), giving a beam diameter at the crystal of approximately 100 pm. Data were recorded on Kodak No-screen X-ray film. Although the useful exposure lifetime of the crystal (24 h) was only sufficient t’o give one photograph per position of the crystal. the small beam diameter enabled several photographs to be obtained from a single crystal by translation of the crystal between exposures. The final data set consist’ed of 88 films taken from approximat)ely 50 crystals. Each photograph covered a l/2” oscillation range and required a 12 to 18-h rxposurc~. Of the 88 photographs. 72 were t’aken with t,hr crystal mount,ed such that [Oil] was along the oscillation axis. These photographs sampled a range of 0 to 45” rotation. where 0” was defined as the position in which the [Oil 1 zone- axis was along the X-ray beam (see Fig. 1). In addition, 12 films and 4 films were taken with the cryst.al mounted such that [IOl] and [llO], respectively, were along the oscillation axis. This strategy gave a data set containing 310.000 of the 420,000 unique reflections t,o 3.2 a. The gaps, due primarily to poor films discarded but not retaken, were randomly distributed in reciprocal space: the 15.fold non-crystallographic redundancy more than compensated for this slight incompleteness. Opt.ical densities from t,he photographs were measured wit,h an Optronics PlOOO rot,ating-drum scanner under the control of a Digital PDP11/34 computer and the SCAN 12 program (Crawford. 1977). Films were scanned with a 50 pm raster and processed on-line. Prior to t’he scan. the crystal orientation for each photograph was refined using 12 manually indexed partially recorded reflections, as described elsewhere (Schutt. 1976). The refined parameters were used to calculate the film coordinates of reflections expected t,o occur in that film. Peak centroids were refined during the scan; they were allowed to shift by up to 2 raster units in each dire&on from their calculated positions. Reflections with centroids shifted by more than 2 rasters in either direction (typically less than 5T0 of all reflections in a film) were flagged as poorly measured. Spot and background areas for each reflect,ion were determined by masks as described elsewhere (Winkler et at., 1979), and measurements were caorrected for film and scanner non-linearity. On a typical phot,ograph 12.000 intensit,ies with a 3 Ak cutoff were measured. of which approximately 75”1~ were partialI> rrcaorded.

Structure of TCV

Table 1 Data collection statistics Films I
K,,, (all reflections) K,,, (fully recorded reflections onl;)t t K,,, = C(f,-z$Cf,. I

88 680,000 42O:OOO 310,000 0.18 0.12

1

The data from each film were processed by established techniques (Winklrr pt al., 19’i9). For each film, fully recorded reflect,ions were used to construct a plot of log (1) versus sin’ B/i,‘. Plots from several represent.ative films were used to derive a mean radial transform, and relative scale and exponential factors (the latter reflecting differential fall-off of int,ensity with resolution) were determined by fitting the transform of each film to the average. Pa&ally recorded reflections greater than 5094 were corrected to their fully recorded equivalent using the post-refinement procedure of Winkler et al. (1979). All data with d 2 32 !L from t’he 88 films were combined into the final data set of 310.000 unique reflections. Pertinent st,atist,irs are presented in Table 1.

627

“spikes” and the distribution of regions of strong and weak transform corresponded well, and we concluded that the two structures were sufficiently similar at 8 !I resolution (the limit of a 6” photograph) for computation at this resolution of initial phases for TCV from TBSV. We could not anticipate at that stage whether TBSV would be adequate for phase extension at higher resolution or not, or whether a small translation or reorientation of the subunit with respect to the icosahedral axes might be indicated by the maps obtained from lowresolution phase refinement. The packing diameter in TCV crystals is 325.5 8; in TBSV it is 331.5 A. The contact,s are along similar but not identical directions. Packing in TBSV is directly along 3-fold axes (the body diagonals of the cubic unit cell), whereas in TCV it is along a direction somewhat displaced from a 3-fold (t,he body diagonal of the orthorhombic but. nearly isodimensional unit cell). The smaller packing radius of TCV could be due entirely to differences in P-domain contact’s, to differences in the orientation of the P domains, or to an overall expansion. Our results show a combination of the first two differences, but no relative contraction of the S-domain shell of TCV with respect to TRSV.

3. Structure Determination (b) Outline of the structure determination (a) Simila,rity

of TRS V and TC V

The similarity of TBSV and TCV at low resolution was evident from analysis of electron micrographs and from three-dimensional image reconstruction of TBSV and TCV small particles ((lrowther & Amos, 1971). Although the space groups are different, qualitative features of the X-ray diffract’ion patt’erns enabled Longley (1963) to conclude that the continuous transforms were extremely similar t.0 at least 20 A resolut.ion. We caomparrd 6” precession photographs of the 2-fold projection of TBSV (space group 123, a = 383.2 A) and of t’wo of the three 2-fold projections of TCV (data not shown). The variation of intensity along

(a)

The similarity of TCV and TBSV was exploited to derive initial phases for TCV from the phases for TBSV. Alignment of the two structures, the initial step in this “molecular replacement” approach, was fixed by the space group symmet,ry. In bot,h cases, the orientation of the virus in the unit cell is determined by the coincidence of particle and crystallographic 2-fold axes (Fig. l), with a single ambiguity resolved by locating int’ensity spikes in diffraction pat.tern along directions of 3-fold and s-fold axes. Molecular replacement could thus be achieved by simple reconstitution of an averaged TBSV elect,ron density map on a TCV grid. Because TCV and TBSV are not identical. success

(b)

Figure 1. (a) Pscking of TCV part.icles (sphere and partial spheres) in t.he 1222 unity cell. Direct.ions of some icaosahedral axes (5 and 3) are shown. (b) Part of the TCV reciprocal lattice. showing directions important in oscillation data cdirction strategy.

J. M. Hogle et al.

628 Molecular replacement

1

Phase extenslon

I

Phase refinement

f FFT

P TBSV 0” d/5 8, grid

FTCV

TC”

MR ’ %R

26

-F,,f 2 Q ret

JI ,cJ;~“, d/5 TCV grid

Figure 2. Outline of the phase determination computations used in TCV structure determination. Programs named k? capital letters in Figs 2 to 5 are described by Bricogne (1976). Definitions and notations for Figs 2 to 5: Fo3observed amplitude: F,,, rMR, calculated amplitude and phase from molecular replacement: cl,,r. drer. phases, refined by noncrystallographic symmetry averaging. to a spacing d,,r: ~1,~~. d,,,. phases for extension. calculated from TBSV model to a spacing d,,,: U532. envelope of icosahedral asymmetric unit,: Ly1222. envelope for crystallographic asymmetric unit (not .‘folded”: see Olson ef nE.. 1983): T,, mth symmetry transformat.ion: I,.S,(I,,Z’,,,)., associat.ed indices: where to put density in coarse grid map (at integer grid point. I,) and where to fet,ch density interpolated from fine grid map (-\I,): in both cases. the point is folded back into the space-group asymmetric unit: there are 15 such records. corresponding to T,. M = I. 15, wit,h the same I,: a superscript s on A’, means sorted on .L/(the fine grid map is sectioned on y). FFT. fast Fourier transform. of the approach depends heavily on the use of noncrystallographic symmetry to refine the initial phases (Rricogne, 1974, 1976). Non-crystailographic symmetry has been shown to be effective in deriving refined phases from an initial set determined by multiple isomorphous replacement. (Bloomer et al., 1976: Harrison et al., 1978) or by single isomorphous replacement (Wilson et nl., 1981). It has also recently been used to extend as well as to refine isomorphous replacement. phases (Rossmann et al.. 1985; Hogle et al., 1985). Here we report t’he use of the 15.fold non-crystallographic redundancy in TCV crystals (Fig. 2) to derive phases from a st’arting set that depends on the similarity of two structures. In order to ensure the validity of the initial phasing model the phase determination was begun at low resolution. TBSV phases were applied to TCV data out to 10 ‘4 and refined. The phase determination was then extended in steps from 10 A to 8 A to 6.6 A to 5.5 A to 4 A and finally to 3.2 A by alternating cycles of phase extension and of refinement to convergence. Phase extension consisted simply of appending molecular replacement phases at high resolution to previously refined phases at lower resolution with steps chosen to result in an approximat,e doubling of t’he number of terms phased. Exa.mination of the map at each stage showed t,hat TBSV continued to be a reasonable model for phase extension, even at 3.2 A. The structure determination is shown in outline in Figure 2 and individual segments are outlined in

more detail in Figures 3 to 6 and discussed below. The programs used for all steps in the calculation were written by Bricogne (1976) and modified for use on a, Digital VAX 1 l/780 computer with 3 Mbyte core memory and 300Mbyte disks. The terminology used in the text, and flow chart follows that of Bricogne (1976). The major difference in these computations from previous applications of non-crystallographic symmetry refinement, has been in the use of 2F,-PC. coefficients to generate maps for averaging. With phases, a,; based on partial structures, Fourier synthesis with F,,, M, coefficients contained those elements of the structure not’ included in the phase calculation at approximat.ely l/2 weight. Fourier synthesis with coefficients of the t,ype nF,-(nl)F,, where n 2 2, can show all features at full weight, including density corresponding to parts of the structure omitted from the phase calculation. The ut.ility of maps based on such coefficients has been established for real-space phase refinement (Deisenhofer & Steigemann: 1975). (c) Generation

of envelopes

In TCV crystals, one crystallographic asymmetric unit) contains 15 icosahedral asymmetric units, or Proper averaging of an one quarter of a particle. local electron density about (nonmap requires crystallographic) symmetry elements truncation of the map by an envelope that contains those portions obeying the local symmetry. In the present case, the envelope must cover the l/4 particle described above; it will be called UZ222.

Structure of TCV This envelope as defined by the Bricogne programs is most conveniently generated from an envelope describing a single icosahedral asymmetric unit, denoted U532. Within a radius of approximately 140 8, U532 can be geometrically defined by the t’riangular wedge contained between two 3-fold axes and a 5-fold axis, the geometric icosahedral asymmetric unit (see Fig. 2 of Olsen et al., 1983). At larger radii, use of the complete icosahedral asymmetric unit about one lattice point as the envelope would result, in inclusion of density from and it is important to neighboring particles. describe correctly the interlacing P-domain projections. A procedure for so doing was described by Olson et al. (1983) for TBSV (see Fig. 2 of that paper). In ail phase extension steps, and in early stages of the phase refinement. U532 was the TBSV envelope (Olson et al., 1983). After the fourth cycle of phase refinement at 4 a (see below), a new U532 Within a radius of 140 A, for TCV was generated. this envelope was still defined by the limits of the geometric icosahedral asymmetric unit. Beyond 140 8, envelope sections were traced on a positionsensitive magnetic tablet from a 15-fold averaged (2F,--F,) electron density map at 4 a resolution (in which Fc and phases were the refined values from the fourth cycle at 4 A resolution). LJ1222 was generated from U532 as summarized in Figure 3. Each point wit)hin U532 was expanded by t’he 5-fold and 3-fold symmetry operations to generate a list, of points t,o be included within UZ222. These points were then sorted on their y coordinates and UZ222 constructed as y sections of a byte map on a TCV coarse grid. The TCV coarse grid was chosen to be d/2.5, where d is the resolut,ion limit of t,he stage of refinement in quest’ion (see below). (d) Generation

of phases from

629

problem of molecular replacement is reduced to the construction of an icosahedrally averaged TBSV electron dens&y map on a TCV grid. This calculation is outlined in Figure 4. A TBSV electron density map was first calculated to a resolution. d. using the observed TBSV amplitudes and refined phases (Harrison et al., 1978; Olson et al., 1983). The methods and programs described by Rricogne (1976) were then used to reconstitute an icosahedrally averaged version of this map on a grid appropriate to t,he TCV unit cell. A fast Fourier transform (FFT) structure factor calculation in the TCV space group then yielded calculated molecular replacement amplitudes (FTCv,MR) and phases large (%3,MR)~ In order to avoid an impossibly number of records for sorting at high resolution (there are, for example, some 26 million associated indices at 3.2 ,!! resolution), we modified the program GENERATE so that in one pass only those points I, for which the section index (y) of I, falls within specific limits are output and processed through map reconstitution. For example, a calculation in which the coarse grid asymmetric unit (I,) covers sect,ions y = 0 to y = 30 may be done in two phases. In the first pass. the out,put to GENERATE is restricted to t’hose indices, I,, for which y ranges from 0 t.o 15. Densities ran then be fetched at, Z,,X,(I,,T,,,) and a partial map reconstituted containing all densities in sections 0 to 15. The procedure is then repeated, restricting associated indices to I,,X,(Z,,T,,,) such that’ y of I, ranges from 16 to 30. This pass gives a partial map containing all the densities for sections 16 to 30. Appending the second map to the first. then yields a complete (“coarse-grid”) map for subsequent back transformation.

TBS 11

Because the position and orientation of the virus in the TCV unit cell are uniquely determined, the

ur222

Figure 3. Envelope generation. See Fig. 2 of Olson et al. (1983) for illustrations of these envelopes.

FTC” ,p” MR ’ MR

Figure 4. Molecular replacement. Flow diagram generation of starting TCV phases by reconstitution TBSV map on the grid of the TCV unit cell.

for of

J. M. Hogle et al.

630

phases were combined with observed amplitudes (F,,), and a 2F,,- F,,, cl,,, map was calculated on a “fine” (d,,,/5) grid covering the TCV asymmetric unit. Subsequent refinement by icosahedral averaging then proceeded as outlined in Figure 5. The process was iterated until the mean phase change/cycle had converged; usually four cycles were sufficient to drop the mean phase change per cycle to less than IO”. The calculations at high resolution were subdivided as described above, by making multiple passes with a restricted range for y in I, in each pass. (f) Resu,lts of phase rejnement

(e) Phase extension

At the initial stage of the calculation (performed at 10 a resolution) the molecular replacement phases were applied directly to the observed TCV amplitudes. At each subsequent molecular replacement state a set of refined phases (a,,,) has been determined for all reflections with a spacing greater than &(15 .& 2 d 2 dref). To extend phases to a higher resolution, d,,,, the phases tlref, were applied to all F, values with 15 a 2 d 2 dref and the molecular replacement phases, aMMR, were applied to all reflection with d,ef > d 2 d,,,. These “hybrid”

The course of refinement w-as followed b,v monitoring crystallographic R value the (R = (IF,,- F,I)/(F,)), the standard correlation coefficient: r = ((F,Fc) - (~W’c>)/ I @,2> - Kd2W’:> - (K>2)14 and the mean phase change per cycle. These are summarized in Table 2 and Figure 6, and the phase statistics after the final cycle at 3.2 A are summarized in Figure 7. In addition, icosahedrally averaged electron density maps of several types were calculated and inspected at t,he end of refinement at’ 5.5 A and after cycle 4 at 4 LA resolution. These included maps based on F,, aref coefficient’s. on 2F,- Fref, M,,~ coefficient’s, and on Fo- FM,, ~,ei coefficients. The latter syntheses were used to look for major differences between the TCV and TBSV structures, as an indication of whether structural features specific for TCV had indeed been generated by the refinement. One such indicator was whether density corresponding to the methyl mercury ions could be seen. The F, and 2F,- Fref maps at 5.5 a resolution

Table 2 Non-crystallographic

symmetry

phase refinement After final cycle

Resolution L‘u

Terms phased

First round (starting from TBS V model) 10 6502 8 15,692 6.6 30,569 5.5 54,651 4.0 143,951 3.2 262,052

No. of

3 4 4 4

6t 4

Srcond round (starting from TC V model in jirst-round 6.6 30,569 4 5.5 54,651 3 4.0 143,951 3 3.2 262,052 3

A$ (final cycle) (“)

R

C@S

A+ (overall) (7

0.364 0.335 0.346 0.342 0.347 0.355

0.600 0.665 0.653 0.669 0.657 0.669

11 8 9 7 9 10

43 47 54 56 56 60

0.314 0.327 0.330 0.358

0.709 0.700 0.684 0.689

7 10 9 11

53 45 42 45

map)

Definitions: R (crystallographic R factor) and r (correlation), see legend to Fig. 6. Ac$ (final cycle), phase change on the last cycle of non-crystallographic symmetry refinement at the resolution step indicated. A4 (overall), difference between initial and final phases for the resolution step indicated; initial phases were a combination of phases from the model (for the range of phase extension only) and the final phases from the previous resolution step. t New envelope for last 2 cycles.

Structure of TCV

-60 -50

g

-40

7 0 : G

-30 -20

a”

- IO I I

0.41

2 Cycle

I

I

3

4

Figure 6. I’ha,se statistics as a function of cycle number for rlon-c~r~stallographic symmetry refinement at 3.2 L% resolution. The crystallographic R factor (-X-X -). eorrelat)ion (~ O-O-),

and mean phase change per cycle R = (IFO-FCI)/(FO), where F, is derived from the symmetrized. enveloped map at each stage: (--0.

0 -) are shown.

r = ((F”bli’,)-(Fo)(Fc))j

l((F,2) - Q’o>%(F,2) - Q’c>2Hf.

looked quite similar to 5.5 a resolution maps of TBSV. Tn particular. we could easily recognize features such as the “extra” density near the 2-fold interface of C subunits at 120 fi radius (Winkler et nl., 1977) and the “saddle-like” features (Winkler et rrl., 1977) now known to represent, respectively, the arms of the C subunits and the inner surfaces of A. 13 and C-subunit S domains. In addition, the quasi3-fold relationship of the density remained strong, with the exception of a large peak at approximately 125 A radius, just above the “arm” of the C subunit, which was not present at corresponding sites on the A and B subunits. This feature was the highest peak in the $i-FhlR map, and it was t,entatively identified as a mercury site. The next strongest features of the F,- F,, map, a set of maxima on t.he outside surfaces of the P domains of

II

II II I I 1 I 0.01 0.02 0.03 0.04 0.05 006 0.07 O-08 0.09 0.10 3.75 3.5 3.25 10% 6 5 4

Figure 7. Statistics

for final phase-refinement

cycle at

The crystallographic R facstor ( x ) and correlation (0) are shown. These quantities are defined in the legend to Fig. 6. 3.2 .A as a function

all three subunits, at a radius of approximately 150 A; were also tentatively identified as mercury sites. The maps at 4 a resolution revealed strong feat’ures corresponding to the tentatively identified mercury sites; indeed the similarity among A, B and C subunits of the density at the putative P-domain site was significantly improved. These maps also indicate a strong similarity in the overall chain topology of TCV and TBSV, although differences in detail, especially in external loops, were indicated. Close inspection of the 4 ,& map revealed points where side-chains clearly differed between the TBSV and TCV structures. For example, density in the TCV 4 A map near the position of TBSV ArglO9 (C subunit), which bridges to 089, indicated smaller side-chain in TCV: a bridge of density appeared instead between sidechains positions of TBSV residues 89 and 163. In the same region, density at the position of TBSV Tyrlll was inconsistent with an aromatic sidechain. These results clearly indicat*ed that the refinement was yielding useful information about t’he TCV structure that had not been included in the molecular replacement model. Inspection of the 2F,-F, map after cycle 4 also indicated the presence of significant. densit,y at the edge and even outside the envelope used in This easily be averaging. density could encompassed by a more generous envelope without including portions of neighboring particles. A new? more generous, envelope was therefore generated as described above and summarized in Figure 3. This envelope was used in the final two cycles of refinement at 4 A resolution (cycles 5 and 6) and in all cycles of the refinement at 3.2 a resolution. The final R factor of 0.355 at 3.2 A is substantially poorer than the K = 0.20 achieved with TBSV (Olson et al., 1983). This difference probably reflects the poorer quality of the TCV dat’a (R,,, = 0.18 as compared to 0.13 for TBSV), since phasing with the TCV model did not substant’ially improve the R value, although it did improve the map.

(g) Initial

r ---.

4smz @/X2(if’) d(H)

631

of resolution.

re$ned electron density map at 2.2 .A resolution

A portion of the 3.2 ir resolution map obtained as just outlined is shown in Figure 8(a) and (b), with the C, traces of TBSV and TCV superimposed. This region has been chosen to illustrate how the TCV map shows significant differences from the course of the TBSV polypeptide backbone; that is, to illustrate the extent to which phase refinement reveals a new structure. With the TBSV backbone as an initial guide, a model for TCV was constructed using the program BILDER and an Evans and Sutherland Multi-picture system (Diamond, 1980). It was possible to obtain a continuous fit for the S domain backbone, although in several locations it was necessary to compare maps at homologous positions in A. B and C

J. M. Hogle et al.

632

subunits in order to place all C, atoms with The P domain presented greater confidence. problems. By assuming the topology of TBSV-P, we obtained an excellent fit to all clear density, but there were two regions of the map sufficiently poor for it to be necessary to leave discontinuities in the chain. One of these zones was clearly due to strong Fourier artifact around a mercury site, resulting from absence of mercury atoms in the TBSV model used for initial phases. In the C subunit, a further discontinuity in backbone density occurred between the folded arm and the S domain. Since the quality of the map was not as good as the 2.9 A TBSV map (Harrison et al., 1978; Olson et aE., 1983), we could not make a reliable guess at the TCV sequence, for which no information was then available. We therefore used the TBSV sequence as a guide. making changes only where clearly indicated by the presence or absence of prominent side-chains. This procedure gave a good fit to calculated density. (h) Further

phase rejinement

from

the initial

model

The model described above was built in C-subunit density! and transformations required to superpose S and P domains of the model onto A and B as follows. TBSV subunits were obtained transformations (Olson et aZ., 1983) were used as a first approximation, and the transformed model for each domain was compared with appropriate density using BILDER’ (Diamond, 1980) and an Evans and Sutherland Multi-picture system. Positions were determined for those C, atoms whose locations could be clearly recognized on the map. The t,ransformation required for the best agreement, of model C, atoms and targeted positions in density was then calculated using the method of Kabsch (1976). Co-ordinates for a model of the crystallographic asymmetric unit were generated from A, B and C subunits by appropriate 5fold and 3-fold rotations. Mercury atoms were included. Structure-factor calculations were performed with an FFT routine in 1222 (Ten Eyck, 1973; Olson d al.. 1983): using a model density map created from the co-ordinates by t,he program RHOGEPr’ (courtesy of S. J. Remington, adapted from EDCALC. see Ten Eyck, 1977). Phase refinement’ proceeded in resolution stages, precisely as described in the preceding section, using phases from the model as a starting point, rather than phases from TBSV. We began directly at 6.6 A resolution, rather than at 10 A. At, each point where the resolution was extended, refined phases were used out to the resolution limit of the previous set of cycles and model phases were used for extension. A new envelope was drawn for these calculations based on the model and on t,he first 3.2 A map. (i) I*mproved electron density map at 3.2 A resolution

A portion of the final map is shown in Figure 8(c). Comparison with the corresponding part of the

initial refined map shows that definite improvement has been introduced by the second round of calculations. In addition, regions around mercury atoms are much clearer, since these atoms were explicitly included in the model co-ordinate set. Shortly after this map was completed, a partial amino acid sequence of the arm and S domain became available (P. G. Stockley, unpublished results). This sequence corresponded to 20 residues at the S terminus of the 30 x lo3 &Z, fragment, generated by chymotryptic cleavage of dissociated TCV (Golden & Harrison, 1982), plus 40 further residues derived from cDNA sequences. The 60 residues were clearly identified in the map, and a full model for some of the region of Figure 8 is shown in density in Figure 9.

4. Discussion (a) TCI’

struckurr:

P, backbone

The quality of t’he final TCV electron density map permits a complete tracing of the polypeptide chain in the arm and S domain, and it leaves little ambiguity about residue positions, even in the turns and loops bet,ween /l strands. The map is not as dist.inct in all regions of the P domain, but with the TBSV domain as a guide, it, permits an almost, complete tracing of the polypeptide backbone. The general features of the TCV subunit, including the /l annulus, extended arm, S and I’ domains. are extremely similar t,o those of TBSV. Tn particular, t,he C, backbone in the arms and S domains of the two viruses are nearly identical. except for certain loops between fl strands. The 1’ domains are a.lso closely related, but, with some difference in overall orientation and some relative distortion of the fl sandwich. A comparison of C-subunit arms and S domains is shown in Figure 10. Both st,ructures in Figure IO(h) are referred to a co-ordinate frame with its origin at the center of the virus particle and its axes along d-fold symmetry axes; that, is, no transformation has been applied to superimpose one domain on the other. The root-mean-square (r.m.s.) difference for positions of corresponding (1, atoms in the framework of the /? roll is 1.2 A. The agreement is not significantly improved by translat’ion or reorientation of the subunits. This result shows that not onlv is the TCV S domain identical in fold to the TBSV k domain, but, that it is also packed in precisely t.he same way relative to thr icosahedral axes. The framework of the KBMV shell domain can likewise be superimposed on the S domains of TBSV and without significant displacement or TCV, reorientation. The fi annulus of TCV consists of three symmetrically related j3 segments of two antiparallel strands each. That, is. each chain contributes for only two of the three segments, as in SBMV, in contrast to the more extensively wound configuration in TBSV. An important similarit,y among all three structures, relevant to a discussion

(b

Figure 8. (a) A region region shown corresponds

of t’he initial refined electron density map of TCV, with the TBSV C, trace superimposed. The to residues 90 to 110 of the TCV C subunit, viewed from outside the S domain. looking toward the particle center (see Fig. 10). &POW shows TBSV loop not present in TCV: its densitp has been “erased” by phase rrfinement. (b) The same map region as in (a), but with TCV chain trace superimposed. (c) Corresponding portion of the final refined electron density map, with TCV chain trace. Arrow shows loop where TCV has 2 more residues than TBSV. iGot,r also t,he better definition of a prominent side-chain near t’he upper left of the Figure (TprlOS. see Fig. 9).

J. M. Hogle et al.

634

Figure 9. Model for TCV residues 95 to 105> showing fit, of side-chains to electron density. The phase refinement. starting originally from a TBSV model and at an intermediate stage from an approximate TCV model with guesses for side-chains, has produced a 3.2 A map that shows the actual sequence very well. The residues shown here were determined by amino acid sequencing from the 30 x lo3 Mr chymotryptic fragment, (P. G. Stockley, unpublished results) and confirmed by the full sequence derived from cDNA (Carrington et al., unpublished results).

of assembly mechanism in the succeeding paper this series, is the way in which

the extended

of arms of

C subunits interact as they fold back into the S domain (Fig. 11). The arms of 2-fold-related subunits pass by each other with several close sidechain interactions, reverse direction, and enter the S domain. The region of tight interaction, just across the molecular dyad, protrudes somewhat’ inwards. A comparison of TCV and TBSV of P domains is shown in Figure 12. The folded topologies are identical, and the C, positions of residues in the P-sheet framework can be superimposed with an

r.m.s. difference of 1.2 A. Significant reorientation of the TBSV P domain is required, however, in order to get the best agreement with the TCV model. This reorientation has the effect of altering the interface between P domains in a dimer, in the manner indicated in Figure 12. Note that the greater tilt of the P domain in TCV places the tip at a smaller radius, largely accounting for the smaller packing diameter. A subsequent paper in this series (Carrington et ~1.. unpublished results) will present a comparison of the TCV and TBSV coat protein sequences and

an analysis particles.

of subunit

(b) Determining

interactions

in the two

a structure from that of a homolog

similar in threeTBSV and TCV are remarkably dimensional structure, considering the difference in amino acid sequence, but there are a number of places in which

the structures

do not superimpose.

Tt is therefore particularly noteworthy that TBSV served as an adequate initial approximation for determining phases of TCV. The l&fold noncrystallographic redundancy in TCV is a significant factor. Initial phase approximations need only be slightly non-random for the computation to converge. It is significant that attempts to start directly at 6 a resolution were unsuccessful. Phase refinement produced only small shifts away from the TBSV phase start. In retrospect, the failure to converge may have been due to the considerable shift in P domain position with respect to the icosahedral axes (Fig. 12). Indeed, at points of maximum relative shift of TBSV and TCV P domains, one strand of a p sheet in TBSV superimposes on the adjacent strand in the P

representation. Figure 10. (a) C-subunit arms and S domains of TBSV (left) and TCV (right) in a “Richardson” Arrows indicate insertions in one subunit with respect to the other. (b) Stereo view of TBSV (darker line) and TCV (lighter line) C, backbones (arm and S domain of C subunit). superimposed to show similarity.

636

J. AI. Hoyle et al.

(b) Figure 11. S domains of the TCV C/C dimer, viewed normal to the dyad. The particle center would be at the bottom of the Figure and the outside of the particle at the top. Compare Fig. 1 of Sorger et al. (1986). (a) C, backbone. Arrows indicate where arms fold back, indicating the interaction between them. This interaction rauses the ,4/H to (‘/(” transition to he co-operative, since folding bark of one arm is not independent of folding back of the other. (h) Similar view with all side-chains. showing that the folded-back part. of t.he arm indicated in (a) protrudes significantly into t.he internal cavity of the TCV particle.

domain from TCV. It appears that only by starting at, a resolution where strands in a sheet were not at all resolved (10 A rather than 6 A) could the computation find the correct solution with TBS?’ as t’he starting point. Successful phase extension from 5.5 to 3 A without model phases has been accomplished since this work was completed in two cases of virus wit.h high non-crystallographic structures redundancy (Rossmann et al., 1985; Hogle et al., 1985). Success depended on use of very small increments, with extensive nonresolution crystallographic symmetry refinement at each step. This experience suggests that we might have been

able to restrict use of the TBSV structure to computation of starting phases at 10 A. Such “pure” phase extension would demand substantially more computational effort than we needed here, because the resolution intervals we were able to use in the present work were much larger than they would have been for extension without TRW model phases. Other instances of using one structure to determine another involve proteins more similar in sequence than the coat protein of TBSV and TCV. Tn the case of human fetal deoxyhemoglobin, for example, Frier & Perutz (1977) used phases from deoxyhemoglobin A to resolve the phase ambiguity

Structure of TC I/

637

lb)

Figure 12. Comparative views of the TBSV (left) and TCV (right) P domain dimer. (a) Viewed along x; (b) viewed along y; (c) viewed at 45” to z and y (i.e. along r = y). The small drawings at the bottom of (a) and (b) indicate these directions both for TBSV and TCV. The fold of the polypeptide chain is essentially identical in the 2 viruses, but the pairing of the domains is somewhat different. Arrows in (a) indicate how the TBSV P domains must, be moved in order to superimpose them in correct alignment on TCV P domains.

in a single isomorphous replacement map, but only about 14% of the total residues are different. Our second round of computations, in which we used a TCV-like model (with “guessed” side-chains) for

initiating non-crystallographic-symmetry phase refinement, certainly improved the clarity of the map. Nonetheless, we believe that if the correct sequence had been available, we could have built,

J. M. Hogle et al

638

at the end of the first round, a fully satisfactory model of the S domain and over half of the P domain. This work was supported by NIH grant CA13202 (to S.C.H.) and by a Charles A. King Trust Research Fellowship (to J.M.H.). We thank C. Steele for advice in program conversion and computation, R. Ladner for computer graphics, and A. Aruffo for assistance in the second round of phase refinement.

References Abad-Zapatero, C., Abdel-Meguid, S. S., Johnson, J. E., Leslie, A. G. W., Rayment, I., Rossmann, M. G., Suck, D. & Tsukihara, T. (1980). Nature (London), 286, 33-39. Amdt, U. W. & Wonacott. A. J. (1977). The Rotation Method in Crystallography, Xorth-Holland. Amsterdam. Bloomer, A., Champness, J. Ku‘.,Bricogne, G., Staden, R. & Klug, A. (1976). Nature (London), 267, 362-368. Bricogne, G. (1974). Acta Crystallogr. sect. A, 30, 395-405. Bricogne, G. (1976). Acta Crystallogr. sect A, 32, 832-846. Crawford, J. (1977). PhD thesis, Harvard University. Crowther, R. A. & Amos, L. (1971). Cold Spring Harbor Symp. Quant. Biol. 36, 489-494. Diamond, R. (1980). In Biomolecular Structure, Conformation, Function and Evolution (Srinivasan, R., ed.), vol. 1, pp. 567-588, Pergamon Press, Oxford. Deisenhofer, H. & Steigemann, W. (1975). Acta Crystallogr. sect B, 31, 238-250. Dorne, B. & Pinck, L. (1971). FEBS Letters, 12, 241-243. Finch, J. T., Klug, A. & Leberman, R. (1970). J. Mol. Biol. 50, 215-222.

Frier. ,J. A. & Perutz, M. F. (1977). J. Mol. Biol. 122, 97-122. Golden. *J. S. & Harrison, S. C. (1982). Biochemistry, 21. 3862-3866. Harrison, S. C. (1968). J. Appt. Crystallogr. 1, 84-90. Harrison, S. C., Olson, A. J., Schutt, C. W.. Winkler, F. K. & Bricogne, G. (1978). Nature (London), 276. 368-373. Hogle, J. M., Chow, M. & Filman, D. J. (1985). Science, 229, 1359. liabsch. W. (1976). Acta Crystal&r. sect. A, 33, 922. Leberman, R. (1966). Virology, 30, 341-347. Liljas, L.. Unge, T.. Jones, T. A., Fridborg, K.. Lovgren: S.. Skoglund, U. & Strandberg, B. (1982). J. Mol. Biol. 159, 93-108. Longley. W. (1963). Ph.D. thesis, Cambridge University. Olson, A. J.. Bricogne, G. & Harrison, S. C. (1983). J. Mo2. Biol. 171, 61-93. Rossmann, M. G., Arnold, E.. Erickson, J. W.. Frankenberger, E. A.; Griffith, J. P., Hecht, J. ,J.. Johnson,
Edited by A. Klug