Structure of tomato bushy stunt virus

Structure of tomato bushy stunt virus

/. Mol. Hiol. (1984) 177, 701-713 Structure of Tomato Bushy Stunt Virus V.t Coat Protein Sequence Determination and its Structural Implications P...

1MB Sizes 0 Downloads 70 Views

./. Mol. Hiol. (1984) 177, 701-713

Structure of Tomato Bushy Stunt Virus V.t Coat Protein Sequence Determination

and its Structural Implications

P. HOPPER~$, S. C. HARRISON’

Massachusetts

Harvard

Institute

AND

R. T. SAUER'

1Department of Biology of Technology, Cambridge,

MA 02139, U.S.A.

‘Department of Biochemistry and Molecular Biology l,‘niversity, 7 Divinity Avenue, Cambridge, MA 02138, U.S.A. (Received

1 February

1984)

CVe report the chemically determined sequence of most of the polypeptide chain of the coat protein of tomato bushy stunt virus. Peptide locations have been determined by comparison with the high-resolution electron density map from X-ray crystallographic analysis as well as by conventional chemical overlaps. Three small gaps remain in the 387-residue sequence. Positively charged sidechains are concentrated in the N-terminal part of the polypeptide (the R domain) as well as on inward-facing surfaces of the S domain. There is homology of S-domain sequences with structurally corresponding residues in southern bean mosaic

virus.

1. Introduction The three-dimensional structure of tomato bushy stunt virus, determined crystallographically to a resolution of 2.9 A, has been described in some detail (Harrison et al., 1978; Harrison, 1980; Olson et al., 1983). The coat protein subunit folds into three domains (R, S and P), with a connecting arm between R and S. The N-terminal, inward projecting R domain is flexibly tethered to the rest of t’he subunit: it therefore cannot be seen in the electron density map. The S domains of the 180 subunits form a tightly bonded shell, from which the P domains project in pairwise clusters. The connection from S to P is a hinge, with one orientation in 60 of the subunits and another in the remaining 120. Figure 1 is a diagram that shows schematically how the subunits pack in the virus. A, B and C denote three similar but distinct packing environments. For a more detailed account, see Olson et al. (1983). The initial X-ray analysis relied on an amino acid sequence for the coat subunit, t Paper IV in this series is by Olson et al. (1983).

: Present address: Department of Microbiology and Molecular

Nology.

Tufts

Medical

School.

Hoston, MA 021 Il. I:.S.A. 0022-283S/H4/24070113

$03.00/O

701

(c)i 1984 Academic Press Inc. (London)

Ltd.

702

P. HOPPER,

(a)

S. C. HARRISON N

TBSV

a R .PC. 66

$5

AND S 167

h 5

R. T. SAUER P I,”

C

P ? ,*$ (b)

i?R

(cl

Fm. 1. Architecture of the TBSV particle. (a) Order of domains in the polypeptide chain, N terminus to C terminus. The number of residues in each domain, indicated by numbers below the line, is derived in part from some of the sequence results reported here. (b) Schematic view of folded polypeptide chain. (c) Arrangement of subunits in the particle. A, B and C denote the 3 packing environments for the subunit; outer surfaces of C subunits are shaded.

that was determined by inspection, from the electron density map. We report here the chemically determined sequence of most of the polypeptide chain. The relative positions of many peptides have been found by conventional overlaps, but several non-overlapping blocks have been placed by comparison with the X-ray sequence. Earlier conclusions concerning the character of inter-subunit bonds (Harrison, 1980), drawn from partial sequence data and from the X-ray structure, are upheld. Positively charged side-chains are concentrated in the R domain and on inward-facing parts of the S domain. There is significant homology of S-domain sequences with structurally corresponding residues in southern bean mosaic virus (Hermodson et al., 1982).

AMINO

ACID

SEQUENCE

OF TBSV

703

2. Materials and Methods (a) Purijcution

and chemical modification of the TBS Vt coat protein

Tomato bushy stunt virus, grown in Datura stramonium, was purified by differential centrifugation using standard procedures (Harrison & Jack, 1975). For alkylation of cysteine residues, 0.4 mg of [3H]iodoacetic acid was added to 180 mg of virus in 8 ml of 0.1 M-sodium bicarbonate (pH 8.2), 1 mM-EDTA, 6 M-guanidinium HCI. After 20 min at room temperature, an additional 13 mg of unlabeled iodoacetic acid was added, and the reaction was continued for 10 min. Upon dialysis of the reaction mixture against 0.2 M-ammonium bicarbonate, the protein precipitated; it was separated from soluble viral RNA by centrifugation. The protein was eventually solubilized after 3 treatments with maleic anhydride. For each treatment, the protein was dissolved in 0.2 M-sodium carbonate (pH 9), 6 M-guanidinium . HCl, and 30-mg portions of maleic anhydride were added to a total of 300 mg. During the maleylation, the pH was maintained at 9 by the addition of XaOH. After the maleylation, the reaction mixture was dialyzed into 0.2 M-ammOnium bicarbonate. (b) Proteolytic and chemical cleavage viral protein was cleaved with trypsin. S-carboxymethylated and maleylated staphylococcal protease V8, or chymotrypsin using the conditions described (Sauer et al.. 1981). Tryptic digestions were also performed on unmodified, denatured TBSV coat protein. For digestion with cyanogen bromide, whole virus was first lyophilized and dissolved in 70% (v/v) formic acid. Cleavage was carried out by adding a 40-fold molar excess of cyanogen bromide and incubating for 17 h at room temperature. Cleavage with hydroxylamine was performed on whole virus dissolved in 4 M-guanidinium . HCl. 0.1 M-sodium carbonate (pH 9), 6 M-hydroxylamine for 3 h at 37°C.

(c) Peptide purijcation Following

or chemical cleavage, peptides were generally separated by a ion-exchange chromatography, and high-pressure liquid chromatography. For gel filtration, columns of Sephadex G-75 or G-50 (superfine) were run in either 0.2 M-ammonium bicarbonate or in 20% (v/v) acetic acid. Ion-exchange chromatography on SP-Sephadex was performed in the presence of 7 M-urea, 0.1 M-acetic acid using a linear gradient of NaCl to elute bound peptides. Reverse-phase chromatography was performed using a 0.39 cm x 30 cm Waters PBondapak C,, column. Peptides were eluted using a linear gradient from 0 to 60% (v/v) acetonitrile, in 0.1% (v/v) trifluoroacetic acid, over a period of 1 h, at a flow-rate of 1.5 ml/min. During the purification steps, peptides were detected by U.V. absorbance or by reaction with ninhydrin after partial acid hydrolysis in 6 M-HCl. [3H]carboxymethyl cysteine-containing peptides were detected by scmtillation counting. combination

enzymatic

of gel filtration,

(d) Sequencing and analysis During purification, the homogeneity of peptide pools was determined by manual Edman endgroup analysis (Edman, 1960; Sauer et al., 1974). Samples greater than 85% pure by endgroup analysis, or samples containing approximately equal mixtures of 2 peptides were used for further sequence studies. Automated Edman degradations were performed using the Beckman 890C Sequencer and the 0.1 M Quadrol program as described

t Abbreviations used: TSSV, tomato bushy stunt virus; SBMV, southern bean mosaic virus; TCV. turnip crinkle virus; STNV, satellite of tobacco necrosis virus; WV., ultraviolet light; TPCK-trypsin, I, 1-tosylamido-2-phenylethyl chloromethyl ketone-treated trypsin. 2s

704

P. HOPPER,

S. C. HARRISON

AND

R. T. SAUER

(Brauer et al., 1975). Phenylthiohydantoin (PTH) amino acid derivatives were identified by gas-liquid and high-pressure liquid chromatography as described (Sauer et al., 1981). PTHarginine and PTH-histidine were identified by amino acid analysis after back hydrolysis in HI for 6 h at 150°C. Peptide samples for amino acid analysis were hydrolyzed under vacuum at 105°C in 6 M-HCl, 1% (w/v) phenol for 24 h. Analyses weie performed using a Durham D500 analyzer.

3. Results The unmodified or S-carboxymethylated TBSV coat protein was found to be exceedingly insoluble after disruption of the virus with denaturing agents. Solubilization of the protein was achieved only after repeated treatments with maleic anhydride. The alkylated, maleylated protein was digested in different experiments with trypsin, staphylococcal protease V8, or chymotrypsin. The unmodified protein was also cleaved with cyanogen bromide, and with trypsin. Several of the peptides resulting from these cleavages were also poorly soluble, but many of the soluble peptides were purified (see Materials and Methods) and subjected to sequence analysis by Edman degradation. The amino terminus of the intact protein is blocked, and thus hydroxylamine treatment, which cleaves predominantly at Asn138-Gly139, generated only a single unblocked fragment, which was sequenced without purification. The sequence information obtained for each peptide is listed in Figure 2. In cases where mixtures of two peptides were sequenced, independent knowledge of one of the sequences allowed deduction of the second sequence. Amino acid compositions for selected peptides are listed in Table 1. Accurate compositions were not obtained for several peptides, because of the presence of contaminating peptides at the 10 to 15% level. However, the heterogeneity in these cases was not severe enough to interfere with the sequence analysis. The sequence data can be aligned into four blocks, including residues 1 to 245, 259 to 268, 275 to 363 and 380 to 387. The relative positions of these noncontiguous blocks were determined by fitting the peptide sequences to the 2.9 A electron density map, which includes residues 67 to 387. The X-ray structure was also required for several alignments within the blocks. For example, chemical sequence overlaps were not obtained for the junctions between peptides MT6 and MT7 (residues 98-99) or between peptides MT7 and MT8 (residues 109-llO), but these overlaps were clear from the electron density map. The sequence of tryptic peptide MT1 was deduced from the following considerations. The composition of the peptide indicates single residues of Ala, Met and Arg and two residues of Thr (Table l), but this peptide, like the intact protein, is refractory to Edman degradation. Since a number of plant viruses are known to have acetylated N-termini (Harris & Knight, 1955; Harris & Hindley, 1961; Hermodson et al., 1982), it is likely that intact TBSV protein and MT1 are also acetylated. The composition of the intact TBSV protein contains three methionine residues, and two of these, Met79 and Met21 1, are positioned in blocks of continuous sequence (Fig. 2). A third methionine residue, Met2, was positioned on the basis of CNBr specificity prior to the Thr-Thr-Arg sequence that begins the CNl sequence (Fig. 2). Addition of ac-Ala to Met-Thr-Thr-Arg to give

AMINO

CM tnwwr3 SPl MT3hl-2 SP2 SP3 SF4 m C”2

ACID

SEQUENCE

705

OF TBSV

vLAAsAAvcALRnY.. . lTMWNLAVSXQLC NX”“VLAVSKKWXLAASAA~ALR ALxmc. NYICE.SP~SAVG~KA~ ssPALLqsAvcxxKK.. . AVChXXA”XVXUXXXQGXP... Lw.KALnKvxmxKQcI~IIT... KQG”WIITXVCS%CS~RAPVAVSR APVAVSRQLV... QLVCSKPKrnR TS>

NT7 nrs ’ (ITS 95 $6 cz/c12 HA c4 c5/c7 SF% SP6’/HT6 SP7 C6 SPR WI9

VGRVALTVDKDSQDPE FDKDSQDP..

. wmv-> VALTVDKDSQDPIPADSVELA,KV->

210

220

230

240

250

260

270

280

ml

300

W(~APY*EU~RI~RY~SA~~~~~IA~G------------~~P-------R~L~~A~~~VLT~... ->LKE -~LKETAPYAEAKLR TAPYAE AzAnL A=HLXI... AmRIFTDKVKRIC... IPTDKVKR CXDXAT.. . YCIDSA~KLIDLCQJ~~~~Y... IDLCPLCIATIG... LCWIATYG... SVTLVFPQP..

. RLDLTCSLADARZCTLVLTR LDLTCSUDATCPCVLVLT. ITT->

310

320

330

340

350

360

370

JS4J

VLTHTFIUTGT~LRCLTSLnlCmIWIUDILI\ID~CT~-C~~~A~V-----~---------S~LL HT15 C10 c11 MT16 c1uc2 MT17 SPl2LsP6 c13 04 Hr(S

-STT”TW THTF RATCTF A“CT”lLSCCLR RlscCLXCL CLTSLTLCA~AWIYDILlIDYVGT*SDIFLWCn... ILAIDmcTA .. FL”CTVssLP*TvrF RAllVVXLL mWnLL

FIG. 2. Sequences of peptides from TBSV and their alignment in the polypeptide chain of the coat protein. Peptide names are based on cleavage methods. CN, cyanogen bromide; MT, maleylated tryptic; SP, staphylococcal protease; C, chymotrypsin; HA, hydroxylamine. Peptides sequenced as mixtures are identified by the names of both peptides in the mixture (e.g. MT2/MT3). The sizes of gaps were derived from the crystallographically determined electron density map. Arg60 and Arg259 were placed on the basis of the specificity of trypsin for cleaving after arginine in the maleylated protein. Arg59 and His70 were assigned on the basis of the CNl composition (Table l), the likelihood that peptide MT5 would not have a second arginine, and the X-ray map. Residues 107 to 109 were assigned using the MT7 composition and the X-ray map.

I’. HOPPER,

706

S. Ct. HARRISON

AND

TABLE

R. T. SAVER

1

Amino acid compositions of selected peptides MT1

~--

1-R

CNI

l-79? -

ASX Thr

0.12

0

8.00

8

2.07

Ser GlX

2 0

0.1 1

0 0 0 1 0 1 0 0 0 0 0 0 1 0 0

2.72 5.39 5.99

3 6 6

0.79 9.99 10.24 74?2

1 10 10 8

1.46 3.30

2 4

8.18

8

0.97

1 0

PW Gly Alil Val Met Ile IRU Tyr Phe His LJW Arg Trp (‘ys

~~.~~_.~ Asx Thr SAI GlX Pro

Gly Ala Val

Mrt Ile l&u TYT

Phr His I,YS Arg

Trp ( :ys

0.14 1.01 0.10 0.87

1.00 md.

~.

~.

_____~ MT8 1 lo-175 ~__.~

SP6

10

I 1

n.d.

(3.49

I

0.64

1

1.12 1.03

T80

222-232

1~01

1

2.92

3

1.12

1 0 0 1 0 0

1.11

1 1 1 0 0 1 1 0 0 0 1 0

0.98 -

1.11

1 0

0.86

I 0 0 0

0.95

1.06 0.99

1.17 -O-81

-

3 2

0 0 2

1*86

1.13

2.19

0 0

-

1 0 0

-~

164-172

0

3.03 I.89

2

n.d.

MT9 4.94 0.97 0.78 5.04 24Xl 1.12

6.01 3.06 0.96

176.-213 _.-...

0 2 0 0

.-

0 0 0

0.92 0.13

1 0

1.12 n.d.

1 0 0 ____~SP8 195-203

5 1

1.12 -

I 0

I .5 3

0.93 --

0 1 0

1 6

1.01 1.05

3

1.12

1 0

4.00

4

0.98 2.01 1.82

1 2 0 2

I.96

2

n.d.

MT12 4.87 3.71 2.10 1.67

99-109

0.13

0 1

0 0

1.13 n.d.

1 8 3

MT7 -.___

I.96

2

0 0

1.93 7.87 2.77

1.04

I 1

1.87 0.96

4

0

1.07 0.82 1.01

1

1.01 1.08

6 5

3.90

0.39

7.16

1,118

0.90

1 0 1 0

0.68 0.49

3.06 7.38

4 8 5 3 7 2 8

0 1

1.09 0.09 0.91

0 2 0 1 1 I 0 1 0 0 1 I 0 0 0 0 0

9.47 4.44 5.15 5.22 3.13

MT1 1 214-221 ASX Thr Ser Glx PI.0 Gly Ala Val Met Ilc Leu TY~ Phe

n.d. 1l.d. -. .___~

87-98 ~

0.13 1.03

1

0.97 5.82 4.97

-- -__-

MT6

1.83 _~ 0.84

1 1 1 0 0 2 0 1 0

0.78

1 0

1 0

n.d.

0 0

222-2591

C9

(4)

1.87 1.86 0.55

(21 (1)

234-2541 (1) (1) W)

(2)

1.34

(1)

1.00

(0)

4.58 3.71 3.28 2.25 3.69 2.08 1.20

(3)

0.82 3.12

(0) (3)

(2)

1.91

(1)

(1) (0)

0.88 2.54 3.37

(0) (0)

(2)

l-21

(1)

(1)

0.33

(0)

(2) (3)

(2) (2)

AMINO

ACID TABLE

MT11 His Lys Arg ‘I’rp (‘)X

ASX Thr Ser (:1x Pro Gly Ala Val Met Ilr Leu Tyr Phe His Lys L4rg Trp (ks

SEQUENCE

iO7

OF TRSV

1 (continued)

214-221

T80

0 2 1 0 0

1.02

222-232

MT12

222-259$

n.d. n.d.

0 1 0 0 1

n.d. 0.41

UN (1) (1) UJ) (1)

SPl 1 236336$

MT14

2777297

MT16

308-319

9.52 156X 6.21 2.97

(5) (15)

2.27

2

1.06

2.96 0.92 0.29

3

2.02

0.59

3.22

(4)

1 0 1

12.46 9.36 8.10

(12)

3 2

2.77

1 2 1 0 0 3 I 0 0 0 2 0 1 0 0 1 0 0

1.87

192 n.d.

(4)

(“1

ii; uu

(2)

3.59

14.23 3.66 4.08

0.83 0.97 n.d. 083

1.81 I.18

(15) (:
4.61 0.90

(3)

0.32

(1) (1) (5) UN (1)

4.97

1.06 2.88

0.46

1.85 n.d.

MT18

1 0 0 5 1 0 0 0 2

0 0

1.04

I.01 n.d. -

l-387

40.67 40.67 32.66

(39) (32) (27)

20.99 15.06

(20) (15)

38.69 33.42 41.14 2.82

(34) (32) (35)

13.93

(11)

4142

(41)

8.94 1398

(10) (1”) (5)

2

0.11

0 0 0 0 0 1

Arg

0.12

TV cys

n.d.

O,ll

2.04

TRW4

2.05

2.21

0.92 0.12

381-387

ASX Thr Ser GlX Pro Gly Ala Val Met Ile Leu Tyr Phe His Lys

0.12 1.08 1.60

w.53 0.43

2

0 0 2 0 0 0 0 0 0 0

4.70

13.18 18.07 1.69 3.67

C9

0.11 n.d.

MT1 7 6.65 9.77

4.79 0.78

1.17 4.58

8.35 7.00

3.83 7.02 1~10 2.07 0.43

1.13 nd. 1.10

234-2541

(W (0) (0) (0) ((1) 320-389: (6) (6) ( Qrl) (0) (1) (3) C.5) (51 (0) VI (61 (1) (2) (0) (0) (1) U’) (2,

(3)

(13)

(18)

(2) (4)

Composition values less than 0.1 mol/mol peptide are not reported. Following each composition, the number of residues expected from the chemically determined sequence are given for the peptide. V’alues listed in parentheses are for peptides whose sequences are incomplete: these values are therefore expected to be lower than the experimentally determined compositions. t The N-terminal sequence determined for CNI begins at residue 3 (Fig. I), but this sequence accounts for only about 10% of the total peptide expected. It is likely that the remaining peptide comprises the N-terminally blocked peptide l-79, which arises as a partial CNBr digestion product. 3 The chemically determined sequences of these peptides are incomplete, and thus the C-terminal endpoints are approximations that may be incorrect. 3 Renormalized data of Michelin-Lausarot et al. (1970).

708

P. HOPPER,

S. C. HARRISON

AND

R. T. SAUER

ac-Ala-Met-Thr-Thr-Arg would account for the composition of MT1 and for its failure to react in the Edman degradation. The assignment of peptide MT18 as the C-terminal peptide of the protein was based on two considerations. First, the peptide was generated by digestion with TPCK-trypsin but lacks arginine or lysine, as expected for the C-terminal peptide. Second, carboxypeptidase Y digestion of intact virus releases Leu, Asn and Val, which again would be expected if MT18 were the C-terminal peptide and . . . Val-Am-Leu-Leu were the C-terminal sequence. The arginine residue at 380 corresponds to an extended side-chain in the electron density map. Placement of peptides by comparison with the electron density map was required in several places as described above, using the “guess” sequence as an initial guide and details of the density features to confirm the assignment. The quality of the map has been documented elsewhere (Olson et al., 1983). The sequence originally built from the appearance of features in the map was precisely correct at 42% of the residue positions in the S domain and “nearly correct” (in terms of character and size of the residue) at a further 15%. Density in the P domain is not defined as well, and the corresponding statistics are 37% and 17%. Assignments from the electron density map for three gaps in the chemically determined sequence are listed in Table 2. The first two gaps (246 to 258 and 269 to 275) correspond to well-resolved parts of the structure, and the X-ray assignments are probably more reliable than in the third gap (365 to 379). Recent careful rebuilding of the P domain has shown one more residue in this gap than indicated in Figure 12 of Olson et al. (1983). The change is in an external loop, and it does not affect any structural conclusions.

4. Discussion (a) Positively charged amino acid residues In the 387 residues of the TBSV coat protein, the distribution of charged amino acids is decidedly non-uniform. The R domain and arm, comprising the first 102 TABLE 2

Electron-density

map assignments for gaps in the chemically determined sequence of TBS V coat protein Sequence

Residue numbers 246-258

250 255 246 Gly-Ala-Gly-Ala-Asp-Ala-Val-Gly-Glu-Leu-Phe-Leu-Ala 289

269-276

215

216

Thr-Asn-Thr-Leu-Leu-Ser.Ser.Lys 385

365-379

270

258

370

375

379

Thr-Val-Ser-Gly-Val-Ala-Ala-Gly-Ile-Leu-Leu-Val-Gly-Arg-Ala

No attempt has been made to adjust these assignments to conform to peptide compositional due to uncertainties in the latter as described in the notes to Table 1.

data,

AMINO

ACID

SEQUENCE

OF TBSV

709

residues of the sequence, contain 15 basic residues (7 Arg; 8 Lys) but only one acidic residue (Glu), whereas the S and P domains have a preponderance of acidic residues. The folded S domain also has a special distribution of charged residues (Fig. 3). Its inward-facing surface is strongly positive (2 Arg; 5 Lys; 1 Glu). There is, however, no apparent pattern to the spatial distribution of these residues, and many of these side-chains show significant disorder in the electron density map (Olson et al., 1983). Figure 4 illustrates the interior of the S-domain shell, with the basic groups highlighted. The complete RNA molecule of TBSV can be expected to fold into an irregular series of stem and loop structures, which pack tightly against the inner S-domain surface. About 75% of the approximately 4800 RNA phosphate groups could be neutralized by the basic residues in the R domain, in the arm, and on the inner surface of the S domain. The remainder are presumably neutralized by small cations, since TBSV contains no polyamines (Cohen & McCormick, 1979). The sequence of the arm forms a folded scaffold structure in the C subunits only. In the A and B subunits, this sequence is spatially disordered, indicating that no particular conformation is favored. The arm is indeed randomly coiled on dissociated subunits of the related turnip crinkle virus, as judged by their extreme proteolytic sensitivity (Golden & Harrison, 1982). By contrast, the R domain of TCV is relatively resistant to proteolysis, and it can be isolated readily as a stable fragment, (Maeda & Harrison, unpublished results). It is therefore possible that R has a relatively stable folded structure within the virus. (b) Comparison

with southern bean mosaic virus

There is a striking structural similarity between the SBMV subunit and the TBSV subunit. SBMV has no P domain, but its R domain, arm and S domain are similar to those of TBSV, and the packing of the subunits in the two viruses is essentially identical. In Figure 5, parts of the amino acid sequences of TBSV and SBMV are aligned according to the pattern of main-chain hydrogen bonds in their beta-annuli and S domains. This method of alignment ensures functional homology, even if distortions in the tertiary structures lead to ambiguous superposition of alpha carbons. In the present case, the two alignments are reported to be completely consistent (Rossmann et al., 1983a). Positions of sequence identity are indicated in Figure 5, and to indicate tertiary positions the side-chains are entered onto the TBSV backbone diagrams shown in Figure 6. Note that the homology begins just where the ordered structures begin to overlap at the -Met-Ala-Pro- sequence in the beta-annulus (Met79 in TBSV, Met44 in SBMV). The R-domain sequences show no evident similarity, except for their strongly basic character. Figure 6 shows that there is little correlation between the positions of sequence identity and positions of important inter-subunit contacts. One exception is the S-domain dimer contact, where the “tangential helices” (denoted helix A by Hermodson et al., 1982) contain the sequences: Thr-Trp-Leu-Arg-GlyVal-Ala-Gln-Asn (SBMV, 98 to 106) Ser -Trp-Leu-Pro-Ala-Leu-Ala-Ser-Asn (TBSV, 143 to 151)

710

P. HOPPER,

S. C. HARRISON

AND

R. T. SAUER

-.......__/,_ . ..__ ,.: . .._ --------.... ‘~., ,;’ ,,:. ,;’,’,’ ‘..., I.,, ~%, aLT (a)

FIG.

(b)

3. Diagrams of (a) P and (b) S domains, indicating main-chain hydrogen bonds involved in secondary structural elements and the amino acid sequence in single-letter code. These diagrams are related to the folded structure as shown in the inset Figures. See Figs 10 to 12 of Olson et al. (1983).

In the C subunit of TBSV, the side-chain of Trp144 extends inward to contact Pro93 in the ordered arm, and its ring nitrogen hydrogen-bonds with Asn161 on the 2-fold related subunit. Both Pro93 and Asnl51 are also present in SBMV, but the reported orientation of the Trp side-chain in the SBMV model precludes the bond hydrogen bond with Asn (Rossmann et al., 19833). Another intersubunit that is conserved in the two viruses is His108 (His71 in SBMV) to Asn197 (Asn152

1~~:. 4. Stereo view from the inside of the virus particle (“RNA-eye view”) of the inner surface of 15 subunits. The polypeptide backbone is represented by lines forming Ca positions. A S-fold axis is at t,hr (,etrtrr of the field. The folded arms of C subunits form the rim of the segment illustrated. Surfaces of basic* residues (lysine and arginine) are shown as dotted contacts, superimposed on the Ca backbones. X-rag crystallographic results show that many of these side-chains can in fact adopt multiplr (,onformations, since the corresponding electron density is relatively diffuse. The illustration was made 1)~ A. .J. Olson with the program GRAMPG (O’Donnell & Olson, 1981).

in SHMV) at the 5-fold (A t’o A) and 6-fold (C to B; B to C) contacts; this bond is spatially adjacent to the Trp144 interaction just described. At other contacts, the general character of non-covalent bonding tends to be c*onstlrved, but with considerable “reshuffling” of specific participating groups. (1) 111the network of polar contacts shown in Figure 17 of Olson et al. (1983), none of the residues is identical in SBMV, but a salt-bridge still appears to be present l)y the THSV -+ SBMV substitutions Arg175 -+ Ala130, Tyr244 -+ Glu229. Glu254 + Arg241. (Glu254 is an X-ray map assignment: it lies in one of the small gaps in the chemical sequence; see Table 2.) (2) At the homolog in SBMV of the TRSV calcium-binding site (Fig. 18 of Olson et al., 1983), only two of the five aspartate residues are present (Asp183 and Asp186 of TBSV are Asp138 and Xspl41 in SBMV), and one of the others (Asp225) is a lysine (Lys200). Thus, a caalcium-mediated ionic bond in TBSV appears to have become a simple Asp-Lys salt-bridge in SBMV. This salt-bridge has indeed been assigned, on the basis of side-chain distances, by Rossmann et al. (1983b). Th eir analysis also reports that an ion may be chelated by the conserved aspartic acid residues (both on the same subunit surface). The principal calcium site in SBMV, determined by difference Fourier analysis, is located on the local 3-fold axis, with three glutamic acid residues liganding the ion (Abdel-Meguid et al., 1981). Thus, the exact position of the divalent cation site appears not to matter, as long as it determines stability of the trimer contacts. The large number of identical residues at structurally homologous positions in TRW and SBMV, and the considerable similarity of their folded chains, argue that these proteins have a common evolutionary origin. This result is particularly striking in view of the absence of a P domain in SBMV. The two other spherical viruses currently known at high resolution are TCV and satellite of tobacco necrosis virus. TCV is structurally related to TBSV. Both its S and P domains are extremely similar in folding and packing interactions to their homologs in TBSV

TBSV/S SBWl.5

TBSV/S

71 SER ILE

80 k!XT ALA PRO VAL .a*

VAL SER MT 42

TBSVlS

l

LE” ARG SEA SW 60

TBSV/S

VAL THR VAL 80

THR ILE

TBSV,S

LE”

SEA HIS 10

l

TYR LL”

MET

115 THR GLN VAL ASN

.

l

CYS GLU LB”

SEA THR GL” 75

LEU ALA

130 135 VAL GLY ASN SER LEU GLN LEU ASN PRO l

VAL VAL THR SER GL” 85

LEU VAL --90

--

---

---

---

---

---

---

140 145 155 110 SER ASN GLY THR LEU PHE SER TAP LEU PRO ALA LEU ALA SER ASN PHE ASP GLN TYR SER .

l

l

l

l

MET PRO PHB THR VAL GLY THR TRP LB” 95 100

ARC GLY VAL ALA GLN ASN TAP SER LYS TYR ALA 110 105

160 PHE AS,, SER VAL “AL

PRO LE”

165 LEU ASP TYR VAL l

SBMV,S

TRP VAL ALA ILE 115

TBSV/S

ALA LB” HIS

“RT

TBSV,A

200 ASN PHE GLY “AL

l

170 CYS GLY THR THR GL” .

175 VAL GLY ARG VAL

t.

.

ARG TYR THR TYR LEU PRO SER CYS PRO THR TSR THR SER GLY ALA ILE 17.0 130 125

180 TYR PHB ASP LIS .

SEMI/S

185 ASP SER GLN ASP PRO CL” l

l

GLY PHB GLN TYR ASP “ET 135

LE”

LYS CL”

195 190 PRO ALA ASP ARC VAL GLU LEU ALA .

l

ALA ASP THR LEU PRO VAL SER “AL 140 145

205 THR ALA PRO TRP ALA ---

l

SBMV,S

110 ARC GL”

*

125 VAL ASN GLY GLY ILB

l

SBW,,S

l

l

HET VAL LYS LE” ARG PRO PRO --55

THR SER “IS

.

HET ASP VAL THR IL6 65

120 ASN SER SEA GLY PHE VAl

95 90 VAL GLY SER LYS PRO LYS PHE THR l

ALA GLN GLY THR --50

l

.

SBW,,S

85 SRR AR0 CLN LE”

100 105 GLY ARG THR SER GLY SER VAL THR “AL l

SBW,/S

ALA PRO ILR 45

ALA VAL .

---

---

---

ASN GLN LEU SER 150

---

---

---

---

l

ASN LE”

LYS GLY TYR VAL TSR GLY PRO VAL TRP GLU GLY GLN SER GLY LEU CYS PHE “AL 155 160 170 165 215

110

TBp,V,S_- -__ --- ___ --_ -__ ___ -__ ___ ___ -_- ___ GLU*L* HETLE” AEGILE psc rss * SBHy,S

TBSv,S

ASN ASN THA LYS CYS PRO ASP THR SER ARG ALA ILB 180 115

27.5 220 ASP LYS VAL LYS ARG TYR CYS ASN ASP SER ALA THR --l

SB”V,S

TBSV/S

THR ILE 185

l

.

.

---

ALA LEU ASP THR ASN GL” 190

---

---

---

---

---

VAL .

.

VAL SER GLU LYS ARG TYR PRO PHE LYS THR ALA THR ASP TYR ALA THR ALA VAL GLY VAL 195 200 210 205 230 ASP GLN LYS ---

---

---

LE”

235 240 ASP LRU CLY GLN LEU GLY ILE .

SBMVIS

ASN ALA ASN ILE 215

TBSV,S

ALA GLY ---

SB”V,S

SRR SER LYS THR ALA VAL ASN THR GLY ARG LE” *as 240

---

TBSVIS

265 268 PHE PRO GLN PRO

SB”V,S

GLU PRO ILE

GLY ASN ILE

ILE

LEU “AL 220

241 ALA THR TYR GLY GLY l

l

.

PRO ALA ARG LBU VAL THR ALA NET GLU GLY GLY 230 225

7.50 ALA ASP ALA VAL GLY GL” f l

260 255 LRU PHE LBU ALA ARG SEA VAL THR LEU TYR l

l

TYR ALA SER TYR THR ILE 245

ARG LE” 250

ILE

l

ALA 255

Pm. 5. Comparison of the sequences of SBMV (Hermodson et al., 1982) and of TBSV (R, arm, S). Identical residues are indicated by asterisks between the lines.

FIG. 6. Ca stereo diagram of TBSV S domain (C subunit, with folded arm): the residues that are identical to their homologs in SBMV are all shown as complete side-chains. There is no strong correlation of conserved residues with 3.dimensional position or with apparent function (see the text).

AMINO

ACID

SEQUENCE

OF TBSV

7 13

(Hogle & Harrison, unpublished results). The amino acid sequence of TCV is not known. The subunit of STNV also has the same folded topology as TBSV, TCV and SBMV, but its packing in the virus is significantly different (Liljas et al., 1982). The &fold contacts in STNV (a T = 1 particle) are roughly similar to the 5-fold or local 6-fold contacts in TBSV; the others are scrambled (see Rossmann et al., 1983a). There is no evident sequence similarity between STNV and TBSV or between STNV and SBMV. This work was supported by NIH grant AI-15706 to R.T.S. and NIH grant CA-13202 to SCH. We thank Kathy Hehir for help with the protein sequencing and A. J. Olson for preparing Figs 4 and 6.

REFERENCES Abdel-Meguid, S. S., Yamane, T., Fukuyama, K. & Rossmann, M. G. (1981). Virology, 144, 81-85. Brauer, A. Q., Margolies, M. N. 6 Haber, E. (1975). Biochemistry, 14, 3029-3035. Cohen, S. S. & McCormick, F. P. (1979). Advan. Virus Res. 24, 331-387. Edman, P. (1960). Ann. N. Y. Acad. Sci. 88, 602-610. Golden, J. S. & Harrison, S. C. (1982). Biochemistry, 21, 3862-3866. Harris, J. I. & Hindley, J. (1961). J. MOE. Biol. 3, 117-120. Harris, J. I. & Knight, C. A. (1955). J. Biol. Chem. 214, 215-230. Harrison, S. C. (1980). Biophys. J. 32, 139. Harrison, S. C. 6 Jack, A. (1975). J. Mol. Biol. 97, 173-191. Harrison, S. C., Olson, A. J., Schutt, C. E., Winkler, F. K. & Bricogne, G. (1978). Nature (London), 276, 386. Hermodson, M., Abad-Zapatero, C., Abdel-Meguid, S. S., Pundak, S. & Rossmann, M. G. (1982). ViroEogy, 119, 133-149. Liljas, L., Unge, T., Fridborg, K., Jones, T. A., Lovgren, S., Skoglund, 0. & Strandberg, B. (1982). J. Mol. Biol. 159, 93-108. Michelin-Lausarot, P., Ambrosino, C., Steere, R. L. & Reichmann, M. E. (1970). Virology, 41, 160-165. O’Donnell, T. J. & Olson, A. J. (1981). Camp. Graph. 15, 133-142. Olson, A. J., Bricogne, G. & Harrison, S. C. (1983). J. Mol. Biol. 171, 61-93. Rossmann, M. G., AbadZapatero, C., Murthy, M. R. N., Liljas, L., Jones, T. A. & Strandberg, B. (1983a). J. Mol. Biol. 165, 711-736. Rossman, M. G., Abad-Zapatero, C., Hermodson, M. A. & Erickson, J. W. (19835). J. Mol. Biol. 166, 37-83. Sauer, R. T., Niall, H. D., Hogan, M. L., Keutmann, H. T., O’Riordan, J. L, H. $ Potts. J. T. Jr (1974). Biochemistry, 13, 1994-1999. Sauer, R. T., Pan, J., Hopper, P., Hehir, K., Brown, J. & Poteete, A. R. (1981). Biochemistry, 20, 3591-3598.

Edited by A. Klug