trends in analytical chemistry, vol. 1 I, no. 3,1992
96
28 D. Barham and P. Trinder,
Analyst (London), 97 (1972) 142. 29 R.Q. Thompson, Ph.D. Dissertation, Michigan State University, 1982. 30 C.L.M. &Its, Ph.D. Dissertation, Michigan State University, 1985.
31 E.B. Townsend IV and S.R. Crouch, in preparation.
E.B. Townsend IV and S R. Crouch are at the Department of Chemistry, Michigan State University, East Lansing, MI,
USA.
Mapping post-translational modifications of viral proteins by mass spectrometry Jeffrey J. Gorman Parkville, Australia
The impact of fast atom bombardment ionisation on the capacity of mass spectrometry (MS) to produce structural data on biopolymers is discussed. Particular emphasis is placed on viral proteins with specific reference to proteins isolated on a microscale from Newcastle disease virus (NDV) and complementarity with other protein analysis techniques. This is elaborated by reviewing strategies for analysis of post-translational modifications of these proteins in relation to variation in virulence of different strains of NDV. Definition of proteolytic activation, glycosylation, phosphorylation and amino-terminal acylation of NDV proteins forms the substance of the discussion together with detection of strain specific sequence variations between the proteins. The importance of such analytical capability is presented in the contexts of molecular virology and biotechnology applications. The potential for more recent advances in ionisation techniques to enhance the role of MS in viral protein analysis is also discussed.
Evolution of mass spectrometry biopolymers and biotechnology
of
Mass spectrometry (MS) has assumed an increasingly important role in the analysis of biological macromolecules since the advent of soft ionization techniques such as fast atom bombardment (FAB) [l-3]. Th is was largely due to elimination
01659936/92/$05.00.
of the need for derivatization steps required to achieve ionization of polar peptides and oligosaccharides by pre-existing ionization techniques such as electron impact and chemical ionization. Mass analysers with extended mass range capabilities have emerged as a consequence of the development of FAB [4] and their impact has also been an important factor in the use of FAB-MS for biopolymer analysis. The initial euphoria which emanated from being able to analyse relatively simple and abundant peptides by FAB-MS was subsequently vindicated by demonstrations that much smaller quantities of more complex biopolymers were also amenable to this technique [5-81. These developments occurred around the same time as developments of increasingly sensitive methods and equipment for stepwise Edman degradation based sequence analysis [9] and amino acid analysis [lo] of peptides and proteins. Thus a complementarity has evolved between MS and these techniques which provides an extremely powerful capability for analysis of proteins on a microscale. The biotechnology revolution has been partly underpinned by these analytical tools together with techniques which enable gene sequences to be determined and their corresponding amino acid sequences to be deduced [ll]. Manipulation of isolated genes into artifical expression systems, enabling production of proteins for research and pharmaceutical purposes, has been a fundamental element in the growth of the biotechnology industry [12]. Protein analytical techniques are important for facilitating gene sequence analysis by providing complementary and corroborative data on
OElsevier
Science Publishers B.V. All rights reserved
trends in analytical chemistry, vol. II, no. 3,1992
corresponding isolated proteins and for analysing the products of expression systems [8,13-151. Analytical data on isolated proteins is also used for producing synthetic oligonucleotide primers to initiate the process of gene sequence analysis [ 161. Protein analytical techniques are essential for detecting and defining post-translational modifications of proteins so that the complete structure of a protein can be obtained [17,18]. Nucleotide sequences of genes embody the information required to deduce the primary structure (amino acid sequence) of proteins but they do not reveal whether the side chains of amino acids are chemically modified in viva after translation of genes or the extents of any modifications. In some instances consensus amino acid sequences indicate a potential for post-translational modification. For example, addition of oligosaccharides may occur on the side chain of asparagine in Asn-XaaThrKer sequences where Xaa can be any amino acid although proline and aspartic acid are unfavoured [19,20]. However, these sequences are not always glycosylated [21] and protein analysis is required to be certain of the status of these consensus sequences. Other post-translational modifications are currently more difficult or impossible to predict. Defining post-translational modifications is essential in the case of expressed proteins that are destined for use as pharmaceuticals. This is required to ensure that they have the same chemical structure as their natural counterparts or an acceptable alternate structure [13-15,221. Unacceptable structures could be detrimental to recipients due to unwanted pharmacological or immunological activities. This is also important for expressed proteins used in diagnostic assays to ensure predictable functional behaviour, for example correct immunochemical reactivity. The fundamental role played by MS in defining proteins from complex biological systems on a microscale, in conjunction with other protein analysis techniques, will be exemplified in this review. Particular emphasis will be placed on integration of MS into the process of defining post-translational modifications of viral proteins. Post-translational modification of viral proteins Consideration of post-translational modifications is essential for determining structure-function relationships of viral proteins and for developing alternate therapeutic agents, diagnostic reagents and vaccines for viral infections. As a con-
97
sequence, considerable effort has gone into developing strategies for analysis of viral proteins for post-translational modifications. This will be exemplified herein with the avian pathogen, avian paramyxovirus-1 which is commonly known as Newcastle disease virus (NDV). Newcastle disease can cause devastating mortalities of poultry in the event of an outbreak of a virulent strain of NDV [23]. One of the difficulties in diagnosis of Newcastle disease virus is that it exists in a variety of strains which vary from highly virulent to asymptomatic [23]. Diagnosis of an outbreak of disease in poultry can be confused by the presence of an avirulent form of NDV which could be mistaken for the cause of disease. Thus it is essential to be able to differentiate between different pathotypes of NDV in order to perform accurate disease diagnosis and take appropriate disease control measures (e.g. eradication by slaughter). Effective disease control is reliant upon a precise knowledge of the rate and degree of spread of the virulent pathogen. Differential diagnostic capability is extremely important in the face of spread of a virulent strain of NDV through regions with endemic avirulent strains. Currently, this is a relatively slow process since it is dependent on protracted in vivo assays which either replicate or mimic the disease. Different strains of NDV all belong to the same serogroup so it is difficult to use classical serological assays to effect differential diagnosis. Different pathotypes do, however, differ in the post-translation proteolytic cleavage of their membrane glycoproteins [24], especially their fusion proteins (see below). MS has been a vital analytical tool for determining the structural consequences of the differences in susceptibilities of these fusion proteins to proteolytic activation [25,26]. However, amino acid analysis and Edman degradation sequencing were essential companion techniques for this task. The ultimate intention was to exploit defined structural differences for developing immunochemical reagents for rapid pathotype differentiation. Other post-translational modifications or sequence variations of NDV proteins may also be of significance to or reflect the pathogenicity of different strains of NDV. These post-translational modifications may include glycosylation, aminoterminal blockage and phosphorylation. Hence strategies for detecting these modifications will be discussed. The potential for electrospray and matrix-assisted laser desorption ionization based MS to provide data not forthcoming from FAB based experimentation will also be considered.
98
trends in analytical chemistry,
vol. 11, no. 3, 1992
Proteolytic activation of Newcastle disease virus fusion proteins Newcastle disease virus belongs to the Puramyxoviridue family of viruses. All members of this family bind to target cell membranes via a receptor interaction and there is a subsequent fusion between the lipid bilayer membranes of the virus and target cell. It is this fusion process which enables transfer of the viral genome into the target or host cell and thus the infectious process to proceed. A virally encoded protein on the surface of the virus, known as the fusion protein, causes this fusion process. This protein is synthesised by infected cells as an inactive single chain biosynthetic precursor and participation of cellular proteases is required in order to activate the precursor by cleavage into a disulphide-linked two chain macromolecule [27]. Proteolytic activation occurs in infected cells prior to assembly of newly synthesised viral proteins into mature virions with full infectivity. The susceptibilities of fusion protein precursors to cleavage-activation has a marked effect on virulence of paramyxoviruses [28,29]. This apparently accounts for the diversity of virulence between different strains of NDV (see above). Analysis of the genes which encode NDV fusion protein precursors has provided an explanation for the variation in their susceptibilies to activation. The regions of the proteins at which cleavage takes place have two types of structural motifs; readily cleaved precursors of virulent strains have two pairs of basic amino acids, separated by a single glutamine residue, preceding the site of cleavage, whereas the cleavage motifs of low virulence or avirulent strains have the first basic amino acid in each pair replaced by another type of amino acid (Fig. 1). These findings led to the speculation that the proteases responsible for activation of precursors of virulent strains are similar to, if not the same as, those which cleave at clustered basic amino acids in protein hormone precursors (e.g. proinsulin). On the other hand proteases with specificity for single basic amino acids are required for the activation of the precursors of lowvirulence and avirulent strains [25,26,30-321. Thus it follows that the availability of specific proteases in different cell types would account for the ability of a strain of NDV with a particular cleavage motif to establish a complete infectious cycle in various cells. Presumably the protease specific for clustered basic amino acids has a broader cellular distribution than the single basic residue
\ -Arg.Phe-lie-Gly-Ala
Virulent
Lys -Gly-
-Gin-Glv_Arg-~~-lle-Gly-Al~
Avirulcnt
A%
Fig. 1. Comparison of the cleavage sites of fusion protein precursors of Newcastle disease viruses of varying virulence. The shaded bar represents the entire coding capacity of the fusion protein precursor gene. Maturation of this primary translation product involves removal of the signal peptide sequence of residues l-31, addition of oligosaccharrides and cleavage-activation at the carboxyl-terminus of the F2-polypeptide region. Cleavage activation results in liberation of the Fl -polypeptide amino-terminus which is believed to participate directly in the virus fusion activity. The F2- and Fl -polypeptides are held together by disulphide bonds after cleavage activation. (Reprinted with permission from ref. 26.)
specific protease( Hence the virulent viruses can spread through a larger range of tissues, to devastating effect. These findings from gene sequence analysis presented the opportunity to test the theory that cleavage-activation only occurred at the cleavage motifs, that cleavage was at the same sites within the different types of motifs and to establish the structures of the fully processed motifs. The answers to these questions hold both academic and practical value as the possibility is raised of exploiting the structures of the fully processed motifs to tailor antipeptide antibodies capable of differentiating between the fully processed motifs. Such antibody specificity would be required in a diagnostic capacity because it is the fully processed forms of these viruses that would be the replicative forms present in any given diagnostic specimen. Viruses grown in the allantoic cavity of embryonating chicken eggs were used for this study since viruses propagated in this way grow in the endodermal cells of the chorioallantoic membrane which have the capacity to fully activate all NDV pathotypes. Parenthetically, this organ illustrates the cell tropism differences between pathotypes which probably reflects the protease distribution phenomenon. Low-virulence and avirulent strains of NDV are only activated in the endodermal cell layer and hence can not spread to other cell layers of the membrane. On the other hand, virulent strains are activated by cells of all three layers of the membrane and cells of the embryo, hence,
trends in analytical chemistry, vol. II, no. 3, 1992
99
they can successfully infect and kill the embryo. Fully activated fusion proteins were isolated from NDV strains of various levels of virulence and subjected to protein structure analysis. This process involved virus purification and disruption and labelling of the viral proteins with a fluorescent reagent to facilitate protein purification [25,26,33,34]. Purification involved SDS-polyacrylamide gel electrophoretic separation of the labelled proteins and recovery of proteins from gel slices by a process of electroelution. Isolated proteins were subjected to proteolytic cleavage with an enzyme able to produce fragments (peptides) amenable to analysis but free from effect on the natural cleavage motif. Resultant fragments were isolated by high-performance liquid chromatography (HPLC) and analysed by amino acid analysis, FAB-MS and stepwise Edman degradation based amino acid sequencing. Cleavage-activation motifs are at the carboxylterminal ends of the F2-polypeptide regions of fusion protein precursors. Thus documentation of the cleavage-activation process involved analysis of the carboxyl-termini of FZpolypeptides of activated fusion proteins. Appropriate fragments were produced for the F2-polypeptides of four AV FZ-POLYPEPTIDE
1.5
30
45
60
75
TIME (minutes) Fig. 2. High-performance liquid chromatogram of an AspNprotease digest of the FPpolypeptide of the virulent Australia-Victoria (AV) strain of NDV. The activation cleavage motif of the fusion protein precursor of this strain contains the sequence -Arg-Arg-Gln-Lys-Arg-, the specificity of AspNprotease for the amino-terminus of aspartic and glutamic acids ensured that the cleavage motif was unaffected during enzymatic digestion in vitro. The fraction labelled carboxyl terminus was the only fraction which had an amino acid composition consistent with derivation from the carboxyl-terminal end of the F2-polypeptide. (Reprinted with permission from ref. 26.)
SEQUENCE
ANALYSIS
OF
AV
C-TERMINUS COMPOSITION
;
60
5
E a -
40
Yield
z a, F
314
pmles
20
0
1234567
9 101112131415161718
DSIRRIPESVTTSGGRRO
Cycle
/
PTH-AA
Fig. 3. Amino acid composition and stepwise Edman degradation of the fraction designated carboxyl terminus in Fig. 2. Ratios of amino acids are presented for each amino acid detected and theoretical values are presented in parentheses. Theoretical values for glutamic acid (E), and arginine (R) were calculated allowing for removal or retention of the -ArgArg-Glnresidues 16, 17 and 18 of the sequence presented in the plot of yields of phenylthiohydantoin-amino acids at each cycle of sequencing of the carboxyl terminus fraction in Fig. 2.
strains of NDV using AspN-protease. The chromatogram of one FZpolypeptide fragment mixture is presented in Fig. 2. Amino acid analysis indicated that the contents of only one fraction could have been derived from the carboxyl-terminus of the F2-polypeptide. However, the data were inconclusive as to the exact position(s) of cleavage-activation due to the finding of fractional integer values for arginine and glutamic acid (Fig. 3). The reasons for these discrepancies were not forthcoming from Edman degradation due to the occurrence of difficult residues within the sequence (Fig. 3). Serine forms an unstable phenylthiohydantoin (PTH) derivative which partially degrades to PTH-dehydroalanine and thus was not quantitated. PTH-arginine is difficult to quantitatively extract from the reaction flask and conversion flask of the sequenator; several arginines occur in the region of greatest interest (i.e. the carboxyl-terminus). Furthermore, stepwise Edman degradation has inherent limitations due to a repetitive inefficiency of 5-8% and the tendency of short sequences (i.e. those degraded down toward the carboxyl-terminal end of a fragment) and the ultimate PTH-amino acid to be washed to waste during extraction and washing steps in the sequenator. MS of the HPLC fraction representing the carboxyl-terminus (Fig. 4) did, however, provide data which resolved the nature
trends in analytical chemistry,
100
MH+ 66-80 160:.7
MH+66-63
;(
2046 9
1500
1300
2100
1900
1700
2300
MASS Fig. 4. FAB-mass spectrum of the carboxyl terminus fraction isolated by HPLC. Ions corresponding to intact protonated carboxyl-terminal segments of the F2-polypeptide are indicated together with the amino acid residue numbers which are spanned by these segments. Asterisks mark phosphoric acid adducts of these parent ions. (Reprinted with permission from ref. 26.)
of the carboxyl-terminus of this FZpolypeptide. Two fragments were detected as parent ions which indicated the existence of two discrete carboxyltermini, one terminating before the first pair of basic residues and the other before the second pair of basic residues. This accounted for the discrepancy in the amino acid compositional analysis. This process was repeated for the other strains of interest. Occurrence of two different carboxyltermini was not restricted to the virulent strain for which data are shown (Fig. 5). The reason for this appears to involve two discrete cleavages at the cleavage motifs by arginine-specific endoproteinases followed by removal of the exposed basic amino acids by an exoproteinase. Review of the amino acid analysis data (Fig. 3) indicated that where two cleavages were apparent, neither posi79
76
80
AV
Ser-Gly-Gly-Arg-Arg
v4
Ser-Gly-Gly-Gly-Lys
61
63
62
64
I
85
3
4
+
+ -
2
Phe-IleGly-Ala
Gin-Lys-Arg
La-Ile-Gly-Ala
Gln-Gly-Arg +
EG
+
+ WA
Ser-Gly-Gly-Glu-Arg F2
*
Xaa-Xaa-Xaa-Xaa
Gln-Gly-Arg
Ser-Gly-Gly-Gly-Arg
-
Leu-ValGly-Ala
GlnrGlu-Arg +
.
Fl
Fig. 5. Alignment of sequences at the junctions of the F2- and Fl -polypeptide regions of the fusion protein precursors of the virulent Australia-Victoria (AV), the low virulence (EG) and avirulent (V4 and WA) strains of NDV, except for the EG strain, the Fl -polypeptides have been isolated and subjected to stepwise Edman degradation, thus, Xaa denotes undefined amino acids of the EG protein. Bold arrows denote actual sites of cleavage of the precursors as defined by the approach described herein. Basic amino acids (Arg and Lys) to the left of the bold arrows are trimmed from the cleavage sites subsequent to the primary cleavage. (Reprinted with permission from ref. 26.)
vol. 11, no. 3,1992
tion was favoured over the other. This was not apparent by intensities of the ions produced by FABMS, presumably due to the different masses and compositions of the respective sequences. These findings have now been exploited to design synthetic templates for production of antipeptide antibodies capable of selectively reacting with the fully processed forms of the various cleavage-activation motifs. These antibodies have been shown to be applicable to rapid in vitro differentiation between the various pathotypes of NDV. Glycosylation of NDV proteins Another post-translational modification that is relevant in the context of virulence is glycosylation. Introduction of a glycosyl moiety (i.e. an oligosaccharide attached to the side chain of an amino acid) adjacent to a cleavage motif typical of virulent viruses has been shown to impede cleavage-activation of virus fusion activity [35]. Hence, it is essential to determine glycosylation states of fusion proteins in order to assess whether or not this post-translational modification has any bearing on virulence of a particular virus strain. In addition, glycosylation could conceivably influence the ability of virus receptor proteins to interact with ligands on target cell membranes or even the formation of infectious virions [21]. Proteins of viruses with restricted cellular tropism may have distinctly different glycosyl structures compared with proteins of virulent viruses. Because virulent strains can spread to a larger range of cell types their proteins will be subjected to a greater variety of glycosylation compartments and may have a relatively varied glycosyl profile compared to the corresponding proteins of avirulent strains. These postulates have been verified for some viruses [21] and are worthy of investigation for all virus types. MS is a powerful tool in the process of mapping glycosylation sites of proteins [8,13]. An example of this is the FZpolypeptides of the fusion proteins of various strains of NDV. All of the F2polypeptides examined above were shown to have a consensus sequence for glycosylation. Thus, the possibility existed for variable occupancy of this site to influence cleavage-activation of their fusion protein precursors [35]. MS analysis of an unfractionated trypsin digest (FAB-mapping) of an F2-polypeptide isolated from one strain of NDV failed to reveal an ion with the mass predicted on the basis of the amino acid composition of the tryp-
trends in analytical chemistry, vol. 11, no. 3, I992
101
V4 F2 TRYPTIC DIGEST
DEGLYCOSYLATED
tention, as a consequence of deglycosylation, produced ions of the appropriate mass. This process was performed with all of the NDV F2-polypeptides of interest and each was found to be glycosylated at the consensus sequence site. It is unlikely, therefore, that differential glycosylation of the IQ-polypeptide has any significant influence on the susceptibilities of the fusion protein precursors to cleavage-activation. For the strains of NDV studied in this laboratory the distribution of proteases with specificities for the various cleavage-activation motifs appears to be the major factor influencing virulence. Some heterogeneity was evident in the glycosyl moieties of these peptides and this property deserves further investigation for potential exploitation in a diagnostic sense.
V4 F2 TRYPTIC DIGEST
INTACT V4 F2 A
9
llee
1288
1388
1488
1588
1688
wss 1788
Fig. 6. FAB-mass spectra of unfractionated digests of the NDV V4 strain F2-polypeptide. The digest in the top spectrum was obtained on an F2-polypeptide sample not treated with peptideN:glycanase, whereas tryptic digestion of the sample depicted in the lower spectrum was performed after removal of the oligosaccharides. Ions marked M correspond to polymers of the glycerol-thioglycerol (1:l) matrix, numbers above the ions correspond to masses of tryptic peptides expected from the FPpolypeptide. The ion at 934 dalton in the lower spectrum corresponds to residues 48-55 without oligosaccharide attached at the glycosylation concensus sequence -Asn-Arg-Thrat residues 54 to 56.
tic peptide bearing the glycosylation consensus sequence (Fig. 6). On the other hand, treatment of the isolated l%polypeptide with the deglycosylating enzyme, peptideN:glycanase, and reisolation of the IQ-polypeptide by HPLC (Fig. 7) prior to FAB-mapping revealed an ion of weak intensity with a mass (934 dalton) of the relevant peptide (Fig. 6B). HPLC mapping was also applied to the digest in order to gather more convincing data. This revealed that the HPLC retention of a single tryptic peptide was influenced by deglycosylation (Fig. 8). The putative glycosylated peptide was isolated by HPLC but it failed to produce ions by MS; however, the peptide with shifted HPLC re-
DEGLYCOSYLATED
L
it
L
15
30
45
60
75
TIME (mid Fig. 7. Characterisation of glycosylation of the NDV V4 strain F2polypeptide by HPLC. (A) Elution profile of the isolated reduced and alkylated F2-polypeptide prior to removal of Nlinked oligosaccharides. (B) Effect of removal of the oligosaccharides with the enzyme, peptideN:glycanase, on HPLC retention of the FPpolypeptide.
102
trends in analytical chemistry, vol. 11,
V4 F2 TRYPTIC
4
tion [37] based amino acid sequencing. Addition of blocking groups to the amino-termini and phosphate groups to the side chains of serine and threonine residues of NDV nucleocapsid proteins will be used below to illustrate this strategy. The process is essentially a peptide mapping procedure in which the protein(s) of interest are fragmented with proteolytic enzyme(s) and their constituent fragments (peptides) separated by HPLC for amino acid analysis and MS. Any peptide(s) which deviate in mass compared to theoretical masses predicted by gene sequence analysis or calculated from amino acid compositions are subjected to Edman degradation based sequence analysis. This is done in order to differentiate between strain-specific sequence variations and post-translational modifications. In rare instances peptides do not ionise under FAB conditions; if this does occur Edman degradation must be used to examine such peptides for unusual amino acid derivatives. The presence of amino-terminal blocking substituents, such as acyl derivatives, preclude Edman degradation sequencing of proteins and peptides. If amino-terminal blockage is involved then mass spectrometry is employed for sequence determination and to characterise the
-
B
DEGLYCOSYLATED
li
i_l.cL
0
DIGEST
15
30
TIME
45
60
no. 3,1992
75
(min)
Fig. 8. HPLC of tryptic digests obtained on NDV V4 strain F2polypeptide preparations prior to (A) or after (B) treatment with peptideN:glycanase. The peptide which changed elution time from 30 min (A) to 34.5 min (B), as a consequence of oligosaccharide removal (as indicated by arrows), only ionised in as the deglycosylated form (MH+ = 934 dalton).
Other post-translational podifications of Newcastle disease proteins Proteins obtained by insertion of viral genes into artificial expression systems hold great potential for use as diagnostic agents, vaccines or therapeutic agents. However, it is essential to establish that these expression systems reproduce exactly the same structure produced as a consequence of viral infection. Thus it is essential to define the structures of proteins isolated from intact virions or virally infected cells. There are a variety of modifications pertinent to this argument. Thus, general screening strategies have been developed for detecting post-translational modifications [13,17,18,36]. One particular strategy [36] employs MS in complementary association with amino acid analysis and stepwise Edman degrada-
10
20
30
40
50
60
70
60
Time (min.) Fig. 9. HPLC of tryptic (A and 8) and AspN-protease (C) digests of isolated NDV V4 strain nucleocapsid protein. Absorbance due to peptide bonds was monitored at 210 nm (A and C) and absorbance due to the fluorescent AEDANS adduct of cysteinyl residues was monitored at 350 nm (B). All peaks of absorbance were collected as discrete fractions and subjected to amino acid analysis and MS. Stepwise Edman degradation based sequence analysis was used for fractions which failed to yield corroborative data by both amino acid analysis and MS or which produced data different from the sequence deduced for the D26 nucleocapsid protein. No evidence was obtained for the presence of phosphorylated residues although the sequence spanning residues 460-489 was not characterised by MS. (Reprinted with permission from ref. 36.)
103
trendsin analyticalchemistry, vol. 11, no. 3,1992
D26 VI
388
488
420
444
468
488
m/z
B
H,C
60
130 140 150 SE~QRFMHIAGSLP~CSNGTP~A~DDAPEDITD
170
180
190 200 210 220 230 TAYETAoESETRRINKYMQqGRVQ~YILHPVCRSAIQLTIRQS~~IFLVSE~G~ _____________________---___--_-_______----__---___--___--_
240
D26 "4
250 260 270 280 290 TAGGTSTYYNLVGDMSYIRNTGLTAFFLTLKYGINTKTSALALSSLSGDIQK"KQLMF= ______________________----__-_-____---__----__---___--___--_
300
D26 "4
310 320 330 340 350 YP.MKGDNAPY"TLLGDSDQMSFAPAEYAQLYLFAMGMAS"LDKGTGKYQF~FMSTSFN __---___--______________________--____-____-_S______________
360
D26 "4
370 380 390 400 410 RLGVEYAQAQGSSINED.NAAELKLTPARRRGIAAWzQR"SEETSSIDNPTQQAG"LTGLS _----_----__---____-____-_____-----_---__---_"_--___--___--_
420
D26 v4
430 440 450 460 470 DGGSQAPQGALNRSQGQPDTGDGETQFLDL"RAVANS"REU'NSAQGTPQPGPPPTPGPS _---__---_~---_----_----_----__----__---__---__--__---__---_
480
D26 V4
D26 VI
- C
40 50 LK"EW"FTLNSDDPEDRNNFA"FCLRIAV
120
D26 "4
368
30
70 SO 90 100 110 SEDANFX.RQGALISLLCSHSQ"MRNK',ALAGKQNZATLAVLEIDGFTNG"PQFNNRSG"
D26 "4
340
10 20 MSSVFDEYE*LLAI\*TRPNWG~=KGST
/.~c_________--___---___---___--______--___---___---___--____-_
160
489 QDNDTDNGY
Fig. 11. Comparison of the amino acid sequence deduced for the strain D26 nucleocapsid protein with the sequence determined directly on the strain V4 nucleocapsid protein. (Reprinted with permission from ref. 36.)
6
Fig. 10. Spectrum of an AspN-protease peptide of the NDV V4 strain nucleocapsid protein subjected to collisionally induced decomposition (A). These data were consistent with the sequence depicted (B) which corresponds to the amino terminus of the strain D26 nucleocapsid protein except for the lack of an initiating methionine and the presence of an acetyl group. (Reprinted with permission from ref. 36.)
blocking group. Unfortunately, many post-translationally modified PTH-amino acids are not stable to the conditions of Edman degradation or are not readily extracted from the sequenator or are difficult to identify. This is an area of weakness in this strategy which needs to be addressed. The nucleocapsid protein isolated from the V4 strain of NDV was found by Edman degradation sequencing to have a blocked amino-terminus. In addition, studies using radiolabelled phosphate indicate that this paramyxovirus protein is phosphorylated [38-401. These findings made this protein an ideal candidate for the mapping (screening) strategy outlined above [34]. The isolated protein was subjected to trypsin or AspN-protease cleavage and the resultant fragments (peptides) were isolated by HPLC (Fig. 9). The blocking group was inferred to be an acetyl group by MS of the HPLC fraction that yielded an amino acid composition consistent with the amino-terminus of the nucleocapsid protein. This was confirmed by scan-
ning daughter ions generated by collisionally induced decomposition of the parent ion of the blocked peptide (Fig. 10). Extensive analysis of these peptides enabled the complete amino acid sequence to be aligned with a gene analysis based deduced sequence for the nucleocapsid protein of another strain (D26) [41]. No mass discrepancies were revealed that could be accounted for by phosphorylation although two amino acid variations, relative to the reference D26 sequence, were detected (Fig. 11). However, peptides derived from the carboxyl-terminal end of the protein, by either enzyme, (residues 460-489) failed to produce ions under FAB-MS conditions, even when negative ion detection was employed. Serine and threonine and their phosphorylated forms produce unmanageable PTH derivatives, due to both degradation and extraction problems (see above), thus, failure to detect phospho-threonine or phospho-serine by direct Edman degradation sequencing of this section of the protein was not surprising, but is not conclusive evidence for the presence or absence of phosphorylated residues in this section of the protein. This process has now been repeated with four other strains of NDV. Whilst no phosphorylation has been detected, this work has revealed substantial information on strain-specific variation of the nucleocapsid proteins. All of the nucleocapsid proteins that have been examined have had blocked amino-termini. In each case the nature of the blocking group was defined by both tryptic and
104
AspN-protease peptides. The amino acid compositions of these peptides were consistent with the amino-termini predicted from gene sequence analysis allowing for loss of one methionine (M). This was confirmed by MS which showed addition of an acetyl group at the amino-terminus. This demonstrated that during biosynthesis of the nucleocapsid protein the first methionine is trimmed off prior to addition of an acetyl group at the amino-terminus of the trimmed protein. Future role of MS in analysis of viral protein structure, processing and function Even this short review on application of MS to a particular virus readily reveals some limitations in FAB as an ionisation system for some polar biological macromolecules. Glycopeptides have often failed to produce ions and other intractable peptide sequences are sometimes encountered (e.g. the carboxyl-terminus of the nucleocapsid protein of NDV). It is possible, however, that these limitations may be overcome by the use of electrospray-MS [42] or matrix-assisted laser desorption-time of flight-MS [43] for analysis of these difficult peptides. It will be of interest to evaluate the carboxyl-terminus of the NDV nucleocapsid proteins for phosphorylation using electrospray ionisation. These techniques will possibly have an even greater impact due to their utility in analysis of the molecular weights of intact proteins [42,43]. This could be used as an screening tool to gain an indication of post-translational modifications prior to exhaustive dissection of a protein to pinpoint and comprehensively define suspected modifications. They may also prove valuable for defining the stoichiometry and dynamics of formation of stable macromolecular aggregates between viral proteins. It is apparent that some post-translational modifications of viral proteins are transient events during cellular infection and are reversed or altered prior to final assembly of intact virions. The possibility exists that phosphorylation of the NDV nucleocapsid protein is a transient event and that the phosphorylated forms seen in virions are only a minor fraction of the nucleocapsid protein. A definite example of a dynamic post-translational modification is the disulphide bond rearrangements which occur within the haemagglutinin-neuraminidase and fusion proteins of NDV during intracellular biosynthesis. These proteins form disulphide patterns at an early stage of biosynthesis
trends in analytical chemistry,
vol. 11,
no. 3,1992
which are subsequently rearranged for formation of the active proteins of mature virions [44,45]. Clearly, it will be essential to examine very small quantities of intermediates extracted from infected cells in order to define these modifications and events at a molecular level. The high sensitivity of electrospray and laser desorption ionisation techniques, coupled with their capacity for molecular weight determination of intact proteins, may allow these possibilities to be assessed and defined. This may be pivotal to development of antiviral drugs targeted at interfering with virus maturation since transient protein states may be vital intermediates for formation of infectious virions. Design of therapeutic agents aimed at interfering with the functions or formation of these proteins may need to be based on these intermediates rather than the final stable structures of proteins found in mature virions. The same arguments apply to the design of therapeutic agents to interfere with the functions of viral regulatory or non-structural proteins which are vital to formation of mature virus particles. These proteins can be obtained from artificial protein expression systems in order to study their molecular structures and functions; however, as argued above, it will be necessary to compare expressed proteins with their counterparts from infected cells or cells transfected with appropriate gene coding sequences in order to be certain that drug design will be rationally based on protein structures within infected cells. Molecular mass determination may also form the basis for introduction of MS into diagnostic protocols for virus characterisation/diagnosis. It may be possible to extract specific polypeptides from diagnostic specimens by immunochemical methods for mass analysis. For example, extraction and mass analysis of the F2-polypeptides from specimens suspected to contain NDV could form the basis of an NDV pathotyping assay by assessing the extracted polypeptides for mass differences derived by processing at paired or single basic amino acids. Acknowledgements The data discussed in this article were obtained whilst I was head of the Protein Chemistry Project at the Australian Animal Health Laboratory in Geelong. I wish to thank my co-workers who participated in acquisition of this data, particularly Gary L. Corino for performance of the mass spectrometric analyses.
105
trendsin analyticalchemistry, vol. 11, no. 3,1992
29 P.W. Choppin and A. Scheid, Trans. Am. Clin. Climatol.
References
Assoc., 90 (1979) 56.
M. Barber, R.S. Bordoli, R.D. Sedgwick and A.N. Tyler, .I. Chem. Sot. Chem. Commun., 7 (1981) 325. D.J. Surman and J.C. Vickerman, J. Chem. Sot. Chem. Commun.,
7 (1981) 324.
M. Barber, R.S. Bordoli, G.J. Elliot, R.D. Sedgwick and A.N. Tyler, Anal. Chem., 54 (1982) 645A. B.N. Green and R.S. Bordoli, in S.J. Gaskell (Editor), Mass Spectrometry in Biomedical Research, Wiley, New York, 1986, Ch. 13, p. 233. 5 B.W. Gibson and K. Biemann, Proc. Natl. Acad. Sci. USA, 81(1984) 1956. 6 K. Biemann and H.A. Scoble, Science (Washington, D. C.), 237 (1987) 992. 7 H.R. Morris, M. Panic0 and G.W. Taylor, Biochem. Biophys. Res. Commun., 117 (1983) 299. 8 H.R. Morris and F.M. Greer, Trenak Biotechnol., 6 (1988) 140. L.E. Hood and W.J. 9 R.M. Hewick, M.W. Hunkapillar, Dreyer, J. Biol. Chem., 256 (1981) 7990. 10 P. Bohlen, Methods Enzymol., 91 (1983) 17. 11 F. Sanger, S. Nicklen and A.R. Coulson, Proc. Natl. Acad. Sci. USA, 74 (1977) 5463.
12 D.V. Goeddel, D.G. Kleid, F. Bolivar, H.L. Heyneker, D.G. Yansura, R. Crea, T. Hirose, A. Kraszewski, K. Itakura and A.D. Riggs, Proc. Natl. Acad. Sci. USA, 76 (1979) 106.
13 S.A. Carr, M.E. Hemling and G.D. Roberts, in D.H. Schleshinger (Editor), Macromolecular Sequencing and Synthesis, Selected Methoak and Applications, A.R. Liss, New York, 1988, Ch. 9, p. 83. 14 S.A. Carr, M.E. Hemling, M.F. Bean and G.D. Roberts,
30 T. Toyoda, T. Sakaguchi, K. Imai, N.M. Inocencio, B. Gotoh, M. Hamaguchi and Y. Nagai, Virology, 158 (1987) 242.
31 R.L. Glickman, R.J. Syddall, R.M. Iorio, J.P. Sheehan and M.A. Bratt, J. Viral., 62 (1988) 354. 32 L.W. McGinnes and T.G. Morrison, Virus Res., 5 (1986) 343.
33 J.J. Gorman, Anal. Biochem., 160 (1987) 376. 34 J.J. Gorman, G.L. Corino and S.J. Mitchell, Eur. J. Biothem., 168 (1987) 169. V.A. Fried, M. Ando and R.G. 35 K.L. Deshpande, Webster, Proc. Natl. Acad. Sci. USA, 84 (1987) 36. 36 J.J. Gorman, G.L. Corino and B.J. Shiell, Biomed. Environ. Mass Spectrom.,
37 38 39 40
19 (1990) 646.
P. Edman and G. Begg, Eur. J. Biochem., 1(1967) 80. R.A. Lamb and P.W. Choppin, Virology, 81(1977) 382. G.W. Smith and L.E. Hightower, J. Virol., 37 (1981) 256. C.-H. Hsu and D.W. Kingsbury, Virology, 120 (1982)
225. 41 N. Ishida, H. Taira, T. Omata, K. Mizumoto, S. Hattori, K. Iwasaki and M. Kawakita, Nucleic Acids Res., 14 (1986) 6551. 42 J.B. Fenn, M. Mann, C.K. Meng, SF. Wong and CM. Whitehouse, Science (Washington, D. C.), 246 (1989) 64. 43 M. Karas, D. Bachmann, U. Bahr and F. Hillenkamp, Int. J. Mass Spectrom. Zon Processes, 78 (1987) 53. 44 L.W. McGinnes, A. Semerjian and T. Morrison, J. Virol., 56 (1985) 341. 45 T.G. Morrison, M.E. Peeples and L.W. McGinnes, Proc. Natl. Acad. Sci. USA, 84 (1987) 1020..
Anal. Chem., 63 (1991) 2802.
1.5 V.R. Anicetti, B.A. Keyt and W.S. Hancock, Trends Biotechnol., 7 (1989) 342.
Dr. J.J. Gorman is at the Biomolecular Research Institute, 343 Royal Parade, Parkville, Vie. 3502, Australia.
16 P. Hudson, J. Haley, M. Cronk, J. Shine and H. Niall, Nature, 291 (1981) 127.
17 S.A. Carr, G.D. Roberts and M.E. Hemling, in C.N. McEwen and B .S Larsen (Editors), Mass Spectrometry of Biological Materials, Marcel Dekker, New York, 1990, Ch. 3, p. 87. 18 S.C.B. Yan, B.W. Grinnell and F. Weld, Trends Biothem. Sci., 14 (1989) 264. 19 E. Bause and H. Hettkamp, FEBS Lett., 108 (1979) 341. 20 D.K. Struck and W.J. Lennarz, in W.J. Lennarz (Editor), The Biochemistry
of Glycoproteins
and Proteoglycans,
Plenum Press, New York, 1980, Ch. 2, p. 35. 21 T.W. Rademacher, R.B. Parekh and R.A. Dwek, Ann. Rev. Biochem., 57 (1988) 785.
22 F.A. Robey, in K.A. Walsh (Editor), Methoak in Protein Sequence Analysis, Humana Press, Clifton, NJ, 1987, p. 67. 23 A.P. Waterson, T.H. Pennington and W.H. Allan, Br. Med. Bull., 23 (1967) 138.
24 Y. Nagai, H.-D. Klenk and R. Rott, Virology, 72 (1976) 494.
35 -- J.J. Gorman, A. Nestorowicz, S.J. Mitchell, G.L. Corino and P.W. Selleck, J. Biol. Chem., 263 (1988) 12522. 26 J.J. Gorman, G.L. Corino and P.W. Selleck, Virology, 177 (1990) 339. 27 A. Scheid and P.W. Choppin, Virology, 80 (1977) 54. 28 H.-D. Klenk, F.X. Bosch, W. Garten, T. Kohama, Y. Nagai and R. Rott, Coloq. Ges. Biol. Chem., 30 (1979) 139.
TrAC Contributions Articles for this journal are generally commissioned. Prospective authors who have not been invited to write should first approach one of the Contributing Editors, or the Staff Editor in Amsterdam (see below), with a brief outline of the proposed article including a few references. Authors should note that all manuscripts are subject to peer review, and commissioning does not auto- matically guarantee publication. Short items of news, etc. and letters may be sent without prior arrangement to: Mr. D.C. Coleman, Staff Editor TrAC, P.O. Box 330,lOOO AH Amsterdam, Netherlands, Tel.: (+31 20) 5862784; Fax: (+3120) 5862304.