Crystal Structure of the Maturation Protein from Bacteriophage Qβ

Crystal Structure of the Maturation Protein from Bacteriophage Qβ

    Crystal structure of the maturation protein from bacteriophage Qβ Janis Rumnieks, Kaspars Tars PII: DOI: Reference: S0022-2836(17)30...

8MB Sizes 0 Downloads 36 Views

    Crystal structure of the maturation protein from bacteriophage Qβ Janis Rumnieks, Kaspars Tars PII: DOI: Reference:

S0022-2836(17)30040-2 doi:10.1016/j.jmb.2017.01.012 YJMBI 65320

To appear in:

Journal of Molecular Biology

Received date: Revised date: Accepted date:

14 December 2016 15 January 2017 16 January 2017

Please cite this article as: Rumnieks, J. & Tars, K., Crystal structure of the maturation protein from bacteriophage Qβ, Journal of Molecular Biology (2017), doi:10.1016/j.jmb.2017.01.012

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

RI

PT

Crystal structure of the maturation protein from

NU

SC

bacteriophage Qβ

of Biology, University of Latvia, Jelgavas 1, LV1004, Riga, Latvia

AC CE P

2Faculty

Research and Study Center, Ratsupites 1, LV1067, Riga, Latvia

TE

1Biomedical

D

MA

Janis Rumnieks1 and Kaspars Tars1,2

ACCEPTED MANUSCRIPT Abstract Virions of the single-stranded RNA bacteriophages contain a single copy of the

PT

maturation protein which is bound to the phage genome and is required for infectivity

RI

of the particles. The maturation protein mediates the adsorption of the virion to bacterial pili and subsequent release and penetration of the genome into the host cell.

SC

Here, we report a crystal structure of the maturation protein from bacteriophage Qβ.

NU

The protein has a bent, highly asymmetric shape and spans 110 Å in length. Apart from small local substructures, the overall fold of the maturation protein does not resemble

MA

that of other known proteins. The protein is organized in two distinct regions, an αhelical part with a four-helix core, and a β-stranded part that contains a seven-stranded

D

sheet in the central part and a five-stranded sheet at the tip of the protein. The Qβ

TE

maturation protein has two distinctive positively charged areas at opposite sides of the

AC CE P

α-helical part which are involved in genomic RNA binding. The maturation protein binds to each of the surrounding coat protein dimers in the capsid differently, and the interaction is considerably weaker compared to coat protein inter-dimer contacts. The coat protein- or RNA- binding residues are not preserved among different ssRNA phage maturation proteins; instead, the distal end of the α-helical part is the evolutionary most conserved, suggesting the importance of this region for maintaining the functionality of the protein.

Keywords: RNA phages, Qß, virus structure, maturation protein

ACCEPTED MANUSCRIPT

Introduction

PT

The bacteriophages of the Leviviridae family represent the smallest and simplest of the

RI

known phages with short, positive-sense, single-stranded RNA (ssRNA) genomes that encode only a few proteins. The F-pili specific ssRNA phages MS2 and Qβ have become

SC

the model Leviviridae phages, and much of what is known about the ssRNA phage

NU

biology comes from studies of these two viruses. The MS2 and Qβ phages represent the currently recognized two genera of the Leviviridae family, the leviviruses and

MA

alloleviviruses, respectively, and are fairly distinct from each other. The ssRNA phages have icosahedral capsids about 28 nm in diameter that are built up

D

from 89 coat protein dimers, 60 in the quasi-equivalent “AB” and 29 in the “CC”

TE

conformation. In alloleviviruses, some of the coat protein molecules contain an

AC CE P

additional C-terminal domain which is generated via read-through of the leaky termination codon of the coat gene [1]. The read-through domain is required for infectivity of the particles [2], although its exact function is still unclear. In addition, the ssRNA phage virions contain a single copy of the “maturation” protein, often referred to as the “A” protein in leviviruses or “A2” protein in alloleviviruses. During assembly of nascent virions in the infected cell, the maturation protein binds to the genome and gets packaged into the capsid along with it where it remains partially exposed onto the surface of the particle. The maturation protein replaces a single “CC” coat protein dimer in the particle, thus disrupting the perfect icosahedral symmetry of the capsid [3]. As a structural component of the virion, the maturation protein serves to adsorb the particle to bacterial pili, which the ssRNA phages use as cellular receptors [4], and mediates genome ejection [5] and penetration into the host cell [6]. In addition, the allolevivirus

ACCEPTED MANUSCRIPT maturation or “A2” protein mediates cell lysis [7, 8] by blocking the bacterial MurA enzyme of the murein biosynthesis pathway [9] while the leviviruses have a dedicated

PT

small lysis protein [10, 11]. Molecular details of many of the maturation protein-mediated interactions are largely

RI

unknown, and for decades, virtually nothing has been known about their structure.

SC

Recent advances in cryo-electron microscopy have provided first glimpses of these proteins [12, 13] and their interaction with the other components of the virion, but an

NU

atomic-resolution structure of an ssRNA phage maturation protein is still not available.

MA

Here, we report the crystal structure of the Qβ A2 protein to 3.3 Å resolution.

D

Results

TE

Crystallization of the A2 protein

AC CE P

The ssRNA phage maturation proteins are distinctively insoluble in an isolated form [14], which has made studies on these proteins, including their structural characterization, a difficult task. However, during studies of the maturation protein from bacteriophage Qβ, Dr. Ry Young’s group discovered that the protein remains soluble when fused to the maltose binding protein (MBP), which is known to often act as a molecular chaperone [15]. We set out to determine the structure of the MBP-A2 fusion protein; however, it turned out to be highly aggregated in solution and not suitable for crystallographic studies. The situation was improved by adding glycerol and salt to the sample buffer, after which a fraction of the protein appeared in a monomeric form and remained non-aggregated upon concentration. When crystallization of the MBP-A2 protein was attempted, the only small crystals grew under a condition containing sodium polyacrylate. We assumed that the carboxyl groups in the acrylate

ACCEPTED MANUSCRIPT polymer might mimic the phosphates of the RNA backbone, and stabilize the protein in a crystallizable state. It was then discovered that when added to the lysis and sample

PT

buffers, sodium polyacrylate is very effective in maintaining the A2 protein in a monomeric form without the requirement for glycerol or salt. Furthermore, in presence

RI

of sodium polyacrylate, the A2 protein remained soluble when cleaved and separated

SC

from the MBP. These advances allowed us to obtain A2 crystals that diffracted to 3.3 Å resolution. The structure was subsequently solved using selenomethionine-labeled

MA

NU

protein with engineered additional methionine residues.

Overall structure

D

The asymmetric unit of the crystal contains four A2 molecules which are represented in

TE

the final model as chains A to D. All chains could be modeled without breaks and contain all residues of the wild-type A2 protein. The A2 protein has a highly elongated shape

AC CE P

with a maximum width of about 50 Å and length of about 110 Å (Fig. 1). The overall shape of the protein roughly resembles a bow with the two ends of the molecule bent towards each other at an approximately 120° angle. About a half of the protein is built up from α helices (“the α-part”) while the rest almost entirely consists of β strands (“the β-part”). While the α- and β-parts are located at the opposite ends of the molecule, they cannot be regarded as separate domains as they have a common hydrophobic core and multiple connections between them. The α-part is roughly globular with approximate dimensions of 50 by 20 Å and consists of an antiparallel four-helix core with three shorter adjacent helices. One of the helices is notably longer than the others and after a sharp kink approaches the β-part. The βpart is more irregular and spans 80 Å in length, 30 Å in width and 15 Å in thickness. The β-part consists of in total 14 strands and is built up from a short β hairpin, a short α

ACCEPTED MANUSCRIPT helix and two antiparallel β sheets. The larger of the sheets consists of seven strands and is located at the central part of the protein where it spans the entire width of the β-

PT

part, and another five-stranded sheet forms the tip of the A2 protein. Three outmost strands of the seven-stranded sheet are considerably longer than the others and also

RI

extend to the tip of the β-part, but remain separate from the other sheet. The main chain

SC

traverses the α- and β-parts five times, and the two regions further interact via packing of two long loops from both parts and the β-hairpin.

NU

There are some differences in the main chain conformation between the four monomers

MA

in the crystallographic asymmetric unit (Fig. S1). The A chain deviates the most from the others with an rmsd for Cα atoms of 1.8-2 Å to the other chains while the C and D

D

chains are the most similar with a Cα atom rmsd of 0.7 Å. The main differences are

TE

located at the distal ends of the molecule, particularly in the loop connecting helices α3 and α4 and around helix α6 in the α-part, and at the five-stranded sheet in the β-part.

AC CE P

The different conformations are induced by crystal contacts between the molecules, and reflect the apparent natural flexibility of the A2 protein.

Structural similarities to other proteins Comparison of the A2 structure to other proteins currently in the Protein Data Bank does not reveal any matches for the whole length of the protein. Four of the helices - α2, α3, α6 and α8 - in A2 adopt a similar structure as a four-helix bundle in many functionally diverse helical proteins such as cytochrome C’ (PDB ID: 1CGN), mannose-6phosphate receptor binding protein (PDB ID: 1SZI) or lincosamide nucleotidyltrasferase (PDB ID: 3JZ0), while the C-terminal α9 is not found in any of them. The fold of the βpart of the A2 protein is not similar to other known proteins, however, the structure formed by strands β8, β9, β10, β11, β12 and β13 resembles an incomplete β barrel and

ACCEPTED MANUSCRIPT is distantly similar to several β barrel membrane proteins such as the Escherichia coli OmpX (PDB ID: 2MNH) or an attachment invasion locus protein from Yersinia pestis

PT

(PDB ID: 3QRA). While the functional similarities to these proteins might appear suggestive, it seems unlikely that during the infection stage the respective part of the A2

RI

protein might insert into cell membrane in a similar manner, as there are no

SC

characteristic hydrophobic residues outside of the A2 “barrel”.

NU

Interactions with coat protein

MA

When fitted into the recent medium-resolution cryo-EM map of the Qβ virion [13], the A2 crystal structure offers additional information about how the protein is integrated

D

into the capsid and how it interacts with the genome. The crystal structure fits into the

TE

map very well (Fig. S2), demonstrating that despite the apparent flexibility, the A2 protein in the crystal adopts essentially the same structure as in the assembled virus

AC CE P

particle. The A2 protein replaces a single coat protein dimer in the “CC” quasiequivalent conformation, and interacts with the surrounding four “AB” dimers. Contact with the capsid is mediated by the central region of the A2 protein while the α-part extends some 60 Å into the particle and the β-part points away from the capsid at a shallow angle (Fig. 2). Although the central β sheet of the A2 protein distantly resembles the capsid interior-facing β sheet of the coat protein, the interactions between the A2 and coat proteins are completely different from those between coat protein dimers. The A2 protein lies askew the hole formed by the missing dimer, with one corner of the central β sheet slightly below the capsid surface and the opposite corner above it. The contacts are different with each of the coat protein subunits with interface area ranging from 150 to 470 Å2 per dimer and approximately 1100 Å2 in total, which is about a third of the area (~3100 Å2) that a coat protein dimer buries in interactions with neighboring

ACCEPTED MANUSCRIPT subunits in the capsid. The relatively weaker interaction is apparently necessary for the A2 protein to be able to leave the capsid upon infection. The cryo-EM reconstruction of

PT

the Qβ virion also showed that the coat protein dimers surrounding the A2 protein are slightly pushed apart from their icosahedrally symmetrical positions such that the inter-

RI

subunit disulfide bonds between them can no longer form. The local weakening of

SC

protein-protein interactions apparently further contributes to efficient release of the genome through the capsid.

NU

Unsurprisingly, the coat protein regions interfacing the A2 protein are the same which

MA

are involved in inter-subunit contacts. The interaction between the two proteins is stronger at one side of the portal where the N-terminal part of the α2 helix, the turn

D

following strand β11, and the loop between β14 and α9 of the A2 protein interact with

TE

the region around the quasi-three-fold symmetry axis of the two adjacent dimers. Due to the resolution of the map it is not possible to reliably identify the contacts, but

AC CE P

residues Asn129, Gln137, Arg280, Glu281, Asp342 and Asp395 of the A2 protein are located close to the interface and appear as obvious candidates for the interaction. At the opposite side of the portal, the C-terminus and the loop connecting strand βG and helix αA of one of the coat protein dimers appear to make contact with Lys2 at the Nterminus and Asp12 and Asn13 in the β hairpin of the A2 protein. The last of the four A2-surrounding dimers is approached by an overhanging edge of the five-stranded β sheet of the A2 protein, but the interaction appears to be much weaker and involve only van der Waals interactions.

Interactions with RNA The maturation protein makes extensive contacts with the genome, but virtually nothing is known about the details of the interaction. The surface electrostatic potential

ACCEPTED MANUSCRIPT of the A2 protein reveals two distinct positively charged regions, the first spanning the capsid-interior facing central β sheet and an adjacent region of the α-part, and the other

PT

located on the opposite side of the helix bundle down to the bottom of the α-part (Fig. 3). These areas are apparently involved in binding to the RNA backbone; and RNA

RI

density is indeed visible in close proximity to these regions in the cryo-EM map. The

SC

two positively-charged regions make extensive contacts each with a separate RNA hairpin in the genome (Fig. 3), and there are another two less pronounced areas where

NU

the RNA interacts with the A2 protein (Fig. S3). The first major RNA-binding region

MA

interacts with a rather long RNA hairpin and is formed by the side chains of Asn226, Lys230 and Arg233 in helix α5 that bind to the RNA stem, Val294, Ser295 and Lys298 in

D

α6 that interact with a feature resembling a bulged nucleotide, and Pro71, Ser75, Gly76

TE

and Arg78 in the loop connecting strands β6 and β7 and Arg236 in β10 that make contact with the hairpin loop. In addition, the hairpin loop is positioned close to a

AC CE P

nearby coat protein dimer and might make contact with Arg57 in its EF loop. The other pronounced RNA-binding region of the A2 protein includes residues Lys158, Arg165, Arg168, Arg172 and Arg176 in helix α2, Ser409 in α9 and Arg416 near the C-terminus, which bind to a backbone of a long RNA helix running parallel to the helices. Another smaller RNA-binding region is located nearby at the distal end of the α-part and involves residues Arg180 and Arg184 in helix α3 and probably some residues from the nearby loop connecting α3 and α4 which likely adopts a different conformation in the virion than in the crystal. Finally, the side chains or Tyr208 in helix α4 and His308 in the loop connecting α6 and α7 appear to interact with yet another part of the genome, but the RNA structure they bind to is hard to interpret at the given resolution of the map.

Comparison to other ssRNA phage maturation proteins

ACCEPTED MANUSCRIPT The maturation proteins of the ssRNA phages are highly divergent with sequence identity below 15% for phages that infect different bacterial genera, and even among

PT

the alloleviviruses the sequence identity is less than 50% for the more distantly related phages. This makes sequence alignment for most parts of the protein unreliable, but

RI

evidently the α-part is much better conserved than the β-part. The few universally

SC

conserved residues that are present even among the most distinct ssRNA phages (Fig. S4) are located in the α-part where they appear to mediate the packing of the helices

NU

and loops at the distal end of the protein (Fig. 4). Interestingly, the totally conservative

MA

Ser318 in the Qβ A2 protein faces away from the helix interface and does not seem to have an obvious function. Although the residue is exposed to the interior of the particle,

D

the cryo-EM structure suggests that it is not involved in RNA binding, thus it might be

TE

speculated that it has a role in some other part of the phage life cycle such as the genome penetration stage.

AC CE P

Although both MS2 and Qβ use the same bacterial receptors for adsorption, their maturation proteins share only about 20% sequence identity, mostly in the helical part plus a few residues in the central seven-stranded β-sheet. The MS2 phage is actually evolutionary closer to leviviruses specific for other conjugative pili than to the F-pili specific alloleviviruses, which suggests that the F-pili specificity in MS2 and Qβ has evolved independently from each other. When only the allolevivirus maturation proteins are aligned (Fig. S5), in addition to extensive sequence conservation in the αpart, the residues that form the central seven-stranded β sheet are also well conserved, while there is little similarity among those that form the five-stranded β sheet at the tip of the β-part. In the virion, the distal end of the β-part is the most exterior-facing region of the A2 protein that presumably interacts both with pili and with the MurA protein,

ACCEPTED MANUSCRIPT and it is intriguing, and currently unknown, how the allolevivirus maturation proteins achieve this despite the considerable sequence variability.

PT

The phages within the allolevivirus genus cluster into two genogroups, historically denoted III and IV, with Qβ as the type species from group III and phage SP as the classic

RI

representative of group IV. A clear distinction between the two genogroups is that

SC

group IV phages have a 22 to 31 residue long insertion in the A2 protein. From the Qβ A2 structure, it can now be seen that the insertion maps to the loop between helices α3

NU

and α4 at the far end of the α-part. According to secondary structure prediction of the

MA

SP A2 protein, the insertion might form a short additional α helix. The insertion contains several positively charged residues and is located close to the RNA-binding

D

regions of the Qβ A2 protein, and might contribute to genomic RNA binding in group IV

AC CE P

Discussion

TE

phages.

After decades of mystery, the crystal structure of the Qβ A2 protein and the cryo-EM reconstructions of the MS2 and Qβ phages have finally started to shed some light on the molecular machinery that the ssRNA phages use for delivering their genome into the host cell. Still, mostly due to the difficulties of working with them in an isolated form, the maturation proteins of the ssRNA phages remain poorly characterized. The A protein from bacteriophage MS2 is the only one that is somewhat characterized functionally, but given the vast sequence variation of the maturation proteins, it is not clear to what level the functional similarities extend to other ssRNA phages. Biochemical studies have shown that the MS2 A protein binds to the genome at two regions, one in the maturation gene close to the 5’ end of the RNA molecule, and the

ACCEPTED MANUSCRIPT other near its 3’ terminus [16]. The secondary structure of the MS2 and Qβ genomes is sufficiently distinct to prevent the prediction of the RNA structures that the Qβ A2

PT

protein binds to. Also, the MS2 A protein has been reported to be cleaved into a large and a small fragment during the penetration stage [6], but it is not known whether this

RI

applies to other ssRNA phages as well. If, however, an analogous cleavage event is

SC

considered for the Qβ A2 protein, from the crystal structure it appears that approximately the first N-terminal 120 amino acids could be removed without

NU

disturbing the structure of the rest of the protein too much, while removal of a similar

MA

sized fragment from the C-terminus would require the disruption of the helix bundle, which does not appear very likely. Thus, if the two fragments were to separate from

D

each other, it is more likely that the supposed cleavage site of the A2 protein is located

TE

closer the N-terminus.

One of the hallmark features of the alloleviviruses is the presence of a minor coat

AC CE P

protein species A1 in the capsid which is required for infection. The other distinct feature of the genus is the lack of a dedicated lysis protein; instead, cell lysis is mediated by the A2 protein. Thus, it appears that while the lysis function has been transferred to the A2 protein, some functions of the A2 protein have in turn been forwarded to the A1 protein. Preliminary studies in our laboratory have indicated that the A1 protein is involved in binding to F-pili, and experiments are underway to find the exact determinants of the F-pili specificity of the A1 and A2 proteins. It has been shown that the enzymatic activity of the MurA protein is inhibited by intact Qβ virions [9]. This indicates that the MurA-binding A2 region is located on the surfaceexposed β-part of the protein, but it is currently not possible to pinpoint any specific residues which are involved in the A2-MurA interaction. Nonetheless, the crystal

ACCEPTED MANUSCRIPT structure now makes rational analysis of the surface-exposed A2 residues possible to determine the MurA-binding region experimentally.

PT

In conclusion, we have shown in this study that soluble Qβ A2 protein can be obtained in a homogeneous monomeric state, that not only has allowed us to determine its

RI

crystal structure, but also provides much better means for studying other properties of

SC

this protein in the future. The crystal structure has revealed the previously unknown fold of an ssRNA phage maturation protein and allowed to pinpoint residues involved in

NU

coat protein and RNA binding. Together, this has advanced our understanding about the

MA

structure of the ssRNA bacteriophages, and provides a rich ground for further studies of

TE

Materials and methods

D

their maturation proteins.

AC CE P

Cloning, expression and purification The coding sequence of the A2 protein was amplified from pBRT7Qβ [17] and cloned into a modified pRSFDuet-1 (Novagen) vector containing the coding sequence for the maltose binding protein followed by a TEV protease cleavage site. Expression of the MBP-A2 protein resulted in efficient cell lysis, which was solved by co-expressing mutant MurA where Leu138 is substituted with a glutamine (L138Q) that confers resistance to A2-mediated lysis [9]. For this, the coding sequence of the MurA protein was amplified from E.coli JM109 genomic DNA and cloned into a bacterial expression vector. The resulting plasmid was used as a template to introduce the L138Q amino acid substitution via PCR mutagenesis, and using overlap extension PCR, a DNA fragment was constructed by fusing Pbla promoter from pET22b with the MurA L138Q coding sequence. The fragment was inserted into the previously constructed MBP-A2 expression plasmid to ensure constitutive expression of MurA L138Q in the cells.

ACCEPTED MANUSCRIPT To produce the MBP-A2 protein, E.coli BL21-AI cells containing the expression plasmid were grown in 2xTY medium at 37 °C until OD590 reached 0.5, then cooled to 16 °C,

PT

induced by adding IPTG and arabinose to final concentrations of 1 mM and 0.2%, respectively, and further incubated at 16 °C for 18 h. The cells were harvested by

RI

centrifugation, resuspended in a small volume of buffer PAB1 (20 mM Tris-HCl pH 7.5,

SC

0.5 % w/v sodium polyacrylate 2100) supplemented with 1 mM PMSF and lysed by sonication. The lysate was cleared by centrifugation and the supernatant loaded on a 1

NU

ml MBPTrap HP (GE Healthcare) column. The column was washed with 15 ml of PAB1

MA

and the bound protein eluted with 3 ml PAB1 supplemented with 10 mM maltose. The eluate was loaded on a HiLoad 16/600 Superdex 200 column (GE Healthcare)

D

equilibrated with PAB1. The second major peak corresponding to monomeric MBP-A2

TE

was collected and treated with recombinant TEV protease for 48 h. After cleavage, the mixture was passed through 1 ml MBPTrap HP and HisTrap HP columns (GE

AC CE P

Healthcare), the flow-through concentrated using an Amicon 10 kDa MWCO unit (Millipore) and again purified on a HiLoad 16/600 Superdex 200 column equilibrated with PAB1. Fractions containing the A2 protein were pooled and used for crystallization.

To obtain selenomethionine-labeled protein for phasing, three additional methionine residues were introduced to increase the anomalous signal. The substitutions, L49M, L354M and L407M were sequentially introduced into the MBP-A2 expression plasmid using PCR mutagenesis. To produce SeMet-labeled protein, the final plasmid was transformed into BL21-AI cells which were grown in 2xTY medium at 37 °C until OD590 reached 0.8. The culture was then centrifuged and the cell pellet resuspended in SelenoMet medium base supplemented with a glucose-free nutrient mix (Molecular Dimensions), 50 mg/l L-isoleucine, 100 mg/l L-lysine, 100 mg/l L-threonine and 0.4%

ACCEPTED MANUSCRIPT glycerol. The culture was incubated for two hours at 16 °C, induced by adding IPTG and arabinose to final concentrations of 1 mM and 0.2%, respectively, and further incubated

PT

at 16 °C for 18 h. The SeMet-labeled protein was purified using the same protocol as for

RI

the wild-type protein, with the only exception that 1 mM DTT was added to all buffers.

SC

Crystallization and structure determination

Purified A2 protein in PAB1 was concentrated to 10 mg/ml using Amicon 10 kDa

NU

MWCO filters and crystallized using the sitting-drop vapor diffusion technique. The best

MA

crystals were obtained by mixing 0.4 μl of the concentrated protein solution with 0.4 μl of a solution containing 40 mM potassium dihydrogen phosphate, 17% PEG 8000 and

D

20% glycerol. The SeMet protein crystallized in slightly different conditions; the one

TE

from which the phasing data were collected was grown in 70 mM potassium dihydrogen phosphate, 21% PEG 8000 and 23% glycerol. Notably, the drop size and mixing

AC CE P

conditions were very important for crystal growth, and the crystals could only be obtained when the drops were set up with a Tecan EVO crystallization robot, but never when mixing the drops by hand using a micropipette. Prior to data collection, the crystals were transferred to a mother liquor containing 30 % glycerol, and flash-frozen in liquid nitrogen. The data for the native crystals were collected at MAX-lab (Lund, Sweden) beamline I911-3 and for the SeMet labeled crystals at the BESSY II (Berlin, Germany) beamline 14.1. Diffraction data were processed using XDS [18] via the XDSAPP 2.0 graphical user interface [19]. The positions of the selenium atoms were located and initial phasing done using SHELX [20] followed by density modification with DM [21] from the CCP4 software package [22]. The resulting map clearly revealed the α helices which were used as the starting point for model building in COOT [23]. For initial stages of the

ACCEPTED MANUSCRIPT model building, four-fold NCS averaging in DM was used to improve map quality, and the known positions of the selenium atoms were used to guide chain tracing. Using the

PT

SeMet data, a partial model containing most of the α-part and some of the central β sheet was built which was then used with the higher resolution native data to complete

RI

the model. Refinement was done using REFMAC [24] and the model was validated using

SC

COOT and the MolProbity server [25]. Crystallographic data collection, scaling and

NU

refinement statistics are given in Table 1.

MA

Structure and sequence analysis

The model of the Qβ A2 protein integrated into the capsid was generated by rigid-body

D

fitting Qβ coat protein (PDB ID: 5KIP) and maturation protein coordinates into the Qβ

TE

virion cryo-EM map (EMDB accession number EMD-8255) [13] using COOT. The protein interface areas were calculated with PISA [26]. 3D-comparison of the A2 structure to

AC CE P

other proteins in the Protein Data Bank were done using the Dali server [27]. Sequence alignments were prepared using Muscle [28] with minor manual adjustments. Secondary structure of phage SP A2 protein was predicted with Jpred [29]. All figures were generated using PyMol [30].

Accession numbers The atomic coordinates and structure factors of the Qβ maturation protein have been deposited in the Protein Data Bank with the accession code 5MNT.

Acknowledgements

ACCEPTED MANUSCRIPT The study was supported by grant 12.094 from the Latvian Council of Sciences, grant 2010/0314/2DP/2.1.1.1.0/10/APIA/VIAA/052 from the European Regional

PT

Development Fund and grant 7869 from Biostruct-X. We are grateful to Ināra Akopjana for skillful technical assistance, the staff at MAX-lab and BESSY II for their help during

NU

SC

during data collection and excellent technical support.

RI

our synchrotron visits, and Dr. Manfred Weiss in particular for his helpful suggestions

References

MA

[1] Weiner AM, Weber K. Natural read-through at the UGA termination signal of Q-beta coat protein cistron. Nature New Biol. 1971;234:206-9.

D

[2] Hofstetter H, Monstein H, Weissmann C. The readthrough protein A1 is essential for

TE

the formation of viable Qb particles. Biochim Biophys Acta. 1974;374:238-51.

AC CE P

[3] Dent KC, Thompson R, Barker AM, Hiscox JA, Barr JN, Stockley PG, et al. The asymmetric structure of an icosahedral virus bound to its receptor suggests a mechanism for genome release. Structure. 2013;21:1225-34. [4] Crawford EM, Gesteland RF. The adsorption of bacteriophage R17. Virology. 1964;22:165-7.

[5] Paranchych W, Ainsworth SK, Dick AJ, Krahn PM. Stages in phage R17 infection. V. Phage eclipse and the role of F pili. Virology. 1971;45:615-28. [6] Krahn PM, O´Callaghan RJ, Paranchych W. Stages in phage R17 infection. VI. Injection of A protein and RNA into the host cell. Virology. 1972;47:628-37. [7] Karnik S, Billeter M. The lysis function of RNA bacteriophage Qb is mediated by the maturation (A2) protein. EMBO J. 1983;2:1521-6.

ACCEPTED MANUSCRIPT [8] Winter RB, Gold L. Overproduction of bacteriophage Qb maturation (A2) protein leads to cell lysis. Cell. 1983;33:877-85.

virion: diversity in lysis targets. Science. 2001;292:2326-9.

PT

[9] Bernhardt TG, Wang IN, Struck DK, Young R. A protein antibiotic in the phage Qbeta

RI

[10] Atkins JF, Steitz JA, Anderson CW, Model P. Binding of mammalian ribosomes to

SC

MS2 phage RNA reveals an overlapping gene encoding a lysis function. Cell. 1979;18:247-56.

MA

implicated in lysis. Cell. 1979;18:257-66.

NU

[11] Beremand MN, Blumenthal T. Overlapping genes in RNA phage: a new protein

[12] Koning RI, Gomez-Blanco J, Akopjana I, Vargas J, Kazaks A, Tars K, et al. Asymmetric

D

cryo-EM reconstruction of phage MS2 reveals genome structure in situ. Nat Commun.

TE

2016;7:12524.

[13] Gorzelnik KV, Cui Z, Reed CA, Jakana J, Young R, Zhang J. Asymmetric cryo-EM

AC CE P

structure of the canonical Allolevivirus Qbeta reveals a single maturation protein and the genomic ssRNA in situ. Proc Natl Acad Sci U S A. 2016;113:11519-24. [14] Steitz JA. Isolation of the A protein from bacteriophage R17. J MolBiol. 1968; 33:937-45.

[15] Reed CA, Langlais C, Kuznetsov V, Young R. Inhibitory mechanism of the Qbeta lysis protein A(2). Molecular microbiology. 2012. [16] Shiba T, Suzuki Y. Localization of A protein in the RNA-A protein complex of RNA phage MS2. Biochim Biophys Acta. 1981;654:249-55. [17] Barrera I, Schuppli D, Sogo JM, Weber H. Different mechanisms of recognition of bacteriophage Q beta plus and minus strand RNAs by Q beta replicase. J Mol Biol. 1993;232:512-21. [18] Kabsch W. Xds. Acta Crystallogr D Biol Crystallogr. 2010;66:125-32.

ACCEPTED MANUSCRIPT [19] Krug M, Weiss MS, Heinemann U, Mueller U. XDSAPP: a graphical user interface for the convenient processing of diffraction data using XDS. Journal of applied

PT

crystallography. 2012;45:568-72. [20] Sheldrick GM. A short history of SHELX. Acta Crystallogr A. 2008;64:112-22.

RI

[21] Cowtan K. 'dm': An automated procedure for phase improvement by density

SC

modification. Joint CCP4 and ESF-EACBM Newsletter on Protein Crystallography. 1994;31:34-8.

NU

[22] Winn MD, Ballard CC, Cowtan KD, Dodson EJ, Emsley P, Evans PR, et al. Overview of

MA

the CCP4 suite and current developments. Acta Crystallogr D Biol Crystallogr. 2011;67:235-42.

D

[23] Emsley P, Cowtan K. Coot: model-building tools for molecular graphics. Acta

TE

Crystallogr D Biol Crystallogr. 2004;60:2126-32. [24] Murshudov GN, Vagin AA, Dodson EJ. Refinement of macromolecular structures by

AC CE P

the maximum-likelihood method. Acta Crystallogr D Biol Crystallogr. 1997;53:240-55. [25] Chen VB, Arendall WB, 3rd, Headd JJ, Keedy DA, Immormino RM, Kapral GJ, et al. MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallogr D Biol Crystallogr. 2010;66:12-21. [26] Krissinel E, Henrick K. Inference of macromolecular assemblies from crystalline state. J Mol Biol. 2007;372:774-97. [27] Holm L, Rosenstrom P. Dali server: conservation mapping in 3D. Nucleic acids research. 2010;38:W545-9. [28] Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic acids research. 2004;32:1792-7. [29] Drozdetskiy A, Cole C, Procter J, Barton GJ. JPred4: a protein secondary structure prediction server. Nucleic acids research. 2015;43:W389-94.

ACCEPTED MANUSCRIPT

AC CE P

TE

D

MA

NU

SC

RI

PT

[30] The PyMOL Molecular Graphics System, Version 1.8 Schrödinger, LLC.

ACCEPTED MANUSCRIPT

Tables

PT

Table 1. Crystallographic data collection, scaling, refinement and model validation

RI

statistics. Values in parentheses are given for the highest resolution bin. Native

SC

Data collection and scaling

MAX-lab I911-3

NU

Beamline Space group

SeMet

BESSY II 14.1

C2221

C2221

a = 112.79

a = 113.37

b = 232.52

b = 231.57

c = 353.47

c = 354.95

1.000000

0.979646

176.74 - 3.32

48.94 - 4.16

3.52 - 3.32

4.42 - 4.16

0.176 (0.913)

0.197 (1.006)

Total number of observations

256850

461930

Number of unique reflections

67077

66993

I/σI

8.74 (1.42)

9.41 (1.88)

CC(1/2) (%)

98.3 (51.8)

99.5 (65.1)

Completeness (%)

97.1 (95.1)

99.1 (96.6)

3.8 (3.8)

6.9 (6.2)

D

AC CE P

Resolution (Å)

TE

Wavelength (Å)

MA

Cell parameters

Highest resolution bin (Å) Rmerge

Multiplicity Refinement Rwork

0.228

Rfree

0.279

ACCEPTED MANUSCRIPT Average B factor (Å2)

96.203

Number of atoms

13771

0.009

bond angles (°)

1.368

RI

bong lengths (Å)

PT

rms deviations from ideal

SC

Ramachandran plot

95.8

NU

residues in favored regions (%)

98.8

AC CE P

TE

D

MA

residues in allowed regions (%)

ACCEPTED MANUSCRIPT Figure legends Figure 1. The overall structure of the Qβ A2 protein. The protein is shown in two

PT

orientations rotated by 90° and rainbow-colored blue to red from the N- to the C-

RI

terminus.

Figure 2. Interactions between the Qβ A2 protein and coat protein dimers in the capsid.

SC

The A2 protein is shown in green and the surrounding coat protein dimers in

NU

alternating orange and brown colors. A side view (left) and a top view (right) of the

interface areas with the A2 protein.

MA

protein complex is shown. The values next to coat protein dimers indicate their

Figure 3. Genomic RNA binding of the Qβ A2 protein. The surface of the A2 protein is

D

colored according to its electrostatic potential. Two positively-charged regions of the

TE

protein are involved in interactions with genomic RNA hairpins, represented as red

AC CE P

surfaces. The surrounding coat protein molecules are shown in gray as a ribbon model. The genomic RNA hairpins that make the most pronounced contacts with the A2 protein are represented as red surfaces cut from the cryo-EM volume (EMDB accession number EMD-8255) [13] contoured at 2.5σ. Figure 4. Conserved regions of ssRNA phage maturation proteins. On top, sequence alignment of maturation proteins from representative F-pili specific phages is presented with completely conserved residues colored orange and similar residues in yellow. The secondary structure elements of the Qβ A2 protein are shown above the sequence. Below, the conserved residues are represented in the context of the threedimensional structure of the Qβ A2 protein. The conserved and similar residues are colored as in the sequence alignment. Sequence alignments of allolevivirus and

ACCEPTED MANUSCRIPT representative distinct Leviviridae phage maturation proteins are given in Fig. S4 and

AC CE P

TE

D

MA

NU

SC

RI

PT

Fig. S5, respectively.

Figure 1

AC CE P

TE

D

MA

NU

SC

RI

PT

ACCEPTED MANUSCRIPT

NU

SC

RI

PT

ACCEPTED MANUSCRIPT

AC CE P

TE

D

MA

Figure 2

Figure 3

AC CE P

TE

D

MA

NU

SC

RI

PT

ACCEPTED MANUSCRIPT

AC CE P

TE

D

MA

NU

SC

RI

PT

ACCEPTED MANUSCRIPT

Figure 4

AC CE P

TE

D

MA

NU

SC

RI

PT

ACCEPTED MANUSCRIPT

Graphical abstract

ACCEPTED MANUSCRIPT Highlights

PT

RI SC NU MA D TE



Crystal structure of the maturation protein from bacteriophage Qβ solved at 3.3 Å resolution The maturation protein consists of a conserved helical and a variable beta sheet region The obtained structure fitted into a recently published low resolution asymmetric cryo-EM map of bacteriphage Qβ Regions of the maturation protein, inolved in coat protein and RNA binding identified

AC CE P

• • •