The Crystal Structure of Mouse Nup35 Reveals Atypical RNP Motifs and Novel Homodimerization of the RRM Domain

The Crystal Structure of Mouse Nup35 Reveals Atypical RNP Motifs and Novel Homodimerization of the RRM Domain

J. Mol. Biol. (2006) 363, 114–124 doi:10.1016/j.jmb.2006.07.089 The Crystal Structure of Mouse Nup35 Reveals Atypical RNP Motifs and Novel Homodimer...

1MB Sizes 1 Downloads 42 Views

J. Mol. Biol. (2006) 363, 114–124

doi:10.1016/j.jmb.2006.07.089

The Crystal Structure of Mouse Nup35 Reveals Atypical RNP Motifs and Novel Homodimerization of the RRM Domain Noriko Handa 1 , Mutsuko Kukimoto-Niino 1 , Ryogo Akasaka 1 Seiichiro Kishishita 1 , Kazutaka Murayama 1,2 , Takaho Terada 1 Makoto Inoue 1 , Takanori Kigawa 1,3 , Shingo Kose 4 Naoko Imamoto 4 , Akiko Tanaka 1 , Yoshihide Hayashizaki 1 Mikako Shirouzu 1 and Shigeyuki Yokoyama 1,5 ⁎ 1

RIKEN Genomic Sciences Center, Tsurumi, Yokohama 230-0045, Japan 2

Tohoku University Biomedical Engineering Research Organization, 2-1 Seiryo-machi, Aoba, Sendai 980-8575, Japan 3

Department of Computational Intelligence and Systems Science, Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology, 4259 Nagatsuta-cho, Midori-ku, Yokohama, 226-8502, Japan 4

Cellular Dynamics Laboratory, RIKEN Discovery Research Institute, Wako-shi, Saitama, 351-0198, Japan 5

Department of Biophysics and Biochemistry, Graduate School of Science, The University of Tokyo, Bunkyo-ku, Tokyo 113-0033, Japan *Corresponding author

The nuclear pore complex mediates the transport of macromolecules across the nuclear envelope (NE). The vertebrate nuclear pore protein Nup35, the ortholog of Saccharomyces cerevisiae Nup53p, is suggested to interact with the NE membrane and to be required for nuclear morphology. The highly conserved region between vertebrate Nup35 and yeast Nup53p is predicted to contain an RNA-recognition motif (RRM) domain. Due to its low level of sequence homology with other RRM domains, the RNP1 and RNP2 motifs have not been identified in its primary structure. In the present study, we solved the crystal structure of the RRM domain of mouse Nup35 at 2.7 Å resolution. The Nup35 RRM domain monomer adopts the characteristic βαββαβ topology, as in other reported RRM domains. The structure allowed us to locate the atypical RNP1 and RNP2 motifs. Among the RNP motif residues, those on the β-sheet surface are different from those of the canonical RRM domains, while those buried in the hydrophobic core are highly conserved. The RRM domain forms a homodimer in the crystal, in accordance with analytical ultracentrifugation experiments. The β-sheet surface of the RRM domain, with its atypical RNP motifs, contributes to homodimerization mainly by hydrophobic interactions: the side-chain of Met236 in the β4 strand of one Nup35 molecule is sandwiched by the aromatic side-chains of Phe178 in the β1 strand and Trp209 in the β3 strand of the other Nup35 molecule in the dimer. This structure reveals a new homodimerization mode of the RRM domain. © 2006 Elsevier Ltd. All rights reserved.

Keywords: nuclear pore protein; NPC; Nup35; RRM domain; homodimer

Abbreviations used: RRM, RNA-recognition motif; RNP, ribonucleoprotein; Nup, nucleoporin; NPC, nuclear pore complex; NE, nuclear envelope; kaps, karyopherins; FG, Phe-Gly; RBD, RNA-binding domain; PIE, polyadenylation inhibition element; PABP, polyadenylate-binding protein; CBP, CREB-binding protein; UPF, up-frameshift; U2AF, U2 auxiliary factor; UHM, U2AF homology motif; SF1, splicing factor 1; SF3b, splicing factor 3b; MAD, multiple-wavelength anomalous dispersion; r.m.s.d., root-mean-square deviation; SeMet, selenomethionine; TEV, tobacco etch virus. E-mail address of the corresponding author: [email protected] 0022-2836/$ - see front matter © 2006 Elsevier Ltd. All rights reserved.

Structure of the Nup35 RRM Domain

Introduction In eukaryotic cells, the nuclear synthesis of DNA and RNA is separated from the cytoplasmic protein synthesis by the double membrane of the nuclear envelope (NE).1 The transport of molecules across the NE is controlled by the nuclear pore complexes (NPCs), which are large proteinaceous structures spanning the NE.2,3 The NPC is composed of a relatively small number of proteins (∼ 30), termed nucleoporins or Nups.4,5 The calculated mass of the NPC is 44 × 106 Da in yeast and 60 × 106 Da in vertebrates,4,5 although the measured mass is larger. The cargo molecules contain short sequence elements, nuclear localization sequences (NLSs) or nuclear export sequences (NESs), which are recognized by shuttling transport factors called karyopherins (kaps; also referred to as importins, exportins or transportins).6 The transfer of cargo– kaps complexes through the NPCs requires interactions between kaps and FG repeats in FG-nups.7 NPCs are assembled throughout the cell cycle, and thus the protein–protein interactions of nups and associated proteins are potentially regulated. In yeast, the interactions between Nup53p and other nups are cell cycle-dependent.8 During interphase, Nup53p interacts with Nup170p. In mitosis, Nup53p is released from Nup170p and interacts with Nic96p. This event leads to the exposure of the high-affinity Kap121p-binding domain of Nup53p. Nup53p thus binds KAP121p and functions as a nuclear import inhibitor.8,9 Nup53p interacts also with Mad1p, and may regulate the duration of the spindle assembly checkpoint machinery.10 The vertebrate nuclear pore protein Nup35, the ortholog of yeast Nup53p interacts with Nup93 (the ortholog of Nic96p) and Nup155 (the ortholog of yeast Nup170p and Nup157p).11 Depletion of Nup35 by RNA interference (RNAi) inhibits the assembly into the NPC of these interacting nups and the spindle checkpoint protein Mad1, and leads to aberrant nuclear morphology.11 Unlike yeast Nup53p, vertebrate Nup35 lacks the Kap121p-binding domain, so it is unlikely that vertebrate Nup35 regulates the cell cycle-dependent import mediated by the vertebrate counterpart of Kap121p (Figure 1). Both vertebrate Nup35 and yeast Nup53p are predicted to contain an RNA-recognition motif (RRM) domain that lacks clear consensus motifs, a C-terminal amphipathic

115 helix and several FG repeats (Figure 1).12,13 A recent in vitro study showed that Nup35 interacts with the transmembrane nucleoporin NDC1 through its conserved C-terminal segment, which contains a potential amphipathic α-helix.14 The RRM domain, also called the ribonucleoprotein (RNP) domain or the RNA-binding domain (RBD), is a ubiquitous protein domain in eukaryotes and is thought to mediate RNA recognition in many proteins involved in post-transcriptional processes. 15 Several structures of RRM domains in complex with RNA or DNA have been determined. These structures revealed that RNA/DNA recognition by RRM domains is mediated through their βsheet surface, with the RNP1 and RNP2 motifs in the two central β-strands.15–18 On the other hand, the protein recognition by RRM domains is diverse. The homodimeric structure of an RRM domain, solved by NMR, was determined with the N-terminal RRM domain of the U1A protein in complex with the polyadenylation inhibition element (PIE) RNA.19 When bound to RNA, the U1A RRM domain forms the homodimer mainly by hydrophobic interactions between the two helices located at the C-terminal RRM domain. 19 Likewise, the tandem RRM domains in hnRNPA1,20,21 Sex-lethal,22 HuD,23 and PABP24 interact with the α-helix (helices) and/ or β-strands. In the cases of U2B″25 and CBP20,26 the recognition of RNA by the RRM domain requires the cofactors, U2A′ and CBP80, respectively. The heterologous protein–protein interactions are mediated through the α-helices and loops of the RRM domains.25,26 In all of these cases, the β-sheet surface of the RRM domain is used, with the RNP1 and RNP2 motifs, as an RNA/DNA-binding platform. In contrast, the RRM domains of some proteins function as protein recognition domains, but not RNA recognition domains. The structures of the Y14–Mago27–29 and UPF2–UPF3b complexes30 revealed that the interaction is mediated through the β-sheet surfaces, thus preventing RNA binding. Y14 has typical RNP motifs, whereas UPF3b lacks the clear consensus sequences of the RNP2 motif. The RRM domain of U2AF35 and the C-terminal RRM domain of U2AF65 represent an atypical RRM domain, named the U2AF homology motif (UHM), containing atypical RNP motifs, in which the aromatic residues that would normally bind RNA are absent.31–33 The U2AF35–U2AF65 and the

Figure 1. A representation of mouse Nup35 and yeast Nup53p. The predicted RRM domains,12 the potential amphipathic α-helices,13 the Kap121p-binding domain,9 and the FG sequences are green, pink, light blue, and yellow, respectively.

Structure of the Nup35 RRM Domain

116 U2AF65–SF1 complexes revealed that the protein interaction is mediated through the α-helices.31,32 Furthermore, the p14–SF3b155 peptide complex revealed that the β-sheet surface of the p14 RRM domain, which contains typical RNP motifs, is occluded largely by a C-terminal α-helix and the SF3b155 peptide, and that the adenosine-binding portion of the RNP2 motif is exposed within a pocket on the occluded surface.34 In the present study, we have solved the crystal structure of the predicted RRM domain of mouse Nup35 at 2.7 Å resolution. The monomer adopts an RRM fold, with the characteristic βαββαβ topology of the secondary structure elements. We have identified its atypical RNP motifs, which lack the conserved residues that typically bind RNA in canonical RRM domains. The structure revealed a novel mode of RRM domain homodimerization. The dimer interface involves the β-sheet surface, which is generally used to bind RNA in typical RRM domains.

Results Crystallization and structure determination The mouse Nup35 cDNA clone is from the FANTOM RIKEN full-length cDNA clone collection (FANTOM clone ID 5330402E05).35 Crystallization trials were carried out for selenomethionine(SeMet)labeled samples of various fragments containing the predicted RRM domain (residues 173–252). The best crystals were obtained with the fragment consisting of residues 156–261 (Figure 1). The crystals belong to the space group P212121, with unit cell constants of a = 59.5 Å, b = 104.2 Å, and c = 110.0 Å. The structure was determined by multiple-wavelength anomalous dispersion (MAD) at 2.7 Å resolution (see Materials and Methods, and Table 1). In the electron density map, the N and C-terminal artificial linkers and the 14–16 ,and 12 residues at the N and C terminus, respectively, were disordered. The asymmetric unit contains four molecules. The structures of these molecules are essentially identical, with a rootmean-square deviation (r.m.s.d.) of all Cα positions between 0.27 Å and 0.49 Å. Overall structure of the predicted RRM domain monomer of Nup35 The predicted RRM domain monomer of mouse Nup35 adopts an RRM fold, the characteristic βαββαβ topology of the secondary structure elements, with a four-stranded antiparallel β-sheet packed against two α-helices (Figure 2).36 Both the N and the C-terminal ends of the Nup35 RRM form an additional, short helical structure. The α2–β4 loop also contains a short β-strand, β3′. The β2–β3 loop is relatively short, consisting of only five residues. In the canonical RRMs, especially those that interact with an RNA stem–loop structure, the β2–β3 loop plays an important role in RNA binding.37

Table 1. Data collection, phasing, and refinement statistics Data set Remote A. Data collection and processing Wavelength (Å) 0.9640 Resolution range (Å) 20–2.7 No. unique reflections 19,082 No. measured reflections 102,977 Multiplicity 5.4 97.8 (86.8) Completenessa (%) a,b 8.1 (30.7) Rsym (%) 14.5 (3.2) I/σ(I)a B. Phasing statistics Resolution range (Å) Se sites/monomer FOMMADc

20–2.7 4 0.39

C. Model refinement Resolution range (Å) No. reflections No. protein atoms No. water molecules Rworkd (%) Rfreed (%)

20−2.7 34,813 2550 32 21.1 23.4

D. Stereochemistry r.m.s.d. from ideal Bond lengths (Å) Bond angles (deg.) Residues in the Ramachandran plot Most favored region (%) Additionally allowed regions (%) Generously allowed regions (%) Disallowed regions (%)

Peak

Edge

0.9791 20–2.7 18,827 100,173 5.3 98.1 (88.4) 8.7 (29.1) 13.3 (3.4)

0.9794 20–2.7 18,919 100,262 5.3 98.2 (87.8) 7.1 (29.7) 15.3 (3.4)

0.014 1.50 90.7 8.2 1.1 0

a

Statistics for the highest resolution shell are given in parentheses. b Rsym=(∑h∑| i Ihi–〈Ih〉|/∑h∑| i Ii|) where h indicates unique reflection indices, and i indicates symmetry-equivalent indices. c Figure of merit after SOLVE phasing. d Rwork=∑|Fobs–Fcalc|/∑Fobs for all reflections and Rfree was calculated using randomly selected reflections (5%).

A search of the Protein Data Bank with the program DALI38 revealed that the RRM domain monomer of Nup35 shares the highest level of structure similarity with the N-terminal RRM domain of Sex-lethal,22 the RRM domain of CBP20,26 the N-terminal RRM domain of PABP,24 the N-terminal RRM domain of hnRNP A1,20,21 the RRM domain of p14,34 and the N-terminal RRM domain of U1A,37 with Z-scores ranging between 10.8 and 9.7 and with r.m.s.d. ranging between 1.8 Å and 2.4 Å. Superpositions of the Cα traces of the RRM domain of Nup35 with the N-terminal RRM domains of Sexlethal and U1A are shown in Figure 3. Atypical RNP1 and RNP2 motifs in the Nup35 RRM domain The sequences of the Nup35 RRM domain are highly conserved in all eukaryotes (Figure 4).39 The structure-based sequence alignments of mouse Nup35 with the RRM-containing proteins Sex-lethal,

Structure of the Nup35 RRM Domain

117

Figure 2. A ribbon diagram of the Nup35 RRM domain monomer structure (stereo view). The β strands are cyan, the α helices are red, the 310 helices are green, and the random coils are gray. This Figure was drawn using MOLSCRIPT52 and Raster3D.53

PABP, p14, CBP20, U1A, and hnRNP A1 are shown in the lower portion of Figure 4. The alignments revealed several sequence features of the RRM domain of Nup35. The most characteristic features are its atypical RNP motifs. The RNP1 and RNP2 consensus sequences are defined as (R/K)-G-(F/Y)-

(G/A)-(F/Y)-(L/I/V)-X-(F/Y) and (V/L/I)-(Y/F)(L/V/I)-X-X-L (where X is any amino acid), respectively, and are juxtaposed on the two central β-strands, β3 and β1, respectively.15–18 In the RNP motifs, four residues on the β-sheet surface contribute to RNA binding in typical RRM domains;

Figure 3. Superposition of the backbone traces of the Nup35 RRM domain with the RRM domains of Sex-lethal and U1A. (a) Stereo diagram showing a superposition of the RRM domain of Nup35 and the N-terminal RRM domain of Sexlethal in magenta and green, respectively. (b) A stereo diagram showing the superposition of the RRM domain of Nup35 and the N-terminal RRM domain of U1A in magenta and green, respectively. This Figure was drawn using PyMol [http://www.pymol.org].

118

Structure of the Nup35 RRM Domain

Figure 4. The sequence alignments of Nup35 homologs. The upper lines are the sequence alignments of mouse Nup35 with the homologs from Xenopus laevis, Drosophila melanogaster, Caenorhabditis elegans, and Saccharomyces cerevisiae. The lower lines are the structure-based sequence alignments of mouse Nup35 with the RRM-containing proteins, Sex-lethal, PABP, p14, CBP20, U1A, and hnRNP A1, which share high levels of structural similarity (PDB accession numbers 1B7F, 1CVJ, 2F9D, 1H6K, 1URN, and 1HA1, respectively). The secondary structures of the mouse Nup35 RRM domain and the Sex-lethal N-terminal RRM domain are shown. Strictly conserved and similar residues are represented within red and yellow boxes, respectively. Residues involved in homodimerization of the Nup35 RRM domain with their main chains and side-chains are indicated by green and red triangles, respectively. Asterisks indicate conserved hydrophobic residues that contribute to the hydrophobic core.54 The RNP2 and RNP1 motifs, which correspond to the first and third β-strands, respectively, are indicated in boxes. Consensus sequences including the RNP motifs are shown below the alignments. The four residues contributing to RNA-binding are red.

namely, positions 1, 3, and 5 of RNP1 and position 2 of RNP2 (Figure 4).17,18 Six residues in the RNP motifs form the hydrophobic core, namely, positions 4, 6, and 8 of RNP1 and positions 1, 3, and 6 of RNP2.15,17 In contrast, in the mouse Nup35 RRM domain, the RNP1 and RNP2 sequences are G-N-WM-H-I-R-Y and V-T-V-F-G-F, respectively (Figure 4). They deviate significantly from the consensus sequences: the residues on the β-sheet surface, including the four RNA-binding residues mentioned above, are different from those of the cano-

nical RRM domains. The sequence study of 161 RRM domains by Inoue40 revealed that position 4 of RNP2 (Phe178 in Nup35) contains an aromatic residue in only 3% of the RRM domains, and that position 2 of RNP1 (Asn208 in Nup35) is occupied by Gly or Pro in 88% of them. In the mouse Nup35 RRM domain structure, these residues (Phe178 and Asn208), in addition to Trp209 and Met236, reside at the homodimer interface and contribute to the homodimer interactions (see below). On the other hand, the RNP motif residues buried in the hydrophobic

Structure of the Nup35 RRM Domain

119

core are conserved. In addition, the structurally important residues are also conserved outside the RNP motifs in Nup35 and other RRM-containing proteins (Figure 4). Homodimerization of the Nup35 RRM domain in the crystal and in solution In the crystallographic asymmetric unit, the RRM domain is packed as two dimers, which have essentially the same structure (r.m.s.d. 0.42 Å). The homodimer structure is shown in Figure 5. The interactions between the dimers bury a total surface area of 1181–1184 Å2, representing 13.6–14.0% of the dimer surface area. In order to determine the predominant species in solution, the molecular mass of the Nup35 RRM domain was measured by analytical ultracentrifugation. Equilibrium experiments were carried out, using UV absorption. Sedimentation equilibrium yielded an estimated molecular mass of 26,131 Da, which is close to the theoretical dimeric value of 25,784 Da. The fit of the data to a single ideal species model is shown in Figure 6.

Figure 6. A plot of the sedimentation equilibrium data with the residuals from the best fit to a single ideal species. This plot shows the data using 0.38 mg/ml of protein and a centrifugation speed of 18,000 rpm. The estimated partial specific volume of the protein is 0.725 ml/g, and the solvent density was calculated to be 1.0049 g/ml.

Dimer interface of the Nup35 RRM domain The dimer interface of the Nup35 RRM domain involves the β-sheet surface (Figure 5). The dimer interaction is mainly hydrophobic. The aromatic side-chains of Phe178, at the end of the β1 strand, and Trp209, at the beginning of the β3 strand, of one molecule sandwich the side-chain of Met236, at the

beginning of the β4 strand of the other molecule, in the dimer (Figure 5). Phe178 and Trp209 are in the atypical RNP2 and RNP1 motifs, respectively (Figure 4). The side-chain of Phe178 also contacts the main chains of Ile237 and Gly238 in the other molecule, while the side-chain of Trp209 contacts the

Figure 5. A ribbon diagram of the Nup35 RRM domain homodimer (stereo view). The 2-fold crystallographic symmetry axis is perpendicular to the plane of the paper. Monomers A and B are yellow and green, respectively. The diagram of monomer A represents a −90° rotation of Figure 2 versus the y axis. This Figure was drawn using PyMol [http://www.pymol.org].

120

Structure of the Nup35 RRM Domain

Figure 7. A detailed view of the dimer interfaces. (a) Interacting residues are displayed as stick models. Hydrogen bonds are indicated by red dots. The molecules are color-coded as in Figure 5. This Figure was drawn using PyMol [http://www.pymol.org]. (b) A diagram of the interactions. The dimer interactions are represented by broken lines. The green and red circles represent the main chains and side-chains, respectively, of the residues involved in homodimerization. The type of interaction is indicated as black for hydrophobic or van der Waals interaction, s red for a hydrogen bond.

Structure of the Nup35 RRM Domain

side-chain of Ile230 and the main chain of Ser234 in the other molecule. In addition to the hydrophobic contacts, the side-chain of Asn208 forms hydrogen bonds with the main chains of Gly179 and Met236 in the other molecule (Figure 7). All of these residues are highly conserved between the RRM domains of eukaryotic Nup35 and those of the yeast homologs (Figure 4).

Discussion In the present study, we determined the crystal structure of the predicted RRM domain of the mouse

121 Nup35 protein. The structure adopts the characteristic βαββαβ topology, and is very similar to other reported RRM domain structures. The RRM domain of Nup35 forms a homodimer in the crystal and in solution. The β-sheet surface, with its atypical RNP motifs, makes hydrophobic interactions for homodimerization. This homodimerization mode is different from the protein–protein interaction mode of the typical RRM domains. Nup53p and Nup59p, which are the yeast homologs of Nup35, were thought to be distantlyrelated members of the FG repeat-containing family of nucleoporins. Nup53p, Nup59p, and mouse Nup35 contain four, six, and three FG sequences,

Figure 8. Surface representation and ribbon diagram of the Nup35 RRM domain homodimer. The residues are colored according to sequence conservation, ranging from white (variable residues) to orange (conserved residues). This figure was drawn using ESPript,55 Consurf,56 and PyMol [http://www.pymol.org]. (a) The Figure is in the same orientation as in Figure 5. (b) A different view of the Nup35 RRM domain homodimer. The Figure represents a 90° rotation of (a) versus the y axis. (c) The Figure represents a 90° rotation of (b) versus the y axis.

122 respectively. In the RRM domain, one FG sequence (Phe178-Gly179, in mouse Nup35) is highly conserved in all of the Nup35 homologs, suggesting that these FG sequences play an important functional role (Figure 4).39 The crystal structure of the mouse Nup35 RRM domain revealed that this highly conserved FG sequence (Phe178-Gly179) is involved in the homodimer interactions. The canonical FG repeats in FG-Nups consist of multiple clustered FG dipeptides separated by hydrophilic spacer sequences, and provide interactions with karyopherins.41 The FG repeats are reportedly highly flexible and lack an ordered secondary structure.42 In mouse Nup35, all three FG sequences are in ordered secondary structure elements; two of the three FG sequences are in the RRM domain, and one is at the C-terminal segment containing a potential amphipathic α-helix, involved in the interaction with NDC1.14 Indeed, we could not detect significant interactions between the mouse Nup35 RRM domain and improtin-β by a pull-down assay (data not shown). These results suggest that the FG sequences in mouse Nup35 contribute to functions different from those of the FG repeats in FG-Nups. Nup35 is associated tightly with the NE membrane and the nuclear lamina, and interacts physically with Nup155 and Nup93. 11 Nup93 contains the αsolenoid fold, and Nup155 contains the α-solenoid and the β-propeller folds.12 They function as the structural scaffold proteins of the NPC.12 A recent study showed that the transmembrane nucleoporin NDC1 interacts with the Nup35 C-terminal segment, containing a potential amphipathic α-helix.14 It was suggested that the interaction between Nup35 and NDC1 triggers the assembly of the Nup93–Nup35 subcomplex into the NPC.14 The color-coded sequence conservation score of the homodimer surface of the Nup35 RRM domain (Figure 8) indicates that the highly conserved portions are the β-sheet surface, the surface formed by the β2 strand and the α1 helix, and the surface formed by the β4 strand and the α2 helix. These regions are likely to contribute to protein–protein interactions. In contrast, the residues on the surfaces of the N and C-terminal short helical structures are poorly conserved. As the Nup35 RRM domain dimerizes, Nup35 is likely to contribute to the formation of the octameric architecture of the NPC. Further studies based on this structure will provide more clues to reveal the biological and molecular functions of the RRM domain in Nups.

Materials and Methods

Structure of the Nup35 RRM Domain at 16,000g at 4 °C for 20 min. The cell lysate was loaded onto a HisTrap (GE Healthcare Bio-Sciences) column (5 ml), previously equilibrated with 20 mM Tris–HCl buffer (pH 8.0) containing 1 M NaCl and 15 mM imidazole, and was eluted with 20 mM Tris–HCl buffer (pH 8.0) containing 500 mM NaCl and 500 mM imidazole. The sample buffer was exchanged to 20 mM Tris–HCl buffer (pH 8.0), containing 1 M NaCl and 15 mM imidazole with a HiPrep 26/10 desalting column. The histidine-affinity tag was cleaved by 100 μl of TEV protease (4 mg/ml) at 4 °C overnight. The reaction solution was loaded onto a HisTrap column (5 ml), and was eluted as described above. The protein sample was desalted on a HiPrep 26/10 desalting column, and was eluted with 20 mM Mes buffer (pH 5.5) containing 5 mM 2-mercaptoethanol. Next, the protein sample was loaded onto a HiTrap SP (GE Healthcare Bio-Sciences) column (5 ml), previously equilibrated with 20 mM Mes buffer (pH 5.5) containing 5 mM 2-mercaptoethanol, and was eluted with a linear gradient of 0–1.0 M NaCl in 20 mM Mes buffer (pH 5.5) with 5 mM 2-mercaptoethanol. Finally, the protein sample was loaded onto a HiLoad 16/60 Superdex 75 (GE Healthcare BioSciences) column, previously equilibrated with 20 mM Tris–HCl buffer (pH 8.0) containing 150 mM NaCl and 5 mM 2-mercaptoethanol, and was eluted with this buffer. The crystals of the SeMet-substituted protein were grown at 20 °C by the sitting-drop, vapor-diffusion method (protein at 7.5 mg/ml), against a reservoir solution of 100 mM Mes buffer (pH 6.5), 12% (w/v) PEG 20,000. Data collection and processing For data collection, we transferred the crystals to 90% Paraton-N light hydrocarbon oil and 10% glycerol as a cryoprotectant. Data for the MAD method were collected at three different wavelengths at BL26B1 of SPring-8, Harima, Japan (Table 1). All data were processed using the HKL2000 and SCALEPACK programs.47 The positions of the Se atoms and the initial MAD phases were determined using the program SOLVE,48 and the MAD phases were improved with RESOLVE.49 The resulting electron density map was clear. Model building and structural refinement The data collected at the high remote wavelength (0.9640 Å) were used to refine the model. The automated model building was performed with RESOLVE.49 The remaining residues were built with the program TurboFrodo†, and multiple cycles of model building and refinement were performed. The model was refined using CNS 1.1.50 The final model has good geometry, as examined by PROCHECK:51 90.7% of the residues have ϕ/ψ angles in the most favored region of the Ramachandran plot and 100% are in the allowed regions. The data collection and refinement statistics are given in Table 1. Analytical ultracentrifugation

Protein preparation and crystallization The RRM domain (156–261) of mouse Nup35 (Figure 1) was produced as a 152-amino acid recombinant residue protein with an N-terminal histidine-affinity tag and a tobacco etch virus (TEV) protease cleavage site. The SeMet substituted protein was synthesized by the Escherichia coli cell-free system.43–46 The reaction solution was centrifuged

The native protein prepared for the analytical ultracentrifuge experiments was synthesized and purified in the same way as the SeMet-substituted protein.

† http://www.afmb.univ-mrs.fr/-TURBO-

Structure of the Nup35 RRM Domain All analytical ultracentrifuge experiments were carried out using a Beckman Optima XL-I analytical ultracentrifuge with an An-50 Ti rotor. The sample buffer was 20 mM Tris– HCl (pH 8.0), 150 mM NaCl and 5 mM β-mercaptoethanol, and all experiments were performed at 20 °C. Sedimentation equilibrium experiments were carried out with six-channel centerpieces with loading concentrations of 0.38 mg/ml and 0.19 mg/ml. Data were obtained at 18,000 rpm, 20,000 rpm and 23,000 rpm. A total equilibration time of 16 h was used for each speed, with scans taken at 12 h and 14 h to ensure that equilibrium had been reached. The absorbance wavelength was 280 nm, and the optical baseline was determined by overspeeding at 40,000 rpm at the end of data collection. The equilibrium data were fitted using the manufacturer's software (Figure 6). Protein Data Bank accession code The atomic coordinates have been deposited in the Protein Data Bank, with the accession code 1WWH.

Acknowledgements We thank Dr Masaki Yamamoto for help in data collection at the RIKEN beamline BL26B1 of SPring8. We thank Mr Satoshi Morita, Ms Yukiko Kinoshita, Mr Hiroaki Hamana, Ms Hiroko Uda-Tochio, and Ms Keiko Nagano for purification of the proteins. We thank Professor Murray Stewart, MRC, and Dr Yutaka Muto for valuable discussions. We thank Dr Satoru Unzai for help in the analysis of the analytical ultracentrifugation data. This work was supported by the RIKEN Structural Genomics/Proteomics Initiative (RSGI), the National Project on Protein Structural and Functional Analyses, Ministry of Education, Culture, Sports, Science and Technology of Japan.

References 1. Weis, K. (2003). Regulating access to the genome: nucleocytoplasmic transport throughout the cell cycle. Cell, 112, 441–451. 2. Fahrenkrog, B. & Aebi, U. (2003). The nuclear pore complex: nucleocytoplasmic transport and beyond. Nature Rev. Mol. Cell Biol. 4, 757–766. 3. Suntharalingam, M. & Wente, S. R. (2003). Peering through the pore: nuclear pore complex structure, assembly, and function. Dev. Cell, 4, 775–789. 4. Rout, M. P., Aitchison, J. D., Suprapto, A., Hjertaas, K., Zhao, Y. & Chait, B. T. (2000). The yeast nuclear pore complex: composition, architecture, and transport mechanism. J. Cell Biol. 148, 635–651. 5. Cronshaw, J. M., Krutchinsky, A. N., Zhang, W., Chait, B. T. & Matunis, M. J. (2002). Proteomic analysis of the mammalian nuclear pore complex. J. Cell Biol. 158, 915–927. 6. Pemberton, L. F. & Paschal, B. M. (2005). Mechanisms of receptor-mediated nuclear import and nuclear export. Traffic, 6, 187–198. 7. Ryan, K. J. & Wente, S. R. (2000). The nuclear pore complex: a protein machine bridging the nucleus and cytoplasm. Curr. Opin. Cell Biol. 12, 361–371.

123 8. Makhnevych, T., Lusk, C. P., Anderson, A. M., Aitchison, J. D. & Wozniak, R. W. (2003). Cell cycle regulated transport controlled by alterations in the nuclear pore complex. Cell, 115, 813–823. 9. Lusk, C. P., Makhnevych, T., Marelli, M., Aitchison, J. D. & Wozniak, R. W. (2002). Karyopherins in nuclear pore biogenesis: a role for Kap121p in the assembly of Nup53p into nuclear pore complexes. J. Cell Biol. 159, 267–278. 10. Scott, R. J., Lusk, C. P., Dilworth, D. J., Aitchison, J. D. & Wozniak, R. W. (2005). Interactions between Mad1p and the nuclear transport machinery in the yeast Saccharomyces cerevisiae. Mol. Biol. Cell, 16, 4362–4374. 11. Hawryluk-Gara, L. A., Shibuya, E. K. & Wozniak, R. W. (2005). Vertebrate Nup53 interacts with the nuclear lamina and is required for the assembly of a Nup93containing complex. Mol. Biol. Cell, 16, 2382–2394. 12. Devos, D., Dokudovskaya, S., Williams, R., Alber, F., Eswar, N., Chait, B. T. et al. (2006). Simple fold composition and modular architecture of the nuclear pore complex. Proc. Natl Acad. Sci. USA, 103, 2172–2177. 13. Marelli, M., Lusk, C. P., Chan, H., Aitchison, J. D. & Wozniak, R. W. (2001). A link between the synthesis of nucleoporins and the biogenesis of the nuclear envelope. J. Cell Biol. 153, 709–724. 14. Mansfeld, J., Guttinger, S., Hawryluk-Gara, L. A., Pante, N., Mall, M., Galy, V. et al. (2006). The conserved transmembrane nucleoporin NDC1 is required for nuclear pore complex assembly in vertebrate cells. Mol. Cell, 22, 93–103. 15. Birney, E., Kumar, S. & Krainer, A. R. (1993). Analysis of the RNA-recognition motif and RS and RGG domains: conservation in metazoan pre-mRNA splicing factors. Nucl. Acids Res. 21, 5803–5816. 16. Burd, C. G. & Dreyfuss, G. (1994). Conserved structures and diversity of functions of RNA-binding proteins. Science, 265, 615–621. 17. Maris, C., Dominguez, C. & Allain, F. H. (2005). The RNA recognition motif, a plastic RNA-binding platform to regulate post-transcriptional gene expression. FEBS J. 272, 2118–2131. 18. Varani, G. & Nagai, K. (1998). RNA recognition by RNP proteins during RNA processing. Annu. Rev. Biophys. Biomol. Struct. 27, 407–445. 19. Varani, L., Gunderson, S. I., Mattaj, I. W., Kay, L. E., Neuhaus, D. & Varani, G. (2000). The NMR structure of the 38 kDa U1A protein - PIE RNA complex reveals the basis of cooperativity in regulation of polyadenylation by human U1A protein. Nature Struct. Biol. 7, 329–335. 20. Shamoo, Y., Krueger, U., Rice, L. M., Williams, K. R. & Steitz, T. A. (1997). Crystal structure of the two RNA binding domains of human hnRNP A1 at 1.75 Å resolution. Nature Struct. Biol. 4, 215–222. 21. Xu, R. M., Jokhan, L., Cheng, X., Mayeda, A. & Krainer, A. R. (1997). Crystal structure of human UP1, the domain of hnRNP A1 that contains two RNArecognition motifs. Structure, 5, 559–570. 22. Handa, N., Nureki, O., Kurimoto, K., Kim, I., Sakamoto, H., Shimura, Y. et al. (1999). Structural basis for recognition of the tra mRNA precursor by the Sex-lethal protein. Nature, 398, 579–585. 23. Wang, X. & Tanaka Hall, T. M. (2001). Structural basis for recognition of AU-rich element RNA by the HuD protein. Nature Struct. Biol. 8, 141–145. 24. Deo, R. C., Bonanno, J. B., Sonenberg, N. & Burley, S. K. (1999). Recognition of polyadenylate RNA by the poly(A)-binding protein. Cell, 98, 835–845. 25. Price, S. R., Evans, P. R. & Nagai, K. (1998). Crystal structure of the spliceosomal U2B″-U2A′ protein

124

26. 27. 28. 29.

30.

31.

32.

33. 34.

35.

36.

37.

38. 39.

40.

complex bound to a fragment of U2 small nuclear RNA. Nature, 394, 645–650. Mazza, C., Ohno, M., Segref, A., Mattaj, I. W. & Cusack, S. (2001). Crystal structure of the human nuclear cap binding complex. Mol. Cell, 8, 383–396. Fribourg, S., Gatfield, D., Izaurralde, E. & Conti, E. (2003). A novel mode of RBD-protein recognition in the Y14-Mago complex. Nature Struct. Biol. 10, 433–439. Lau, C. K., Diem, M. D., Dreyfuss, G. & Van Duyne, G. D. (2003). Structure of the Y14-Magoh core of the exon junction complex. Curr. Biol. 13, 933–941. Bono, F., Ebert, J., Unterholzner, L., Guttler, T., Izaurralde, E. & Conti, E. (2004). Molecular insights into the interaction of PYM with the Mago-Y14 core of the exon junction complex. EMBO Rep. 5, 304–310. Kadlec, J., Izaurralde, E. & Cusack, S. (2004). The structural basis for the interaction between nonsensemediated mRNA decay factors UPF2 and UPF3. Nature Struct. Mol. Biol. 11, 330–337. Kielkopf, C. L., Rodionova, N. A., Green, M. R. & Burley, S. K. (2001). A novel peptide recognition mode revealed by the X-ray structure of a core U2AF35/ U2AF65 heterodimer. Cell, 106, 595–605. Selenko, P., Gregorovic, G., Sprangers, R., Stier, G., Rhani, Z., Kramer, A. & Sattler, M. (2003). Structural basis for the molecular recognition between human splicing factors U2AF65 and SF1/mBBP. Mol. Cell, 11, 965–976. Kielkopf, C. L., Lucke, S. & Green, M. R. (2004). U2AF homology motifs: protein recognition in the RRM world. Genes Dev. 18, 1513–1526. Schellenberg, M. J., Edwards, R. A., Ritchie, D. B., Kent, O. A., Golas, M. M., Stark, H. et al. (2006). Crystal structure of a core spliceosomal protein interface. Proc. Natl Acad. Sci. USA, 103, 1266–1271. Carninci, P., Waki, K., Shiraki, T., Konno, H., Shibata, K., Itoh, M. et al. (2003). Targeting a complex transcriptome: the construction of the mouse full-length cDNA encyclopedia. Genome Res. 13, 1273–1289. Nagai, K., Oubridge, C., Jessen, T. H., Li, J. & Evans, P. R. (1990). Crystal structure of the RNA-binding domain of the U1 small nuclear ribonucleoprotein A. Nature, 348, 515–520. Oubridge, C., Ito, N., Teo, C. H., Fearnley, I. & Nagai, K. (1995). Crystallisation of RNA-protein complexes. II. The application of protein engineering for crystallisation of the U1A protein-RNA complex. J. Mol. Biol. 249, 409–423. Holm, L. & Sander, C. (1993). Protein structure comparison by alignment of distance matrices. J. Mol. Biol. 233, 123–138. Marelli, M., Aitchison, J. D. & Wozniak, R. W. (1998). Specific binding of the karyopherin Kap121p to a subunit of the nuclear pore complex containing Nup53p, Nup59p, and Nup170p. J. Cell Biol. 143, 1813–1830. Inoue, M., Muto, Y., Sakamoto, H., Kigawa, T., Takio, K., Shimura, Y. & Yokoyama, S. (1997). A characteristic arrangement of aromatic amino acid residues in the solution structure of the aminoterminal RNA-binding domain of Drosophila sexlethal. J. Mol. Biol. 272, 82–94.

Structure of the Nup35 RRM Domain 41. Strawn, L. A., Shen, T., Shulga, N., Goldfarb, D. S. & Wente, S. R. (2004). Minimal nuclear pore complexes define FG repeat domains essential for transport. Nature Cell Biol. 6, 197–206. 42. Denning, D. P., Patel, S. S., Uversky, V., Fink, A. L. & Rexach, M. (2003). Disorder in the nuclear pore complex: the FG repeat regions of nucleoporins are natively unfolded. Proc. Natl Acad. Sci. USA, 100, 2450–2455. 43. Kigawa, T., Yabuki, T., Matsuda, N., Matsuda, T., Nakajima, R., Tanaka, A. & Yokoyama, S. (2004). Preparation of Escherichia coli cell extract for highly productive cell-free protein expression. J. Struct. Funct. Genomics, 5, 63–68. 44. Wada, T., Shirouzu, M., Terada, T., Ishizuka, Y., Matsuda, T., Kigawa, T. et al. (2003). Structure of a conserved CoA-binding protein synthesized by a cell-free system. Acta Crystallog. sect. D, 59, 1213–1218. 45. Kigawa, T., Yabuki, T. & Yokoyama, S. (1999). Largescale protein preparation using the cell-free synthesis. Tanpakushitsu Kakusan Koso, 44, 598–605. 46. Kigawa, T., Yabuki, T., Yoshida, Y., Tsutsui, M., Ito, Y., Shibata, T. & Yokoyama, S. (1999). Cell-free production and stable-isotope labeling of milligram quantities of proteins. FEBS Letters, 442, 15–19. 47. Otwinowski, Z. & Minor, W. (1997). Processing of X-ray diffraction data collected in oscillation mode. Methods Enzymol. 276, 307–326. 48. Terwilliger, T. C. & Berendzen, J. (1999). Automated MAD and MIR structure solution. Acta Crystallog. sect. D, 55, 849–861. 49. Terwilliger, T. (2004). SOLVE and RESOLVE: automated structure solution, density modification and model building. J. Synchrotron Radiat. 11, 49–52. 50. Brunger, A. T., Adams, P. D., Clore, G. M., DeLano, W. L., Gros, P., Grosse-Kunstleve, R. W. et al. (1998). Crystallography and NMR system: a new software suite for macromolecular structure determination. Acta Crystallog. sect. D, 54, 905–921. 51. Laskowski, R. A., MacArthur, M. W., Moss, D. S. & Thornton, J. M. (1993). PROCHECK: a program to check the stereochemical quality of protein structures. J. Appl. Crystallog. 26, 283–291. 52. Kraulis, P. J. (1991). MOLSCRIPT: a program to produce both detailed and schematic plots of protein structures. J. Appl. Crystallog. 24, 946–950. 53. Merritt, E. A. & Murphy, M. E. (1994). Raster3D Version 2.0. A program for photorealistic molecular graphics. Acta Crystallog. sect. D, 50, 869–873. 54. Kenan, D. J., Query, C. C. & Keene, J. D. (1991). RNA recognition: towards identifying determinants of specificity. Trends Biochem. Sci. 16, 214–220. 55. Gouet, P., Courcelle, E., Stuart, D. I. & Metoz, F. (1999). ESPript: analysis of multiple sequence alignments in PostScript. Bioinformatics, 15, 305–308. 56. Glaser, F., Pupko, T., Paz, I., Bell, R. E., Bechor-Shental, D., Martz, E. & Ben-Tal, N. (2003). ConSurf: identification of functional regions in proteins by surfacemapping of phylogenetic information. Bioinformatics, 19, 163–164.

Edited by J. Doudna (Received 26 June; accepted 19 July 2006) Available online 3 August 2006