Solution NMR views of dynamical ordering of biomacromolecules

Solution NMR views of dynamical ordering of biomacromolecules

Accepted Manuscript Solution NMR views biomacromolecules of dynamical ordering of Teppei Ikeya, David Ban, Donghan Lee, Yutaka Ito, Koichi Kato, ...

2MB Sizes 0 Downloads 17 Views

Accepted Manuscript Solution NMR views biomacromolecules

of

dynamical

ordering

of

Teppei Ikeya, David Ban, Donghan Lee, Yutaka Ito, Koichi Kato, Christian Griesinger PII: DOI: Reference:

S0304-4165(17)30276-3 doi: 10.1016/j.bbagen.2017.08.020 BBAGEN 28928

To appear in: Received date: Revised date: Accepted date:

20 June 2017 22 August 2017 24 August 2017

Please cite this article as: Teppei Ikeya, David Ban, Donghan Lee, Yutaka Ito, Koichi Kato, Christian Griesinger , Solution NMR views of dynamical ordering of biomacromolecules, (2017), doi: 10.1016/j.bbagen.2017.08.020

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

Solution NMR views of dynamical ordering of biomacromolecules Teppei Ikeya

†,§,*



,∥

†,§



, David Ban , Donghan Lee , Yutaka Ito , Koichi Kato

, and Christian

Griesinger¶,* †

T

Department of Chemistry, Graduate School of Science and Engineering, Tokyo Metropolitan University, 1-1

Minamioswa, Hachioji, Tokyo 192-0373, Japan §

IP

CREST/Japan Science and Technology Agency (JST), 4-1-8 Honcho, Kawaguchi, Saitama 332-0012, Japan

Department of Medicine, James Graham Brown Cancer Center, University of Louisville, 505 S. Hancock St.,

CR



Louisville, KY 40202, USA 

Okazaki Institute for Integrative Bioscience and Institute for Molecular Science, National Institutes of

US

Natural Sciences, 5-1 Higashiyama, Myodaiji, Okazaki 444-8787, Japan ∥

Graduate School of Pharmaceutical Sciences, Nagoya City University, Tanabe-dori 3-1, Mizuho-ku Nagoya

AN

467-8603, Japan ¶

M

Department of Structural Biology, Max Planck Institute for Biophysical Chemistry, Am Fassberg 11, Göttingen, 37077 Germany

AC

CE

PT

ED

*Corresponding authors

1

ACCEPTED MANUSCRIPT

US

CR

IP

T

Abstract Background: To understand the mechanisms related to the ‘dynamical ordering’ of macromolecules and biological systems, it is crucial to monitor, in detail, molecular interactions and their dynamics across multiple timescales. Solution nuclear magnetic resonance (NMR) spectroscopy is an ideal tool that can investigate biophysical events at the atomic level, in near-physiological buffer solutions, or even inside cells. Scope of Review: In the past several decades, progress in solution NMR has significantly contributed to the elucidation of three-dimensional structures, the understanding of conformational motions, and the underlying thermodynamic and kinetic properties of biomacromolecules. This review discusses recent methodological development of NMR, their applications and some of the remaining challenges. Major Conclusions: Although a major drawback of NMR is its difficulty in studying the dynamical ordering of larger biomolecular systems, current technologies have achieved considerable success in the structural analysis of substantially large proteins and biomolecular complexes over 1 MDa and have characterised a wide range of timescales across which biomolecular motion exists. While NMR is well suited to obtain local structure information in detail, it contributes valuable and unique information within hybrid approaches that combine complementary methodologies, including solution scattering and microscopic techniques. General Significance: For living systems, the dynamic assembly and disassembly of macromolecular complexes is of utmost importance for cellular homeostasis and, if dysregulated, implied in human disease. It is thus instructive for the advancement of the study of the dynamical ordering to discuss the potential possibilities of solution NMR spectroscopy and its applications.

AN

Keywords (a maximum of 6) dynamical ordering, solution NMR spectroscopy, dynamics, large biomacromolecules, stable isotope labelling, in-cell NMR

M

Abbreviations

AC

CE

PT

ED

Amyotrophic Lateral Sclerosis – ALS Carr-Purcell-Meiboom-Gill – CPMG Cell Penetrating Peptide – CPP Chinese Hamster Ovary Cells – CHO Chemical/Conformational Exchange (Chemical) Exchange Saturation Transfer – (C)EST Chemical Shift Anisotropy – CSA Compressed Sensing - CS Cross-Correlated Relaxation – CCR Dark State Exchange Saturation Transfer – DEST Direct Interpretation of Dipolar Couplings – DIDC Effective Transverse Relaxation Rate Ensemble Refinement for Native Proteins Using a Single Alignment Tensor – ERNST Ensemble Refinement from Unfolded Structures in Explicit Solvent Restrained with NOE and RDC data – EROS Enzyme I – EI Gaussian Axial Fluctuation – GAF Heteronuclear Single Quantum Coherence – HSQC Histidine Phosphocarrier – HPr Human Embryonic Kidney Cells – HEK Hydrogen/Deuterium – H/D Intrinsically Disordered Protein – IDP Intrinsic Relaxation Rate Lanthanide Binding Tag - LBT Lipari-Szabo Order Parameter – Longitudinal Cross-Correlated Relaxation Rate -

2

AC

CE

PT

ED

M

AN

US

CR

IP

Longitudinal Relaxation – R1 Maltodextrin-Binding Protein - MBP Maximum Entropy – MaxEnt Model-Free – MF Model-Free Analysis – MFA Molecular Dynamics – MD Multi-Dimensional Decomposition – MDD Non-uniform Sampling – NUS Nuclear Magnetic Resonance – NMR Nuclear Overhasuser Effect – NOE Optimised RDC-based Iterative and Unified Model-Free – ORIUM Paramagnetic Relaxation Enhancement – PRE Protein Data Bank – PDB Pseudo-Contact Shift – PCS Quantitative Maximum Entropy Reconstruction– QME Radio-Frequency – RF RDC Order Parameter Residual Dipolar Coupling – RDC Relaxation Dispersion – RD Sparse Multidimensional Iterative Lineshape-Enhanced – SMILE Spectroscopy by Integration of Frequency and Time Domain Information - SIFT Stereo-Array Isotope Labelling – SAIL Streptolysin O - SLO Superoxide Dismutase – SOD Third IgG-Binding Domain of Protein G – GB3 Trans-Activating Transcriptional Activator – TAT Transverse Cross-Correlated Relaxation Rate Transverse Relaxation – R2 Transverse Relaxation Optimised Spectroscopy – TROSY Transverse Rotating Frame Spectroscopy TROSY for Rotational Correlation Times – TRACT

T

ACCEPTED MANUSCRIPT

3

ACCEPTED MANUSCRIPT

ED

M

AN

US

CR

IP

T

1. Introduction Living systems are an orchestration of a vast number and variety of biomolecules. For example, a cell contains a multitude of components that incorporate various different biomolecules that operate collectively for its survival. Each of these biomolecules can be structurally complex and exhibit some degree of plasticity functioning through various pathways that are widely interconnected and driven by myriad interactions between proteins and other biomolecules. In fact, within the human cell, it has been estimated that the cellular interactome is composed of up to at least 130,000 distinct interactions [1]. Since these interactions are critical for maintaining life, tremendous scientific efforts have been made to characterise biomolecules in their health and disease states in order to dissect their function. Dynamical ordering is a useful phrase to indicate that molecules self-assemble in the cell to exert their function, and that the assemblies are often times conformationally dynamic but also transient in nature. Thus, dynamical ordering not only reflects the conformational but also stoichiometric dynamics of complexes within the cell. Over the past several decades, advances within the field of structural biology have greatly expanded our knowledge on the relationship between the atomic structure of proteins and their function. To date, techniques including nuclear magnetic resonance (NMR) spectroscopy, X-ray crystallography and cryoelectron microscopy have led to the deposition of more than 123,000 structures in the Protein Data Bank (PDB) [2, 3]. These structures have provided us with a wealth of information regarding the structure of proteins and how their attributes can be linked to their function. Moreover, it has become clear from the plethora of structural data that the conformations which are accessible by a biomolecule can be heterogeneous, or highly variant from one another [4-6]. Therefore, the temporal displacement of atoms is an inherent characteristic of biomolecules in which access to new conformations has allowed identification of features such as the capacity to expose binding surfaces [7], reorganise active sites [8, 9] and uncover allosteric regions [10, 11]. It is of no surprise that, over the past decades, the concept that biomolecules are inherently dynamic has become well appreciated, thus placing focus on their characterization. In this review article, we will outline the basics and applications of solution NMR methodologies for quantitatively characterizing biomolecular dynamics on various timescales, including those applied to in-cell NMR spectroscopy. 2. Molecular dynamics across different NMR accessible timescales

AC

CE

PT

Accessing different timescales of molecular dynamics using NMR spectroscopy Biomolecular motion, one of the indicated aspects of dynamical ordering, occurs on various timescales, typically from picoseconds to seconds and longer up to years. “Biomolecules sample a free energy surface whose distinct minima represent different structural configurations [12]. Each minimum has distinct characteristics where each minimum’s depth and width are related to the enthalpic and entropic contributions for the free energy of a given state. Here, a state refers to a unique structural configuration that will have some associated free energy value. Access to the different minima is driven by the thermal energy of the system, and the barriers between minima represent the activation kinetics of the process. These barriers can involve transitions between local minima that are shallow and that can maintain periodic behaviour (e.g. described by a harmonic oscillator), or transitions occur between low energy regions that are separated by larger barriers (greater than 1 kT).” The free energy landscape can be changed due to binding of ligands which can cause large or subtle structural changes [13, 14]. Even in the absence of a ligand many proteins exist in solution as a structural ensemble sampling a variety of structural states [13, 15]. Importantly, this means that their sampled structural states are an intrinsic property. A variety of studies have illuminated the importance of how intrinsic motion and the resulting conformational diversity affect functionality of biomolecules. An early study looked at an antibody’s structural heterogeneity between a variety of binding competent states capable of recognizing different potential antigens [16]. However, the “true” binding competent states for a particular antigen could only be identified when both the structural and kinetic information were known. This not only stresses the role of ascertaining the conformers sampled in the ground-state of biomolecules, but urges the requirement to utilise kinetic measurements that allow for the free energy landscape minima to be connected. Another example in which motional properties of a biomolecule has a direct consequence on drug discovery is allosteric inhibitors

4

ACCEPTED MANUSCRIPT

M

AN

US

CR

IP

T

[7, 10]. Recently, it has been shown that an allosteric inhibitor, which imposed no changes to the ground-state structure of the protein, had direct consequence to an on-pathway transient state [17]. Moreover, the authors showed that the interaction with this inhibitor abolished the lowly populated state and destroyed the affinity for the protein’s natural binding partner. Since traditional structure based drug design approaches focus on understanding differences in the apo and bound state of biomolecules [7], this work suggests that more attention should be paid to the transient states explored by the protein drug target [18, 19]. One can also argue that understanding the breadth of the conformational space for a system is important to understanding function. Another group recently studied the transient formation of potentially deviant conformations of the metalloenzyme superoxide dismutase (SOD), a protein that is related amyotrophic lateral sclerosis (ALS) [20]. Through, NMR based dynamics studies, this protein was shown to transiently visit “aggregation prone” conformations that may lead to aberrant events related to plaque formation. This is another elegant example where looking at only ground-state structures of a protein eludes the detection of deviant processes that are transient in nature. It is interesting to note that mutant forms of the protein had the same as well as additional off-pathway states and their exchange kinetics had increased [21]. Therefore, by ascertaining the kinetics that underlie the interconversion between states, one can identify that the same deviant conformations are more frequently visited with the mutants. Such information may help explain their enhanced pathogenic traits. These are just a few examples that support the study of protein dynamics in order to understand the functions of biomolecules in terms of healthy and potentially diseased states. We would like to refer the reader to a variety of reviews which further discuss possible connections between protein dynamics and biological function for different protein classes [11, 19, 22-24]. NMR is an ideal technique whose observables span a broad range of timescales spanning picosecond to real-time (Fig. 1). Additionally, NMR is able to maintain atomic resolution, does not physically harm the system of study and permits it to be examined under a native/physiological environment. Thus, NMR is a powerful tool to study the native dynamics of biomolecules. Below, we present a variety of techniques that measure amplitude and kinetic information of biomolecules across the entire dynamic range accessible by NMR spectroscopy.

AC

CE

PT

ED

The central principle of NMR relaxation The common feature for all NMR based relaxation experiments (Fig. 1) is that they all rely on some mechanism(s) that relaxes coherences (spin-states) to the equilibrium state [25, 26]. This relaxation is achieved through time variant local oscillating magnetic fields. Relaxation is manifested at frequencies that match transition frequencies which depend on the mechanism and nuclei being queried. frequencies depending on the nucleus being queried [26]. These magnetic fields are typically decomposed into their longitudinal and transverse components and depend on their amplitude and if they are resonant with any transitional frequencies. This effect can be quantified given that any time dependent interaction is stochastic and has no memory of its previous state in time. Therefore, the time dependent interaction will maintain an average value of zero. In order to cause relaxation, the variance over time will not be zero. This allows for the auto-correlation function for a given interaction(s) time dependence to be calculated. The maximum variance of the function is the amplitude of the process. Consequently, the decay lifetime of the auto-correlation function is a direct measure of the kinetics for the underlying process(es). This can be further extended to situations where two different mechanisms that can cause relaxation function together creating a crosscorrelation function. This occurs when multiple (double or zero) quantum coherences are generated between two internuclear vectors, which yield different combinations of cross-correlated relaxation (vide infra) [27]. Therefore, all NMR relaxation techniques effectively monitor the decay rate of coherences that are caused by unique mechanisms and that maintain a sensitivity to the timescale that the interaction exists over. This allows one to directly extract amplitude and kinetic information from phenomena that cause relaxation. It is not our intention here to give extensive theoretical detail for the variety of NMR relaxation methods. The reader is referred to the following references [26, 28, 29] that elegantly describe the theoretical frameworks for different relaxation mechanisms and how to calculate their values. Instead, we will highlight some NMR relaxation techniques, their basic concepts, and application in understanding the importance of protein dynamics.

5

ACCEPTED MANUSCRIPT

Conventional NMR relaxation methods that report on motion from picoseconds to the overall tumbling time

AC

CE

PT

ED

M

AN

US

CR

IP

T

The most frequently employed analysis of protein dynamics by NMR is through the use of conventional NMR relaxation (Fig. 1 & 2). These experiments report on the amplitude of bond vector motions within the timescale of picoseconds up to the overall tumbling time (c) termed the sub-c range. For a protein, c is typically between several to tens of nanoseconds (Fig. 2). Measurements are executed to determine the longitudinal relaxation rate (R1), transverse relaxation rate (R2), and heteronuclear Nuclear Overhauser Effect (NOE). These values are subsequently combined to extract c and motional amplitudes for inter-nuclear vector motion encoded as the Lipari-Szabo order parameters ( ; values between 0 [full flexibility] and 1 [full rigidity]) and is termed the simple model-free (MF) formalism [29-31]. Alternatively, these data can ascertain the rotational anisotropy of a system which has also been used as a structural restraint [32]. It is important to note that experiments which measure actually report an effective transverse relaxation rate ( ) which is the sum between two different relaxation sources. The first, the intrinsic transverse relaxation rate ( ), is caused by dipolar and chemical shift anisotropy (CSA) mechanisms which originate from processes occurring within the sub-c range. The second is chemical exchange ( ) due to slower processes, which act as an addendum to (vide infra) [33, 34]. In some cases, the simple MF formalism can fail, typically leading to an overestimation in , and manifests from two possible reasons: 1) additional sub-c processes that are at least an order of magnitude slower than any picosecond processes (e.g. the internal correlation time), but that are faster than c, or 2) can come from having a contribution of (vide infra) [35-37]. This can be alleviated by using the extended MF formalism which includes an additional correlation time and order parameter within the auto-correlation function [35] as well as a “pseudo offset” to account for . These experiments have been widely applied to the study of backbone 15N nuclei because of their ease of measurement. Furthermore, this methodology has been subjected to rigorous evaluation in order to reduce unwanted interferences that can hinder the accuracy of the measured relaxation rates [38-41]. These approaches and their analyses have also been extended to other nuclei types including a variety of backbone and sidechain nuclei [42]. Other useful sub-c relaxation rates are the transverse ( ) and longitudinal ( ) cross-correlated relaxation rates [43] (Fig. 2). These cross-correlated relaxation rates measure the contribution of relaxation interference between both the dipolar and CSA mechanisms for a given nucleus. Cross-correlated relaxation is exploited during Transverse Relaxation Optimised Spectroscopy (TROSY) [44] where the different relaxation properties between the fast and slowly relaxing component of the 15N-1HN doublet is utilised. TROSY selects the slowly relaxing component because it decays slower by a contribution of , therefore, leading to narrower line-widths of resonances. is an ideal parameter to study the sub-c dynamics of biomolecules, as it is comparable to , but is not sensitive to chemical exchange processes [38, 43]. This parameter can be determined using a method called TROSY for rotational correlation times (TRACT) experiment where the difference in relaxation rates for both the fast and slowly relaxing components render a direct estimation of c [45]. Furthermore, this experiment can be extended to a two-dimensional version whereby individual 15N backbone nuclei can be measured [46, 47]. This method has been used successfully to study the 15N backbone nuclei of several large proteins and their complexes (vide infra). Conventional relaxation is an effective approach to characterise the internal dynamics and overall rotational tumbling properties of a molecule as relaxation rates are sensitive to size and shape [48, 49]. Besides this basic characterization for a biomolecule, the amplitude of motion for an inter-nuclear vector within the sub-c range is also retained. This is particularly useful in cases where values (Fig. 2) can be used to separate contributions from motional parameters that cover both the sub-c and slower temporal regimes (vide infra) [50]. Intrinsically disordered proteins (IDPs) remain a challenge to study using common structural biology methods (e.g. X-ray crystallography) due to their inherent flexibility. NMR has become an ideal technique to obtain information at atomic resolution about these types systems [51]. Conventional relaxation methods have been applied to the study of IDPs [52-54] and unfolded proteins [55, 56] and, within these types of proteins, have been able to identify possible stretches of residues that involved in long-range interactions as well as regions that can adopt transient secondary structure elements [55, 56]. However, the detailed interpretation of

6

ACCEPTED MANUSCRIPT

US

CR

IP

T

the relaxation rates is not clear because classical relaxation models rely on a separation between the internal motion for a given inter-nuclear vector and the overall tumbling of the biomolecule [31]. For IDPs, there is no overall tumbling and therefore the overall and internal motion for a given inter-nuclear vector cannot be separated. Steps have been taken to alleviate these issues by the development of new models that shed quantitative insight into sub-c motion of IDPs [54, 57, 58]. Interesting insight has been obtained from conventional relaxation approaches applied to methyl bearing side chain moieties [59, 60] in biomolecules. The axis order parameter for methyl bearing moieties report on torsion and libration-type fluctuations for the corresponding proximal branch point carbon and its adjacent aliphatic carbon which is subsequently readout on the attached methyl(s) group(s) [61]. It was realised that the axis order parameter is an excellent proxy for the conformational entropy of a system and that methyl groups alone account for a significant contribution to a system’s overall entropy [62-64]. Thus, they are excellent reporters for assessing changes to a system. This approach has been used to quantify the role of conformational entropy during molecular recognition processes [63] as well as provided insight to the role of conformational entropy during protein activity [65]. The functional aspects derived from the application of conventional relaxation techniques highlight their value, but sub-c dynamics of biomolecules is only a small range of what NMR techniques can access. although NMR serves as experimental validation of dynamic conformational ensembles of biomolecules derived from molecular dynamics simulation [66-68]. Below, we outline NMR based experiments that explore dynamics of biomolecules from nanoseconds to tens of milliseconds.

AC

CE

PT

ED

M

AN

Accessing the amplitude of motion from processes slower than c: The supra-c range NMR is not only sensitive to fast sub-c motions, but contains experimental observables that cover the entire temporal range (Fig. 1). Motion that occurs between c to tens of microseconds, called the supra-c range, had been a “blindspot” to NMR. Over the past decade, large methodological strides have been made to explore the amplitude of motions within this time range (Fig. 1). It is important to note that many conventional structural biology techniques cannot access this range of protein motion [33]. NMR based experiments that measure the amplitude of motion from the supra-c range have been used to elucidate the role of protein dynamics involved in molecular recognition (for both proteins [15] and nucleic acids [69]), allostery [70], transient formation of structural elements [71], and functional states that promote protein activity [72]. Experimental observables that provide direct access to the amplitude of motion for inter-nuclear vectors of proteins have been championed by two techniques: the residual dipolar coupling (RDC) and crosscorrelated relaxation (CCR) (Fig. 2). Although, chemical shifts are also very sensitive reporters to structure [73] and to motion within this time range (Fig. 1) [74, 75] their use is highly dependent on the model used to describe them [76]. A current challenge remains in quantifying all sources that contribute to them [73]. Thus, we will focus our discussion to RDCs and CCR. In isotropic solutions, biomolecules tumble freely which results in all possible dipolar couplings to be averaged to zero because they are equally populated. These can be partially restored by monitoring proteins in anisotropic conditions that cause partial alignment of the molecule. In this scenario, some orientations become unequally populated and render incompletely averaged dipolar couplings [77, 78]. These result in RDCs whose magnitude is on the order of approximately one-thousandth of the maximal dipolar coupling [77]. RDCs are detected through a variety of experimental approaches that accurately measure scalar couplings [79], where under anisotropic conditions, the measurement provides the sum of the scalar coupling value and RDC [79]. We turn the reader to references [33, 77, 78] and all references therein for an extended discussion about the theory and functional use of RDC data. RDC data directly reports on the orientation of a bond vector with respect to an external alignment conditions and can be directly enforced to define global structural information for a biomolecule [77]. This has become common practice in the NMR community during structure determination procedures and has allowed for the fold of proteins as large as 82 kDa to be determined [80]. Therefore, RDCs help in pushing the upper limit in molecular weight of systems that can be studied by NMR. Most importantly, is that RDCs also report on the ensemble average orientation of a given inter-nuclear vector across all structural configurations a biomolecule can sample (Fig. 2). This averaging covers the sub-c range to motions as slow as milliseconds (proportional to the inverse of the measured RDC value) [81, 82].

7

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN

US

CR

IP

T

Several groups have pursued analytical methods that extract dynamical information from RDC data [Direct Interpretation of Dipolar Couplings (DIDC) [83] and Model-free Analysis (MFA) [82]] which have been unified using the Optimised RDC-based Iterative and Unified Model-free analysis (ORIUM) [84]. In brief, obtaining dynamic information from RDC data requires a minimum of five orthogonal alignment conditions and a protein structure. The five independent elements of the RDC tensor can then be calculated for each inter-nuclear vector within the protein. The averaged second order spherical harmonics can then be calculated for each inter-nuclear vector [50, 82]. The sum across the product of each five spherical harmonics with its complex conjugate yields an RDC derived order parameter ( ) that reflects all motion from picoseconds to milliseconds (Fig. 2). It should be noted that structural noise can affect the accuracy of RDC based order parameters, but the ORIUM method can start from random extended structures, and is thus, highly tolerant to structural noise [84]. Ultimately, the contribution of motion from the sub-c range can be separated by taking the ratio between and yielding a parameter that provides the amplitude of motion purely from the supra-c range [50]. An alternative model developed by Brüschweiler and co-workers [85, 86] called the three-dimensional Gaussian Axial Fluctuation (3D GAF) model has also been applied by Blackledge and co-workers to identify the amplitude of supra-c motions within the protein GB1 [87]. Interestingly, the amplitudes derived from 3D GAF and MFA correlated well with one another [50]. A key use of this of large RDC datasets are their incorporation into ensemble refinement protocols which provide structural representations of motion within the supra-c range [13, 15]. Several examples of these ensembles have identified a variety of functional processes that occur within this temporal range [15, 88-91]. Biological processes rely on different biomolecules interacting with one another. Therefore, the way biomolecules are recognised is a critical process, and understanding the molecular features around these processes have provided key insight into protein behaviour. Traditionally, there are two general mechanisms that describe how biomolecules interact with one another, induced-fit and conformational selection model [92, 93]. Induced-fit dictates that the process is diffusion controlled forcing the target protein to change its conformation upon initial interaction and then converting to its final bound conformation. In contrast, the conformational selection model states that the bound state conformation is sampled in the absence of ligand and is therefore not limited by diffusion, but relies on the rate of interconversion between conformers. RDC data has played a critical role in evaluating different molecular recognition processes [15, 69, 94]. In particular, ensembles of structures that were enforced with RDC data provide access to structures that reflect motions up to microseconds [15] (Fig. 2). For the protein ubiquitin in the absence of ligand, RDC data from a total of 36 different alignment conditions were utilised to develop an ensemble called EROS. Remarkably, the EROS ensemble revealed that in the apo state ubiquitin samples structures that mimic its variety of bound complexes [15]. This result was only revealed when RDC data were included within the ensemble generation procedure, highlighting that the molecular recognition process for ubiquitin stems from motion within the supra-c range. This approach has also been applied to nucleic acids, in particular, to reveal the conformational space of HIV-1 TAR RNA [69]. Interestingly, the HIV-1 TAR RNA RDC derived ensemble could be used during an in silico docking protocol which helped to find potent inhibitors of HIV-1 TAR RNA [95]. Access to a wide range of structural representations which cover the conformational space of a system indicates that conformational ensembles can be important within a computational chemistry framework. The aforementioned ensembles assume all conformers are equally populated and therefore represent non-canonical distributions. Accelerated molecular dynamics (MD) simulations [96] which are restrained with RDC data can be subsequently Boltzmann reweighted providing a method to generate canonical conformational ensembles that report on dynamics up to milliseconds. This has been applied to ubiquitin [97], third IgG-binding domain of protein G (GB3) [98] and large molecular weight proteins such as thrombin [99] and IB [100]This suggests that motion within the supra-c range may be a common feature for many proteins. Ensemble generation for IDPs frequently utilises RDC data [101] therefore, it is interesting to speculate that motion within the supra-c may exist. However, the associated kinetics between conformer interconversion has remained elusive. A key limitation to ensemble refinement with RDC data is the requirement that many orthogonal RDC datasets must be acquired. For many systems, this may not be amenable. Therefore, an observable that provides amplitude information across the same temporal regime, but under isotropic conditions, is highly desirable. The measurement of angular changes between two different inter-nuclear vectors via CCR experiments has emerged as a promising alternative and complement to RDCs.

8

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN

US

CR

IP

T

CCR monitors relaxation due to either dipolar mechanism for both inter-nuclear vectors, CSA mechanism for both inter-nuclear vectors, or a combination of both dipolar/CSA mechanism for one or the either inter-nuclear vector [27]. For example, dipolar/dipolar CCR rates report on the fluctuation of the projection angle between two different inter-nuclear vectors. This relaxation phenomenon functions by monitoring multiple quantum and double quantum magnetization coherences whose underlying descriptions directly relate to the angular geometry along the polypeptide backbone between the queried inter-nuclear vectors [27]. CCR was utilised in an early study where the dipolar/dipolar CCR rates were measured between the 1HN/15N-1H13C inter-nuclear vectors and could be correlated with a Karplus-type curve and provided a direct measurement to the  dihedral angle [102]. Consequently, a large variety of experiments were developed in order to determine the angles between a variety of inter-nuclear vectors across the polypeptide backbone of proteins [102-110] and nucleic acids [111-117]. Since, CCR rates maintain rich structural information their use in structure determination has also been demonstrated [27]. It is key to note that the CCR rate is dependent on knowledge of c, but c can be determined using methods mentioned from the subc section above. Similar to RDCs, CCR rates monitor motions across the supra-c range (Fig. 1 & 2). The capacity to monitor the temporal fluctuations between the angle of inter-nuclear vectors provides an avenue to not only provide complementary amplitude information about motion from the supra-c range but, also is a platform to address how a pair of inter-nuclear fluctuate (i.e. two bond vectors move in a correlated or anticorrelated fashion) (Fig. 2). Efforts have been put-forth to implement CCR rates within ensemble refinement procedures. Following the development of the original ubiquitin RDC-based ensemble (EROS), a new ensemble was generated using MD simulations that were restrained with NOE and 1HN/15N RDC data that was named ensemble refinement for native proteins using a single alignment tensor (ERNST) [118]. It was found that compared to the X-ray, conventional NMR bundle, unrestrained MD, and EROS ensemble structures, ERNST could best reproduce experimental CCR rates. The CCR rates that were used reported on successive 1HN/15N vectors and intra-residue fluctuations between 1HN/15N-1H13C vectors. A correlation between  and  dihedral angles across all residues was used to test for the existence of correlated motion within ubiquitin. A pathway of individual residues across several beta strands, that compose the primary site for ubiquitin binding partners, were shown to fluctuate in an anti-correlated fashion [118]. Importantly, this work confirmed that CCR data are as sensitive to supra-c motion as compared to RDC data. A recent report has utilised a collection of four different CCR rates coupled with RDC data to create an ensemble of GB3 that was restrained with both data types [119]. In order to directly assess correlations between inter-nuclear vectors, the ratio between the experimental rate with respect to the product of the RDC order parameters and rigid rate was computed. The basis for positive (negative) ratios were discussed to stem from correlated (anticorrelated) fluctuations that arise from slower processes within the supra-c range that cause a decrease (increase) in the decay lifetime of the auto-correlation function that describes the CCR rate. This semiquantitative method is illustrative for analysing CCR based ensembles, but the complexity of motion between two inter-nuclear vectors still requires further development of advanced analytical methods. Furthermore, different ensemble generation techniques that evaluate the minimum data required (combination of RDC and CCR) would be valuable and open the door to a variety of systems to be studied by this approach. The information content from RDC and CCR data has illuminated the existence of significant motional amplitudes within the supra-c range for several systems. However, the kinetics that are associated with these groundstate fluctuations still remain a challenge to characterise. Fortunately, recent technological advancements have provided the tools to access the kinetics for these rapid processes [120, 121]. Accessing the kinetics of the supra-c range Although conventional NMR methods are ideal for extracting amplitude of motions occurring within the sub-c range, these methods are not suited for determining kinetic information within the supra-c range. NMR based Relaxation Dispersion (RD) spectroscopy has emerged as a technique that can be used to directly measure motions as slow as tens of milliseconds and now for processes as fast as three microseconds [120122]. The data can be used to extract parameters that directly report on the kinetics, thermodynamics, and structural features. The theoretical framework for RD to study simple two-state and multi-state processes has been addressed and thus, we direct the reader to the following publications for more in-depth discussions [25,

9

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN

US

CR

IP

T

123-125]. Briefly, NMR RD exploits the phenomenon of chemical exchange which is ascribed to the modulation of the isotropic chemical shift (Fig. 2). When a biomolecule samples discrete structural states, each state has an associated unique chemical shift that can cause the creation of distinct magnetization vectors that interconvert with on another (kinetics) and have a phase difference (chemical shift difference). When this occurs within the transverse plane chemical exchange will increase resonance line-widths. This causes the contribution of chemical exchange to manifest as an addendum to the transverse relaxation rate ( ).The contribution of can be monitored by following the change in as a function of an effective radio-frequency (RF) refocusing field which can depend on the amplitude of the RF field as well as the frequency offset between where the RF field is applied and the queried resonance [33]. The dependence in (Fig. 2) is subsequently fit to models that extract kinetic and structural information. In the case for ubiquitin, the RDC based ensembles [15, 118] indicate that the sampled bound-like conformers must interconvert with one another. Traditional RD failed to detect any motion therefore it was hypothesised that these motional events would be faster than the previously accepted fastest lifetime detectable by NMR RD (~40 s). Transverse rotating-frame ( ) RD experiments applied under supercooled conditions (265 K) were used and an interconversion lifetime of approximately 120 s was detected. Through an Arrhenius extrapolation, the lifetime for conformational sampling within the protein ubiquitin was found to be ~10 s at physiological temperature [126]. Interestingly, chemical shift predictors applied to the NMR RD ensembles were able to detect large chemical shift variances for the same residues found to have dispersion from the super-cooled experiments. This indicates that structural ensembles, which incorporate supra-c information, have predictive power for residues that may display RD [126]. Traditional NMR RD was limited to motions up to 40 s; however recent development in cryogenically cooled probeheads (cryoprobes) and experimental methodology has extended this to the hundreds of nanoseconds range which provides us direct access to the supra-c range [88, 120]. “High-power” RD experiments refers to the improved power handling of modern cryoprobes while concomitantly using modern pulse methodology and heat compensation techniques to generate large amplitude RF field strengths [120]. This is an imperative concept as the fastest lifetime of motion that can be observed is approximately equal to the inverse in the amplitude of the effective spin-lock field. It was found that modern cryoprobes could safely withstand spin-lock pulses with an amplitude up to 6.4 kHz and could be applied for as long as 120 ms for 15N nuclei. This translates to the detection of lifetimes as fast as 25 s. Highpower NMR RD measured on ubiquitin corroborated previous super-cooled RD experiments, and showed the ability to detect nuclei that display smaller amplitude motions (smaller chemical shifts). This advantage stems from the ability to more comprehensively sample with larger effective spin-locking fields. The minimum detectable lifetime was further extended by moving to nuclei of larger gyromagnetic ratio which is also highly advantageous because less power will be deposited into the probe [121, 127]. High-power RD has been applied to both 13C and 1H nuclei which increased the time resolution to 13 and 3 s for 13C and 1H, respectively [121]. Measuring RD for multiple nuclei types provides increased data size for increasing the precision in global kinetic parameters, greater coverage over the entirety of a biomolecule, and moving to nuclei with larger gyromagnetic ratios increases the minimum detectable contribution of chemical exchange to RD. 1HN RD coupled with super-cooled conditions has permitted the detection of a significant global exchange event for residues with the -turn of protein GB3 which would not be detectable by the original high-power 15N RD. This motion occurs at approximately 400 ns at physiological temperature and was identified to be caused by changes in hydrogen bonding within a region that is within the antigen binding site [88]. Furthermore, the motion was corroborated by RDC derived order parameters and with an increase structural variance within the RDC derived ensemble [88]. The combination of 15N, 13C, and 1HN high-power RD has revealed interesting features behind protein behaviour. For the proteins ubiquitin and GB3, high-power methyl 13C and 1H RD revealed comparable microsecond fluctuations between the 20-55 s range [121]. Interestingly, only methyl 13C nuclei displayed observable RD while the attached 1H nuclei remained RD silent. The only motional event that can cause RD for 13C methyl groups, but keep 1H nuclei silent is due to the -effect. The -effect refers to the difference in chemical shift of a carbon which is anit-periplanar or synclinal to another carbon. Importantly, is that for methyl bearing side chains with two 13C methyl groups the observed amplitudes between each group

10

ACCEPTED MANUSCRIPT

US

CR

IP

T

were disparate. In order to reconcile these differences the “population shuffling” model was created [121]. It describes that the slower microsecond, ground-state, fluctuations of the backbone and sidechain moieties within the protein cause the rotameric populations to redistribute (the interconversion for rotamers is still orders of magnitude faster than the macrostate interconversion). The population shuffling model was also applied to methyl groups that sample sparsely populated states (<5%) and whose macrostate interconversion are orders of magnitude smaller than ubiquitin and GB3 [128]. It appears that population shuffling may be a general phenomenon for many proteins. An unanswered question that remained was whether ubiquitin’s multi-specificity for its variety of binding partners stems only from its canonical binding interface or whether the process is global and allosterically coupled through the rest of the protein. This is of keen interest given the presence of motion at sites distal to the canonical binding site. Furthermore, the motion for all sites and nuclei types could be fit globally to a common timescale. A computational method was developed to take a set of MD simulations and derive an optimised linear mode that best explained the RD data [129]. The RD weighted mode that fulfilled the experimental RD data represented a structural model along the reaction coordinate for which the microsecond motion occurs. This is one of the first atomic models for a fast exchange transition in the ground state. This is in contrast to other models where the interconversion between states is in the slow regime permitting the extraction of chemical shifts between the ground and lowly populated states [130]. These experimental and analytical approaches can be applied to a wide variety of systems making high-power RD experiments a potentially vital tool in uncovering how other protein’s function.

AC

CE

PT

ED

M

AN

Slower motions by NMR based methods: RD to Real-time For motions that extend past the supra-c range, other NMR RD type experiments have excelled in quantifying the kinetic and amplitudes of motions for biomolecules between 40 s to tens of milliseconds [122, 131]. For the past two decades, two particular experiments have been the focus of methodological development, the transverse rotating-frame ( ) and Carr-Purcell-Meiboom-Gill (CPMG) experiments. Both monitor as a function of external spin-locking fields either by manipulation of the amplitude and/or offset of an applied radio-frequency field [131], or by varying the inter-pulse delay between a train of 180º pulses [122] for and CPMG experiments, respectively (Fig. 2). Traditional RD experiments have focused on the measurement of backbone 15N nuclei, but now a variety of experiments exist to measure all nuclei types across the backbone and sidechain moieties [122, 130, 132]. Their application has been widespread to study a variety of processes including: enzyme catalysis [9], folding/unfolding [122], on-/off-events [133], and allostery [134]. They have also been applied successfully for both proteins and nucleic acids [135, 136]. In scenarios, where the motion is within the slow regime (i.e chemical shifts of the lowly populated state can be extracted), structural models at atomic resolution can be created of the transient, intermediate states [130, 137]. Also, for slowly exchanging proteins, RDC values that correspond to the lowly populated states can also be extracted [138, 139]. Situations can arise in which nuclei that undergo motion within the slow exchange regime traditional RD experiments maintain refocusing periods that are too short in length and do not sufficiently sample the exchange event. An alternative RD experiment that has resurged is the development of Exchange Saturation Transfer (EST) experiments and their application to biomolecules (Fig. 2) [140-142]. These experiments are equally powerful for extracting kinetic, thermodynamics, and structural parameters as for traditional RD methods. Two particular EST techniques exist, Chemical Exchange Saturation Transfer (CEST) [141] and Dark State Exchange Saturation Transfer (DEST) [143], but both in principle, are identical. These experiments monitor the attenuation of longitudinal magnetization at a particular RF field strength (usually between 10 and 500 Hz) as a function of the offset frequency. Since, the spin-lock field is applied for several hundreds of milliseconds this permits many more exchange events to be captured during the experiment increasing its sensitivity to slowly exchanging events. These experiments have had a lot of success in elucidating the structural and kinetic features that surround the misfolding of proteins [20, 21, 144] and to the formation of amyloids [143, 145] providing insight into how small proteins can interact with substrates greater than 1 MDa in size. Furthermore, these experiments have been subject to further methodological development and can benefit from high-powered techniques as well [146, 147].

11

ACCEPTED MANUSCRIPT

T

Finally, when the underlying process is greater than the time it takes to record a single free induction decay it is possible record multiple spectra (Figs. 1 & 2). In some cases, the process can be slow enough that multidimensional spectra can be collected repetitiously. This is called Real-Time NMR (Fig. 1). Spectra collected over time can display chemical shift changes and/or changes to peak intensity. This has been applied to study slow turnover of substrate by enzymes [148], folding processes [149-151], the monitoring of chemical synthesis [152, 153], and for identifying structural changes in materials [154, 155]. Real-time NMR has also benefited from fast acquisition [156] and hyperpolarization [157] techniques to not only improve the rate at which data can be collected, but to also increase the signal-to-noise of the data.

IP

3. NMR approaches to large molecular systems

AC

CE

PT

ED

M

AN

US

CR

Optimised isotope labelling and R2 relaxation Biomolecules in living systems often assemble into huge complexes with molecular masses beyond 100 kDa that function as supramolecular machinery. Examples of such complexes include molecular chaperones, molecular motors such as F1/F0-ATPase and protein degradation complexes like proteasomes. Although, it used to be challenging to deal with larger (>30 kDa) biomacromolecules using solution NMR techniques, recent developments in this field have substantially expanded the limitation of molecular size, enabling the analysis of larger biomolecular systems. Protein complexes over 1 MDa have been well characterised using solution NMR spectroscopy. In this section, we outline current state-of-art approaches to larger biomolecular systems using solution NMR techniques. Solution NMR analyses of large biomacromolecules are hampered primarily by two factors. One is substantial signal overlap due to an increasing number of observable nuclei, which makes the resulting spectra complicated and results in ambiguity in NMR signal assignments. The other is severe line-broadening of signals due to enhanced R2 relaxation (due to an increase in molecular weight), leading to loss of observable NMR signals. As discussed in the previous section, huge molecules rotate very slowly in solution and therefore exhibit fast R2, yielding terrible line-broadening. Of note, even a small molecule can give significant linebroadening of its signals as it tumbles slowly in complex with a larger binding partner, which makes detailed characterisation of biomolecular assemblies difficult. NMR signals are only detected from nuclei bearing a nuclear spin. For biomolecular NMR, nuclear spins that are typically used maintain a spin of 1/2 which includes 13C, 15N, 19F, 31P, or 1H, nuclei. Therefore, the replacement of particular atoms with isotopes can be used for spectral editing and in some cases, can lead to the reduction in the relaxation rate for particular nuclei by reducing contributions from dipolar relaxation. Isotope labelling can be achieved through metabolic and biosynthetic pathways of bacterial and eukaryotic expression systems, as well as in cell-free expression systems [158-161]. One of the most straightforward and efficient strategies to circumvent the molecular size problem in solution NMR spectroscopy is deuteration of target proteins, which suppresses peak overlap as well as reduces line-broadening caused by reduction of proximal protons which contribute to dipolar relaxation for a given nucleus. The dipolar relaxation contribution between a proton and deuteron is reduced by a factor of 16 as compared to the dipolar relaxation between two protons. This enables the simplification and sensitive acquisition of spectral data. Uniform deuteration (perdeuteration) of proteins is conventionally achieved by the growth of Escherichia coli in deuterated media. Although, perdeuteration is effective for improving the spectral quality of larger proteins, it has a trade-off whereby the loss of protons reduces the number of structural restraints required for resolving the 3D structure of the target protein. Therefore, ‘selective protonation’ with a deuterated background is a favourable approach that provides narrower proton peaks as a source of structural constraints while minimizing peak overlap. Methyl groups are usually the preferred protonated positions in larger proteins because methyl signals have improved signal properties compared with methylene and methine signals. This is because methyl groups have reduced transverse relaxation due to their rapid motion coupled with a 3-fold improved signal intensity because of the three equivalent protons attached to each methyl carbon. The methyl selective protonation schemes leave the aliphatic branch 12C labelled while 13C labelling the terminal methyl carbons within leucine (1, 2 positions), valine (1, 2 positions) and isoleucine ( position) residues are easily achieved by E .coli protein expression in a minimal medium with commercially available metabolic precursors, -ketobutyrate and -keto-isovalerate

12

ACCEPTED MANUSCRIPT

PT

ED

M

AN

US

CR

IP

T

(Fig. 3a) [162, 163]. Metabolic labelling techniques have been reported for methyl selective protonations of threonine, alanine and methionine residues [164-168], and selective protonations at the aromatic rings of tyrosine, phenylalanine and tryptophan residues [169-172]. The spectra derived from methyl selectively protonated samples give a limited number of strong signals, which facilitates peak picking and resonance assignments. In addition, the methyl groups and aromatic rings tend to be located in the core regions of proteins through hydrophobic interactions and provide important distance information for determining global folds of proteins. It is also important to note that selective protonation has now become feasible in terms of cost and effort. Despite the significant advantage in determining accurate global folds of proteins, methyl selective protonation within a perdeuterated background provides limited information if one attempts high-resolution structure determination. This is particularly true for ‘hydrophilic’ protein surfaces or regions where methylcontaining amino acids are sparse. The approach is not fully optimised for minimally suppressing transverse relaxation and maximally acquiring structure information. Kainosho and coworkers designed stereo-array isotope labelling (SAIL) which simultaneously achieves a 4- to 7-fold increase in signal-to-noise, narrower resonance lines, and a 40%–60% reduction in the number of signals without sacrificing essential information regarding the conformations of backbones and side chains of all amino acid residue types (Fig. 3b) [173]. The SAIL technique uses 20 synthesised amino acids with stereospecific and region specific arrangements of stable isotopes, thereby enabling reduced spectral overlap, complete stereospecific assignment and the collection of longer interproton distance information. This approach has made it possible to determine high-quality solution 3D structures of proteins with molecular masses higher than 30 kDa (Fig. 3c and 3d). Several types of SAIL amino acids are currently available for specific applications such as the characterisation of conformational dynamics [174] and conformational isomerism of disulphide bonds [175]. In addition to the sophisticated isotope labelling techniques, TROSY-type experiments are currently essential as tools for NMR studies of large biomacromolecules [176]. TROSY-type modification of pulse sequences can significantly improve the signal-to-noise for resonances by providing sharper cross peaks due to the selection of a coherence where the cross-relaxation between of the dipolar and CSA interactions actually attenuate the relaxation rate for this coherence (vide supra). The original TROSY concept was proposed for the measurement of the 1H–15N correlation of backbone amides [176] and the 1H–13C correlation of aromatic side chain resonances [177], but has subsequently been incorporated into various triple-resonance and NOE-type experiments [178]. In particular, TROSY experiments for the measurement of methyl groups (methyl-TROSY) [179] are extraordinarily powerful for studying larger biomolecular complexes greater than 100 kDa. This has yielded many outstanding applications that are described below.

AC

CE

Advanced techniques for the rapid collection of data sampling and NMR structural information The spectra of large biomarcomolecules suffer from severe peak overlap causing resonance specific assignments to become intractable. Typically, this can be resolved by increasing the dimensionality of an experiment whereby the addition of another dimension provides new nuclei specific correlations that simplify their assignment in frequency space. However, it is often difficult to improve spectral resolution and the signalto-noise ratio of multi-dimensional NMR measurements in a reasonable experimental time period. This is especially problematic when dealing with proteins that have poor solubility and/or that are unstable over time. Among the many approaches to address this issue, the most robust and straightforward method is to employ non-uniform sampling (NUS) [180] in combination with spectral reconstruction utilising non-Fourier transform methods (Fig. 4). Generally speaking, NUS is the collection of only a subset of data points from the entire original experimental data matrix. After which, the sparsely sampled data matrix is fully reconstructed where the missing points are interpolated by various processing methods. When this approach was first applied to multi-dimensional NMR data, an exponential weight function was employed to select points to be measured within the sparsely sampled indirect dimensions [181]. Afterward, to minimise the undesirable effects from large gaps between data points and biased distributions, Hyberts et al. proposed a sparse sampling scheme collecting data points using a sinusoidal-weighted Poisson distribution, so-called Poisson-gap sampling [182, 183]. NUS data require the reconstruction of missing data points to obtain a complete spectrum. Maximum entropy (MaxEnt) reconstruction has been widely used in the field of NMR [184, 185] and estimates the missed

13

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN

US

CR

IP

T

points with a priori metrics based on the principle of MaxEnt from information theory. Improved versions of MaxEnt have also been developed, such as forward MaxEnt [186] and quantitative maximum entropy (QME) [187]. Applications of decomposition algorithms for multi-way arrays, multi-dimensional decomposition (MDD) [188], recursive MDD [189], co-processing MDD [190] and SIFT [191] have been proposed with quantitative reproducibility of signal intensity and therefore have been employed for analysis of NOE and relaxation, where the determination of accurate peak intensities are imperative [192, 193]. More recently, a state-of-the-art reconstruction method called compressed sensing (CS) has successfully reconstructed multidimensional NMR spectra from NUS-sampled data [194, 195]. CS minimises the L1-norm based on a prior assumption that most regions in a spectrum can be sparse or have no information (zero values). Many algorithms of convex optimisation for L1 or L0 norm minimisation have been developed, including iterative soft thresholding [196, 197], iteratively reweighted least squares [194], NESTA [198] and sparse multidimensional iterative lineshape-enhanced (SMILE) spectral reconstruction [199]. Using NUS and these reconstruction methods, sampling points can be reduced without losing resolution and the acquired time can be used to improve sensitivity by increasing the number of scans. In classical NMR structure analysis, the most commonly used structure information includes torsion angles derived from chemical shifts as well as scalar couplings and short-range (up to approximately 5 Å) interproton distances from NOEs. However, structure determination of larger molecular systems often suffers from insufficient distance restraints due to severe signal overlaps and fast transverse relaxation. Thus, longrange information offering structural restraints would be useful in order to increase the accuracy and precision in 3D structure determination or to validate structural models. The RDC [200], paramagnetic relaxation enhancement (PRE) [201] and pseudo-contact shift (PCS) [202, 203] are essential structural data for modern NMR analysis of larger systems. The RDC provides the orientation of scalar coupled spin pairs relative to an external alignment tensor (vide supra). [200, 204-206]. PRE occurs between a nucleus and an unpaired electron from a paramagnetic atom that as an isotropic g-tensor. This relaxation enhancement is especially important as it encodes long-range distance information (as far as 35 Å) [207]. PCS is only observed in paramagnetic ions with an anisotropic g-tensor, usually lanthanide ions, providing information regarding distance and angles of the nuclear spin with respect to the susceptibility magnetic tensor of the metal ion. It is notable that PCS offers longer distance information (as far as 40 Å) [207] than that of PRE because the magnitude of PCS depends on the inverse third power of distance whilst PRE depends on the inverse sixth power of distance in PRE. Several groups have developed methods that enable the structure determination of large molecules by combining NMR data with that of X-ray and neutron scattering [208-211]. Venditti et al. recently applied a hybrid method for the structural analysis of autophosphorylation of enzyme I (EI) [212], a 128-kDa dimer composed of two domains (EIN and EIC), and its complex with the histidine phosphocarrier (HPr) with a total molecular mass of 146 kDa. EIN is further divided into two subdomains, EIN/, which contains an active site of phosphoryl transfer, and EIN, which interacts with HPr. Three crystal structures, from different organisms, of intact EI showed that the orientations of the EIN  subdomain relative to the EIN subdomain exhibited a large difference between the three structures; even though the orientation of the EIC domains between the three structures remained identical. The authors elucidated the orientation of the domains, in solution, using RDC, small- and wide-angle scattering and small-angle neutron scattering. The EIN and EIC domains were fixed as rigid bodies while the linker between them was allowed to vary during the simulated annealing protocol but were restrained from the experimental data. The obtained structure was not consistent with the three crystal structures and the back-calculated scattering data from the three conformers were largely different from the observed ones. The results obtained by the hybrid approach indicate that the crystal packing affected the domain orientation between EIN and EIN/. This highlighted that conformational transitions between domains, in solution, can be larger in amplitude than what is captured in the crystal. The current cutting edge: NMR of larger proteins (> 30 kDa) For NMR protein structure determination, the statistics regarding the number of PDB entries clearly shows that a significant obstacle remains at 30 kDa (Fig. 5a and 5b). The biggest single-polypeptide protein whose atomic coordinates are determined exclusively based on NMR experimental data has thus far been the 723-residue enzyme malate synthase G (82 kDa) by Kay and co-workers [213]. Employing methyl-selective protonation within a perdeuterated background, data acquisition with NUS, spectrum reconstruction by MDD

14

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN

US

CR

IP

T

and the use of RDCs in addition to NOE-derived distance restraints, the authors achieved 95% backbone resonance assignment, and stereospecific assignments for ~90% and ~80% of valine and leucine methyl groups, respectively [214]. The backbone root mean square deviation (RMSD) for the 10 lowest-energy structures and its X-ray crystal structure (PDB ID:1D8C) were 2.9 and 4.1Å, respectively. Another example of 3D structure of large proteins determined exclusively by NMR data has been shown for the two domain 41-kDa maltodextrinbinding protein (MBP) which was first determined on the basis of NOEs between amide and methyl protons and RDCs [215], and subsequently a more accurate and precise 3D structure model was proposed based on the data obtained by the SAIL approach (Fig. 3c and 3d) [173]. RMSDs of the backbone and all side chain heavy atoms to the mean coordinates for the SAIL-MBP structure were approximately 0.7/0.8 and 1.0/1.1 Å (Cterminal/N-terminal domains), respectively. As the average molecular mass of a eukaryotic protein is 40 kDa [216], the aforementioned achievements establish that solution NMR spectroscopy can be used to determine the structures of a considerable amount of proteins aiding. TROSY-based approaches offer a new avenue for the characterisation of dynamical structures and interactions of huge protein complexes. Most recently, this approach was applied to a 94 kDa complex in order to identify dynamic changes within Nucleophosmin 1 that contribute to its liquid-like phase separation behaviour [202]. Furthermore, Rosenzweig et al. elucidated the structural basis of protein disaggregation in the complex of 580 kDa ClpB hexamer and 70 kDa DnaK, which serves as an ATP-dependent molecular chaperone [217]. Despite its biological importance, the ClpB-DnaK interaction is weak therefore, the structural mechanisms behind disaggregation remains to be understood. Titration and PRE experiments conducted with selectively methyl protonated samples identified the interface of the binding regions. This experimental information was included in HADDOCK molecular docking programme in order to produce the first structural model of the ClpB/DnaK complex, which was validated with biochemical and/or biophysical information [218, 219]. Similarly, Huang et al. recently addressed the mechanism of molecular recognition for the chaperone SecB, a protein which forms a 70 kDa rectangular disc-like shape with four subunits, each consisting of 155 residues [220]. This protein exhibits an unusually strong activity for maintaining secretory proteins in an unfolded state prior to their transmembrane transport. The transport is later mediated by the Sec-translocase which, also acts as a chaperone preventing protein aggregation. The authors employed maltose-binding protein (396 amino acids) and alkaline phosphatase (PhoA, 471 amino acids) as substrates of SecB, and determined the structures of their complexes from NOE-/PRE-derived distance restraints, dihedral angle restraints from chemical shifts, and the crystal structure of SecB in order to define an atomic model for how SecB interacts with unfolded proteins [221, 222]. It is particularly notable that the SecB-PhoA complex structure maintained a molecular mass of ~120 kDa. The same group also performed NMR characterisation of SecA, a 204-kDa ATPase motor of the Sec translocase [223] and trigger factor chaperone in complex with PhoA (~100 kDa) [224]. Kay and co-workers have intensively investigated the archaeal 20S proteasome and its complex with two 11S regulatory particles. This is one of the largest biomolecular complexes (beyond 1 MDa) analysed using solution NMR-based methods (Fig. 5c) [225-232]. The 20S proteasome forms a symmetric barrel that has proteolytic active sites located within its lumen and a gate for substrate entry at each end of the barrel. The authors site-specifically spin labelled residues within the gate of the and subsequently utilized the sidechain methyl group of methionine residues as a means to measure PRE effects. This revealed a conformational equilibrium for residues within the gate of the proteasome that was ascribed to the opening and closing of lid domains [227]. However, a two-dimensional magnetisation exchange experiment required a three-state model to describe the conformational equilibrium and identified that the process occurs on a second time scale. 4. NMR-based analysis of protein structure in living cells Isotope labelling of target proteins in cells A remarkable application of solution NMR is for the in situ observation of protein behaviour in living cells permitting the analysis of their structures and dynamics in a non-invasive manner. The study of biomolecules in living cells using NMR-based approaches is called in-cell NMR and is currently the only existing spectroscopic technique to study biomacromolecules inside cells at atomic resolution. A final goal in

15

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN

US

CR

IP

T

the study of ‘dynamical ordering’ is to elucidate the mechanisms behind the assembly and disassembly of various biomolecules resulting in the integrated functions of living systems. In addition to biomolecular investigations under diluted solution states, direct observation of target molecules in living systems by in-cell NMR could provide new insights into 3D structures, dynamics and various binding events of biomolecules in cells. In-cell NMR was first reported by the Dötsch group where 2D 1H–15N HSQC spectra of a small 7 kDa, N-terminal metal-binding domain of the Tn501 mercuric ion reductase (NmerA) was recorded in living E. coli cells [233]. Since then, the method has been expanded from prokaryotic to eukaryotic cells such as yeast (Pichia pastoris) [234], cultured insect cells (Sf9) [187], Xenopus laevis oocytes [235, 236], Chinese hamster ovary cells (CHO-K1) [237], rat neuronal cells (B65 and RCSN-3) [238] and cultured human cells (HeLa [239], HEK293 [240, 241], A2780, neuronal SK-N-SH, etc. [238]). Incorporation of NMR-active nuclei only permits observation of the desired molecules or atoms even in complex environments containing an enormous number of others compounds and biomacromolecules. The specific labelling of target molecules inside cells with stable isotopes is achieved mainly by employing two approaches: (1) intrinsic over-expression of proteins in cells and (2) incorporation of stable isotope-labelled molecules by importing them through the cellular membrane (Fig. 6). Intrinsic over-expression, which was employed in the first report of in-cell NMR using E. coli cells [233], enables isotope labelling of target proteins by exchanging unlabelled cell culture media to one containing stable isotope-labelled nutrients prior to the induction of protein expression. Hamatsu et al. adopted the sf9/baculovirus protein expression system for measuring protein NMR spectra in cultured insect cells [187]. Banci et al. employed cultured human cells and enabled the observation of how SOD1 matures inside HEK293 cells, from which, the formation of intrasubunit disulphide bonds were observed [241]. The advantage of this ‘over-expression’ approach is that (1) proteins with poor stability or that are difficult to purify can be analysed, (2) native protein production processes are used in contrast to the ‘incorporation’ method and (3) in many cases, higher intracellular concentrations can be achieved in contrast to the ‘incorporation’ method enabling increase signal-to-noise of NMR signals. A major drawback to the overexpression method is that it results in the generation of strong background signals because this method cannot completely prevent the expression of other proteins which also become isotopically labelled. This leads to severe signal overlap due to the observation of other resonances from the background expression of other proteins making the analysis of spectra challenging. The incorporation of purified isotopically labelled biomolecules from outside, on the other hand, can easily avoid the generation of unwanted resonances. It also permits delivery of various ‘modified’ proteins, for example, those with post-translational modifications, fusion proteins with segmental isotope labelling [242] and proteins conjugated with chemical compounds such as paramagnetic lanthanide-binding tags [243]. Four different methods have been proposed for the ‘incorporation’ approach: (1) microinjection of labelled molecules into cells (particularly in X. laevis oocytes), (2) incorporation of proteins with the help of conjugated cell penetrating peptides (CPPs) [239], (3) introduction of a pore-forming toxin protein, streptolysin O (SLO) [244] and (4) permeabilisation by electroporation [238, 245] (Fig. 6b–e). Microinjection has yielded several important applications of in-cell NMR which has led to the observation of interactions between ubiquitin and intrinsic proteins in cells [236], the in-cell study of nucleic acids [246] and a 3D structure model of a protein from in-cell paramagnetic NMR data [247, 248]. One obstacle of microinjection is that it is only applicable to exceedingly large cells such as X. laevis oocytes. The CPP method was developed for in-cell NMR measurements of proteins within human cultured cells. The first method utilized the trans-activating transcriptional activator (TAT) CPP tag (CPPTAT) from the Tat protein of HIV-1 which is attached to a target protein and allows proteins to cross through the plasma membrane of cells. CPPTAT is positively charged, comprised of mostly lysine and arginine residues, which likely permits the peptide-linked proteins to pass through the cellular membrane via the interaction with anionic lipid bilayers and then cleaved from the cargo protein. Inomata et al. connected CPPTAT with cargo proteins using two strategies: (1) fusion proteins consisting of CPPTAT–ubiquitin cargo, and (2) production of proteins in which CPPTAT is attached via a disulphide bond. In the former method, the cargo proteins are cleaved by endogenous ubiquitin-specific terminal proteases. In the latter, the disulphide bond with CPP TAT naturally breaks under the reducing environment in cells. In the first report of this method, the authors successfully addressed the folding stability of proteins in cells and the binding of small molecules to FKBP12 [239]. Another example of this method was an in-cell NMR study of the human calbindin D9k where changes in its

16

ACCEPTED MANUSCRIPT

CR

IP

T

Ca2+ bound state could be monitored following changes of the Ca2+ concentration in the cytosol [249]. The pore-forming method utilises SLO to create holes in the cell membrane which permits proteins found in the extracellular space to diffuse into the cell. The pores are repaired by adding Ca2+ ions, which prevents the leakage of the translocated molecules [250]. This method can deliver molecules up to 150 kDa in size through pores that maintain a diameter of approximately 35 nm [251]. This method has the advantage as no modification of the target protein is required. Electroporation is currently often used for in-cell NMR studies because one can easily incorporate proteins into cells without any modifications of the molecules and cell membrane [238]. However, a common problem for all of the incorporation methods is that they do not achieve as high a protein concentration compared to the intrinsic overexpression method. The intracellular concentration of the incorporated protein is typically on the order of 10s of micromolar at most. Nevertheless, the incorporation efficiency of each method largely depends on the compatibility with individual proteins. Therefore, to some extent, heuristic approaches have to be taken for each protein in order to find the optimum method and conditions that permit their study within cells.

AC

CE

PT

ED

M

AN

US

Frontier applications of in-cell NMR An intriguing challenge of in-cell NMR studies is to elucidate how the structure/dynamics of biomolecules are maintained in living environments, and how they differ from the dilute in vitro solution state. The interior of cells is a highly crowded environment containing various biomacromolecules, reaching a total of 300–400 mg/mL in E. coli cells [252]. A protein’s structure, stability and dynamics are usually assessed by monitoring the system in a controlled environment where only specific interactions among a limited variety of molecules are studied as performed for in vitro experiments. However, in the cell many non-specific interactions [253, 254] and macromolecular crowding effects can have dramatic effects on protein behaviour (Fig. 7). Macromolecular crowding yields excluded-volume effects [255] causing inhomogeneity in the distribution of water within the cell [256]. Protein behaviour under these conditions has been vigorously investigated both in living cells and artificial crowding environments. It was originally proposed that the macromolecular crowding effect stabilises protein folding by the excluded-volume effect. That is that many native protein structures prefer compact conformations promoted by the steric repulsion within its densely packed environment. Strictly speaking, the excluded-volume effect is composed of ‘crowding’ and ‘confinement’; crowding refers to an outcome attributed to volumes taken up by one soluble macromolecule which, therefore, prevents space for another molecule, and confinement refers to an effect due to limited space prescribed by surrounding macromolecules [255]. Although these effects were precisely investigated by in vitro crowding systems, many were performed only with one or a few types of macromolecules as artificial molecular crowders. In the living system, Inomata et al. analysed protein stability in HeLa cells and concluded that the living cell environment notably decreases the folding stability of human ubiquitin [239]. This contradicted in vitro studies with macromolecular crowding agents. The authors investigated the exchange rate of backbone amide hydrogens with solvent water using NMR H/D exchange experiments (Fig. 8) [239]. A comparison of the H/D exchange rates measured for the protein ubiquitin revealed that exchange rates became 15-20 times faster in HeLa cells as compared to an in vitro environment. The increased H/D exchange rates was not only observed for residues found on the surface of the protein, but also in the core region. The authors also performed the same H/D exchange experiments with mutant construct of ubiquitin in which three alanine substitutions (L8A, I44A and V70A) were introduced. These single point mutations are located in regions that surround the interaction surface of ubiquitin and were rationalized to promote the disruption of its myriad interactions. The results showed that the amide H/D exchange rates for the mutant decreased to 30%–60% compared with those of the wild-type, strongly suggesting that the folding destabilisation is due to specific interactions as well as non-specific interactions with other molecules within the cell. In another folding stability study performed under crowding conditions, Danielsson et al. investigated the thermodynamics of mutant forms of the -barrel protein, radical scavenger, Cu/Zn SOD1 barrel [245]. The NMR spectrum of the I35A mutant of SOD1 barrel (SOD1I35A) demonstrated that the mutant is in equilibrium between a folded and unfolded conformation in vitro at pH 6.5 and at 35°C. Equilibrium constants and free energies of SOD1I35A in mammalian (A2780) and bacterial cells were carefully examined by comparing the changes of peak intensities at several temperatures. The results showed that the environments inside both types of cells destabilised the folding of SOD1 I35A and shifted the equilibrium 4-fold toward the denatured state at

17

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN

US

CR

IP

T

37°C in A2780 cells. The authors discussed that this is due to transient interactions of SOD1I35Al with interior surfaces of the cell. Meanwhile, Smith et al. reported a similar study on the 7-kDa globular N-terminal SH3 domain of Drosophila signal transduction protein drk (drk-N SH3), which is also at a thermal equilibrium between folded and unfolded in buffer conditions at pH 7.2 [257, 258]. The authors introduced 19F at the 5position of tryptophan side chains by adding 5-fluoroindole in the growth medium and examined the stability of drk-N SH3 in E. coli cells from the peak intensity in the 19F spectra. The results showed that drk-N SH3 was destabilised as in the case of SOD1 barrel. The melting temperature (folding to unfolding) and the free energy of the unfolded protein either decreased or were unchanged compared with those in the buffer. The thermodynamic analysis suggested that the unfolded ensemble of drk-N SH3 may be preferred in cells due to attractive interactions with other biomacromolecules. The same group has also investigated the stability of the IDPs -synuclein and FlgM in bacterial cells using H/D exchange experiments [259, 260]. Using a fusion construct consisting of folded ubiquitin attached to -synuclein only the backbone resonances of -synuclein could be detected, whereas ubiquitin could only be detected upon lysing the cells. The authors concluded that the disordered state of these IDPs persisted in the crowded cellular environment. In current studies of protein folding pathways and protein stability, lowly populated transient states of proteins have received considerable attention. It is known that biomacromolecules undergo conformational fluctuations including states that are lowly populated, but that can sometimes play indispensable roles in their biological functions (see sections 1 & 2). It is of interest to elucidate whether the population of the stable and unstable states of proteins in cells are similar to those in vitro. Latham et al. used an RD-based approach for proteins in cell lysates to investigate the folding processes within a crowded environment that contains various unknown biomolecules [261, 262]. Despite the lack of complicated intracellular structures, it is still worthwhile to study biomolecules in lysates as it is certainly closer to physiological conditions compared with using artificial molecular crowding agents. The authors investigated the dynamics of the four-helix-bundle FF domain from human HYPA/FBP11 (FF domain) in the cellular lysates of E. coli and Saccharomyces cerevisiae [262]. The results showed that the populations of the intermediate states in both lysates were very similar to those in a buffer, but the exchange rates between the ground and excited states decreased to approximately 64% and 44% in the E. coli and S. cerevisiae lysates, respectively. This suggested that the kinetics and thermodynamics of FF domain fluctuations were somewhat perturbed under the crowding environment. The authors also performed CPMG RD experiments for the 20S proteasome in an archaeal lysate from the thermophile Thermus thermophilus to address the exchange rate of proteasome gating under a more physiologically relevant condition [232]. They demonstrated that the populations of the opened and closed forms in cell lysate did not change compared to what was found in vitro and that the gate open/closing exchange rate only decreased slightly as compared to in vitro conditions. The authors concluded that the crowding environment, at least in the lysate, has little influence on the kinetics and thermodynamics of the gating reaction of the 20S proteasome, suggesting that the system is particularly robust as it doesn’t partake in non-specific interactions with other molecules under such conditions. It has been shown that solution NMR provides structural/dynamic information, inter-atomic distances and dihedral angle information of biomacromolecules inside living systems. It is also possible to reconstruct the 3D protein structures of biomacromolecules in the cell. In the case of in-cell NMR, when the ‘expression’ approach is utilized, the short lifetime of cells in an NMR sample tube, severe signal overlaps and massive background noise signals hinder the collection of a sufficient amount of accurate structural information. Sakakibara et al. achieved the first de novo protein structure determination in living cells: a T. thermophilus HB8 TTHA1718 gene product in living E. coli was analysed by solution NMR experiments with NUS, signal reconstruction by MaxEnt and selective isotope labelling [263, 264] (Fig. 9a). The authors were able to completely assign the backbone and a majority of side chain atoms as well as collect a sufficient number of NOE distance restraints. The resulting structure was well converged with a backbone RMSD less than 1.0 Å and is similar to the in vitro structure with a backbone RMSD of 1.2 Å. Slight structural differences were found in the putative heavy metal binding loop where it was found that chemical shift differences reflected possible metal binding. The authors discussed that the interactions with metal ions in the E. coli cytosol or the effects of viscosity and intracellular molecular crowding might affect the conformation of this region. Moreover, the same group recently improved the procedure for the determination of protein structures in cells with three methodological advances: QME data processing [187], automated resonance assignment [265, 266] and

18

ACCEPTED MANUSCRIPT

ED

M

AN

US

CR

IP

T

Bayesian inference-assisted structure refinement [267, 268] (Fig. 9b). The new procedure enables the determination of 3D protein structures with much lower intracellular concentrations and even without artificial restraints such as hydrogen bond information. The structure of protein GB1 in living E. coli cells was determined at an order of magnitude lower concentration than the original report of TTHA1718, which was expressed in E. coli cells at a concentration of 3–4 mM. A bioreactor system, that continuously supplies fresh medium from outside the spectrometer, has enabled in-cell NMR observations of proteins at even lower concentrations [269]. When de novo 3D protein structure determination is performed in eukaryotic cells, the maximum achievable protein concentration using the current incorporation methods is too low in order to record 3D NOESY-type spectra and to obtain a sufficient quantity of NOE-derived distance information. Paramagnetic effects, such as PCS, PRE and RDC are typically induced by lanthanide ions, provide the most promising structural data for replacing NOE-derived distance information. Particularly in in-cell NMR studies, the advantage of PCSs, PREs and RDCs is that they can be collected from 2D 1H–15N or 1H–13C correlation spectra. Unless proteins of interest have a strong natural affinity to paramagnetic lanthanide ions, which directly produce PCSs and PREs as well as RDCs via alignment in the magnetic field, lanthanide-binding tags (LBTs) introduced by chemical modification to the proteins are used for quantifying these effects. However, for in-cell NMR experiments, the intracellular reducing environment and cytotoxicity of lanthanide ions require that the LBTs become more stable. The same is true of the conjugating nitroxide radicals on proteins in cells. The development of several new LBTs have made it possible to acquire structural PRE, PCS and RDC data inside cells [247, 248, 270, 271]. Global 3D structures of proteins inside X. laevis oocytes were obtained by combining with 3D protein prediction software such as Rosetta [272, 273]. The sample concentrations in these studies were approximately 50 μM, approaching the conditions expected for a true physiological environment where the maximal natural concentration of a protein is usually in the tens to several hundred micromolar range [274, 275]. Theillet et al. attached a DOTA-maleimide tag with a Gd3+ ion onto a single cysteine residue of an IDP, -synuclein, and observed PRE in human cultured cells [238]. The authors showed that the -synuclein conformation is similarly compact as what is found in vitro and that in vitro found intrinsic disorder is sustainable inside cells.

AC

CE

PT

Solution NMR studies of biomolecules in living systems or macromolecular crowding environments are gradually allowing us to understand more about their biophysical properties. However, the method is still not straightforward to perform and its range of applications are limited. One of the causes for this is that the crowding environment results in low signal sensitivity. This is because of slow molecular tumbling due to the environments high viscosity and various interactions that occur with surrounding molecules. Accordingly, incell NMR studies encounter the same problems as NMR analyses of large molecules, suggesting that the limitation of in-cell NMR can be solved by the similar approaches taken for large systems. For instance, stereospecific isotope labelling techniques [173] and the state-of-art reconstruction algorithms such as compressed sensing [194, 195] would be valuable for future applications. Although the short lifetime of cells in NMR sample tubes remains a problem of in-cell NMR, it may be addressed by suppressing cell death via a bioreactor system that continuously supplies fresh medium from outside the spectrometer in the case of both bacterial [276] and human cultured cells [269]. Several in-cell studies have observed different behaviour of biomolecules in vitro compared with in-cell [238, 239, 245], suggesting that the NMR analysis of dynamics and interactions of biomolecules under in-cell or near physiological conditions has the potential to elucidate the mechanisms in living systems regarding the dynamical ordering of biomolecules. Whether in-cell NMR will lead to the discovery of fundamental principles and unknown factors in living systems remains to be seen. 5. Conclusions and outlook It is becoming increasingly clear that solution NMR spectroscopy has an armada of experimental techniques that work to delineate the dynamic 3D structures of large biomacromolecules and their complexes greater than 1 MDa, covering a broad range of timescales over which biomolecules exhibit motion (Fig. 1). Given that the temporal aspects of functional regulation have been discovered, measuring the kinetics that underlie these events is of increasing interest. The experimental techniques outlined in this article provide insight into the amplitudes as well as kinetics for a variety of biological functions. NMR spectroscopy will

19

ACCEPTED MANUSCRIPT

play a future role in evaluating the dynamism of biomacromolecules but may also see an increased role within multi-disciplinary approaches that employ a variety of techniques, including solution scattering, atomic force microscopy, electron microscopy and single molecule optical imaging along with computational simulation. This may permit access to potentially larger and more complex biologically relevant systems. Over time these measures will help to elucidate the underlying mechanisms of function and dysfunction of biomolecules in living systems.

AN

US

CR

IP

T

6. Acknowledgements The authors thank Drs. Youhei Kawabata (Tokyo Metropolitan University), Saeko Yanaka (Institute for Molecular Science), Sundaresan Rajesh (GlaxoSmithKline plc.), and Prof. Jonathan Heddle (Jagiellonian University) for critical reading of this manuscript. We gratefully acknowledge financial supports by Scientific Research on Innovative Areas (JP26102538, JP25120003, JP16H00779 to T. I., JP15H01645, JP16H00847 to Y.I., and JP25102008, JP25102001, JP15K21708 to K. K.) and Grants-in-Aid for Scientific Research (JP15K06979 to T. I. and JP15H02491 to K. K.) from the Japan Society for the Promotion of Science (JSPS) and by the Funding Program for Core Research for Evolutional Science and Technology (CREST JPMJCR13M3) from Japan Science and Technology Agency (JST), the James Graham Brown Foundation (D. L.), National Center for Research Resources (NCRR; CoBRE 1P30GM106396 to D.L.), and the Max Planck Society.

7. References

AC

CE

PT

ED

M

[1] K. Venkatesan, J.F. Rual, A. Vazquez, U. Stelzl, I. Lemmens, T. Hirozane-Kishikawa, T. Hao, M. Zenkner, X. Xin, K.I. Goh, M.A. Yildirim, N. Simonis, K. Heinzmann, F. Gebreab, J.M. Sahalie, S. Cevik, C. Simon, A.S. de Smet, E. Dann, A. Smolyar, A. Vinayagam, H. Yu, D. Szeto, H. Borick, A. Dricot, N. Klitgord, R.R. Murray, C. Lin, M. Lalowski, J. Timm, K. Rau, C. Boone, P. Braun, M.E. Cusick, F.P. Roth, D.E. Hill, J. Tavernier, E.E. Wanker, A.L. Barabasi, M. Vidal, An empirical framework for binary interactome mapping, Nat Methods, 6 (2009) 83-90. [2] R.M. Glaeser, How good can cryo-EM become?, Nat Methods, 13 (2016) 28-32. [3] G.M. Clore, A.M. Gronenborn, NMR structures of proteins and protein complexes beyond 20,000 M-r, Nature Structural Biology, 4 (1997) 849-853. [4] D. Tobi, I. Bahar, Structural changes involved in protein binding correlate with intrinsic motions of proteins in the unbound state, Proceedings of the National Academy of Sciences, 102 (2005) 18908-18913. [5] I. Nobeli, A.D. Favia, J.M. Thornton, Protein promiscuity and its implications for biotechnology, Nature Biotechnology, 27 (2009) 157-167. [6] A. Sekhar, L.E. Kay, NMR paves the way for atomic level descriptions of sparsely populated, transiently formed biomolecular conformers., Proceedings of the National Academy of Sciences of the United States of America, 110 (2013) 12867-12874. [7] M.R. Arkin, Y. Tang, J.A. Wells, Small-Molecule Inhibitors of Protein-Protein Interactions: Progressing toward the Reality, Chemistry & Biology, 21 (2014) 1102-1114. A. .T. mith, R. M ller, M.D. Toscano, P. Kast, H.W. Hellinga, D. Hilvert, K.N. Houk, Structural Reorganization and Preorganization in Enzyme Active Sites: Comparisons of Experimental and Theoretically Ideal Active Site Geometries in the Multistep Serine Esterase Reaction Cycle, Journal of the American Chemical Society, 130 (2008) 15361-15373. [9] J.P. Loria, R.B. Berlow, E.D. Watt, Characterization of Enzyme Motions by Solution NMR Relaxation Dispersion, Accounts of chemical research, 41 (2008) 214-221. [10] H.N. Motlagh, J.O. Wrabl, J. Li, V.J. Hilser, The ensemble nature of allostery., Nature, 508 (2014) 331339. [11] R. Nussinov, C.-J. Tsai, Allostery in Disease and in Drug Discovery, Cell, 153 (2013) 293-305.

20

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN

US

CR

IP

T

[12] H. Frauenfelder, S.G. Sligar, P.G. Wolynes, The energy landscapes and motions of proteins., Science, 254 (1991) 1598-1603. [13] D.D. Boehr, R. Nussinov, P.E. Wright, The role of dynamic conformational ensembles in biomolecular recognition, Nature chemical biology, 5 (2009) 789-796. [14] O. Keskin, Binding induced conformational changes of proteins correlate with their intrinsic fluctuations: a case study of antibodies, BMC Structural Biology, 7 (2007) 31-11. [15] O.F. Lange, N.-A. Lakomek, C. Farès, G.F. Schröder, K.F.A. Walter, S. Becker, J. Meiler, H. Grubmüller, C. Griesinger, B.L. de Groot, Recognition dynamics up to microseconds revealed from an RDCderived ubiquitin ensemble in solution., Science, 320 (2008) 1471-1475. [16] L.C. James, P. Roversi, D.S. Tawfik, Antibody Multispecificity Mediated by Conformational Diversity, Science, 299 (2003) 1362-1367. [17] S.-R. Tzeng, C.G. Kalodimos, Allosteric inhibition through suppression of transient conformational states, Nature chemical biology, 9 (2013) 462-465. [18] S.J. Teague, Implications of protein flexibility for drug discovery, Nature Reviews Drug Discovery, 2 (2003) 527-541. [19] S.E.A. Ozbabacan, A. Gursoy, O. Keskin, R. Nussinov, Conformational ensembles, signal transduction and residue hot spots: Application to drug discovery, Current Opinion in Drug Discovery and Development, 13 (2010) 527-537. [20] A. Sekhar, J.A. Rumfeldt, H.R. Broom, C.M. Doyle, G. Bouvignies, E.M. Meiering, L.E. Kay, Thermal fluctuations of immature SOD1 lead to separate folding and misfolding pathways, (2015) 1-33. [21] A. Sekhar, J.A.O. Rumfeldt, H.R. Broom, C.M. Doyle, R.E. Sobering, E.M. Meiering, L.E. Kay, Probing the free energy landscapes of ALS disease mutants of SOD1 by NMR spectroscopy, Proceedings of the National Academy of Sciences, 113 (2016) E6939-E6945. [22] N.R. Latorraca, A.J. Venkatakrishnan, R.O. Dror, GPCR Dynamics: Structures in Motion, Chem Rev, 117 (2017) 139-155. [23] M. Karplus, J. Kuriyan, Molecular dynamics and protein function, Proc Natl Acad Sci U S A, 102 (2005) 6679-6685. [24] V. Csizmok, A.V. Follis, R.W. Kriwacki, J.D. Forman-Kay, Dynamic Protein Interaction Networks and New Structural Paradigms in Signaling, Chem Rev, 116 (2016) 6424-6462. [25] A.G. Palmer III, NMR probes of molecular dynamics: overview and comparison with other techniques, Annu. Rev. Biophys. Biomol. Struct., 30 (2003) 129-155. [26] P. Luginbuhl, K. Wutrich, Semi-classical nuclear spin relaxation theory revisited for use with biological macromolecules, Progress in nuclear magnetic resonance spectroscopy, 40 (2002) 199-247. [27] B. Reif, A. Diener, M. Hennig, M. Maurer, C. Griesinger, Cross-Correlated Relaxation for the Measurement of Angles between Tensorial Interactions, Journal of Magnetic Resonance, 143 (2000) 45-68. [28] M.P. Nicholas, E. Eryilmaz, F. Ferrage, D. Cowburn, R. Ghose, Nuclear spin relaxation in isotropic and anisotropic media, Progress in nuclear magnetic resonance spectroscopy, 57 (2010) 111-158. [29] D.M. Korzhnev, M. Billeter, A.S. Arseniev, V.Y. Orekhov, NMR studies of Brownian tumbling and internal motions in proteins, Progress in nuclear magnetic resonance spectroscopy, 38 (2001) 197-266. [30] L.E. Kay, D.A. Torchia, A. Bax, Backbone Dynamics of Proteins as Studied by 15N Inverse Detected Heteronuclear NMR Spectroscopy: Application to Staphylococcal Nuclease, Biochemistry, 28 (1989) 89728979. [31] G. Lipari, A. Szabo, Model-free approach to the interpretation of nuclear magnetic resonance relaxation in macromolecules. 2. Analysis of experimental results, Journal of the American Chemical Society, 104 (1982) 4559-4570. [32] N. Tjandra, D.S. Garrett, A.M. Gronenborn, A. Bax, G.M. Clore, Defining long range order in NMR structure determination from the dependence of heteronuclear relaxation times on rotational diffusion anisotropy, Nature Structural & Molecular Biology, 4 (1997) 443-450. [33] D. Ban, T.M. Sabo, C. Griesinger, D. Lee, Measuring dynamic and kinetic information in the previously inaccessible supra-τ(c) window of nanoseconds to microseconds by solution NMR spectroscopy., Molecules (Basel, Switzerland), 18 (2013) 11904-11937.

21

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN

US

CR

IP

T

[34] D.F. Hansen, D. Yang, H. Feng, Z. Zhou, S. Wiesner, Y. Bai, L.E. Kay, An Exchange-Free Measure of 15N Transverse Relaxation: An NMR Spectroscopy Application to the Study of a Folding Intermediate with Pervasive Chemical Exchange, Journal of the American Chemical Society, 129 (2007) 11468-11479. [35] G.M. Clore, A. Szabo, A. Bax, L.E. Kay, P.C. Driscoll, A.M. Gronenborn, Deviations from the Simple Two-Parameter Model-Free Approach to the Interpretation of Nitrogen-15 Nuclear Magnetic Relaxation of Proteins, Journal of the American Chemical Society, 112 (1990) 4989-4991. [36] L.K. Lee, M. Rance, W.J. Chazin, A.G. Palmer III, Rotational diffusion anisotropy of proteins from simultaneous analysis of, Journal of Biomolecular NMR, 9 (1998) 287-298. [37] D.M.e.a. Korzhnev, Model-Free Approach beyond the Borders of Its Applicability, 127 (1997) 184-191. [38] N.-A. Lakomek, J. Ying, A. Bax, Measurement of 15N relaxation rates in perdeuterated proteins by TROSY-based methods, Journal of Biomolecular NMR, 53 (2012) 209-221. [39] D.M. Korzhnev, N.R. Skrynnikov, O. Millet, D.A. Torchia, L.E. Kay, An NMR Experiment for the Accurate Measurement of Heteronuclear Spin-Lock Relaxation Rates, Journal of the American Chemical Society, 124 (2002) 10743-10753. [40] F. Ferrage, D. Cowburn, R. Ghose, Accurate Sampling of High-Frequency Motions in Proteins by Steady- tate 15N−{ 1H} Nuclear Overhauser Effect Measurements in the Presence of Cross-Correlated Relaxation, Journal of the American Chemical Society, 131 (2009) 6048-6049. [41] F. Ferrage, A. Reichel, S. Battacharya, D. Cowburn, R. Ghose, On the measurement of ¹⁵ N-{¹H} nuclear Overhauser effects. 2. Effects of the saturation scheme and water signal suppression., Journal of magnetic resonance (San Diego, Calif. : 1997), 207 (2010) 294-303. [42] A.G. Palmer III, NMR Characterization of the Dynamics of Biomacromolecules, Chemical reviews, 104 (2004) 3623-3640. [43] C.D. Kroenke, J.P. Loria, L.K. Lee, M. Rance, A.G. Palmer III, Longitudinal and Transverse 1H-15N Dipolar/15N Chemical Shift Anisotropy Relaxation Interference: Unambiguous Determination of Rotational Diffusion Tensors and Chemical Exchane Effects in Biological Macromolecules, Journal of the American Chemical Society, 120 (1998) 7905-7915. [44] K. Pervushin, R. Riek, G. Wider, K. Wüthrich, Attenuated T2 relaxation by mutual cancellation of dipole-dipole coupling and chemical shift anisotropy indicates an avenue to NMR structures of very large biological macromolecules in solution., Proceedings of the National Academy of Sciences of the United States of America, 94 (1997) 12366-12371. [45] D. Lee, C. Hilty, G. Wider, K. Wüthrich, Effective rotational correlation times of proteins from NMR relaxation interference., Journal of magnetic resonance (San Diego, Calif. : 1997), 178 (2006) 72-76. [46] D.M. Mitrea, J.A. Cika, C.S. Guy, D. Ban, P.R. Banerjee, C.B. Stanley, A. Nourse, A.A. Deniz, R.W. Kriwacki, Nucleophosmin integrates within the nucleolus via multi-modal interactions with proteins displaying R-rich linear motifs and rRNA, eLIFE, 5 (2016) e13571. [47] N.-A. Lakomek, J.D. Kaufman, S.J. Stahl, J.M. Louis, A. Grishaev, P.T. Wingfield, A. Bax, Internal Dynamics of the Homotrimeric HIV-1 Viral Coat Protein gp41 on Multiple Time Scales, Angewandte Chemie International Edition, 52 (2013) 3911-3915. [48] N. Rezaei-Ghaleh, F. Klama, F. Munari, M. Zweckstetter, Predicting the Rotational Tumbling of Dynamic Multidomain Proteins and Supramolecular Complexes, Angewandte Chemie International Edition, 52 (2013) 11410-11414. [49] R. Brüschweiler, X. Liao, P.E. Wright, Long-Range Motional Restrictions in a Multidomain Zinc-Finger Protein from Anisotropic Tumbling, Science, 268 (1995) 886-891. [50] N.-A. Lakomek, K.F.A. Walter, C. Farès, O.F. Lange, B.L. de Groot, H. Grubmüller, R. Brüschweiler, A. Munk, S. Becker, J. Meiler, C. Griesinger, Self-consistent residual dipolar coupling based model-free analysis for the robust determination of nanosecond to microsecond protein dynamics., Journal of Biomolecular NMR, 41 (2008) 139-155. [51] F.-X. Theillet, A. Binolfi, T. Frembgen-Kesner, K. Hingorani, M. Sarkar, C. Kyne, C. Li, P.B. Crowley, L. Gierasch, G.J. Pielak, A.H. Elcock, A. Gershenson, P. Selenko, Physicochemical Properties of Cells and Their Effects on Intrinsically Disordered Proteins (IDPs), Chemical reviews, 114 (2014) 6661-6714.

22

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN

US

CR

IP

T

[52] S.N. Khan, C. Charlier, R. Augustyniak, N. Salvi, V. Dejean, G. Bodenhausen, O. Lequin, P. Pelupessy, F. Ferrage, Distribution of Pico- and Nanosecond Motions in Disordered Proteins from Nuclear Spin Relaxation, Biophys J, 109 (2015) 988-999. [53] M.L. Gill, R.A. Byrd, A.G. Palmer, III, Dynamics of GCN4 facilitate DNA interaction: a model-free analysis of an intrinsically disordered region, Phys Chem Chem Phys, 18 (2016) 5839-5849. [54] A. Abyzov, N. Salvi, R. Schneider, D. Maurin, R.W.H. Ruigrok, M.R. Jensen, M. Blackledge, Identification of Dynamic Modes in an Intrinsically Disordered Protein using Temperature-dependent NMR Relaxation, Journal of the American Chemical Society, 138 (2016) 6240-6251. [55] H. Schwalbe, K.M. Fiebig, M. Buck, J.A. Jones, S.B. Grimshaw, A. Spencer, S.J. Glaser, L.J. Smith, C.M. Dobson, Structural and Dynamical Properties of a Denatured Protein. Heteronuclear 3D NMR Experiments and Theoretical Simulations of Lysozyme in 8 M Urea, Biochemistry, (1997) 1-15. [56] J. Klein-Seetharaman, M. Oikawa, S.B. Grimshaw, J. Wirmer, E. Duchardt, T. Ueda, T. Imoto, L.J. Smith, C.M. Dobson, H. Schwalbe, Long-Range Interactions Within a Nonnative Protein, Science, 295 (2002) 1719-1722. [57] Y. Xue, I.S. Podkorytov, D.K. Rao, N. Benjamin, H. Sun, N.R. Skrynnikov, Paramagnetic relaxation enhancements in unfolded proteins: Theory and application to drkN SH3 domain, Protein Science, 18 (2009) 1401-1424. [58] K. Modig, F.M. Poulsen, Model-independent interpretation of NMR relaxation data for unfolded proteins: the acid-denatured state of ACBP, Journal of Biomolecular NMR, 42 (2008) 163-177. [59] L.K. Nicholson, L.E. Kay, D.M. Baldisseri, J. Arango, P.E. Young, A. Bax, D.A. Torchia, Dynamics of Methyl Groups in Proteins As Studied by Proton-Detected 13CNMR Spectroscopy. Application to the Leucine Residues of Staphylococcal Nuclease?, Biochemistry, 31 (1992) 5253-5263. [60] D.R. Muhandiram, T. Yamazaki, B.D. Sykes, L.E. Kay, Measurement of 2H T1 and T1rho Relaxation Times in Uniformly 13C-Labeled and Fractionally 2H-Labeled Proteins in Solution, Journal of the American Chemical Society, 117 (1995) 11536-11544. [61] A.J. Wand, Dynamic activation of protein function: A view emerging from NMR spectroscopy, Nature Structural Biology, 8 (2001) 926-931. [62] D. Yang, L.E. Kay, Contributions to conformational entropy arising from bond vector fluctuations measured from NMR-derived order parameters: application to protein folding., Journal of molecular biology, 263 (1996) 369-382. [63] K.K. Frederick, M.S. Marlow, K.G. Valentine, A.J. Wand, Conformational entropy in molecular recognition by proteins, Nature, 448 (2007) 325-329. [64] M. Akke, R. Brüschweiler, A.G. Palmer III, NMR Order Parameters and Free Energy: An Analytical Approach and Its Application to Cooperative Ca2+ Bining by Calbindin D9k, Journal of the American Chemical Society, 115 (1993) 9832-9833. [65] S.-R. Tzeng, C.G. Kalodimos, Protein activity regulation by conformational entropy, Nature, 488 (2012) 236-240. [66] Y. Xue, J.M. Ward, T. Yuwen, I.S. Podkorytov, N.R. Skrynnikov, Microsecond time-scale conformational exchange in proteins: using long molecular dynamics trajectory to simulate NMR relaxation dispersion data, J Am Chem Soc, 134 (2012) 2555-2562. [67] K. Lindorff-Larsen, P. Maragakis, S. Piana, D.E. Shaw, Picosecond to Millisecond Structural Dynamics in Human Ubiquitin, J Phys Chem B, 120 (2016) 8313-8320. [68] T. Yamaguchi, Y. Sakae, Y. Zhang, S. Yamamoto, Y. Okamoto, K. Kato, Exploration of conformational spaces of high-mannose-type oligosaccharides by an NMR-validated simulation, Angew Chem Int Ed Engl, 53 (2014) 10941-10944. [69] Q. Zhang, A.C. Stelzer, C.K. Fisher, H.M. Al-Hashimi, Visualizing spatially correlated dynamics that directs RNA conformational transitions., Nature, 450 (2007) 1263-1267. [70] H.T.A. Leung, P. Kukic, C. Camilloni, F. Bemporad, A. De Simone, F.A. Aprile, J.R. Kumita, M. Vendruscolo, NMR characterization of the conformational fluctuations of the human lymphocyte functionassociated antigen-1 I-domain, Protein Science, 23 (2014) 1596-1606.

23

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN

US

CR

IP

T

[71] M.R. Jensen, G. Communie, E.A. Ribeiro Jr, N. Martinez, A. Desfosses, L. Salmon, L. Mollica, F. Gabel, M. Jamin, S. Longhi, R.W.H. Ruigrok, M. Blackledge, Intrinsic disorder in measles virus nucleocapsids, Proceedings of the National Academy of Sciences, 108 (2011) 9839-9844. [72] L. Wang, Y. Pang, T. Holder, J.R. Brender, A.V. Kurochkin, E.R.P. Zuiderweg, Functional dynamics in the active site of the ribonuclease binase, Proceedings of the National Academy of Sciences, 98 (2001) 76847689. [73] D.A. Case, Chemical shifts in biomolecules, Current opinion in structural biology, 23 (2013) 172-176. [74] M.V. Berjanskii, D.S. Wishart, A Simple Method to Measure Protein Side-Chain Mobility Using NMR Chemical Shifts, Journal of the American Chemical Society, 135 (2013) 14536-14539. [75] M.V. Berjanskii, D.S. Wishart, A Simple Method To Predict Protein Flexibility Using Secondary Chemical Shifts, Journal of the American Chemical Society, 127 (2005) 14970-14971. [76] P. Robustelli, K.A. Stafford, A.G. Palmer III, Interpreting Protein Structural Dynamics from NMR Chemical Shifts, Journal of the American Chemical Society, 134 (2012) 6365-6374. [77] A. Bax, G. Kontaxis, N. Tjandra, Dipolar Couplings in Macromolecular Structure Determination, Elsevier Masson SAS, Place Published, 2001. [78] J.R. Tolman, K. Ruan, NMR Residual Dipolar Couplings as Probes of Biomolecular Dynamics, Chemical reviews, 106 (2006) 1720-1736. [79] A. Bax, G.W. Vuister, S. Grzesiek, F. Delaglio, A.C. Wang, R. Tschudin, G. Zhu, Measurement of Homo- and Heteronuclear J Couplings from Quantitative J Correlation, Methods in Enzymology, 239 (1994) 79-105. [80] V. Tugarinov, W.-Y. Choy, V.Y. Orekhov, L.E. Kay, Solution NMR-derived global fold of a monomeric 82-kDa enzyme, 102 (2005) 622-627. [81] J.R. Tolman, J.M. Flanagan, M.A. Kennedy, J.H. Prestegard, NMR evidence for slow collective motions in cyanometmyoglobin, Nature Structural & Molecular Biology, 4 (1997) 292-297. [82] J. Meiler, J.J. Prompers, W. Peti, C. Griesinger, R. Brüschweiler, Model-free approach to the dynamic interpretation of residual dipolar couplings in globular proteins., Journal of the American Chemical Society, 123 (2001) 6098-6107. [83] J.R. Tolman, A Novel Approach to the Retrieval of Structural and Dynamic Information from Residual Dipolar Couplings Using Several Oriented Media in Biomolecular NMR Spectroscopy, Journal of the American Chemical Society, 124 (2002) 12020-12030. [84] T.M. Sabo, C.A. Smith, D. Ban, A. Mazur, D. Lee, C. Griesinger, ORIUM: optimized RDC-based Iterative and Unified Model-free analysis., Journal of Biomolecular NMR, 58 (2014) 287-301. [85] R. Brüschweiler, P.E. Wright, NMR Order Parameters of Biomolecules: A New Analytical Representation and Application to the Gaussian Axial Fluctuation Model, Journal of the American Chemical Society, 116 (1994) 8426-8427. [86] T. Bremi, R. Brüschweiler, Locally Anisotropic Internal Polypeptide Backbone Dynamics by NMR Relaxation, Journal of the American Chemical Society, 119 (1997) 6672-6673. [87] G. Bouvignies, P. Bernadó, S. Meier, K. Cho, S. Grzesiek, R. Brüschweiler, M. Blackledge, Identification of slow correlated motions in proteins using residual dipolar and hydrogen-bond scalar couplings, Proceedings of the National Academy of Sciences, 102 (2005) 13885-13890. [88] S. Pratihar, T.M. Sabo, D. Ban, R.B. Fenwick, S. Becker, X. Salvatella, C. Griesinger, D. Lee, Kinetics of the Antibody Recognition Site in the Third IgG-Binding Domain of Protein G, Angewandte Chemie, 128 (2016) 9719-9722. [89] C.D. Schwieters, G.M. Clore, A physical picture of atomic motions within the Dickerson DNA dodecamer in solution derived from joint ensemble refinement against NMR and large-angle X-ray scattering data, Biochemistry, 46 (2007) 1152-1166. [90] H.M. Al-Hashimi, Y. Gosser, A. Gorin, W. Hu, A. Majumdar, D.J. Patel, Concerted motions in HIV-1 TAR RNA may allow access to bound state conformations: RNA dynamics from NMR residual dipolar couplings, J Mol Biol, 315 (2002) 95-102. [91] L. Salmon, G. Bascom, I. Andricioaei, H.M. Al-Hashimi, A general method for constructing atomicresolution RNA ensembles using NMR residual dipolar couplings: the basis for interhelical motions revealed, J Am Chem Soc, 135 (2013) 5457-5466.

24

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN

US

CR

IP

T

[92] G.G. Hammes, Y.-C. Chang, T.G. Oas, Conformational selection or induced fit: a flux description of reaction mechanism., Proceedings of the National Academy of Sciences of the United States of America, 106 (2009) 13737-13741. [93] B. Ma, S. Kumar, C.J. Tsai, R. Nussinov, Folding funnels and binding mechanisms., Protein engineering, 12 (1999) 713-720. [94] R.B. Fenwick, S. Esteban-Martín, X. Salvatella, Understanding biomolecular motion, recognition, and allostery by use of conformational ensembles, European Biophysics Journal, 40 (2011) 1339-1355. [95] A.C. Stelzer, A.T. Frank, J.D. Kratz, M.D. Swanson, M.J. Gonzalez-Hernandez, J. Lee, I. Andricioaei, D.M. Markovitz, H.M. Al-Hashimi, Discovery of selective bioactive small molecules by targeting an RNA dynamic ensemble, Nature chemical biology, 7 (2011) 553-559. [96] D. Hamelberg, J. Mongan, J.A. McCammon, Accelerated molecular dynamics: A promising and efficient simulation method for biomolecules, Journal of Chemical Physics, 120 (2004) 11919-11929. [97] P.R.L. Markwick, G. Bouvignies, L. Salmon, J.A. McCammon, M. Nilges, M. Blackledge, Toward a Unified Representation of Protein Structural Dynamics in Solution, Journal of the American Chemical Society, 131 (2009) 16968-16975. [98] P.R.L. Markwick, G. Bouvignies, M. Blackledge, Exploring Multiple Timescale Motions in Protein GB3 Using Accelerated Molecular Dynamics and NMR Spectroscopy, Journal of the American Chemical Society, 129 (2007) 4724-4730. [99] P.M. Gasper, B. Fuglestad, E.A. Komives, P.R.L. Markwick, J.A. McCammon, Allosteric networks in thrombin distinguish procoagulant vs. anticoagulant activities, Proceedings of the National Academy of Sciences, 109 (2012) 21216-21222. [100] C.F. Cervantes, P.R.L. Markwick, S.-C. Sue, J.A. McCammon, H.J. Dyson, E.A. Komives, Functional Dynamics of the Folded Ankyrin Repeats of IκBα Revealed by Nuclear Magnetic Resonance, Biochemistry, 48 (2009) 8023-8031. [101] M.R. Jensen, P.R.L. Markwick, S. Meier, C. Griesinger, M. Zweckstetter, S. Grzesiek, P. Bernadó, M. Blackledge, Quantitative Determination of the Conformational Properties of Partially Folded and Intrinsically Disordered Proteins Using NMR Dipolar Couplings, Structure/Folding and Design, 17 (2009) 1169-1185. [102] B. Reif, M. Hennig, C. Griesinger, Direct measurement of angles between bond vectors in highresolution NMR., Science, 276 (1997) 1230-1233. [103] P. Pelupessy, S. Ravindranathan, G. Bodenhausen, Correlated motions of successive amide N-H bonds in proteins, Journal of Biomolecular NMR, 25 (2003) 265-280. 10 B. geli, L. Yao, Correlated Dynamics between Protein HN and HC Bonds Observed by NMR Cross Relaxation, Journal of the American Chemical Society, 131 (2009) 3668-3678. [105] T. Carlomagno, M. Maurer, M. Hennig, C. Griesinger, Ubiquitin Backbone Motion Studied via NH N−C‘C αDipolar−Dipolar and C‘−C‘C α/NH NC A−Dipolar Cross-Correlated Relaxation, Journal of the American Chemical Society, 122 (2000) 5105-5113. [106] E. Chiarparin, P. Pelupessy, R. Ghose, G. Bodenhausen, Relative Orientation of C αH α-Bond Vectors of Successive Residues in Proteins through Cross-Correlated Relaxation in NMR, Journal of the American Chemical Society, 122 (2000) 1758-1761. [107] E. Chiarparin, P. Pelupessy, R. Ghose, G. Bodenhausen, Relaxation of Two-Spin Coherence Due to Cross-Correlated Fluctuations of Dipole−Dipole Couplings and Anisotropic hifts in NMR of 15N, 13CLabeled Biomolecules, Journal of the American Chemical Society, 121 (1999) 6876-6883. [108] P. Pelupessy, E. Chiarparin, R. Ghose, G. Bodenhausen, Efficient determination of angles subtended by Ca-Ha and N-HN vectors in proteins via dipole-dipole cross-correlation, Journal of Biomolecular NMR, 13 (1999) 375-380. [109] P. Pelupessy, E. Chiarparin, R. Ghose, G. Bodenhausen, Simultaneous determination of psi and phi angles in proteins from measurements of cross-correlated relaxation effects, Journal of Biomolecular NMR, 14 (1999) 277-280. [110] D. Yang, A. Mittermaier, Y.K. Mok, L.E. Kay, A Study of Protein Side-chain Dynamics from New 2H Auto-correlation and 13C Cross-correlation NMR Experiments: Application to the N-terminal SH3 Domain from drk, Journal of molecular biology, 276 (1998) 939-954.

25

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN

US

CR

IP

T

[111] C. Richter, C. Griesinger, I. Felli, P.T. Cole, G. Varani, H. Schwalbe, Determination of sugar conformation in large RNA oligonucleotides from analysis of dipole–dipole cross correlated relaxation by solution NMR spectroscopy, Journal of Biomolecular NMR, 15 (1999) 241-250. [112] C. Richter, B. Reif, C. Griesinger, H. Schwalbe, NMR pectroscopic Determination of Angles α and ζ in RNA from CH-Dipolar Coupling, P-CSA Cross-Correlated Relaxation, Journal of the American Chemical Society, 122 (2000) 12728-12731. [113] E. Duchardt, C. Richter, O. Ohlenschlager, M. Gorlach, J. Wohnert, H. Schwalbe, Determination of the glycosidic bond angle chi in RNA from cross-correlated relaxation of CH dipolar coupling and N chemical shift anisotropy, J Am Chem Soc, 126 (2004) 1962-1970. [114] R. Fiala, N. Spackova, S. Foldynova-Trantirkova, J. Sponer, V. Sklenar, L. Trantirek, NMR crosscorrelated relaxation rates reveal ion coordination sites in DNA, J Am Chem Soc, 133 (2011) 13790-13793. [115] I.C. Felli, C. Richter, C. Griesinger, H. Schwalbe, Determination of RNA Sugar Pucker Mode from Cross-Correlated Relaxation in Solution NMR Spectroscopy, Journal of the American Chemical Society, 121 (1999) 1956-1957. [116] J. Boisbouvier, B. Brutscher, A. Pardi, D. Marion, J.-P. Simorre, NMR Determination of Sugar Puckers in Nucleic Acids from C A−Dipolar Cross-Correlated Relaxation, Journal of the American Chemical Society, 122 (2000) 6779-6780. [117] T. Carlomagno, I.C. Felli, M. Czech, R. Fischer, M. Sprinzl, C. Griesinger, Transferred CrossCorrelated Relaxation: Application to the Determination of Sugar Pucker in an Aminoacylated tRNAMimetic Weakly Bound to EF-Tu, Journal of the American Chemical Society, 121 (1999) 1945-1948. [118] R.B. Fenwick, S. Esteban-Martín, B. Richter, D. Lee, K.F.A. Walter, D. Milovanovic, S. Becker, N.A. Lakomek, C. Griesinger, X. Salvatella, Weak long-range correlated motions in a surface patch of ubiquitin involved in molecular recognition, Journal of the American Chemical Society, 133 (2011) 10336-10339. 11 R.B. Fenwick, C.D. chwieters, B. geli, Direct Investigation of Slow Correlated Dynamics in Proteins via Dipolar Interactions, Journal of the American Chemical Society, 138 (2016) 8412-8421. [120] D. Ban, A.D. Gossert, K. Giller, S. Becker, C. Griesinger, D. Lee, Exceeding the limit of dynamics studies on biomolecules using high spin-lock field strengths with a cryogenically cooled probehead., Journal of magnetic resonance (San Diego, Calif. : 1997), 221 (2012) 1-4. [121] C.A. Smith, D. Ban, S. Pratihar, K. Giller, C. Schwiegk, B.L. de Groot, S. Becker, C. Griesinger, D. Lee, Population shuffling of protein conformations., Angewandte Chemie (International ed. in English), 54 (2015) 207-210. [122] D.M. Korzhnev, L.E. Kay, Probing Invisible, Low-Populated States of Protein Molecules by Relaxation Dispersion NMR Spectroscopy: An Application to Protein Folding, Accounts of chemical research, 41 (2008) 442-451. [123] P. Li, I.R.S. Martins, M.K. Rosen, The feasibility of parameterizing four-state equilibria using relaxation dispersion measurements, Journal of Biomolecular NMR, 51 (2011) 57-70. 12 O. Trott, A.G. Palmer III, R1ρ Relaxation outside of the Fast-Exchange Limit, Journal of Magnetic Resonance, 154 (2002) 157-160. [125] P. Neudecker, D.M. Korzhnev, L.E. Kay, Assessment of the Effects of Increased Relaxation Dispersion Data on the Extraction of 3-site Exchange Parameters Characterizing the Unfolding of an SH3 Domain, Journal of Biomolecular NMR, 34 (2006) 129-135. [126] D. Ban, M. Funk, R. Gulich, D. Egger, T.M. Sabo, K.F.A. Walter, R.B. Fenwick, K. Giller, F. Pichierri, B.L. de Groot, O.F. Lange, H. Grubmüller, X. Salvatella, M. Wolf, A. Loidl, R. Kree, S. Becker, N.-A. Lakomek, D. Lee, P. Lunkenheimer, C. Griesinger, Kinetics of conformational sampling in ubiquitin., Angewandte Chemie (International ed. in English), 50 (2011) 11437-11440. [127] C. Eichmüller, N.R. Skrynnikov, A new amide proton R1rho experiment permits accurate characterization of microsecond time-scale conformational exchange., Journal of Biomolecular NMR, 32 (2005) 281-293. [128] T.M. Sabo, J.O. Trent, D. Lee, Population shuffling between ground and high energy excited states, Protein Science, 24 (2015) 1714-1719.

26

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN

US

CR

IP

T

[129] C.A. Smith, D. Ban, S. Pratihar, K. Giller, M. Paulat, S. Becker, C. Griesinger, D. Lee, B.L. de Groot, Allosteric switch regulates protein–protein binding through collective motion, Proceedings of the National Academy of Sciences, 113 (2016) 3269-3274. [130] D.M. Korzhnev, T.L. Religa, W. Banachewicz, A.R. Fersht, L.E. Kay, A Transient and Low-Populated Protein-Folding Intermediate at Atomic Resolution, Science, 329 (2010) 1312-1316. [131] A.G. Palmer, F. Massi, Characterization of the Dynamics of Biomacromolecules Using Rotating-Frame Spin Relaxation NMR Spectroscopy, Chemical reviews, 106 (2006) 1700-1719. [132] A.J. Baldwin, T.L. Religa, D.F. Hansen, G. Bouvignies, L.E. Kay, 13CHD 2Methyl Group Probes of Millisecond Time Scale Exchange in Proteins by 1H Relaxation Dispersion: An Application to Proteasome Gating Residue Dynamics, Journal of the American Chemical Society, 132 (2010) 10992-10995. [133] D.M. Korzhnev, I. Bezsonova, S. Lee, T.V. Chalikian, L.E. Kay, Alternate binding modes for a ubiquitin-SH3 domain interaction studied by NMR spectroscopy., Journal of molecular biology, 386 (2009) 391-405. [134] P.J. Farber, A. Mittermaier, Relaxation dispersion NMR spectroscopy for the study of protein allostery, Biophysical Reviews, 7 (2015) 191-200. [135] Y. Xue, D. Kellogg, I.J. Kimsey, B. Sathyamoorthy, Z.W. Stein, M. McBrairty, H.M. Al-Hashimi, Characterizing RNA Excited States Using NMR Relaxation Dispersion, 1 ed., Elsevier Inc., Place Published, 2015. 136 E.N. Nikolova, E. Kim, A.A. Wise, P. . O’Brien, I. Andricioaei, H.M. Al-Hashimi, Transient Hoogsteen base pairs in canonical duplex DNA, Nature, 470 (2011) 498-502. [137] D.M. Korzhnev, X. Salvatella, M. Vendruscolo, A.A. Di Nardo, A.R. Davidson, C.M. Dobson, L.E. Kay, Low-populated folding intermediates of Fyn SH3 characterized by relaxation dispersion NMR, Nature, 430 (2004) 586-590. [138] P. Vallurupalli, D.F. Hansen, E. Stollar, E. Meirovitch, L.E. Kay, Measurement of bond vector orientations in invisible excited states of proteins., Proceedings of the National Academy of Sciences of the United States of America, 104 (2007) 18473-18477. [139] T.I. Igumenova, U. Brath, M. Akke, A.G. Palmer, Characterization of Chemical Exchange Using Residual Dipolar Coupling, Journal of the American Chemical Society, 129 (2007) 13396-13397. [140] N.J. Anthis, G.M. Clore, Visualizing transient dark states by NMR spectroscopy, Quarterly Reviews of Biophysics, 48 (2015) 35-116. [141] P. Vallurupalli, G. Bouvignies, L.E. Kay, Studying "invisible" excited protein states in slow exchange with a major state conformation., Journal of the American Chemical Society, 134 (2012) 81488161. [142] S. Forsén, R.A. Hoffman, Study of Moderately Rapid Chemical Exchange Reactions by Means of Nuclear Magnetic Double Resonance, Journal of Chemical Physics, 39 (1963) 2892-2901. [143] N.L. Fawzi, J. Ying, R. Ghirlando, D.A. Torchia, G.M. Clore, Atomic-resolution dynamics on the surface of amyloid-b protofibrils probed by solution NMR, Nature, 480 (2011) 268-272. [144] D.S. Libich, V. Tugarinov, G.M. Clore, Intrinsic unfoldase/foldase activity of the chaperonin GroEL directly demonstrated using multinuclear relaxation-based NMR., Proceedings of the National Academy of Sciences of the United States of America, 112 (2015) 8817-8823. [145] N.L. Fawzi, D.S. Libich, J. Ying, V. Tugarinov, G.M. Clore, Characterizing Methyl-Bearing Side Chain Contacts and Dynamics Mediating Amyloid β Protofibril Interactions Using 13C methyl-DEST and Lifetime Line Broadening, Angewandte Chemie International Edition, 53 (2014) 10345-10349. [146] K.S. Chakrabarti, D. Ban, S. Pratihar, J.G. Reddy, S. Becker, C. Griesinger, D. Lee, High-power 1H composite pulse decoupling provides artifact free exchange-mediated saturation transfer (EST) experiments, Journal of Magnetic Resonance, 269 (2016) 65-69. [147] D. Ban, A. Mazur, M.G. Carneiro, T.M. Sabo, K. Giller, L.M.I. Koharudin, S. Becker, A.M. Gronenborn, C. Griesinger, D. Lee, Enhanced accuracy of kinetic information from CT-CPMG experiments by transverse rotating-frame spectroscopy., Journal of Biomolecular NMR, 57 (2013) 73-82. 1 C. Haupt, R. Pat schke, U. Weininger, . Gr ger, M. Kovermann, J. Balbach, Transient Enzyme– Substrate Recognition Monitored by Real-Time NMR, Journal of the American Chemical Society, 133 (2011) 11154-11162.

27

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN

US

CR

IP

T

[149] M. Zeeb, J. Balbach, Protein folding studied by real-time NMR spectroscopy, Methods (San Diego, Calif.), 34 (2004) 65-74. [150] S.D. Hoeltzli, C. Frieden, Real-time refolding studies of 6-19F-tryptophan labeled Escherichia coli dihydrofolate reductase using stopped-flow NMR spectroscopy, Biochemistry, 35 (1996) 16843-16851. [151] J. Balbach, V. Forge, N.A.J. van Nuland, S.L. Winder, P.J. Hore, C.M. Dobson, Following protein folding in real time using NMR spectroscopy, Nature Structural Biology, 2 (1995) 865-870. [152] M. Gal, M. Mishkovsky, L. Frydman, Real-time monitoring of chemical transformations by ultrafast 2D NMR spectroscopy, J Am Chem Soc, 128 (2006) 951-956. [153] J.I. Guijarro, C.J. Morton, K.W. Plaxco, I.D. Campbell, C.M. Dobson, Folding kinetics of the SH3 domain of PI3 kinase by real-time NMR combined with optical spectroscopy, J Mol Biol, 276 (1998) 657667. [154] R. Bhattacharyya, B. Key, H. Chen, A.S. Best, A.F. Hollenkamp, C.P. Grey, In situ NMR observation of the formation of metallic lithium microstructures in lithium batteries, Nat Mater, 9 (2010) 504-510. [155] B. Key, R. Bhattacharyya, M. Morcrette, V. Seznec, J.M. Tarascon, C.P. Grey, Real-time NMR investigations of structural changes in silicon electrodes for lithium-ion batteries, J Am Chem Soc, 131 (2009) 9239-9249. [156] P. Schanda, B. Brutscher, Very Fast Two-Dimensional NMR Spectroscopy for Real-Time Investigation of Dynamic Events in Proteins on the Time Scale of Seconds, Journal of the American Chemical Society, 127 (2005) 8014-8015. [157] S. Bowen, C. Hilty, Time‐Resolved Dynamic Nuclear Polarization Enhanced NMR Spectroscopy, Angewandte Chemie International Edition, 47 (2008) 5235-5237. [158] T. Matsuda, S. Koshiba, N. Tochio, E. Seki, N. Iwasaki, T. Yabuki, M. Inoue, S. Yokoyama, T. Kigawa, Improving cell-free protein synthesis for stable-isotope labeling, J Biomol NMR, 37 (2007) 225-229. [159] C. Opitz, S. Isogai, S. Grzesiek, An economic approach to efficient isotope labeling in insect cells using homemade 15N-, 13C- and 2H-labeled yeast extracts, J Biomol NMR, 62 (2015) 373-385. [160] K. Kato, Y. Yamaguchi, Y. Arata, Stable-isotope-assisted NMR approaches to glycoproteins using immunoglobulin G as a model system, Prog Nucl Magn Reson Spectrosc, 56 (2010) 346-359. [161] Y. Yamaguchi, H. Yagi, K. Kato, Stable isotope labeling of glycoproteins for NMR study, in: K. Kato, T. Peters (Eds.) NMR in Glycoscience and Glycotechnology, RSC Publishing (Cambridge), Place Published, 2017, pp. 194-205. [162] M.K. Rosen, K.H. Gardner, R.C. Willis, W.E. Parris, T. Pawson, L.E. Kay, Selective methyl group protonation of perdeuterated proteins, Journal of Molecular Biology, 263 (1996) 627-636. [163] K.H. Gardner, L.E. Kay, Production and incorporation of N-15, C-13, H-2 (H-1-delta 1 methyl) isoleucine into proteins for multidimensional NMR studies, Journal of the American Chemical Society, 119 (1997) 7599-7600. [164] K. Sinha, L. Jen-Jacobson, G.S. Rule, Specific Labeling of Threonine Methyl Groups for NMR Studies of Protein-Nucleic Acid Complexes, Biochemistry, 50 (2011) 10189-10191. [165] I. Ayala, R. Sounier, N. Use, P. Gans, J. Boisbouvier, An efficient protocol for the complete incorporation of methyl-protonated alanine in perdeuterated protein, Journal of Biomolecular Nmr, 43 (2009) 111-119. [166] A. Velyvis, A.M. Ruschak, L.E. Kay, An economical method for production of (2)H, (13)CH3threonine for solution NMR studies of large protein complexes: application to the 670 kDa proteasome, PLoS One, 7 (2012) e43725. [167] R.L. Isaacson, P.J. Simpson, M. Liu, E. Cota, X. Zhang, P. Freemont, S. Matthews, A new labeling method for methyl transverse relaxation-optimized spectroscopy NMR spectra of alanine residues, Journal of the American Chemical Society, 129 (2007) 15428-+. [168] M. Fischer, K. Kloiber, J. Häusler, K. Ledolter, R. Konrat, W. Schmid, Synthesis of a13C-MethylGroup-Labeled Methionine Precursor as a Useful Tool for Simplifying Protein Structural Analysis by NMR Spectroscopy, ChemBioChem, 8 (2007) 610-612. [169] S. Rajesh, D. Nietlispach, H. Nakayama, K. Takio, E.D. Laue, T. Shibata, Y. Ito, A novel method for the biosynthesis of deuterated proteins with selective protonation at the aromatic rings of Phe, Tyr and Trp, Journal of Biomolecular Nmr, 27 (2003) 81-86.

28

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN

US

CR

IP

T

[170] R.J. Lichtenecker, N. Coudevylle, R. Konrat, W. Schmid, Selective Isotope Labelling of Leucine Residues by Using alpha-Ketoacid Precursor Compounds, Chembiochem, 14 (2013) 818-821. [171] J. Schroghuber, T. Sara, M. Bisaccia, W. Schmid, R. Konrat, R.J. Lichtenecker, Novel Approaches in Selective Tryptophan Isotope Labeling by Using Escherichia coli Overexpression Media, Chembiochem, 16 (2015) 746-751. [172] B. Ramaraju, H. McFeeters, B. Vogler, R.L. McFeeters, Bacterial production of site specific 13C labeled phenylalanine and methodology for high level incorporation into bacterially expressed recombinant proteins, J Biomol NMR, 67 (2017) 23-34. [173] M. Kainosho, T. Torizawa, Y. Iwashita, T. Terauchi, A. Mei Ono, P. Guntert, Optimal isotope labelling for NMR protein structure determinations, Nature, 440 (2006) 52-57. [174] C.J. Yang, M. Takeda, T. Terauchi, J. Jee, M. Kainosho, Differential Large-Amplitude Breathing Motions in the Interface of FKBP12-Drug Complexes, Biochemistry, 54 (2015) 6983-6995. [175] M. Takeda, Y. Miyanoiri, T. Terauchi, M. Kainosho, C-13-NMR studies on disulfide bond isomerization in bovine pancreatic trypsin inhibitor (BPTI), Journal of Biomolecular Nmr, 66 (2016) 37-53. [176] K. Pervushin, R. Riek, G. Wider, K. Wuthrich, Attenuated T2 relaxation by mutual cancellation of dipole-dipole coupling and chemical shift anisotropy indicates an avenue to NMR structures of very large biological macromolecules in solution, Proc Natl Acad Sci U S A, 94 (1997) 12366-12371. [177] K. Pervushin, D. Braun, C. Fernandez, K. Wuthrich, [N-15,H-1]/[C-13,H-1]-TROSY for simultaneous detection of backbone N-15-H-1, aromatic C-13-H-1 and side-chain N-15-H-1(2) correlations in large proteins, Journal of Biomolecular Nmr, 17 (2000) 195-202. [178] C. Fernandez, G. Wider, TROSY in NMR studies of the structure and function of large biological macromolecules, Current Opinion in Structural Biology, 13 (2003) 570-580. [179] V. Tugarinov, P.M. Hwang, J.E. Ollerenshaw, L.E. Kay, Cross-correlated relaxation enhanced 1H[bond]13C NMR spectroscopy of methyl groups in very high molecular weight proteins and protein complexes, J Am Chem Soc, 125 (2003) 10420-10428. [180] M. Mobli, A.S. Stern, J.C. Hoch, Spectral reconstruction methods in fast NMR: Reduced dimensionality, random sampling and maximum entropy, Journal of Magnetic Resonance, 182 (2006) 96-105. [181] J.C.J. Barna, E.D. Laue, M.R. Mayger, J. Skilling, S.J.P. Worrall, Exponential Sampling, an Alternative Method for Sampling in Two-Dimensional Nmr Experiments, Journal of Magnetic Resonance, 73 (1987) 6977. [182] S.G. Hyberts, K. Takeuchi, G. Wagner, Poisson-Gap Sampling and Forward Maximum Entropy Reconstruction for Enhancing the Resolution and Sensitivity of Protein NMR Data, Journal of the American Chemical Society, 132 (2010) 2145-+. [183] K. Kazimierczuk, A. Zawadzka, W. Kozminski, Optimization of random time domain sampling in multidimensional NMR, Journal of Magnetic Resonance, 192 (2008) 123-130. [184] E.D. Laue, M.R. Mayger, J. Skilling, J. Staunton, Reconstruction of phase sensitive 2D NMR spectra by maximum entropy, Journal of Magnetic Resonance, 68 (1986) 14-29. [185] J.C. Hoch, M.W. Maciejewski, M. Mobli, A.D. Schuyler, A.S. Stern, Nonuniform Sampling and Maximum Entropy Reconstruction in Multidimensional NMR, Accounts of Chemical Research, 47 (2014) 708-717. [186] S.G. Hyberts, G.J. Heffron, N.G. Tarragona, K. Solanky, K.A. Edmonds, H. Luithardt, J. Fejzo, M. Chorev, H. Aktas, K. Colson, K.H. Falchuk, J.A. Halperin, G. Wagner, Ultrahigh-resolution H-1-C-13 HSQC spectra of metabolite mixtures using nonlinear sampling and forward maximum entropy reconstruction, Journal of the American Chemical Society, 129 (2007) 5108-5116. [187] J. Hamatsu, D. O'Donovan, T. Tanaka, T. Shirai, Y. Hourai, T. Mikawa, T. Ikeya, M. Mishima, W. Boucher, B.O. Smith, E.D. Laue, M. Shirakawa, Y. Ito, High-resolution heteronuclear multidimensional NMR of proteins in living insect cells using a baculovirus protein expression system, J Am Chem Soc, 135 (2013) 1688-1691. [188] V.A. Jaravine, V.Y. Orekhov, Targeted acquisition for real-time NMR spectroscopy, Journal of the American Chemical Society, 128 (2006) 13421-13426. [189] V. Jaravine, I. Ibraghimov, V.Y. Orekhov, Removal of a time barrier for high-resolution multidimensional NMR spectroscopy, Nature Methods, 3 (2006) 605-607.

29

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN

US

CR

IP

T

[190] M. Mayzel, J. Rosenlow, L. Isaksson, V.Y. Orekhov, Time-resolved multidimensional NMR with nonuniform sampling, Journal of Biomolecular Nmr, 58 (2014) 129-139. [191] Y. Matsuki, M.T. Eddy, J. Herzfeld, Spectroscopy by Integration of Frequency and Time Domain Information for Fast Acquisition of High-Resolution Dark Spectra, Journal of the American Chemical Society, 131 (2009) 4648-4656. [192] Y. Matsuki, T. Konuma, T. Fujiwara, K. Sugase, Boosting Protein Dynamics Studies Using Quantitative Nonuniform Sampling NMR Spectroscopy, Journal of Physical Chemistry B, 115 (2011) 1374013745. [193] T.E. Linnet, K. Teilum, Non-uniform sampling of NMR relaxation data, J Biomol NMR, 64 (2016) 165-173. [194] K. Kazimierczuk, V.Y. Orekhov, Accelerated NMR spectroscopy by using compressed sensing, Angew Chem Int Ed Engl, 50 (2011) 5556-5559. [195] D.J. Holland, M.J. Bostock, L.F. Gladden, D. Nietlispach, Fast multidimensional NMR spectroscopy using compressed sensing, Angew Chem Int Ed Engl, 50 (2011) 6548-6551. [196] A.S. Stern, D.L. Donoho, J.C. Hoch, NMR data processing using iterative thresholding and minimum l(1)-norm reconstruction, J Magn Reson, 188 (2007) 295-300. [197] S.G. Hyberts, A.G. Milbradt, A.B. Wagner, H. Arthanari, G. Wagner, Application of iterative soft thresholding for fast reconstruction of NMR data non-uniformly sampled with multidimensional Poisson Gap scheduling, J Biomol NMR, 52 (2012) 315-327. [198] S. Sun, M. Gill, Y. Li, M. Huang, R.A. Byrd, Efficient and generalized processing of multidimensional NUS NMR data: the NESTA algorithm and comparison of regularization terms, J Biomol NMR, 62 (2015) 105-117. [199] J. Ying, F. Delaglio, D.A. Torchia, A. Bax, Sparse multidimensional iterative lineshape-enhanced (SMILE) reconstruction of both non-uniformly sampled and conventional NMR data, J Biomol NMR, (2016). [200] N. Tjandra, A. Bax, Direct measurement of distances and angles in biomolecules by NMR in a dilute liquid crystalline medium, Science, 278 (1997) 1111-1114. [201] J. Iwahara, C.D. Schwieters, G.M. Clore, Ensemble approach for NMR structure refinement against (1)H paramagnetic relaxation enhancement data arising from a flexible paramagnetic group attached to a macromolecule, J Am Chem Soc, 126 (2004) 5879-5896. [202] I. Bertini, C. Luchinat, G. Parigi, R. Pierattelli, NMR spectroscopy of paramagnetic metalloproteins, Chembiochem, 6 (2005) 1536-1549. [203] V. Gaponenko, S.P. Sarma, A.S. Altieri, D.A. Horita, J. Li, R.A. Byrd, Improving the accuracy of NMR structures of large proteins using pseudocontact shifts as long-range restraints, J Biomol NMR, 28 (2004) 205-212. [204] G.M. Clore, M.R. Starich, A.M. Gronenborn, Measurement of Residual Dipolar Couplings of Macromolecules Aligned in the Nematic Phase of a Colloidal Suspension of Rod-Shaped Viruses, Journal of the American Chemical Society, 120 (1998) 10571-10572. [205] S.M. Douglas, J.J. Chou, W.M. Shih, DNA-nanotube-induced alignment of membrane proteins for NMR structure determination, Proc Natl Acad Sci U S A, 104 (2007) 6644-6648. [206] K. Chen, N. Tjandra, The use of residual dipolar coupling in studying proteins by NMR, Top Curr Chem, 326 (2012) 47-67. [207] G.M. Clore, J. Iwahara, Theory, practice, and applications of paramagnetic relaxation enhancement for the characterization of transient low-population states of biological macromolecules and their complexes, Chem Rev, 109 (2009) 4108-4139. [208] M.-L. Mattinen, K. Pääkkönen, T. Ikonen, J. Craven, T. Drakenberg, R. Serimaa, J. Waltho, A. Annila, Quaternary Structure Built from Subunits Combining NMR and Small-Angle X-Ray Scattering Data, Biophysical Journal, 83 (2002) 1177-1183. [209] A. Grishaev, J. Wu, J. Trewhella, A. Bax, Refinement of multidomain protein structures by combination of solution small-angle X-ray scattering and NMR data, Journal of the American Chemical Society, 127 (2005) 16621-16628. [210] F. Gabel, B. Simon, M. Sattler, A target function for quaternary structural refinement from small angle scattering and NMR orientational restraints, European Biophysics Journal, 35 (2006) 313-327.

30

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN

US

CR

IP

T

[211] A. Grishaev, V. Tugarinov, L.E. Kay, J. Trewhella, A. Bax, Refined solution structure of the 82-kDa enzyme malate synthase G from joint NMR and synchrotron SAXS restraints, Journal of Biomolecular Nmr, 40 (2008) 95-106. [212] V. Venditti, C.D. Schwieters, A. Grishaev, G.M. Clore, Dynamic equilibrium between closed and partially closed states of the bacterial Enzyme I unveiled by solution NMR and X-ray scattering, Proc Natl Acad Sci U S A, 112 (2015) 11565-11570. [213] V. Tugarinov, W.Y. Choy, V.Y. Orekhov, L.E. Kay, Solution NMR-derived global fold of a monomeric 82-kDa enzyme, Proceedings of the National Academy of Sciences of the United States of America, 102 (2005) 622-627. [214] V. Tugarinov, L.E. Kay, Stereospecific NMR assignments of prochiral methyls, rotameric states and dynamics of valine residues in malate synthase G, Journal of the American Chemical Society, 126 (2004) 9827-9836. [215] G.A. Mueller, W.Y. Choy, D. Yang, J.D. Forman-Kay, R.A. Venters, L.E. Kay, Global folds of proteins with low densities of NOEs using residual dipolar couplings: application to the 370-residue maltodextrinbinding protein, J Mol Biol, 300 (2000) 197-212. [216] L. Brocchieri, S. Karlin, Protein length in eukaryotic and prokaryotic proteomes, Nucleic Acids Research, 33 (2005) 3390-3400. [217] R. Rosenzweig, S. Moradi, A. Zarrine-Afsar, J.R. Glover, L.E. Kay, Unraveling the mechanism of protein disaggregation through a ClpB-DnaK interaction, Science, 339 (2013) 1080-1083. [218] C. Dominguez, R. Boelens, A.M. Bonvin, HADDOCK: a protein-protein docking approach based on biochemical or biophysical information, J Am Chem Soc, 125 (2003) 1731-1737. [219] S.J. de Vries, A.D. van Dijk, M. Krzeminski, M. van Dijk, A. Thureau, V. Hsu, T. Wassenaar, A.M. Bonvin, HADDOCK versus HADDOCK: new features and performance of HADDOCK2.0 on the CAPRI targets, Proteins, 69 (2007) 726-733. [220] C. Huang, P. Rossi, T. Saio, C.G. Kalodimos, Structural basis for the antifolding activity of a molecular chaperone, Nature, 537 (2016) 202-206. [221] P. Guntert, C. Mumenthaler, K. Wuthrich, Torsion angle dynamics for NMR structure calculation with the new program DYANA, J Mol Biol, 273 (1997) 283-298. [222] P. Guntert, Automated structure determination from NMR spectra, Eur Biophys J, 38 (2009) 129-143. [223] I. Gelis, A.M. Bonvin, D. Keramisanou, M. Koukaki, G. Gouridis, S. Karamanou, A. Economou, C.G. Kalodimos, Structural basis for signal-sequence recognition by the translocase motor SecA as determined by NMR, Cell, 131 (2007) 756-769. [224] T. Saio, X. Guan, P. Rossi, A. Economou, C.G. Kalodimos, Structural basis for protein antiaggregation activity of the trigger factor chaperone, Science, 344 (2014) 1250494. [225] R. Sprangers, L.E. Kay, Quantitative dynamics and binding studies of the 20S proteasome by NMR, Nature, 445 (2007) 618-622. [226] V. Tugarinov, R. Sprangers, L.E. Kay, Probing side-chain dynamics in the proteasome by relaxation violated coherence transfer NMR spectroscopy, J Am Chem Soc, 129 (2007) 1743-1750. [227] T.L. Religa, R. Sprangers, L.E. Kay, Dynamic Regulation of Archaeal Proteasome Gate Opening As Studied by TROSY NMR, Science, 328 (2010) 98-102. [228] A.M. Ruschak, T.L. Religa, S. Breuer, S. Witt, L.E. Kay, The proteasome antechamber maintains substrates in an unfolded state, Nature, 467 (2010) 868-871. [229] T.L. Religa, A.M. Ruschak, R. Rosenzweig, L.E. Kay, Site-directed methyl group labeling as an NMR probe of structure and dynamics in supramolecular protein systems: applications to the proteasome and to the ClpP protease, J Am Chem Soc, 133 (2011) 9063-9068. [230] A.M. Ruschak, L.E. Kay, Proteasome allostery as a population shift between interchanging conformers, Proceedings of the National Academy of Sciences of the United States of America, 109 (2012) E3454-E3462. [231] A. Velyvis, L.E. Kay, Measurement of active site ionization equilibria in the 670 kDa proteasome core particle using methyl-TROSY NMR, J Am Chem Soc, 135 (2013) 9259-9262. [232] M.P. Latham, A. Sekhar, L.E. Kay, Understanding the mechanism of proteasome 20S core particle gating, Proceedings of the National Academy of Sciences of the United States of America, 111 (2014) 55325537.

31

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN

US

CR

IP

T

[233] Z. Serber, A.T. Keatinge-Clay, R. Ledwidge, A.E. Kelly, S.M. Miller, V. Dötsch, High-resolution macromolecular NMR spectroscopy inside living cells, Journal of the American Chemical Society, 123 (2001) 2446-2447. [234] K. Bertrand, S. Reverdatto, D.S. Burz, R. Zitomer, A. Shekhtman, Structure of proteins in eukaryotic compartments, J Am Chem Soc, 134 (2012) 12798-12806. [235] P. Selenko, Z. Serber, B. Gade, J. Ruderman, G. Wagner, Quantitative NMR analysis of the protein G B1 domain in Xenopus laevis egg extracts and intact oocytes, Proceedings of the National Academy of Sciences of the United States of America, 103 (2006) 11904-11909. [236] T. Sakai, H. Tochio, T. Tenno, Y. Ito, T. Kokubo, H. Hiroaki, M. Shirakawa, In-cell NMR spectroscopy of proteins inside Xenopus laevis oocytes, J Biomol NMR, 36 (2006) 179-188. [237] I.G. Zigoneanu, G.J. Pielak, Interaction of alpha-synuclein and a cell penetrating fusion peptide with higher eukaryotic cell membranes assessed by (1)(9)F NMR, Mol Pharm, 9 (2012) 1024-1029. [238] F.X. Theillet, A. Binolfi, B. Bekei, A. Martorana, H.M. Rose, M. Stuiver, S. Verzini, D. Lorenz, M. van Rossum, D. Goldfarb, P. Selenko, Structural disorder of monomeric alpha-synuclein persists in mammalian cells, Nature, 530 (2016) 45-50. [239] K. Inomata, A. Ohno, H. Tochio, S. Isogai, T. Tenno, I. Nakase, T. Takeuchi, S. Futaki, Y. Ito, H. Hiroaki, M. Shirakawa, High-resolution multi-dimensional NMR spectroscopy of proteins in human cells, Nature, 458 (2009) 106-109. [240] B. Bekei, In-cell NMR Spectroscopy in Mammalian Cells, Ph.D. thesis, Freie Universität Berlin, Germany, (2013). [241] L. Banci, L. Barbieri, I. Bertini, E. Luchinat, E. Secci, Y. Zhao, A.R. Aricescu, Atomic-resolution monitoring of protein maturation in live human cells by NMR, Nat Chem Biol, 9 (2013) 297-299. [242] D. Liu, R. Xu, D. Cowburn, Chapter 8 Segmental Isotopic Labeling of Proteins for Nuclear Magnetic Resonance, 462 (2009) 151-175. [243] J. Koehler, J. Meiler, Expanding the utility of NMR restraints with paramagnetic compounds: background and practical aspects, Prog Nucl Magn Reson Spectrosc, 59 (2011) 360-389. [244] S. Ogino, S. Kubo, R. Umemoto, S. Huang, N. Nishida, I. Shimada, Observation of NMR signals from proteins introduced into living Mammalian cells by reversible membrane permeabilization using a poreforming toxin, streptolysin o, Journal of the American Chemical Society, 131 (2009) 10834-10835. [245] J. Danielsson, X. Mu, L. Lang, H. Wang, A. Binolfi, F.X. Theillet, B. Bekei, D.T. Logan, P. Selenko, H. Wennerstrom, M. Oliveberg, Thermodynamics of protein destabilization in live cells, Proc Natl Acad Sci U S A, 112 (2015) 12402-12407. [246] R. Hansel, S. Foldynova-Trantirkova, F. Lohr, J. Buck, E. Bongartz, E. Bamberg, H. Schwalbe, V. Dotsch, L. Trantirek, Evaluation of parameters critical for observing nucleic acids inside living Xenopus laevis oocytes by in-cell NMR spectroscopy, J Am Chem Soc, 131 (2009) 15761-15768. [247] T. Muntener, D. Haussinger, P. Selenko, F.X. Theillet, In-Cell Protein Structures from 2D NMR Experiments, J Phys Chem Lett, 7 (2016) 2821-2825. [248] B.B. Pan, F. Yang, Y. Ye, Q. Wu, C. Li, T. Huber, X.C. Su, 3D structure determination of a protein in living cells using paramagnetic NMR spectroscopy, Chem Commun (Camb), 52 (2016) 10237-10240. [249] D.S. Hembram, T. Haremaki, J. Hamatsu, J. Inoue, H. Kamoshida, T. Ikeya, M. Mishima, T. Mikawa, N. Hayashi, M. Shirakawa, Y. Ito, An in-cell NMR study of monitoring stress-induced increase of cytosolic Ca2+ concentration in HeLa cells, Biochem Biophys Res Commun, 438 (2013) 653-659. [250] I. Walev, S.C. Bhakdi, F. Hofmann, N. Djonder, A. Valeva, K. Aktories, S. Bhakdi, Delivery of proteins into living cells by reversible membrane permeabilization with streptolysin-O, Proceedings of the National Academy of Sciences, 98 (2001) 3185-3190. [251] S. Bhakdi, U. Weller, I. Walev, E. Martin, D. Jonas, M. Palmer, A guide to the use of pore-forming toxins for controlled permeabilization of cell membranes, Medical microbiology and immunology, 182 (1993) 167-175. [252] R.J. Ellis, Macromolecular crowding: obvious but underappreciated, Trends in Biochemical Sciences, 26 (2001) 597-604. [253] A.C. Miklos, C.G. Li, N.G. Sharaf, G.J. Pielak, Volume Exclusion and Soft Interaction Effects on Protein Stability under Crowded Conditions, Biochemistry, 49 (2010) 6984-6991.

32

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN

US

CR

IP

T

[254] Y.Q. Wang, C.G. Li, G.J. Pielak, Effects of Proteins on Protein Diffusion, Journal of the American Chemical Society, 132 (2010) 9392-9397. [255] H.-X. Zhou, G. Rivas, A.P. Minton, Macromolecular Crowding and Confinement: Biochemical, Biophysical, and Potential Physiological Consequences, Annual Review of Biophysics, 37 (2008) 375-397. [256] R. Harada, Y. Sugita, M. Feig, Protein crowding affects hydration structure and dynamics, J Am Chem Soc, 134 (2012) 4842-4849. [257] O. Zhang, J.D. Forman-Kay, Structural characterization of folded and unfolded states of an SH3 domain in equilibrium in aqueous buffer, Biochemistry, 34 (1995) 6784-6794. [258] A.E. Smith, L.Z. Zhou, A.H. Gorensek, M. Senske, G.J. Pielak, In-cell thermodynamics and a new role for protein surfaces, Proc Natl Acad Sci U S A, 113 (2016) 1725-1730. [259] C.O. Barnes, W.B. Monteith, G.J. Pielak, Internal and global protein motion assessed with a fusion construct and in-cell NMR spectroscopy, Chembiochem, 12 (2011) 390-391. [260] A.E. Smith, L.Z. Zhou, G.J. Pielak, Hydrogen exchange of disordered proteins in Escherichia coli, Protein Sci, 24 (2015) 706-713. [261] M.P. Latham, L.E. Kay, Probing non-specific interactions of Ca(2)(+)-calmodulin in E. coli lysate, J Biomol NMR, 55 (2013) 239-247. [262] M.P. Latham, L.E. Kay, A Similar In Vitro and In Cell Lysate Folding Intermediate for the FF Domain, Journal of Molecular Biology, 426 (2014) 3214-3220. [263] D. Sakakibara, A. Sasaki, T. Ikeya, J. Hamatsu, T. Hanashima, M. Mishima, M. Yoshimasu, N. Hayashi, T. Mikawa, M. Walchli, B.O. Smith, M. Shirakawa, P. Guntert, Y. Ito, Protein structure determination in living cells by in-cell NMR spectroscopy, Nature, 458 (2009) 102-105. [264] T. Ikeya, A. Sasaki, D. Sakakibara, Y. Shigemitsu, J. Hamatsu, T. Hanashima, M. Mishima, M. Yoshimasu, N. Hayashi, T. Mikawa, D. Nietlispach, M. Walchli, B.O. Smith, M. Shirakawa, P. Guntert, Y. Ito, NMR protein structure determination in living E. coli cells using nonlinear sampling, Nature Protocols, 5 (2010) 1051-1060. [265] T. Ikeya, J.G. Jee, Y. Shigemitsu, J. Hamatsu, M. Mishima, Y. Ito, M. Kainosho, P. Guntert, Exclusively NOESY-based automated NMR assignment and structure determination of proteins, J Biomol NMR, 50 (2011) 137-146. [266] E. Schmidt, P. Guntert, A new algorithm for reliable and general NMR resonance assignment, J Am Chem Soc, 134 (2012) 12817-12829. [267] T. Ikeya, S. Ikeda, T. Kigawa, Y. Ito, P. Güntert, Protein NMR Structure Refinement based on Bayesian Inference, Journal of Physics: Conference Series, 699 (2016) 012005. [268] T. Ikeya, T. Hanashima, S. Hosoya, M. Shimazaki, S. Ikeda, M. Mishima, P. Güntert, Y. Ito, Improved in-cell structure determination of proteins at near-physiological concentration, Scientific Reports, 6 (2016) 38312. [269] S. Kubo, N. Nishida, Y. Udagawa, O. Takarada, S. Ogino, I. Shimada, A gel-encapsulated bioreactor system for NMR studies of protein-protein interactions in living mammalian cells, Angew Chem Int Ed Engl, 52 (2013) 1208-1211. [270] Y. Ye, X. Liu, G. Xu, M. Liu, C. Li, Direct observation of Ca(2+) -induced calmodulin conformational transitions in intact Xenopus laevis oocytes by (19) F NMR spectroscopy, Angew Chem Int Ed Engl, 54 (2015) 5328-5330. [271] Y. Hikone, G. Hirai, M. Mishima, K. Inomata, T. Ikeya, S. Arai, M. Shirakawa, M. Sodeoka, Y. Ito, A new carbamidemethyl-linked lanthanoid chelating tag for PCS NMR spectroscopy of proteins in living HeLa cells, J Biomol NMR, 66 (2016) 99-110. [272] Y. Shen, O. Lange, F. Delaglio, P. Rossi, J.M. Aramini, G. Liu, A. Eletsky, Y. Wu, K.K. Singarapu, A. Lemak, A. Ignatchenko, C.H. Arrowsmith, T. Szyperski, G.T. Montelione, D. Baker, A. Bax, Consistent blind protein structure generation from NMR chemical shift data, Proc Natl Acad Sci U S A, 105 (2008) 46854690. [273] H. Yagi, K.B. Pilla, A. Maleckis, B. Graham, T. Huber, G. Otting, Three-dimensional protein fold determination from backbone amide pseudocontact shifts generated by lanthanide tags at multiple sites, Structure, 21 (2013) 883-890.

33

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN

US

CR

IP

T

[274] B.D. Beck, Polymerization of the bacterial elongation factor for protein synthesis, EF-Tu, Eur J Biochem, 97 (1979) 495-502. [275] M. Beck, A. Schmidt, J. Malmstroem, M. Claassen, A. Ori, A. Szymborska, F. Herzog, O. Rinner, J. Ellenberg, R. Aebersold, The quantitative proteome of a human cell line, Mol Syst Biol, 7 (2011) 549. [276] N.G. Sharaf, C.O. Barnes, L.M. Charlton, G.B. Young, G.J. Pielak, A bioreactor for in-cell protein NMR, J Magn Reson, 202 (2010) 140-146.

34

ACCEPTED MANUSCRIPT

8. Figure Captions

Fig. 1 The complete dynamic range of NMR and the variety of experiments that can access the entire temporal range.

ED

M

AN

US

CR

IP

T

Fig. 2 Schematic that visualises a variety of NMR observables that have a different temporal range. A combination of all techniques covers the entire temporal range where protein motion. Fast relaxation typically measured using conventional relaxation methods to attain amplitudes of rapid bond vector motion usually denoted by the Lipari-Szabo order parameter ( ). Amplitude information from the rapid picosecond events to milliseconds, which covers the supra-c range can be deduced from RDC and CCR approaches. The motional amplitude of bond motions within this range are reported using the RDC based order parameter ( ), but CCR rates provide information on how two different inter-nuclear vectors fluctuate with one another. These features are represented by the range of potential conformations from a bundle of structures. Different CCR rates are represented by different coloured planes between different inter-nuclear vectors. Such as intra-residue CCR rates (purple plane), the 13C-1H(i-1)/15N-1HN(i-2) (orange planes), and 13C-1H(i-3)/ 13 C-1H(i-4). The residue numbering is done only to present multiple potential rates. Hydrogen, oxygen, carbon, and nitrogen atoms are coloured in white, red, green, and blue, respectively. In situations where as the structure changes and interconvert (ex) with one another leading to a modulation of the chemical shift, RD and EST are elegant tools to access the kinetics and structural information of the transient states. A dashed line under the subsection for the RD experiments shows the new capabilities for high-power approaches. The existence of RD is identified by a change in as a function of the amplitude of the refocusing field denoted here as RF. The fastest observable lifetime by RD approaches is approximately 3 s and is denoted above figure [121] within the s-ms subset of the figure. Real-Time NMR is also sensitive to structural and kinetic changes that can be observed via the collection of a series of spectra in which different states can be followed by monitoring intensity changes and/or chemical shift differences.

AC

CE

PT

Fig. 3 Selective protonation approach for protein 3D structure determination. (a) Metabolic conversion of conventional isotope-labelled precursors to methyl-protonated amino acids. (b) Chemical structures of the SAIL amino acids in which H denotes 1H and D denotes 2H. (c) 3D structures of MBP obtained with distance restraints from methyl-protonated labelling (left, PDB ID:1EZP), PRE data (centre, PDB ID:2KLF) and SAIL (right, PDB ID:2D21). X-ray structure (PDB ID:1DMB), the N-terminal domain and the C-terminal domain are shown in red, blue and green, respectively. The 10 bundle NMR structures were separately superimposed for the two flexibly connected domains. (d) Distance restraints obtained in the SAIL method are represented as red lines in the ribbon model.

Fig. 4 Sampling schemes and reconstructed spectra for the indirect dimensions of 3D 15N-selected NOESY data of the protein FixJC. Schematic illustration of the conventional full sampling, the non-uniform sampling schemes with 1/2, 1/4, 1/8, 1/16 randomly selected data points, and 1/16 linearly selected sampling points for t1 and t2 indirect dimensions of the 3D 15N-selected NOESY spectra. A representative F1(1H)-F2(15N) cross section at F4(1HN) = 8.36 ppm was extracted and shown for the MaxEnt processed 3D 15N-selected NOESY spectra reconstructed from the conventionally acquired ‘reference’ data and the data with 1/2, 1/ , 1/ , and 1/16 randomly selected sampling points and 1/16 linearly selected points.

Fig. 5 As of April 2017, statistical distributions of molecular structures deposited in PDB. (a) Size distributions of structures, up to 300 kDa, deposited with the ‘experimental method’ given as ‘solution NMR’ (blue) and ‘X-ray diffraction’ (green). (b) i e distribution deposited as ‘solution NMR’ (blue) in the range of

35

ACCEPTED MANUSCRIPT

50 to 550 kDa. Representative molecules are illustrated as ribbon models. Note that, where the ‘experimental method’ is solution NMR, it does not mean that those 3D structures have been determined exclusively using NMR data. (c) Schematic illustration of large biomacromolecules analysed by solution NMR, described in this article, along with the axes of molecular mass and number of subunits. MSG and TF denote malate synthase G and trigger factor, respectively.

CR

IP

T

Fig. 6 Schematic illustration of the proposed approaches for incorporation of isotopically-labelled molecules into cells, intrinsic over-expression in cells (a), microinjection to Xenopus laevis oocytes (b), CPP-mediated delivery (c), incorporation with the pore-forming toxin (d) and electroporation (e). Representative cells used in the proposed approaches are listed.

US

Fig. 7 Schematic illustration of expected effects that contribute to protein stability, dynamics and 3D structures under intracellular environments.

M

AN

Fig. 8 H/D exchange experiment of CPP-mediated proteins with HeLa cells. (a) Schematic illustration of the H/D exchange experiment of the proteins conjugating CPP-tags with HeLa cells. The method incorporates the uniformly 2H/15N isotope-labelled proteins with the CPP-tags into cells, disrupts the cells after an arbitrary period and measures 1H NMR signals of the proteins in a lysate. Ub denotes the ubiquitin protein. (b) Schematic illustration of the process exchanging of 2H nuclei into 1H in the unstable conformations under intracellular environments. The ball & stick diagrams with 2H (blue) and 1H (red) below correspond to the red residues in the proteins. Inomata et al. showed that the exchange rate in the cells increased to 15–22 times compared with in vitro.

AC

CE

PT

ED

Fig. 9 Structures of the proteins TTHA1718 (a) and GB1 (b) in living E. coli. Here, 380 (20%) of 1,900 conformers yielded by the Bayesian-assisted refinement and 20 out of 100 in the final step of the conventional method are superimposed on the 20 structures determined in vitro (red), respectively. Distance restraints derived on the basis of the automatically assigned chemical shifts are represented in the white ribbon model with red lines (right).

36

US

CR

IP

T

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN

Fig. 1

37

AN

US

CR

IP

T

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

Fig. 2

38

AC

Fig. 3

CE

PT

ED

M

AN

US

CR

IP

T

ACCEPTED MANUSCRIPT

39

ED

M

AN

US

CR

IP

T

ACCEPTED MANUSCRIPT

AC

CE

PT

Fig. 4

40

AC

Fig. 5

CE

PT

ED

M

AN

US

CR

IP

T

ACCEPTED MANUSCRIPT

41

AC

Fig. 6

CE

PT

ED

M

AN

US

CR

IP

T

ACCEPTED MANUSCRIPT

42

M

AN

US

CR

IP

T

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

Fig. 7

43

PT

ED

M

AN

US

CR

IP

T

ACCEPTED MANUSCRIPT

AC

CE

Fig. 8

44

US

CR

IP

T

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN

Fig. 9

45

ACCEPTED MANUSCRIPT

Highlights Solution NMR is an ideal tool to study dynamical ordering of biomacromolecules. Biomolecular dynamics over a broad range of timescales can be characterised by NMR.

CE

PT

ED

M

AN

US

CR

IP

T

Current NMR techniques can deal with huge supramolecular machinery over 1 MDa. NMR approach also enables studying protein structures and dynamics in living cells.

AC

   

46