High-resolution reversible folding of hyperstable RNA tetraloops using molecular dynamics simulations
Alan A. Chen and Angel E. García
PNAS, September 2013
Published online before print September 16, 2013, doi: 10.1073/pnas.1309392110
This paper combines a couple of interesting ideas, but the importance of these only become apparent in the broader context of RNA structure determination. For this paper, as with many of the other papers I write about, I’ll include two short summaries to begin. The first is a high level, general audience summary – this should be understandable regardless of your background, assuming you’ve had some exposure to scientific language. The second is a more bio-focussed summary, which is a short summary aimed at people in the biological sciences, but perhaps not familiar with computational structure determination, or RNA.
If you understand why this paper is important just from this initial overview, you’re probably not going to get a whole lot more out of this essay, but I do recommend you go and read the paper itself. It’s incredibly well written, and honestly just a great example of how, what to a certain extent is a methods paper, can be dynamic, interesting, and engaging.
RNA is a type of molecule found inside all cells. Structurally, one can think about RNA a bit like a necklace, where each bead on the necklace is a nucleotide (one type of molecule) and the string is made of a phosphate backbone (another type of molecule). However, in reality RNA doesn’t look like a long, flexible, string-like molecule, but instead folds up into all kinds of weird and wonderful structures. Tetraloops are one example of a very stable RNA structures. These tetraloops are made of just a few nucleotides (necklace beads), but have been shown to stay stable and in a well defined three dimensional orientation even at high temperatures. This was a bit unexpected when it was first observed, because you’d think a short loop of RNA would be very unstable. However, further work helped determine out the exact structure of the tetraloops (i.e. what those loops look like in 3D).
Considering we know what these tetraloops look like, we’d really like it if our computer models of RNA could reproduce this structure if we told the model what the individual beads looked like and what order they were in. The problem is that right now we can’t -i.e. we know in real life 6 beads orientate themselves into a tetraloop, but in our computer model those same 6 beads just stay like a necklace. The obvious explanation for this is that there are problems with our RNA model.
In this paper, Chen and Garcia showed that by developing new parameters for this RNA model, they were able to implement an RNA model which succesfully forms a tetraloop out of the six beads. Not only did the beads form a tetraloop, but the tetraloop matches the known experimental information exactly right – i.e. we can now make models which successfully reproduce the tetraloop structure we see in cells. This is the first time our computer models have been able to do this, so means we might now be able to look at other RNA things more accurately.
Tetraloops are super-stable RNA structural motifs, but current computational methods fail to predict their formation, despite a wealth of structural data. Chen and Garcia showed that by developing new AMBER forcefield parameters for RNA based on experimental data and quantum mechanical calculations, they were able to reproduce the folding/packing of three different RNA tetra loops using standard simulation approaches (replica exchange molecular dynamics). These simulations reproduced all the non-canonical interactions which have been observed experimentally, and represent the first time ab initio folding of tetraloops has been seen. A semi-empirical folding pathway is described, although the fact that simulations engage in temperature replica exchange means this may not represent a realistic folding router under normal conditions. These parameters provide new depth for molecular dynamic simulations of RNAs, allowing simulations to describe interactions which in previous forcefield iterations where either poorly described or under-represented.
When people think about RNA, they typically imagine a long, stretched out molecule which shepherd information from DNA to the ribosome. In reality, RNA molecules carry out a range of roles, from structural support to catalytic activity, and like proteins fold into a wide array of 3D structures and shapes. The RNA folding problem is one of many orders of magnitude more complicated than protein folding, yet we (as a scientific community) are still struggling with protein folding! In many ways, the fact that RNA has these complicated tertiary and secondary structures is often brushed under the rug, in no small part due to the fact that we’re really not that good at to predicting/determining what long RNA sequences might look like in terms of their secondary and tertiary structure. Certain aspects is well understood, such as canonical Watson-Crick base pairing (A-U and C-G), which defines the DNA helix, but while such interactions are often see in regions of RNA, many other structures which DNA could never form are also observed.In proteins, structure prediction is approached in a number of different ways. Pseudo ab-initio methods use calibrated energy models to try and predict how a protein might fold via molecular dynamics. Knowledge based approaches use existing structural data to inform on what similar sequences might look like (such as ROSETTA). RNA structure prediction is not blessed with the same wealth of structural data that proteins have, although in a number of cases specific motifs or structures have been solved to a high degree of resolution, through NMR and/or X-ray crystallography. One classic example of this is the tetraloop.
The tetraloop is (at the name suggests) a short loop of four RNA nucleotides which come together in a highly non-canonical manner to create an incredibly stable loop structure which have been well characterized structurally. Despite their structural stability, such loops represent a tiny percentage of RNA nucleotides in any one given RNA fragment. That said, they may provide a structural staple – a seed for early structure to form around in the folding process, meaning their importance on a folding pathway may be significant (this is conjecture on my part).
Much work in protein folding and dynamics has, especially in the last five years, been tackled by molecular dynamics. Here, using the same approaches as the pseudo ab-initio methods described earlier, a protein is described by some energy function which varies with the proteins conformation, allowing for a simulation of how that protein may behave in solution. An in depth discussion on molecular dynamics is far beyond the scope of this essay, but suffice to say that this provides a way to examine how a protein might fold, interact with ligands and simply how it behaves under equilibrium. Much of the early MD work showed unequivocally that proteins are wobbly, jiggly structures, as opposed to the rigid stable structures people often imagined they might be. While MD has shown great promise in terms of proteins, the energy representations for RNA are still significantly lacking. This is demonstrated by the fact that despite the wealth of structural data, current models fail to re-create the formation of the tetraloop. By contrast, given enough time, energy models will effectively predict alpha helix or beta sheet formation, despite the significant increase in the number of residues involved compared to the four nucleotide tetraloop.In this paper, Chen and Garcia aimed to address this issue. Through new parameterization of an RNA energy model, based on experimental data, classical thermodynamics, and quantum mechanics, a representation which successfully overcomes the weaknesses of older models was developed to correctly predict tetraloop formation. Moreover, this model doesn’t just predict the formation of tetraloops, but predicts the specific structural features which underlying their unusual stability. Considering this, their new model has implications for future simulations involving both RNA in isolation and RNA with proteins. It opens the door to much more effectively study short RNA strands, as this new model correctly captures a number of different thermodynamic ideas which older models failed to get right.
This paper also includes a possible folding scheme, whereby the tetraloops consistently form through a specific pathway. The authors correctly comment that the physiological relevance of such a pathway may be minimal, given that the 500ns replicas are constantly exchanging temperature. However, it does show an interesting and consistent behaviour, whereby loop formation is triggered by a reduction in the degrees of freedom through the initial formation of the tetraloop stem, closely followed by the loop itself.
In summary, this paper is focused around the development of a new set of RNA forcefield parameters which yield much better results than previous iterations. The fact that these parameters can recreate the tetraloop is, in many ways, simply a good demonstration that these parameters capture a set physical processes much better than older parameters. The thought is, therefore, that these parameters may allow us to examine the behaviour of small RNAs which we don’t have existing structural information for, and to make more detailed predictions which can subsequently be tested experimentally.