Seminar: Efficient approximations of RNA kinetics landscapes using non-redundant sampling

2017-11-08: by Yann Ponty. LIX - Batiment Turing, Ecole Polytechnique, France. The seminar will take place Nov 08, 15.30-16.30 at University of Copenhagen, SUND/SCIENCE, Building 1-04, Grønnegårdsvej 7, 1st floor library, Frederiksberg C.

While MFE and partition function-based approaches have proven substantially successful at predicting the structure of RiboNucleic Acids (RNAs), they rely on a thermodynamic equilibrium hypothesis, an assumption which is not guaranteed in finite time, neither in theory nor in practice. For instance, a thermodynamics equilibrium is not consistent with several well-documented phenomena, including co-transcriptional folding, in which the early folding of the nascent transcript impacts its final conformation, or the selective adoption of a conformation upon binding a ligand within riboswitches.

However, departing from the world of thermodynamics to embrace kinetics, the study of RNA folding in finite time, comes at a dire cost. Many of the elegant simplifications used to study RNA at the equilibrium, following the seminal work of McCaskill, no longer hold. Even the most modest building block of kinetics studies, the computation of the energy barrier between two conformations, was established NP-hard, as shown by Manuch et al. It is a current challenge in RNA kinetics to identify the key elements of folding landscapes, and contribute efficient heuristics to estimate individual transfer rates.

In this talk, I will present a new approach, based on sampling, to generate simplified folding landscapes, featuring the key conformations. Our approach to identify the key landmarks relies on two ideas. First, we restrict our sampling to locally optimal structures, a subset of conformations which are good representative of energy basins, yet still amenable to elegant and efficient combinatorial decompositions. Then, we introduce a dedicated data structure to avoid redundancy during the sampling, while keeping the sampling unbiased. Our rationale for generating unique structures is that the sampling probability associated with each structure is explicitly known, and that the redundancy is both uninformative and time-consuming. We validate our theoretical speed up on several datasets. Moreover, we argue that, by adaptively allowing access to structures of higher energy basins, our non-redundant strategy, produces approximated landscapes that yield more precise numerical estimates of RNA dynamics.

This work was mainly performed by Juraj MICHALIK (Ecole Polytechnique & Inria Saclay, France), and was co-directed with Hélene TOUZET (Univ. Lille I & Inria Lille Nord Europe, France).