Researchers have made use of a novel AI-based method called Drfold to enhance the accuracy of RNA models by more than 70 percent when compared to traditional approaches.
Drfold created two complementary deep-learning network pipelines – one focused on end-to-end learning, and the other on geometrical restraint learning.
(Source: Getty Images)
Queenstown/Singapore – A research team from the Cancer Science Institute of Singapore (CSI Singapore) at the National University of Singapore (NUS) has successfully harnessed artificial intelligence (AI) and deep-learning techniques to model atomic-level RNA 3D structures from primary RNA sequences. Called Drfold, this novel AI-based method improves the accuracy of RNA models by more than 70 percent, compared to traditional approaches.
The team, which is led by Professor Zhang Yang from CSI Singapore and NUS School of Computing, published their findings in the scientific journal Nature Communications on 16 September 2023.
RNAs are large biomolecules consisting of a single chain of nucleotides, which derive their sequence order from double-stranded DNA molecules during transcription. RNAs are widely known for their role in transcription and translation processes, which facilitates the transfer of gene information embodied in DNA sequences into protein amino acid sequences. In recent years, RNAs have been found to play important roles in regulating various biological processes, hence positioning them as novel drug targets.
It has been estimated that targeting RNAs with small molecules will expand the drug design landscape exponentially, compared to traditional protein-targeted drug discovery. Accordingly, RNA biology and its applications in developing new therapeutics represent a critical emerging field, garnering significant academic and industry investment worldwide.
Predicting RNA structures
Compared to well-folded protein structures, RNA structures and their folds are generally considered less stable due to the relatively shallow energy landscape. Therefore, traditional physics- and statistics-based force fields, which are often error-prone, cannot accurately describe the elegant and intricate folding interactions of RNAs. Meanwhile, the limited availability of experimental RNA structures in the Protein Data Bank (PDB) further constrains the accuracy of these traditional knowledge-based force fields, which are derived from the statistics of the PDB structures.
To address these challenges, Drfold created two complementary deep-learning network pipelines – one focused on end-to-end learning, and the other on geometrical restraint learning. This innovative approach significantly improved the accuracy of the AI-based force field. The synergistic coupling of these two networks also further enhanced the accuracy of the single neural network-based AI potentials.
The key innovation lies in introducing a deep learning approach for predicting RNA tertiary structure. While traditional methods relied on homologous modelling or physics-based folding simulations, which suffer from the limitation of the force field accuracy, Drfold uses self-attention transformer networks to predict 3D structures from RNA sequences, marking a revolutionary shift in addressing this crucial challenge. Drfold’s new strategy of integrating two parallel and complementary networks built on end-to-end and geometry learnings helps to enhance the accuracy of the potential function and RNA model prediction, making it light, highly flexible, scalable, and hence, the preferred prediction method.
Dr Li Yang, a Research Scientist at CSI Singapore and first author of this study, said, “Since the biological functions of RNAs depend on the specific tertiary structures, it becomes increasingly important and necessary to determine the 3D structures of RNAs in order to facilitate RNA-based function annotation and drug discovery.”
He added, “The golden standard in structural biology, such as using biophysical experiments — X-ray crystallography, Cryogenic Electron Microscopy (Cryo-EM), and Nuclear Magnetic Resonance (NMR) Spectroscopy — to determine RNA structures, are often cost- and labour-intensive, limiting their application to a tiny portion of known RNAs. Currently, there are more than 30 million known RNA sequences in the RNA central database, but only less than 500 (or 0.0017 per cent) have experimentally solved structures. This frustratingly leaves more than 99 per cent of RNA targets with no structural information. Hence, our study’s core aim is to develop new computational methods capable of predicting high-quality RNA structure models, filling this substantial information gap.”
Potential applications in drug design and virtual screening
Commenting on the significance of their research, Prof Zhang, Senior Principal Investigator at CSI Singapore and corresponding author of the study, highlighted, “Our primary goal for this study is to bridge the gap between the scarcity of experimental RNA structures and the increasing demand of the RNA biology field and drug industry. In this regard, high-confident Drfold models can be used as a starting point to guide the RNA drug design and virtual screening, or to help elucidate the biological functions of the RNA molecules in cells."
Date: 08.12.2025
Naturally, we always handle your personal data responsibly. Any personal data we receive from you is processed in accordance with applicable data protection legislation. For detailed information please see our privacy policy.
Consent to the use of data for promotional purposes
I hereby consent to Vogel Communications Group GmbH & Co. KG, Max-Planck-Str. 7-9, 97082 Würzburg including any affiliated companies according to §§ 15 et seq. AktG (hereafter: Vogel Communications Group) using my e-mail address to send editorial newsletters. A list of all affiliated companies can be found here
Newsletter content may include all products and services of any companies mentioned above, including for example specialist journals and books, events and fairs as well as event-related products and services, print and digital media offers and services such as additional (editorial) newsletters, raffles, lead campaigns, market research both online and offline, specialist webportals and e-learning offers. In case my personal telephone number has also been collected, it may be used for offers of aforementioned products, for services of the companies mentioned above, and market research purposes.
Additionally, my consent also includes the processing of my email address and telephone number for data matching for marketing purposes with select advertising partners such as LinkedIn, Google, and Meta. For this, Vogel Communications Group may transmit said data in hashed form to the advertising partners who then use said data to determine whether I am also a member of the mentioned advertising partner portals. Vogel Communications Group uses this feature for the purposes of re-targeting (up-selling, cross-selling, and customer loyalty), generating so-called look-alike audiences for acquisition of new customers, and as basis for exclusion for on-going advertising campaigns. Further information can be found in section “data matching for marketing purposes”.
In case I access protected data on Internet portals of Vogel Communications Group including any affiliated companies according to §§ 15 et seq. AktG, I need to provide further data in order to register for the access to such content. In return for this free access to editorial content, my data may be used in accordance with this consent for the purposes stated here. This does not apply to data matching for marketing purposes.
Right of revocation
I understand that I can revoke my consent at will. My revocation does not change the lawfulness of data processing that was conducted based on my consent leading up to my revocation. One option to declare my revocation is to use the contact form found at https://contact.vogel.de. In case I no longer wish to receive certain newsletters, I have subscribed to, I can also click on the unsubscribe link included at the end of a newsletter. Further information regarding my right of revocation and the implementation of it as well as the consequences of my revocation can be found in the data protection declaration, section editorial newsletter.
“Considering the potency and effectiveness of mRNA vaccines in combating pandemics, tools such as Drfold play a crucial role in predicting and optimizing RNA structures and the stability of vaccines. Furthermore, these tools can be used to study the biological functions of RNAs, particularly non-coding RNAs, and design novel RNA experiments using predicted models which follow the sequence-to-structure-to-function paradigm,” Prof Zhang added.
The group has opened the source codes of Drfold to the public community via their webpage: https://zhanggroup.org/Drfold. Its high scalability and open-source framework render it incredibly flexible and applicable for solving other related problems, such as RNA-protein interaction modelling.
Next steps
Moving forward, the team envisions extending their AI strategy to encompass protein-RNA interactions, an area where reliable AI approaches for high-quality protein-RNA complex structure prediction are currently absent. Such tools are highly relevant for RNA function annotation and RNA drug discovery.
In addition, the team hopes to further improve Drfold’s accuracy in single-chain RNA structure prediction. One of the inherent barriers stems from the limited availability of experimental RNA structures, which impacts the accuracy of the deep learning models, especially for large-sized RNAs (approximately more than 200 nucleotides). Novel strategies and ideas are needed to break through the bottleneck of high-accuracy RNA structure predictions, and the researchers are currently working on it with encouraging progress.