当前位置: X-MOL 学术J. Chem. Inf. Model. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
DEBFold: Computational Identification of RNA Secondary Structures for Sequences across Structural Families Using Deep Learning
Journal of Chemical Information and Modeling ( IF 5.6 ) Pub Date : 2024-04-22 , DOI: 10.1021/acs.jcim.4c00458
Tzu-Hsien Yang

It is now known that RNAs play more active roles in cellular pathways beyond simply serving as transcription templates. These biological mechanisms might be mediated by higher RNA stereo conformations, triggering the need to understand RNA secondary structures first. However, experimental protocols for solving RNA structures are unavailable for large-scale investigation due to their high costs and time-consuming nature. Various computational tools were thus developed to predict the RNA secondary structures from sequences. Recently, deep networks have been investigated to help predict RNA structures directly from their sequences. However, existing deep-learning-based tools are more or less suffering from model overfitting due to their complicated problem formulation and defective model training processes, limiting their applications across sequences from different structural families. In this research, we designed a two-stage RNA structure prediction strategy called DEBFold (deep ensemble boosting and folding) based on convolution encoding/decoding and self-attention mechanisms to enhance the existing thermodynamic structure models. Moreover, the model training process followed rigorous steps to achieve an acceptable prediction generalization. On the family-wise reserved test sets and the PDB-derived test set, DEBFold achieves better structure prediction performance over traditional tools and existing deep-learning methods. In summary, we obtained a cutting-edge deep-learning-based structure prediction tool with supreme across-family generalization performance. The DEBFold tool can be accessed at https://cobis.bme.ncku.edu.tw/DEBFold/.

中文翻译:

DEBFold:使用深度学习计算识别跨结构家族序列的 RNA 二级结构

现在我们知道,RNA 在细胞通路中发挥着更积极的作用,而不仅仅是充当转录模板。这些生物学机制可能是由更高的 RNA 立体构象介导的,因此需要首先了解 RNA 二级结构。然而,由于成本高且耗时,解决 RNA 结构的实验方案无法用于大规模研究。因此开发了各种计算工具来根据序列预测 RNA 二级结构。最近,人们对深度网络进行了研究,以帮助直接从 RNA 序列预测其结构。然而,现有的基于深度学习的工具由于其复杂的问题表述和有缺陷的模型训练过程而或多或少地遭受模型过度拟合的困扰,限制了它们在不同结构族序列中的应用。在本研究中,我们设计了一种基于卷积编码/解码和自注意力机制的两阶段RNA结构预测策略DEBFold(深度集成增强和折叠),以增强现有的热力学结构模型。此外,模型训练过程遵循严格的步骤以实现可接受的预测泛化。在family-wise预留测试集和PDB衍生测试集上,DEBFold比传统工具和现有深度学习方法取得了更好的结构预测性能。综上所述,我们获得了一种基于深度学习的尖端结构预测工具,具有卓越的跨族泛化性能。 DEBFold 工具可通过 https://cobis.bme.ncku.edu.tw/DEBFold/ 访问。
更新日期:2024-04-22
down
wechat
bug