当前位置: X-MOL 学术J. Med. Chem. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Enhancing Molecular Property Prediction through Task-Oriented Transfer Learning: Integrating Universal Structural Insights and Domain-Specific Knowledge
Journal of Medicinal Chemistry ( IF 7.3 ) Pub Date : 2024-05-15 , DOI: 10.1021/acs.jmedchem.4c00692
Yanjing Duan 1 , Xixi Yang 2 , Xiangxiang Zeng 2 , Wenxuan Wang 1 , Youchao Deng 1 , Dongsheng Cao 1, 3
Affiliation  

Precisely predicting molecular properties is crucial in drug discovery, but the scarcity of labeled data poses a challenge for applying deep learning methods. While large-scale self-supervised pretraining has proven an effective solution, it often neglects domain-specific knowledge. To tackle this issue, we introduce Task-Oriented Multilevel Learning based on BERT (TOML-BERT), a dual-level pretraining framework that considers both structural patterns and domain knowledge of molecules. TOML-BERT achieved state-of-the-art prediction performance on 10 pharmaceutical datasets. It has the capability to mine contextual information within molecular structures and extract domain knowledge from massive pseudo-labeled data. The dual-level pretraining accomplished significant positive transfer, with its two components making complementary contributions. Interpretive analysis elucidated that the effectiveness of the dual-level pretraining lies in the prior learning of a task-related molecular representation. Overall, TOML-BERT demonstrates the potential of combining multiple pretraining tasks to extract task-oriented knowledge, advancing molecular property prediction in drug discovery.

中文翻译:

通过面向任务的迁移学习增强分子特性预测:整合通用结构见解和特定领域知识

精确预测分子特性对于药物发现至关重要,但标记数据的稀缺给深度学习方法的应用带来了挑战。虽然大规模自监督预训练已被证明是一种有效的解决方案,但它常常忽略特定领域的知识。为了解决这个问题,我们引入了基于 BERT (TOML-BERT) 的面向任务的多级学习,这是一种同时考虑分子结构模式和领域知识的双层预训练框架。 TOML-BERT 在 10 个药物数据集上实现了最先进的预测性能。它能够挖掘分子结构中的上下文信息并从大量伪标记数据中提取领域知识。双层预训练实现了显着的正迁移,其两个组成部分做出了互补的贡献。解释分析表明,双层预训练的有效性在于预先学习与任务相关的分子表示。总体而言,TOML-BERT 展示了结合多个预训练任务来提取面向任务的知识的潜力,从而推进药物发现中的分子特性预测。
更新日期:2024-05-15
down
wechat
bug