当前位置: X-MOL 学术npj Comput. Mater. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Robust training of machine learning interatomic potentials with dimensionality reduction and stratified sampling
npj Computational Materials ( IF 9.7 ) Pub Date : 2024-02-26 , DOI: 10.1038/s41524-024-01227-4
Ji Qi , Tsz Wai Ko , Brandon C. Wood , Tuan Anh Pham , Shyue Ping Ong

Machine learning interatomic potentials (MLIPs) enable accurate simulations of materials at scales beyond that accessible by ab initio methods and play an increasingly important role in the study and design of materials. However, MLIPs are only as accurate and robust as the data on which they are trained. Here, we present DImensionality-Reduced Encoded Clusters with sTratified (DIRECT) sampling as an approach to select a robust training set of structures from a large and complex configuration space. By applying DIRECT sampling on the Materials Project relaxation trajectories dataset with over one million structures and 89 elements, we develop an improved materials 3-body graph network (M3GNet) universal potential that extrapolates more reliably to unseen structures. We further show that molecular dynamics (MD) simulations with the M3GNet universal potential can be used instead of expensive ab initio MD to rapidly create a large configuration space for target systems. We combined this scheme with DIRECT sampling to develop a reliable moment tensor potential for titanium hydrides without the need for iterative augmentation of training structures. This work paves the way for robust high-throughput development of MLIPs across any compositional complexity.



中文翻译:

通过降维和分层采样对机器学习原子间势进行鲁棒训练

机器学习原子间势 (MLIP) 能够以超出从头计算方法所能达到的规模对材料进行精确模拟,并在材料的研究和设计中发挥着越来越重要的作用。然而,MLIP 的准确性和稳健性取决于其训练数据。在这里,我们提出了具有分层(直接)采样的降维编码簇作为从大型复杂配置空间中选择稳健的结构训练集的方法。通过对具有超过一百万个结构和 89 个元素的材料项目松弛轨迹数据集应用直接采样,我们开发了一种改进的材料三体图网络 (M3GNet) 通用潜力,可以更可靠地推断出看不见的结构。我们进一步表明,可以使用具有 M3GNet 通用潜力的分子动力学 (MD) 模拟代替昂贵的从头开始 MD 来快速为目标系统创建较大的配置空间。我们将该方案与直接采样相结合,为钛氢化物开发了可靠的矩张量势,而不需要迭代增强训练结构。这项工作为跨任何成分复杂性的 MLIP 的稳健高通量开发铺平了道路。

更新日期:2024-02-26
down
wechat
bug