当前位置: X-MOL 学术Phys. Chem. Chem. Phys. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Big data benchmarking: how do DFT methods across the rungs of Jacob's ladder perform for a dataset of 122k CCSD(T) total atomization energies?
Physical Chemistry Chemical Physics ( IF 3.3 ) Pub Date : 2024-05-13 , DOI: 10.1039/d4cp00387j
Amir Karton 1
Affiliation  

Total atomization energies (TAEs) are a central quantity in density functional theory (DFT) benchmark studies. However, so far TAE databases obtained from experiment or high-level ab initio wavefunction theory included up to hundreds of TAEs. Here, we use the GDB-9 database of 133k CCSD(T) TAEs generated by Curtiss and co-workers [B. Narayanan, P. C. Redfern, R. S. Assary and L. A. Curtiss, Chem. Sci., 2019, 10, 7449] to evaluate the performance of 14 representative DFT methods across the rungs of Jacob's ladder (namely, PBE, BLYP, B97-D, M06-L, τ-HCTH, PBE0, B3LYP, B3PW91, ωB97X-D, τ-HCTHh, PW6B95, M06, M06-2X, and MN15). We first use the A25[PBE] diagnostic for nondynamical correlation to eliminate systems that potentially include significant multireference effects, for which the CCSD(T) TAEs might not be sufficiently reliable. The resulting database (denoted by GDB9-nonMR) includes 122k species. Of the considered functionals, B3LYP attains the best performance relative to the G4(MP2) reference TAEs, with a mean absolute deviation (MAD) of 4.09 kcal mol−1. This first-generation hybrid functional, in which the three mixing coefficients were fitted against a small set of TAEs, is one of the few functionals that are not systematically biased towards overestimating the G4(MP2) TAEs, as demonstrated by a mean-signed deviation (MSD) of 0.45 kcal mol−1. The relatively good performance of B3LYP is followed by the heavily parameterized M06-L meta-GGA functional, which attains a MAD of 6.24 kcal mol−1. The PW6B95, M06, M06-2X, and MN15 functionals tend to systematically overestimate the G4(MP2) TAEs and attain MADs ranging between 18.69 (M06) and 28.54 (MN15) kcal mol−1. However, PW6B95 and M06-2X exhibit particularly narrow error distributions. Thus, scaling their TAEs by an empirical scaling factor reduces their MADs to merely 3.38 (PW6B95) and 2.85 (M06-2X) kcal mol−1. Empirical dispersion corrections (e.g., D3 and D4) are attractive, and therefore, their inclusion worsens the performance of methods that systematically overestimate the TAEs.

中文翻译:

大数据基准测试:对于 122k CCSD(T) 总原子化能量的数据集,雅各布阶梯各梯级的 DFT 方法如何执行?

总原子化能 (TAE) 是密度泛函理论 (DFT) 基准研究的核心量。然而,到目前为止,从实验或高级从头算波函数理论获得的 TAE 数据库包含多达数百个 TAE。在这里,我们使用由 Curtiss 和同事生成的包含 133k CCSD(T) TAE 的 GDB-9 数据库 [B. Narayanan、PC Redfern、RS Assary 和 LA Curtiss,化学。科学。 , 2019, 10 , 7449] 评估 14 种具有代表性的 DFT 方法在雅各布阶梯上的性能(即 PBE、BLYP、B97-D、M06-L、τ-HCTH、PBE0、B3LYP、B3PW91、ωB97X-D 、τ-HCTHh、PW6B95、M06、M06-2X 和 MN15)。我们首先使用A 25 [PBE] 诊断进行非动态相关,以消除可能包含显着多参考效应的系统,而 CCSD(T) TAE 对此可能不够可靠。生成的数据库(由 GDB9-nonMR 表示)包括 122k 个物种。在所考虑的泛函中,B3LYP 相对于 G4(MP2) 参考 TAE 获得了最佳性能,平均绝对偏差 (MAD) 为 4.09 kcal mol -1。这种第一代混合泛函将三个混合系数与一小组 TAE 进行拟合,是少数不会系统性地偏向于高估 G4(MP2) TAE 的泛函之一,如均值符号偏差所示(MSD) 0.45 kcal mol -1。 B3LYP 的性能相对较好,其次是高度参数化的 M06-L meta -GGA 泛函,其 MAD 达到 6.24 kcal mol -1。 PW6B95、M06、M06-2X 和 MN15 泛函倾向于系统性高估 G4(MP2) TAE 并获得介于 18.69 (M06) 和 28.54 (MN15) kcal mol -1之间的 MAD 。然而,PW6B95 和 M06-2X 的误差分布特别窄。因此,通过经验缩放因子缩放其 TAE 将其 MAD 降低至仅为 3.38 (PW6B95) 和 2.85 (M06-2X) kcal mol -1。经验色散校正(例如,D3 和D4)很有吸引力,因此,它们的包含会恶化系统性高估TAE 的方法的性能。
更新日期:2024-05-13
down
wechat
bug