Issue 20, 2024

Big data benchmarking: how do DFT methods across the rungs of Jacob's ladder perform for a dataset of 122k CCSD(T) total atomization energies?

Abstract

Total atomization energies (TAEs) are a central quantity in density functional theory (DFT) benchmark studies. However, so far TAE databases obtained from experiment or high-level ab initio wavefunction theory included up to hundreds of TAEs. Here, we use the GDB-9 database of 133k CCSD(T) TAEs generated by Curtiss and co-workers [B. Narayanan, P. C. Redfern, R. S. Assary and L. A. Curtiss, Chem. Sci., 2019, 10, 7449] to evaluate the performance of 14 representative DFT methods across the rungs of Jacob's ladder (namely, PBE, BLYP, B97-D, M06-L, τ-HCTH, PBE0, B3LYP, B3PW91, ωB97X-D, τ-HCTHh, PW6B95, M06, M06-2X, and MN15). We first use the A25[PBE] diagnostic for nondynamical correlation to eliminate systems that potentially include significant multireference effects, for which the CCSD(T) TAEs might not be sufficiently reliable. The resulting database (denoted by GDB9-nonMR) includes 122k species. Of the considered functionals, B3LYP attains the best performance relative to the G4(MP2) reference TAEs, with a mean absolute deviation (MAD) of 4.09 kcal mol−1. This first-generation hybrid functional, in which the three mixing coefficients were fitted against a small set of TAEs, is one of the few functionals that are not systematically biased towards overestimating the G4(MP2) TAEs, as demonstrated by a mean-signed deviation (MSD) of 0.45 kcal mol−1. The relatively good performance of B3LYP is followed by the heavily parameterized M06-L meta-GGA functional, which attains a MAD of 6.24 kcal mol−1. The PW6B95, M06, M06-2X, and MN15 functionals tend to systematically overestimate the G4(MP2) TAEs and attain MADs ranging between 18.69 (M06) and 28.54 (MN15) kcal mol−1. However, PW6B95 and M06-2X exhibit particularly narrow error distributions. Thus, scaling their TAEs by an empirical scaling factor reduces their MADs to merely 3.38 (PW6B95) and 2.85 (M06-2X) kcal mol−1. Empirical dispersion corrections (e.g., D3 and D4) are attractive, and therefore, their inclusion worsens the performance of methods that systematically overestimate the TAEs.

Graphical abstract: Big data benchmarking: how do DFT methods across the rungs of Jacob's ladder perform for a dataset of 122k CCSD(T) total atomization energies?

Supplementary files

Article information

Article type
Paper
Submitted
27 Jan 2024
Accepted
01 May 2024
First published
13 May 2024

Phys. Chem. Chem. Phys., 2024,26, 14594-14606

Big data benchmarking: how do DFT methods across the rungs of Jacob's ladder perform for a dataset of 122k CCSD(T) total atomization energies?

A. Karton, Phys. Chem. Chem. Phys., 2024, 26, 14594 DOI: 10.1039/D4CP00387J

To request permission to reproduce material from this article, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements