Materials-Discovery Workflows Guided by Symbolic Regression: Identifying Acid-Stable Oxides for Electrocatalysis
Back
Akhil S. Nair ∗ , Lucas Foppa, Matthias Scheffler
The NOMAD Laboratory at the Fritz Haber Institute of the Max-Planck-Gesellschaft, Faradayweg 4-6,
14195 Berlin, Germany
AI-driven workflows will accelerate materials discovery by efficiently guiding experiments or simulations towards materials with desired properties. However, probabilistic AI approaches commonly used in these workflows are limited by the relatively small size of high-quality datasets and they rely on typically unknown, low-dimensional representations. Here, we train ensemble of symbolicregression models in order obtain not only (mean) predictions, but also their variance. This opens the opportunity to use symbolic regression in sequential-learning workflows for materials discovery. Indeed, we leverage the prediction uncertainties derived from the variance across the
ensemble models to guide the acquisition of data in previously unexplored regions of materials space. We employ the sure-independence-screening-and-sparsifying-operator (SISSO) symbolic-regression approach, which identifies analytical expressions for the target property using moderate-sized datasets. These expressions are low-dimensional representations depending only on few key physicochemical parameters, out of many offered candidates. Importantly, SISSO provides materials-property maps covering the entire materials space, further reducing the risk that the workflow misses promising materials that were overlooked in the initial dataset. We demonstrate the effectiveness of the SISSO guided workflow by identifying acid-stable oxides for the water-splitting reaction through DFT-HSE06 calculations.
Keywords: SISSO, DFT, oxides, electrocatalysis, material discovery
References
1. J. H. Montoya et al., “Autonomous intelligent agents for accelerated materials discovery” Chem. Sci.
(2020), 11, 8517–8532.
2. R. Ouyang et al.,“SISSO: A compressed-sensing methodfor identifying the best low-dimensional
descriptor in an immensity of offered candidates.” Physical Review. M (2018), 2(8), 083802.