1 of 1

An ecosystem for digital reticular chemistry

1

Scientific Achievement

Digital chemistry has shown great progress in energy-related applications ranging from carbon capture to gas separation. While machine learning (ML) holds promise to accelerate this field, there is still a lack of good ML practices and datasets in this area. We present the fundamentals of an ecosystem of datasets, methods, and good practices to advance this ecosystem.

Significance and Impact

The ecosystem we have created provides increased accessibility to machine learning for reticular chemistry and beyond without compromising on rigor, especially for less experienced users. This will allow for a closer coupling of data-driven materials design and the synthesis and characterization of (in silico generated) materials.

Technical Approach

  • We show key pitfalls, such as researchers having significant data leakage between training and test sets, preventing direct comparisons of modeling approaches.
  • We identify that models are often not validated correctly (and through different protocols), and develop better validation criteria for ML models in this field.
  • We provide new ML methods that can be used directly on the newly curated datasets.

K. Jablonka, A. S. Rosen, A. S. Krishnapriyan, B. Smit, ACS Central Science (2023).

Better validation metrics for splitting datasets