OPTIMADE API enables seamless access and interoperability across materials databases
by Carey Sargent, EPFL, NCCR MARVEL
Data has become a critical resource in materials science. Numerous experimental databases are widely used throughout the field and high-throughput electronic structure calculations have significantly increased the amount of data available from computational simulations. This explosion has led to a new paradigm of data-driven materials science, a field underpinned by databases that can be queried by humans and machines alike through application programming interfaces (APIs).
Materials databases differ in fidelity and focus though and it is useful to be able to extract and unify data from multiple sources. Doing this is rarely straightforward. Each database has its own specialized API that governs access patterns as well as the querying and representation of underlying data. What’s more, as individual APIs evolve, existing clients must do so too, requiring a significant maintenance effort.
These challenges motivated the providers of several materials databases to come together to design and implement an API specification that enables seamless access and interoperability across materials databases. The result is OPTIMADE (v1), whose specification and design have been published today on Nature Research’s journal Scientific Data. By extracting technical and scientific commonalities from existing APIs, the developers have produced one that can be implemented across a broad range of materials domains, database back-ends and sizes. The development of such an API is likely to ease the effective use of big, open data in materials science.
AiiDA and Materials Cloud are among the leading databases that have been driving the design of such a specification, which resulted in OPTIMADE v1.0. Both AiiDA and Materials Cloud already offer an implementation of it, together with many other worldwide crystal structure databases including AFLOW, COD, TCOD, Materials Project, NOMAD, odbx, Open Materials Database (omdb), and OQMD. Together, they give researchers easy access to more than 10,000,000 results for different materials, providing benchmarking opportunities and offering a huge opportunity for high-throughput screening and machine learning studies.
OPTIMADE can search databases, expose links between them and deliver standardized results, putting it in a good position to significantly enhance the impact and permeability of pre-existing data silos. This should enable researchers to scan through new and unexpected material families and train models capable of understanding deep correlations from all available data.
The OPTIMADE API can also be queried using a custom graphical client available on the Materials Cloud at http://materialscloud.org/optimadeclient, that allows users to query any database serving data using the OPTIMADE specification.
OPTIMADE is already flexible and will be extended to more use cases going forward. Its development and adoption depend, however, on the involvement of many scientists—contributions from the community are strongly encouraged. Questions on development, the registration of a provider or use can be directed to the web forum (https://matsci.org/optimade) or the mailing list (email@example.com). Further information can also be found at the OPTIMADE website https://optimade.org.
Looking ahead, the consortium plans to include the standardization of more filterable materials properties, integrate molecular dynamics simulations and experimental results, extensions beyond electronic-structure calculations, and integration and support for ontological references and definitions.
The OPTIMADE specification is developed openly on GitHub with releases archived on Zenodo. Version 1.0.0 was released on 1 July 2020 and is licensed under Creative Commons Attribution 4.0 International (CC-BY 4.0).
All associated code is hosted under the Materials-Consortia organization on GitHub (https://github.com/Materials-Consortia).
Andersen, C., Armiento, R., Blokhin, E. et al. OPTIMADE: an API for exchanging materials data, Sci Data 8, 217 (2021) https://doi.org/10.1038/s41597-021-00974-z
Low-volume newsletters, targeted to the scientific and industrial communities.Subscribe to our newsletter