NOMAD has addressed the main obstacles to using computational data to advance materials science and engineering, advancing Open Science. Firstly, NOMAD brought together existing data, previously scattered around the world, building on the existing NOMAD Repository. The Repository is a database where computational materials scientists from around the world can store and share their data for free. The Repository has now grown to over 50 million open access calculations, making it the largest database of its kind. In keeping with Open Science best practices, NOMAD has led the way to making sure the data is Findable, Accessible, Interoperable and Reusable (FAIR). For this, expert scientists of the project developed 40 software programs to convert data from many different formats into a single format, making it easier to combine and compare. This new, single format data is stored in the NOMAD Archive, which is continuously updated as scientists continue to perform and upload more calculations to the growing Repository. NOMAD partner institutes have also helped to establish a new, non-profit initiative, FAIR Data Infrastructure e.V. that will support extensive, sustainable data sharing in future.
The amount of data in the Repository and Archive is massive. In order for it to be truly useful for R & D, scientists need efficient tools to search and analyze the data. NOMAD provides such tools through the NOMAD Encyclopedia, Advanced Graphics, and the Big Data Analytics Toolkit. The Encyclopedia is a user-friendly, public access point to NOMAD’s extensive data that lets users see, compare, explore, and comprehend computations for a large variety of materials. Advanced Graphics tools help experts and the general public alike to visualize complex, multi-dimensional data and experience the world of materials first hand through virtual-reality simulations. The Analytics Toolkit presently offers more than 15 advanced tools for performing Big Data analytics to discover patterns and other useful information in NOMAD’s massive collection of data. Advanced users can also create their own automated tools based on NOMAD software. All of these tools and services have been made available using HPC infrastructure and services, letting academic and industrial users alike maximize the use of European HPC capabilities and resources.
In addition, NOMAD has carried out case studies to show how the tools and services can be used to solve challenges with societal impact. For example, NOMAD researchers generated and screened a massive database of materials looking for the best materials to use in polymer solar cells for sustainable energy generation. The most promising materials are now being made to test their laboratory performance.
Impressively, the NOMAD team has published over 60 articles in top journals in just three years, with many more to come. NOMAD researchers have spoken at major international conferences about NOMAD’s contribution to the data revolution, e.g. at the Global Internet of Things Summit or the Platform for Advanced Scientific Computing Conference, giving over 215 invited presentations. More than 25 training events have also been held, making sure that NOMAD expertise is widely shared for maximal benefit.