Periodic Reporting for period 2 - NoMaD (The Novel Materials Discovery Laboratory)
Reporting period: 2017-05-01 to 2018-10-31
Scientists have already generated a lot of valuable computational data about materials, spending billions of CPU hours at many HPC centers all over the world. Using these data is currently difficult as they are stored in separate databases. Also, these data have been generated using many different programs and storage formats so that it is not easy to directly compare data.
To address these challenges, NOMAD creates, collects, stores, and cleanses a large volume of computational materials science data, derived from the most important materials science programs available today. In addition, NOMAD develops tools for mining this data in order to find structures, correlations, and novel information that could not be discovered from studying smaller data sets. Together, the large volume of data and innovative tools is enabling researchers in basic science and engineering to advance materials science.
The amount of data in the Repository and Archive is massive. In order for it to be truly useful for R & D, scientists need efficient tools to search and analyze the data. NOMAD provides such tools through the NOMAD Encyclopedia, Advanced Graphics, and the Big Data Analytics Toolkit. The Encyclopedia is a user-friendly, public access point to NOMAD’s extensive data that lets users see, compare, explore, and comprehend computations for a large variety of materials. Advanced Graphics tools help experts and the general public alike to visualize complex, multi-dimensional data and experience the world of materials first hand through virtual-reality simulations. The Analytics Toolkit presently offers more than 15 advanced tools for performing Big Data analytics to discover patterns and other useful information in NOMAD’s massive collection of data. Advanced users can also create their own automated tools based on NOMAD software. All of these tools and services have been made available using HPC infrastructure and services, letting academic and industrial users alike maximize the use of European HPC capabilities and resources.
In addition, NOMAD has carried out case studies to show how the tools and services can be used to solve challenges with societal impact. For example, NOMAD researchers generated and screened a massive database of materials looking for the best materials to use in polymer solar cells for sustainable energy generation. The most promising materials are now being made to test their laboratory performance.
Impressively, the NOMAD team has published over 60 articles in top journals in just three years, with many more to come. NOMAD researchers have spoken at major international conferences about NOMAD’s contribution to the data revolution, e.g. at the Global Internet of Things Summit or the Platform for Advanced Scientific Computing Conference, giving over 215 invited presentations. More than 25 training events have also been held, making sure that NOMAD expertise is widely shared for maximal benefit.
• The Archive is the only materials science database in the world that contains data from all important worldwide programs in a single format.
• For the first time, scientists can see, compare, explore, and comprehend computational materials science data through the Encyclopedia.
• Big Data analytics, not previously possible with data in many formats in databases scattered around the world, have been developed and proven on test datasets.
• Advanced Graphics and virtual-reality simulations have been developed to help scientists better understand and use materials science data.
One of the most important achievements of NOMAD has been the change in the scientific culture towards extensive data sharing. Before NOMAD, data were not widely shared. Now, through NOMAD, the materials science community has uploaded over 50 million calculations for open access reuse, and more are uploaded every day. While there are now other databases, they are restricted to a single program and serve only the user community generating the data. NOMAD supports all important programs, with support for new ones on request, and is open to anyone in the world.
NOMAD is ensuring that Europe leads the way in novel materials discovery, in collaboration with international networks. NOMAD is also training the next generation of scientists and engineers who will advance computational materials science, to make sure European science and industry remain competitive in global markets.
The most exciting part of NOMAD has only just begun!
Scientists are now using NOMAD tools and services for novel materials discovery. By making these tools freely and openly available, so that others can build on our work, we are improving access to and use of computational materials science data to advance basic science and drive innovation in a broad range of industries from sustainable energy to transport to healthcare and more. For example, NOMAD is currently collaborating with industry to investigate materials for green chemical production that would decrease carbon dioxide emissions and increase renewable energy usage.