ICME Case Studies: Successes and Challenges for Generation, Distribution, and Use of Public/Pre-Existing Materials Datasets: On-Demand Oral Presentations
Sponsored by: TMS Materials Processing and Manufacturing Division, TMS: Integrated Computational Materials Engineering Committee
Program Organizers: Stephen DeWitt, Oak Ridge National Laboratory; Vikas Tomar, Purdue University; James Saal, Citrine Informatics; James Warren, National Institute of Standards and Technology

Monday 8:00 AM
March 14, 2022
Room: Materials Design
Location: On-Demand Room

The Status of ML Algorithms for Structure-property Relationships Using Matbench as a Test Protocol: Anubhav Jain1; 1Lawrence Berkeley National Laboratory
    During the past few years, there has been an explosion of new ideas regarding features / descriptors, machine learning algorithms, and neural network architectures for predicting composition-property or structure-property relationships. Recently, we introduced a standard benchmark (Matbench) for measuring the performance of these algorithms through testing on a common data set, with initial conclusions showing "conventional" feature-based machine learning working well for smaller data sets and graph-based neural network methods working better for larger data sets. In this talk, I will first re-introduce the Matbench test set, which is a set of 13 supervised machine learning problems derived from 10 experimental and ab initio datasets and which range in size from 312 to 132,752 samples. I will next summarize our findings in using this data to benchmark state-of-the-art machine learning methods for property prediction including CGCNN, MEGNet, Automatminer, Roost, CRABNet, and MODNet, and others.

A Quest for Re-using 3D Materials Data: Emine Gulsoy1; Peter Voorhees1; 1Northwestern University
     Growing amounts of publicly available materials data ideally defines a new age for materials research. Over the last decade, as a community, we have been actively developing, populating, curating materials data and databases while simultaneously discussing best practices of doing so. This talk will summarize the contributions of the Center for Hierarchical Materials Design to this community-wide effort by discussing the data, databases and resources produced by the Center. In particular, the authors will present and discuss access and use of publicly contributed and available 3D/4D data using methods such as synchrotron tomography, TEM tomography and serial sectioning. This talk aims to also include a personal journey for finding and reusing data towards piloting computational methods on novel research ideas and seeding grant applications; challenges faced and lessons learned will be discussed.

Mg Database Project: Mapping Trends and Data Sets of Magnesium and Its Alloys for Improved Mechanical Performance: Suhas Eswarappa Prameela1; Suraj Ravindran2; Burigede Liu2; Padmeya Indurkar3; Babak Ravaji4; Caitlyn Schuette1; Abigail Park1; Fanuel Mammo1; Stephanie Hernandez1; Timothy Weihs1; 1Johns Hopkins University; 2Caltech; 3University of Cambridge; 4University of Houston
    Magnesium (Mg) and its alloys continues to draw interest from many researchers and funding agencies across the world. The lightweight metal is poised to bring huge benefits for a wide variety of structural applications. Lessons drawn from the last decade indicate that we need a highly synergetic experimental and computational approach to design these materials for improved mechanical performance. Artificial Intelligence (AI) is now seen as a powerful tool to help engineers design better Mg alloys. However, two critical obstacles remain in implanting successful (machine learning) ML models for Mg alloys. One is the lack of data to train the models. The second is the lack of organization of existing data in the literature. To help mitigate this problem, the Mg database project aims to collect datasets across different parameters critical to the design of Mg alloys, focusing on improving their mechanical performance.

Graph Convolutional Neural Networks for Fast, Accurate Prediction of Material Properties for Solid Solution High Entropy Alloys Using Open-source Datasets: Massimiliano (Max) Lupo Pasini1; Samuel Reeve1; Pei Zhang1; Marko Burcul2; 1Oak Ridge National Laboratory; 2Motion-S
     We present numerical results from training two DL models - multi-layer perceptron (MLP) and graph convolutional neural networks (GCNN) - on three different open-source data - ab-initio DFT data for high entropy alloys (HEA) copper-gold (CuAu), iron-platinum (FePt), and silicon-steel (FeSi). This data has been generated for varying sizes of the lattice structure by running the LSMS-3 code, which implements a locally self-consistent multiple scattering method on OLCF supercomputer Summit.We found that: (i) multi-headed MLPs simultaneously provide accurate and robust estimate of multiple physical properties for CuAu and FePt where the volume of data is sufficiently large (ii) GCNNs attain higher accuracy on the CuAu and FePt datasets with respect to MLP because they take advantage of the topology of the lattice, and (iii) as expected, the volume of data needed by the two DL models to provide reasonably accurate results increases with the size of the lattice.