Session Sheet - ICME Case Studies: Successes and Challenges for Generation, Distribution, and Use of Public/Pre-Existing Materials Datasets: Leveraging Open Datasets

ICME Case Studies: Successes and Challenges for Generation, Distribution, and Use of Public/Pre-Existing Materials Datasets: Leveraging Open Datasets
Sponsored by: TMS Materials Processing and Manufacturing Division, TMS: Integrated Computational Materials Engineering Committee
Program Organizers: Stephen DeWitt, Oak Ridge National Laboratory; Vikas Tomar, Purdue University; James Saal, Citrine Informatics; James Warren, National Institute of Standards and Technology

Monday 8:30 AM
February 28, 2022
Room: 254A
Location: Anaheim Convention Center

Session Chair: Stephen DeWitt, Oak Ridge National Laboratory

8:30 AM
Filling Data Gaps in 3D Microstructure with Deep Learning: Neal Brodnik¹; Devendra Jangid¹; Michael Goebel¹; Amil Khan¹; Saisidharth Majeti¹; McLean Echlin¹; B. S. Manjunath¹; Samantha Daly¹; Tresa Pollock¹; ¹University of California Santa Barbara
    Nonlinear machine learning tools such as network approaches are promising ways to rapidly infer complex material relationships. However, this ability depends on sufficient data for training, which presents challenges when experimental collection is difficult, such as for 3D microstructures. Here, we present the application of deep learning to 3D microstructure recognition and generation in ways that facilitate the learning of broad materials concepts and creation of more robust datasets. We demonstrate how publicly available 3D object datasets can teach distribution-based morphology recognition for application to microstructural features. We also show how image synthesis techniques like super-resolution can be adapted with physics-based constraints to function on crystallographic metadata such as indexed EBSD orientation maps. Physics-based training allows for faster, more accurate learning and presents opportunities where coarser datasets can be refined for future training approaches. Together, these approaches may also offer opportunities for deep learning to generate user-defined microstructures.

8:50 AM
Holistic Merging of Experimental and Computational Datasets – A Case Study for Diffusion Coefficients: Wei Zhong¹; Ji-Cheng Zhao¹; ¹University of Maryland
    It is usually challenging to reconcile the differences between the computational datasets and experimental datasets when merging them together. Fortunately for some materials properties, large amounts of data are available to allow identification of the degree of agreements and disagreements, and thus give confidence on some or all aspects of the computational datasets. Diffusion coefficients are one of such cases where holistic merging of the computational and experimental datasets are possible to leverage the best of both datasets. Examples will be given to illustrate such holistic integration in order to establish reliable diffusion coefficient (atomic mobility) databases for simulating kinetic processes and properties of alloys.

9:10 AM  Invited
Materials Innovation and Design Enabled by the Materials Project: Kristin Persson¹; ¹University of California, Berkeley
    Fueled by our abilities to compute materials properties and characteristics orders of magnitude faster than they can be measured and recent advancements in harnessing literature data, we are entering the era of the fourth paradigm of science: data-driven materials design. The Materials Project (www.materialsproject.org) contains data derived from quantum mechanical calculations for over 140,000 materials and millions of properties. The resource supports a growing community of data-rich materials research, currently supporting over 200,000 registered users and millions of data records served each day through the API. In this talk, I will highlight some of our ongoing work, as well as success stories of how external users have used Material Project datasets in their work; including a discussion of barriers that users have experienced and what we have done to improve our service in terms of information, accessibility, algorithmic tools and materials data.

9:40 AM
Data-driven Model Based Comparison of Public Datasets for Online State of Charge Estimation in Lithium-ion Batteries: Meghana Sudarshan¹; Alexey Serov¹; Casey Jones¹; Vikas Tomar¹; ¹Purdue University
    Lithium-ion(Li-ion) batteries are widely used in energy storage systems, electric vehicles, and portable electronics, considering their high energy density and low self-discharge qualities. Online estimation of the state of charge (SOC) using battery management systems in Li-ion batteries is crucial to determine battery capacity fade and remaining useful life accurately. Due to the extent of datasets available, determining an appropriate combination of datasets covering most variabilities for training models to predict SOC is essential. Data-driven based machine learning algorithms are used in this work to predict SOC by measuring battery operational parameters and material parameters due to their exceptional learning abilities and high accuracy. Machine learning models are trained on input-output pairs from various publicly available datasets of 1860 Li-ion batteries. These models are tested for accuracy and compared based on real-time data prediction on battery material parameters as well as operational parameters, including voltage, current, and temperature.