ICME Gap Analysis in Materials Informatics: Databases, Machine Learning, and Data-Driven Design: Session III
Sponsored by: TMS Materials Processing and Manufacturing Division, TMS: Computational Materials Science and Engineering Committee, TMS: Integrated Computational Materials Engineering Committee
Program Organizers: James Saal, Citrine Informatics; Carelyn Campbell, National Institute of Standards and Technology; Raymundo Arroyave, Texas A&M University

Thursday 8:30 AM
February 27, 2020
Room: 30D
Location: San Diego Convention Ctr

Session Chair: James Saal, Citrine Informatics


8:30 AM  Invited
Machine Learning to Predict Oxidation Behavior of High-temperature Alloys: Dongwon Shin1; Rishi Pillai1; Jian Peng1; Marie Romedenne1; Bruce Pint1; J. Haynes1; 1Oak Ridge National Laboratory
    Due to the lack of a physics-based model and missing fundamental data on the diffusion kinetics of oxide scales, it is not yet possible to predict high-temperature oxidation of multi-component alloys in first-principles manners. We demonstrate a modern data analytic workflow that leverages high-quality experimental data, augmented with highly relevant thermodynamic and kinetic descriptors to predict alloy oxidation behavior as a function of composition and temperature. The presentation will discuss the challenges and opportunities in the proposed workflow in three aspects: 1) defining quantitative target properties to represent high-temperature alloy oxidation, 2) populating scientific alloy descriptors to capture underlying mechanisms, and 3) interrogating trained surrogate machine learning models to design advanced alloys. We use an example of cyclic oxidation of NiCr-based alloys, of which data have been consistently collected over the past few decades. The research was sponsored by the Department of Energy, Vehicle Technologies Office, Propulsion Materials Program.

9:10 AM  Invited
Discovering and Navigating Gaps and Connections in Data for Materials Design: Krishna Rajan1; 1University at Buffalo- State University of New York
    In the context of the theme of this symposium on “gap analysis”, this presentation will explore how materials informatics methods can be used to map out gaps in the information landscape for materials design. The key in harnessing ICME principles to aid in materials design lies in recognizing the high dimensionality of data and to apply techniques that map out where the connections can be discovered between data in this high dimensional space. We provide examples of how we use these methods for a variety of applications ranging from materials modeling to materials characterization.

9:50 AM  
Machine Learning for Materials Science: Open, Online Tools in NanoHUB: Juan Verduzco1; Saaketh Desai1; Alejandro Strachan1; Tanya Faltens1; 1Purdue University
     Data science and machine learning are transforming materials science and engineering (MSE). Research is benefiting from the increased availability of mineable data and models capable of finding hidden correlations between structure and properties and making predictions. It has become critical to expose MSE students to these tools and train them in their use. We describe a set of open simulation tools, available for online simulation in nanoHUB (https://nanohub.org/tools/mseml), that introduce students to key aspects of data science and machine learning: Data query, organization and visualization; Linear regression model to explore correlations between descriptors and properties; Usage of neural networks.The examples are implemented as Jupyter notebooks and users can easily modify the code, change the model details, or train models for other properties. Users do not need to install any software and the tool runs on standard web browsers. This tool was utilized in an undergraduate class at Purdue University.

10:10 AM Break

10:30 AM  
Automated Data Curation for Electron Microscopy Using the Materials Data Facility: Charudatta Phatak1; Jonathon Gaff2; Ian Foster1; Ben Blaiszik2; 1Argonne National Laboratory; 2University of Chicago
    Modern electron microscopy is no longer only driven by instrumentation, but is increasingly linked with computational and data-driven algorithms and methods for acquisition and analysis. Harnessing this data for scientific research especially to enable machine learning approaches necessitates development of a supportive data infrastructure of large and well-curated datasets that can be used reliably. We will demonstrate an automated data curation workflow for electron microscopy that imposes minimal burden on users for additional information, yet collects data in a form amenable to automated analysis and machine learning. This workflow is developed using the Materials Data Facility (MDF). We will discuss the implementation of the workflow for a multi-user transmission electron microscope facility. Our approach allows the end-user to create a record entry for their datasets, search through their data easily, and streamlines the final publication ready data to be shared. Future work will involve consolidation of records from multiple instruments.

10:50 AM  
Uncertainty Quantification and Propagation in ICME Enabled by ESPEI: Brandon Bocklund1; Richard Otis2; Zi-Kui Liu1; 1Pennsylvania State University; 2Jet Propulsion Laboratory, California Institute of Technology
    The CALPHAD method is a foundational component of the ICME approach. CALPHAD models can be used directly for prediction of equilibrium properties or used in kinetic simulations of diffusion and phase transformations. CALPHAD models rely heavily on the extrapolation of unary, binary and ternary model parameters in the description of the Gibbs energy for multicomponent systems. However there has not been a viable route for quantifying, storing and propagating the uncertainty in CALPHAD models to other simulations until recently. ESPEI enables the uncertainty quantification for CALPHAD model parameters through its thermodynamic engine, pycalphad. This presentation will demonstrate the use of ESPEI for development of a ternary CALPHAD database with uncertainty quantification and propagation.

11:10 AM  
Computational Classification, Generation and Time-evolution Prediction of Alloy Microstructures with Deep Learning: Fei Zhou1; 1Lawrence Livermore National Laboratory
    The solidification conditions and resulting microstructures have direct and decisive effects on the mechanical properties of alloys, and are therefore of utmost importance for their performance. However, the length and time scales involved in microstructures are extremely demanding for computational modeling efforts. We demonstrate the usefulness of deep learning approaches in processing, analyzing, and quantifying electron micrograph images of alloys, including classification of microstructure types with convolutional neural networks (CNN), generation of new microstructure images with generative adversarial networks (GAN), and prediction of microstructural time evolution with recurrent neural networks (RNN).

11:30 AM  
Predicting Electronic Density of States of Nanoparticles by Principal Component Analysis and Crystal Graph Convolutional Neural Network: Kihoon Bang1; Byung Chul Yeo2; Doosun Hong1; Donghun Kim2; Sang Soo Han2; Hyuck Mo Lee1; 1Korea Advanced Institute of Science and Technology; 2Korea Institute of Science and Technology
    A computational cost in calculating DOS by density functional theory (DFT) is quite high, in particular, for nanoparticles (NPs) with large number of atoms. Herein, we propose a machine-learning method to fastly predict DOS patterns of metallic NPs by combining a principal component analysis (PCA) and a crystal graph convolutional neural network (CGCNN). Within the PCA-CGCNN framework, we can predict DOSs of Pt, Pd, and Au NPs with a reasonable accuracy compared to DFT calculation, where effects of various sizes and shapes of metallic NPs have also been explored. In addition, the PCA-CGCNN method can be applied into various bimetallic alloy structures such as core-shell type, homogeneously mixed ones, and phase separated ones. We also developed a separate-learning technique. Moreover, our PCA-CGCNN method shows ~8,000 times faster than the DFT method. This clearly reveals that our PCA-CGCNN method provides a new paradigm in the field of an electronic structure calculation.