Uncertainty Quantification in Data-Driven Materials and Process Design: Data-driven Process-Structure-Property Surrogate Modeling
Sponsored by: TMS: Integrated Computational Materials Engineering Committee
Program Organizers: Yan Wang, Georgia Institute of Technology; Raymundo Arroyave, Texas A&M University; Anh Tran, Sandia National Laboratories; Dehao Liu, Binghamton University

Monday 8:00 AM
October 10, 2022
Room: 310
Location: David L. Lawrence Convention Center

Session Chair: Anh Tran, Sandia National Laboratories; Dehao Liu, Binghamton University; Ramin Bostanabad, University of California, Irvine; Sam Reeve, Oak Ridge National Laboratory


8:00 AM  
Enabling the Fourth Paradigm of Multiscale ICME Models through Versatile Gaussian Process and Bayesian Optimization: Anh Tran1; 1Sandia National Laboratories
    Gaussian process (GP), as well as Bayesian optimization (BO), has been a cornerstone of Bayesian machine learning methods with naturally enabled uncertainty quantification. In the first half of the talk, we will discuss a generic and versatile BO approach to tackle a handful of general optimization problems, including known and unknown constraints, multi-objective, multi-fidelity, mixed-integer, parallelization on high-performance computers, Big Data, and high-dimensional problems. In the second half of the talk, we will discuss the applications of GP/BO to several ICME models, in the materials design under uncertainty context and in the spirit of the Material Genome Initiative (2011). In particular, using ICME applications as forward models in the process-structure-property relationship, we will discuss how GP/BO fits in as an enabler to the data-driven fourth paradigm for materials design using multiple ICME models, including density functional theory, molecular dynamics, kinetic Monte Carlo, and crystal plasticity finite element.

8:20 AM  
Learning from Multi-source Scarce Data via Latent Map Gaussian Processes: Mehdi Shishehbor1; Tammer Eweis-labolle1; Ramin Bostanabad1; 1University of California, Irvine
    I will introduce latent map Gaussian processes (LMGPs) that inherit the attractive properties of GPs but are also applicable to mixed data that have both quantitative (e.g., pressure) and qualitative (e.g., coating type) inputs. I will elaborate on the core idea of LMGPs which consists of learning a low-dimensional manifold where all qualitative inputs are represented by some latent quantitative features. Through a wide range of analytical and real-world examples, I will demonstrate the advantages of LMGPs in terms of accuracy and versatility. I will show that LMGPs (1) can handle variable-length inputs, (2) have a nice neural network interpretation, and (3) dispense with manual featurization in Bayesian optimization. I will also demonstrate that LMGPs can fuse multiple sources of information together without imposing any hard constraints on how information sources, regardless of their fidelity level, should be fused or how the covariance of the errors is structured.

8:40 AM  
Bayesian Estimation and Active Learning of Data-driven Interatomic Potentials for Propagation of Uncertainty through Molecular Dynamics: Dallas Foster1; 1Massachusetts Institute of Technology
    Data-driven interatomic potentials represent a compelling class of techniques for simulating and modeling the atomic interactions of large-scale materials. Accurate linear techniques like the Spectral Neighbor Analysis Potential (SNAP) and the Atomic Cluster Expansion (ACE) derive simple relationships between atomic configurations and their potential energy surface but tend to suffer from extrapolation errors and instabilities during long-time molecular simulations. Bayesian methodologies and active learning strategies seek to mitigate these generalization errors. In Bayesian parameter estimation, generalizability is dependent on ensuring that model densities are representative of physical principles, not solely informed by learned statistical patterns that exist in the training data. We discuss how non-Gaussian assumptions in parameter estimation can make potentials more robust, and how these Bayesian methodologies intersect with components of active learning that depend on having calibrated notions of uncertainty: sampling diverse configurations for training and indicating how trustworthy learned potentials are during molecular dynamics simulations.

9:00 AM  
Quantifying Uncertainty in Atomistic Exploration: Thomas Swinburne1; 1CNRS
    Uncertainty in atomistic simulation arises from cohesive model form and incomplete sampling. The latter is challenging to quantify when targeting observables which depend on unknown and rare thermally activated mechanisms (e.g. diffusion), as any estimate will be vulnerable to the discovery of new mechanisms. I will discuss recent efforts to rigorously quantify sampling uncertainties, producing robust kMC models or reaction-diffusion equations. Using examples of defect diffusion in alloys and structural transformations of atomic clusters, I will show how the bounding and propagation of sampling uncertainty depends critically on both the quantity of interest and the sampling method. Some of these ideas are exploited by the massively parallel TAMMBER code, which autonomously manages sampling effort such that the target uncertainties reduce maximally fast. If time allows I will also discuss how this same framework can be used to propagate and target model form uncertainty. Papers and code: https://tomswinburne.github.io

9:20 AM  
Solving Stochastic Inverse Problems for Property–structure Linkages Using Data-consistent Inversion and Machine Learning: Tim Wildey1; Anh Tran1; 1Sandia National Labs
    Determining process–structure–property linkages is one of the key objectives in material science, and uncertainty quantification plays a critical role in understanding both process–structure and structure–property linkages. In this presentation, we demonstrate how to learn a distribution of microstructure parameters that are consistent in the sense that the forward propagation of this distribution through a computational model matches a target distribution on materials properties. This stochastic inversion formulation infers a distribution of acceptable/consistent microstructures, as opposed to a deterministic solution, which expands the range of feasible designs. To solve this stochastic inverse problem, we employ a recently developed uncertainty quantification framework based on push-forward probability measures to define a unique and numerically stable solution. To reduce the computational burden in solving both stochastic forward and stochastic inverse problems, we combine this approach with a machine learning surrogate models and demonstrate the proposed methodology on two representative case studies in structure–property linkages.

9:40 AM  
Anisotropic Creep Modeling and Uncertainty Quantification of an Electron Beam Melted AM Ni-Based Superalloy: Patxi Fernandez-Zelaia1; Yousub Lee1; Sebastien Dryepondt1; Michael Kirka1; 1Oak Ridge National Laboratory
    Electron beam melting powder bed fusion is a promising technology for fabricating high temperature nickel based superalloys. The microstructure exhibits a strong build direction fiber texture which is believed to drive anisotropic creep behavior. In this work we present a crystal plasticity model, which incorporates non-Schmid effects, developed for additively manufactured IN738LC. The probabilistic calibration is performed to quantify model parameter uncertainty. A sequential design approach is utilized to infer the probability density of these constitutive parameters. The established model is utilized to study the behavior of equiaxed grain clusters sometimes observed within in the nominally columnar structure. Uncertainty is propagated from the constitutive model parameters to the full-field response. These features are a source of localized strain accumulation which agrees with experimentally obtained micrographs.

10:00 AM Break

10:20 AM  
Neural Network Surrogate Predictions with Uncertainties for Materials Science: Sam Reeve1; Paul Laiu1; Pei Zhang1; Ying Yang1; Dongwon Shin1; Jong Youl Choi1; Massimiliano Lupo Pasini1; Dan Lu1; 1Oak Ridge National Laboratory
    Uncertainty bounds are a crucial part of making decisions from computational predictions; however, reliable and tractable methods of calculating those bounds are difficult and method-specific and therefore not regularly applied. In this work we combine the prediction intervals from three neural networks (PI3NN) approach, which avoids the substantial computational cost of ensemble-based uncertainty quantification (UQ), with two distinct data-driven neural network surrogates in materials science. First, we use the HydraGNN multi-task graph convolutional neural network to produce surrogate predictions for atomic systems with per-sample uncertainties from PI3NN. For multiple solid-state and molecular open-source datasets we highlight how the prediction intervals are calculated and further demonstrate the detection of regions with higher uncertainty outside the original training space. Similarly, with recently developed residual neural networks trained on CALPHAD data, we show similar results with the PI3NN method across both in-distribution and out of distribution data (in compositional space).

10:40 AM  
Data-driven Modeling and Control for Temperature-controlled Shear Assisted Processing and Extrusion (ShAPE) using Koopman Operators: Woongjo Choi1; James Koch1; Ethan King1; Colby Wight1; Zhao Chen1; Erin Barker1; Eric Smith1; Jenna Pope1; Keerti Kappagantula1; 1Pacific Northwest National Laboratory
    Advanced manufacturing processes often require repetitive trial-and-error attempts to develop a desirable product due to the nascent physics models available to predict relevant microstructural details and properties. Empirical studies are necessary when the processing exhibits nonlinearity and the complexity of the process is prohibitive for analytical modeling and control. However, generating additional data is resource-intensive and time-consuming. In this study, an active learning strategy for data-driven modeling and control using a Koopman operator is implemented for AA7075 tubes synthesized via Shear Assisted Processing and Extrusion (ShAPETM). The complex nonlinear relationships between ShAPE controlled variables (spindle speed and extrusion rate) and resulting extruded temperature are modeled by linear dynamics in a ‘lifted’ space using a Koopman operator approach, then linear control theory is leveraged to construct a model-based controller. The results for an active learning controller are compared with the results from a conventional PID controller.

11:00 AM  
Active Learning for Density Functional Theory Simulations with DeepHyper: Amit Samanta1; Prasanna Balaprakash2; Sylvie Aubry1; Brian Lin3; 1Lawrence Livermore National Laboratory; 2Argonne National Laboratory; 3ArcelorMittal Global R&D
    Density Functional Theory (DFT) is used extensively in the materials science community to predict fundamental properties. However, DFT simulations are computationally intensive and costly, especially in the context of predicting properties for the development of new alloys. In this work, DFT was first applied to a binary Fe-Mn system to study the effect of Mn on the stacking fault energy. The outputs from DFT were fed into a neural network (NN) architecture search using the DeepHyper package that trains an ensemble of machine learning models, optimized using a genetic algorithm, to predict the configurational energy. The uncertainty associated with the predictions was quantified by DeepHyper, which was accomplished by not just perturbing the parameters of a single NN model, but by using an ensemble of NN models. In doing so, the uncertainty quantification estimates come without systematic bias inherent in methods that rely on a single NN model.

11:20 AM  
Data-driven Structure-property Mapping in Small Data Regime: Towards Increasing Generalizability : Baskar Ganapathysubramanian1; Hao Liu2; Nirmal Baishnab1; Balaji Pokuri1; Olga Wodo2; 1Iowa State University; 2University at Buffalo
     Data-driven approaches become the integral approach to establishing reliable microstructure-property mappings. However, materials science datasets are typically small, or the property evaluations are computationally or experimentally demanding, and this requires approaches that integrate the small datasets or seek smart sampling strategies. However, the generalizability of such models beyond one data set remains to be understood.We utilize the problem of constructing structure-property (SP) models for organic photovoltaics applications (OPV) to understand data-driven SP models. This study explores the following questions: Given a few datasets with distinct microstructure annotated with the short circuit current: Can one derive a transferrable model only from a specific microstructure type and use it in another kind. Will the salient features in individual models be consistent across the independent datasets? And how sensitive are the models to the amount of data used to construct the generalizable model?