Session Sheet - AI for Big Data Problems in Advanced Imaging, Materials Modeling and Automated Synthesis: Accelerating Discovery of Materials

AI for Big Data Problems in Advanced Imaging, Materials Modeling and Automated Synthesis: Accelerating Discovery of Materials
Sponsored by: TMS: Computational Materials Science and Engineering Committee
Program Organizers: Mathew Cherukara, Argonne National Laboratory; Badri Narayanan, University of Louisville; Subramanian Sankaranarayanan, University of Illinois (Chicago)

Monday 8:00 AM
October 18, 2021
Room: A124
Location: Greater Columbus Convention Center

Session Chair: Devang Bhagat, University of Louisville

8:00 AM  Invited
De Novo Inverse Design of Nanoporous Materials by Machine Learning: Mathieu Bauchy¹; ¹University of California, Los Angeles
    Although simulations excel at mapping an input material to its output property, their application to inverse design has traditionally been limited by their high computing cost and lack of differentiability—so that simulations are often replaced by surrogate machine learning models in inverse design problems. Here, taking the example of the inverse design of a porous matrix featuring targeted sorption isotherm, we introduce an inverse design framework that addresses these challenges. We reformulate a lattice density functional theory of sorption in terms of a convolutional neural network with fixed hard-coded weights that leverages automated end-to-end differentiation. Thanks to its differentiability, the simulation is used to directly train a deep generative model, which outputs an optimal porous matrix. Importantly, this pipeline leverages for the first time the power of TPUs—an emerging family of dedicated chips, which, although they are specialized in deep learning, are flexible enough for intensive scientific simulations.

8:30 AM  Invited
Tuning Optoelectronic Properties of Semiconductors with First Principles Modeling and Machine Learning: Arun Kumar Mannodi Kanakkithodi¹; Maria Chan²; ¹Purdue University; ²Argonne National Laboratory
    Semiconductors with desirable electronic band structure and optical absorption are sought for solar cells, electronic devices, infrared sensors and quantum computing. In this work, we develop AI-based frameworks for the on-demand prediction of the phase stability, band gap, optical absorption spectra, photovoltaic figures of merit, dielectric constant, defect formation energies, and impurity energy levels in two broad classes of semiconductors, namely (a) halide perovskites, and (b) group IV, III-V and II-VI semiconductors. This framework is powered by high-throughput density functional theory (DFT) computations, unique encoding of the atom-composition-structure information, and rigorous training of advanced neural network-based predictive and optimization models. Multi-fidelity learning is applied to bridge the gap between (high quantities of) low accuracy calculations and (lower quantities of) accurate, expensive computations and experimental measurements. AI-based recommendations are synergistically coupled with targeted synthesis and characterization, leading to successful validation and discovery of novel compositions for improved performance in solar cells.

9:00 AM
Machine Learning Polymer Property Prediction Models with Polymers Represented as Natural Language: Christopher Kuenneth¹; Rampi Ramprasad¹; ¹Georgia Institute of Technology
    Polymer informatics tools have been recently gaining ground to design and discover polymers that meet specific application needs. A critical component of such tools is the conversion of polymers to machine readable representations (so-called fingerprints). The fingerprinting process has so far been based on handcrafted approaches that capture key chemical and structural features. Recently, within the domain of natural language processing, transformer-based ML models have demonstrated a new, fully ML based path to obtain fingerprints of language. Here, we view SMILES strings as a language representation of polymers, and use them to train a transformer based ML model using more than 100 million SMILES strings. The performance of the so-derived fingerprints are compared with traditional fingerprints using a large polymer property data set. Our new approach has a similar prediction performance compared to the existing state-of-the-art methods, but is faster, more flexible, and allows us to create fully-autonomous ML pipelines.

9:20 AM  Invited
Aluminum Alloy Design Using Physics Informed Machine Learning: Fatih Sen¹; Marat Latypov¹; Heath Murphy¹; Kyle Haines¹; Shruthi Raj¹; Aurele Mariaux²; Sazol Das¹; David Anderson¹; Debdutta Roy¹; Yudie Yuan¹; Vishwanath Hegadekatte¹; ¹Novelis R&D Center, Kennesaw GA; ²Novelis R&D Center, Sierre, Switzerland
    Automotive manufacturers are increasingly using aluminum alloys to reduce the weight of vehicles and as a result improve their fuel efficiency/performance. Advanced aluminum alloys with targeted properties needs to be developed faster, especially with the introduction of electric vehicles. In the current work, we have developed a physics informed machine learning framework to explore 6xxx aluminum alloy chemistry space for targeted product performance criteria. Computational thermodynamics methods were used to estimate microstructure features on historical alloy design data. These features were integrated with machine learning (ML) methods to estimate strength, ductility, formability and corrosion performance of 6XXX aluminum alloys. We used multi-objective genetic algorithm optimization to identify alloy chemistries that met or exceed the performance targets for specific automotive applications. We conducted a lab trial for selected alloys and their properties were evaluated and compared with the ML model predictions. The framework developed enables accelerated design of aluminum alloys.

9:50 AM
Refinements to the Production of Machine Learning Interatomic Potentials: Jared Stimac¹; Jeremy Mason¹; ¹University of California, Davis
    Machine learning potentials (MLP) have the potential to allow dramatically accelerated simulations of atomic systems with the accuracy of quantum mechanical techniques through the use of supervised regression algorithms. The price to be paid is that MLPs have a higher computational expense than empirical potentials, both during construction and for every evaluation of the potential energy. In pursuit of reducing these costs and alleviating the necessity for enormous datasets, our framework for producing MLPs combines an efficient implementation of a sparse Gaussian process algorithm with a novel set of descriptors for atomic environments. It is intended that the descriptors be an injective embedding that imposes minimal distortion and that the sparse Gaussian process selects as few inducing points—which dominate the computational complexity in all respects—as necessary. To this end, we aim to produce better performing potentials with less training and data than competing frameworks.

10:10 AM Break

10:30 AM  Invited
Discovery of Novel Crystal Structures via Generative Adversarial Networks : Taylor Sparks¹; Michael Alverson¹; ¹University of Utah
    The idea of material discovery has excited and perplexed scientists for centuries. Several methods have been employed to find new types of materials, ranging from replacement of atoms in a crystal structure to advanced machine learning methods for predicting entirely new crystal structures. In this work, we investigate the performance of various Generative Adversarial Network (GAN) architectures to find innovate ways of generating theoretical crystal structures that are synthesizable and stable. Over 300,000 entries from Pearson’s Crystal Database are used for the training of each GAN. The space group number, atomic positions, and lattice parameters are parsed and used to construct an input tensor for each of the network architectures. Several different GAN layer configurations are designed and analyzed, including Wasserstein GANs with weight clipping and gradient penalty, in order to identify a model that can adequately discern symmetry patterns that are present in known material crystal structures.