Session Sheet - Materials Informatics for Images and Multi-dimensional Datasets: Session II

Materials Informatics for Images and Multi-dimensional Datasets: Session II
Sponsored by: ACerS Basic Science Division, ACerS Electronics Division
Program Organizers: Amanda Krause, Carnegie Mellon University; Alp Sehirlioglu, Case Western Reserve University; Daniel Ruscitto, GE Aerospace Research

Wednesday 2:00 PM
October 12, 2022
Room: 310
Location: David L. Lawrence Convention Center

Session Chair: Amanda Krause, Carnegie Mellon University

2:00 PM
Materials Data Science for Reliability: Data Handling: Laura Bruckman¹; ¹Case Western Reserve University
    Materials data science for reliability requires large amounts of data and data types, often from disparate datastreams. This requires a different approach to data collection, handling, and curation in order to build models including data-drive models, machine learning, and graph models. FAIRification of data and models is necessary to combine datasets efficiently through time. Ontologies are necessary to link datasets together in schemas to provide insights into the relationships between data in knowledge graphs. Defects in photovoltaic (PV) cells are identified in multiple different image types including fluorescence, electroluminescence, IR, and white light imaging which all identify different defect features and degradation signatures. These images must be combined into hyper images for better feature extraction of large sets of images which are then coupled with I-V curve and power data from these cells.

2:40 PM
Neighborhood Maps for Discovery of Novel Materials in Reduced Dimensions Using Machine Learning: Suchismita Goswami¹; V. Stanev²; H. Liang²; I. Takeuchi²; ¹MEST; ²UMD
    Machine learning techniques are being used to discover novel materials, compounds and molecules. The mapping of atomistic materials into feature vectors is an important step prior to implementation of any machine learning algorithms, consisting of both the unsupervised tasks for underlying patterns and the supervised learning tasks for prediction. Here we implement Python based libraries to featurize crystallographic information files (CIFs) into numerical descriptors with JarvisCFID and Sine Matrix methods. The Sine Matrix descriptor mostly calculates Columb interactions between atoms in a periodic system with reduced computational cost. We then project the high dimensional featurized data into a two-dimensional space using the t-Stochastic Neighbor Embedding and the Uniform Manifold Approximation and Projection methods. Such projected data usually create maps of neighbors for visualization around a user defined compound for prediction novel compounds. Here we present neighborhood maps for identifying similar novel materials of magnetic materials and Li-based compounds.

3:00 PM
Machine Learning Enabled Reproducible Data Analysis for Electron Microscopy: Xiaoting Zhong¹; Nestor Zaluzec²; Yu Lin³; Jiadong Gong³; ¹Lawrence Livermore National Laboratory; ²Argonne National Laboratory; ³QuesTek Innovations
    We show that region-based convolutional neural network (R-CNN) models can detect/segment complicated microstructure features in two distinct material systems using a few (<20) training images. Our first material system contains biological vesicles. It is imaged using transmission electron microscopy (TEM) bright field mode. The vesicles images are challenging because they have low contrast. We trained a mask-RCNN and achieved an 86.64 % validation mean average precision (mAP) for the large vesicle segmentation task. Our second material system contains nano-rods. It is imaged using the high angle annular dark field (HAADF) technique. The nano-rod images are challenging because individual nano-rods highly overlap each other. We trained a faster-RCNN model and achieved a 50.51 % validation mAP for the nano-rod detection task. Accurate and reproducible microstructure statistics can be computed from the ML processed TEM images with fast speed. This approach opens exciting opportunities for automated EM image analysis.

3:20 PM Break

3:40 PM
Computer Vision Applications in Materials Science and Engineering: Aroba Saleem¹; Idris Jeelani¹; ¹University of Florida
    Computer vision (CV) focuses on the extraction and analysis of meaningful information from digital images. While CV is being widely being used in fields, such as robotics, aerospace, transportation and medicine, its applications in Materials Science and Engineering (MSE) have been limited and sporadic, despite numerous potential benefits. CV techniques, such as image classification, semantic segmentation, and object detection, can be used for developing new approaches to solve a variety of MSE problems. These include microstructural characterization, phase identification, phase transformations, defect detection and classification, material twinning, crystal structure identification, plasticity, and fracture analysis. This study presents the results of an exploratory study aimed at investigating different CV techniques that have been used/have potential future use. This study maps different CV techniques to different application areas within MSE, demonstrates their use, identifies the challenges in implementing CV and provided a future research roadmap.

4:00 PM
Combining Limited Image and Tabular Data to Understand Failure Modes in Metals: Jonathan Owens¹; Andrew Detor¹; Jason Parolini²; Daniel Ruscitto¹; ¹GE Global Research; ²GE Gas Power
    The "small data" problem appears frequently in industrial applications of machine learning. This relative dearth of data makes the automated extraction of meaningful microstructural properties from images and tabular data difficult, as pre-trained models are typically too far out-of-distribution for specific uses cases. In this talk, we explore various methods to combine tabular and image data to understand failure modes in metals via fracture surface analysis, including classical and deep-learning-based approaches. We examine methods such as support-vector-machines, UNet, and Mask-RCNN to classify the failure mode as a function of an input image and tabular data.