Joint Sessions of AIM, ICME, & 3DMS: Fair Data I
Program Organizers: TMS Administration

Monday 9:10 AM
June 16, 2025
Room: Platinum Ballroom 7&8
Location: Anaheim Marriott

Session Chair: Deepali Patil, Worcester Polytechnic Institute


9:10 AM  
Transforming Materials Science With Concepts for a Semantically Accessible Data Space: Bernd Bayerlein1; Markus Schilling1; Henk Birkholz2; Philipp von Hartrott3; Jörg Waitelonis4; Alden A. Dima5; June W. Lau5; 1Bundesanstalt für Materialforschung und -prüfung; 2Leibniz-Institut für Werkstofforientierte Technologien; 3Fraunhofer-Institut für Werkstoffmechanik; 4Leibniz-Institut für Informationsinfrastruktur; 5National Institute of Standards and Technology
    The digital transformation in materials science enables more efficient and sustainable processes. Through technological adaptations and a commitment to the FAIR principles, materials and processes are holistically addressed across entire value chains. The Platform MaterialDigital (PMD) and related initiatives are developing innovative solutions to the challenges of digitalization. The focus is on the interoperable integration of heterogeneous materials and processes data in semantically accessible data spaces. An ontological framework, based on the PMD Core Ontology and application-specific ontologies, promotes semantic interoperability of cross-domain and multi-scale (meta)data. This framework can be extended through natural language processing in a script-supported manner, as demonstrated with the Microscopy Ontology. The presentation further highlights how freely available mechanical and microstructural datasets of various aging stages of an aluminum alloy can be semantically integrated and flexibly searched. Graph-based operations enable links between processing and microstructural properties to be established, facilitating enhanced correlation analysis and pattern recognition.

9:30 AM  
Towards Structured Data Spaces: Prototypical Application of Semantic Technologies as a Driver for Innovation in Materials Science: Markus Schilling1; Bernd Bayerlein1; Philipp von Hartrott2; Jörg Waitelonis3; Henk Birkholz4; Birgit Skrotzki5; 1Federal Ministry of Materials Research and Testing; 2Fraunhofer-Institut für Werkstoffmechanik (IWM); 3FIZ – Leibniz Institute for Information Infrastructure; 4Leibniz-Institut für Werkstofforientierte Technologien IWT; 5Bundesanstalt für Materialforschung und -prüfung (BAM)
     In the pursuit of advancing development and digitalization within materials science, ensuring quality assurance, interoperability, and adherence to FAIR principles is significant. To address these aspects, semantic technologies are employed for storage, processing, and contextualization of data, offering machine-actionable and human-readable knowledge representations crucial for data management. This presentation showcases the prototypical application of generic approaches of knowledge representation in materials science. It includes the design and documentation of graph patterns that may be compiled into rule-based semantic shapes. The development and application of the PMD Core Ontology 3.0 (PMDco 3.0) tailored for materials science is highlighted. Its integration into daily lab life is demonstrated through its functional incorporation into electronic lab notebooks (ELN). Examples of material processing and standardized mechanical testing illustrate how knowledge graph operations enhance ELN capabilities, providing a generalizable unified approach for managing diverse experimental data from different sources with automation potentials.

9:50 AM  Cancelled
Ontology-Based Materials Data Management for High Temperature Alloy Oxidation Data: Madison Wenzlick1; William Trehern1; Leebyn Chong1; Casey Carney1; Michael Gao1; Richard Oleksak1; Wissam Saidi1; 1National Energy Technology Laboratory
    High quality, digitally structured data is essential for creating reliable data-driven models. However, managing materials datasets presents several challenges including variable data types, formats, and relationships. Ontology-based data management is a re-emerging tool for structuring and defining materials data. In this work, an ontology was generated and applied to high temperature alloy oxidation data collected from both open-source literature as well as in-house testing. Metadata and data attributes from each study were standardized and translated into the ontology structure. The oxidation ontology further integrates with an existing mid-level materials data ontology to support the curation of additional materials information. This framework aids in encoding the complex relationships between the data in both a machine-readable and human-readable manner, enabling the ongoing interpretability of the dataset. The ontology can further be leveraged to support the flow of data from laboratory to database and provides a resource for advanced data-driven modeling and analysis.

10:10 AM  
FactoryNet: A Labeled Image Dataset for the Manufacturing Environment: Erick Braham1; Andrew Bowman1; William Bernstein2; 1UES / Air Force Research Laboratory; 2Air Force Research Laboratory
    Human labeled image datasets are essential to training and developing AI models. Most image datasets with high volumes of data contain low specificity of image classes. A public open image dataset focused on the manufacturing environment with a high volume of images in the manufacturing domain would benefit the development of visual AI for manufacturing. The FactoryNet dataset is a growing, high-quality labeled image dataset of in-context images for the manufacturing community. Initial efforts to build the dataset have utilized web scraping, factory scans, and industry collaborations. We present the first iteration of the human labeled dataset, looking at the data sanitization and design choices taken to this point. We look toward the future of this resource and how it is and will continue to be an open resource for the manufacturing community.

10:30 AM Break

10:50 AM  
Customizing the NIMS RDE System for Optimal Data Management: Takuya Kadohira1; Jun Fujima1; Hideki Yoshikawa1; Satoshi MInamoto1; 1National Institute For Materials Science
    The preparation of FAIR data is crucial in AI research. While basic research on materials has successfully applied FAIR principles to simulation data, applying these principles to experimental data is challenging due to quality differences; the simulation conditions are fully known, unlike the experimental ones. The NIMS RDE system, a part of the Materials Data Platform, addresses FAIRness at data submission through input and ETL functions that are customizable for different experiments. This flexibility theoretically supports the management of experimental results as FAIR data, enhancing reproducibility and reducing redundant experiments. However, the extensive customization flexibility can be challenging, as it requires identifying how much customization is truly necessary and finding ways to minimize the effort involved in such customization. Balancing this challenge involves developing a data model that identifies essential FAIRness aspects for data utilization and provides a framework for function customization.

11:10 AM  
NIMS's Data-Driven Materials Research Platform: Enhancing MLOps With Literature-Based Data Integration: Toshihisa Anazawa1; Hayato Sonokawa1; Hideki Yoshikawa1; Takuya Kadohira1; Satoshi Minamoto1; 1National Institute for Materials Science
     NIMS Materials Data Platform (MDPF) supports data-driven materials research through "DICE", a system that comprehensively handles the creation, accumulation, and utilization of materials data. MDPF is currently developing a materials development platform using MLOps by connecting systems named "RDE" (data accumulation), "pinax" (modeling), and "MInt" (execution). Simultaneously, we're training coordinators to promote data-driven materials development. While pinax can use developers' own data as well as RDE and other public databases, materials are diverse in their applications. Even when properties are reported in academic papers, they are often not databased in a reusable form.Regarding this, we propose a tool for MLOps that searches related papers, collects desired property data and process conditions from relevant papers, and aggregates them as machine learning data. This system will not only incorporate the aggregated data into pinax for MLOps but also utilize metadata extracted from papers as reference for materials developers and the coordinators.

11:30 AM  
Modular and Interoperable Materials Data Science Ontology (MDS-Onto) for Knowledge Graphs and Semantic Reasoning: Erika Barcelos1; Balashanmuga Priyan Rajamohan1; Quynh Tran1; Van Tran1; Kai Zheng1; Nathaniel Hahn1; Hayden Caldwell1; Ozan Dernek1; Pawan Tripathi1; Yinghui Wu1; Laura Bruckman1; Roger French1; 1Case Western Reserve University
    Research data and metadata often have non-standard terms and formats. The terms and concepts are frequently subjective and adopted based on local experience and choices of research teams, and this poses major challenges for data reusability and research reproducibility. Ontologies represent a critical component toward FAIRification of data, achieving semantic interoperability and reducing barriers for data sharing, usability, analysis and modeling. In addition, they serve as the backbone of knowledge graphs, where data and results, standardized by ontologies enabling semantic reasoning and inferences. In this work we introduce the Materials Data Science Ontology(MDS-Onto), a low-level, interoperable, ontology built on a modular framework that simplifies ontology alignment by mapping MDS-Onto concepts to midlevel ontologies such as the Platform Material Digital Core Ontology (PMDco) and the Quantities, Units, Dimensions and Types (QUDT) Ontology terms. It encompasses over 20 domains in materials data science and provides a common foundation for semantic triples of the Resource Description Framework (RDF) data model. These RDF triples, or RDF statements, when stored in a graph database such as Ontotext’s GraphDB, form knowledge graphs, over which people and machines can perform semantic reasoning using SPARQL queries. Knowledge graphs enable machines to reason over ing enables can span. These knowledge graphs enable machines to reason over billions of RDF statements, representing for materials data science the ability to reason over historical data and results at scale.

11:50 AM  Cancelled
NFDI MatWerk Ontology: A Framework for FAIR Data Management in the Materials Science and Engineering: Hossein Beygi Nasrabadi1; E. Norouzi1; K. Hubaiev1; H. Fliegl1; V. Hofmann2; A. Azócar Guzmán2; S. Fathalla2; A. Z. Ihsan2; S. Sandfeld2; A. Gedsun3; J. Waitelonis1; H. Sack1; 1FIZ Karlsruhe – Leibniz-Institute for Information Infrastructure; 2Institute for Advanced Simulations – Materials Data Science and Informatics (IAS 9); 3Albert-Ludwigs Universität Freiburg
    The National Research Data Infrastructure (NFDI) is a German initiative established in 2020 to create a standardized and sustainable research data infrastructure across various disciplines. Within this framework, NFDI-MatWerk focuses on creating a digital infrastructure for Materials Science and Engineering (MSE) to facilitate improved data sharing and collaboration in this domain. This presentation introduces the NFDI MatWerk Ontology (MWO) version 3.0.0, a foundational framework designed to structure research data and enhance interoperability within the MSE community. MWO V3.0.0 has been mapped to the Basic Formal Ontology (BFO) to ensure comprehensive integration and compliance with top-level ontology standards. Additionally, it utilizes the NFDIcore mid-level ontology's modular approach, enriching metadata through standardized classes and properties. By serving as the backbone of a knowledge graph for MSE, MWO significantly enhances data discoverability and reusability, thereby accelerating materials development and technological innovation through optimized scientific exchange and data management.