About this Abstract

Meeting

MS&T25: Materials Science & Technology

Symposium

Enhancing the Accessibility of Machine Learning-Enabled Experiments

Presentation Title

Hypothesis Formation and Predictive Modeling of 2D Perovskite Spacer Cations Using Retrieval Augmented LLMs and Deep Kernel Learning

Author(s)

Jordan Marshall, Elham Foadian, Sheryl Sanchez, Utkarsh Pratiush, Rushik Desai, Mahshid Ahmadi, Sergei Kalinin, Arun Kanakkithodi

On-Site Speaker (Planned)

Jordan Marshall

Abstract Scope

In this work, we introduce a dynamic, hypothesis-driven framework that connects large language models (LLMs) and machine learning to accelerate the discovery of novel spacer cations for quasi-2D perovskite materials. By combining Retrieval-Augmented Generation (RAG)-powered literature mining with predictive modeling, we map underexplored regions of chemical space with greater speed and precision. Our pipeline rapidly transforms sprawling scientific literature into structured, machine-learning-ready datasets. We will share how we identified Google's NotebookLM as the optimal extraction tool, designed a rich molecular descriptor set blending cheminformatics and DFT features, and trained a Deep Kernel Learning model that fuses graph embeddings with uncertainty-aware prediction. We will also explore how active learning strategies prioritized new spacer candidates for experimental validation. This talk will focus on the challenges and breakthroughs in scaling LLM-driven hypothesis formation and discuss how bridging natural language understanding with predictive modeling is reshaping materials discovery.

OTHER PAPERS PLANNED FOR THIS SYMPOSIUM

Accelerating Scientific Discovery with Machine Learning: Data Analysis for Computational Beamlines

Adaptive Workflows for Lab of the Future

ATOMIC: Autonomous Characterization of 2D Materials Through Foundation Models

Autonomous Atomic Force Microscopy using Large Language Model Agents

DiffractGPT: Atomic Structure Determination from X-ray Diffraction Patterns Using a Generative Pretrained Transformer

Foundational Workflows for Processing Legacy Data and Realizing Domain-Specific Multi-Modal AI Models

From automated to autonomous – creating a general active learning service for self-driving laboratories

High-throughput, Ultra-fast Laser Sintering of Ceramics and Machine-learning Based Prediction on Processing-Microstructure-Property Relationships

Hypothesis Formation and Predictive Modeling of 2D Perovskite Spacer Cations Using Retrieval Augmented LLMs and Deep Kernel Learning

Ptychography Data Pipelines at the Advanced Photon Source

Pycroscopy, AEcroscopy, and data workflows: integrating customized control, data analysis and workflows in an autonomous microscopy facility

Questions about ProgramMaster? Contact programming@programmaster.org | TMS Privacy Policy | Accessibility Statement