About this Abstract |
Meeting |
2026 TMS Annual Meeting & Exhibition
|
Symposium
|
AI/ML/Data Informatics for Materials Discovery: Bridging Experiment, Theory, and Modeling
|
Presentation Title |
An Agentic Framework for Extracting Structured Knowledge from Materials Science Literature |
Author(s) |
Hasan Muhammad Sayeed, Casey Clark, Taylor Sparks |
On-Site Speaker (Planned) |
Hasan Muhammad Sayeed |
Abstract Scope |
We present a multi-agent LLM system for transforming unstructured materials science literature into structured datasets. Leveraging GPT-based models with tool-calling capabilities, the system extracts key information such as compositions, processing conditions, characterization methods, and properties from scientific PDFs. The extraction process is enhanced by a sequence of reasoning agents that dynamically tailor prompts, assess extraction quality, and consolidate results. A simple web interface enables users to upload documents, select models, and export outputs in CSV format for downstream analysis or machine learning workflows. By combining large language models with lightweight agentic reasoning, this framework enables scalable, accurate data extraction from literature while reducing manual effort. This work aims to accelerate materials informatics pipelines by bridging the gap between textual knowledge and structured data formats. |
Proceedings Inclusion? |
Planned: |
Keywords |
Machine Learning, Extraction and Processing, Other |