About this Abstract |
Meeting |
2026 TMS Annual Meeting & Exhibition
|
Symposium
|
AI/ML/Data Informatics for Materials Discovery: Bridging Experiment, Theory, and Modeling
|
Presentation Title |
Introducing XRDReader for Automated Extraction and Library Generation of X‑Ray Diffraction Data from Scientific Literature |
Author(s) |
Ulizes Atlixqueno, Afnan Mostafa, Niaz Abdolrahim |
On-Site Speaker (Planned) |
Ulizes Atlixqueno |
Abstract Scope |
Extracting quantitative X‑ray diffraction (XRD) data from the materials‐science literature is often laborious and error‑prone. We introduce XRDReader, a novel tool that leverages Google Gemini’s large‑language‑model capabilities to automate end‑to‑end extraction and compilation of XRD patterns into a publicly accessible library. XRDReader first retrieves open‑access publications, then applies prompt‑engineered pipelines for document filtering and image classification to isolate relevant XRD graphics. It synthesizes textual metadata, digitizes noisy diffraction plots into clean datasets, and formats the results into a standardized database. Through our iterative prompt engineering process for image classification and paper filtration in extraction and verification, Gemini achieves high accuracy in identifying and describing desired XRD figures. The resulting open‑access XRD dataset and modular XRDReader establish a new paradigm for scalable literature mining in scientific research. |
Proceedings Inclusion? |
Planned: |
Keywords |
Other, Other, Other |