Abstract Scope |
Discovering materials that meet specific functional requirements is crucial for technological progress but remains challenging due to the vastness of chemical space. Exploring this vast chemical space using experiments or Density Functional Theory is resource-intensive, slow, and expensive. Generative AI models like GANs and diffusion models offer faster alternatives but face challenges with conditional generation from natural language, necessitating significant architectural changes. To address these challenges, we introduce CrysText, an advanced framework that fine-tunes open-source large language models (LLMs) to generate Crystallographic Information Files (CIFs) from simple text descriptions conditioned on composition, symmetry, and thermodynamic stability. Using parameter-efficient fine-tuning (PEFT), CrysText balances accuracy with computational efficiency. Evaluated on the MP-20 benchmark dataset, CrysText achieves high structure match rates and low root mean square error. It also generates novel and stable structures guided by energy above hull values, offering fast, scalable, accurate, and targeted materials design, thus accelerating material discovery. |