Regression with Large Language Models for Materials and Molecular Property Prediction

This paper demonstrates that fine-tuned LLaMA 3 models, using only composition-based input strings, can effectively predict molecular and material properties with accuracy rivaling standard machine learning models on the QM9 dataset and outperforming GPT-3.5 and GPT-4o, thereby showcasing the potential of large language models to transcend their traditional applications and tackle complex physical phenomena in materials science.

Original authors: Ryan Jacobs, Maciej P. Polak, Lane E. Schultz, Hamed Mahdavi, Vasant Honavar, Dane Morgan

Published 2026-04-22
📖 5 min read🧠 Deep dive

This is an AI-generated explanation of the paper below. It is not written or endorsed by the authors. For technical accuracy, refer to the original paper. Read full disclaimer

Imagine you have a super-intelligent robot librarian named LLaMA. For years, this librarian has been famous for one thing: reading books, writing stories, translating languages, and chatting with people. It's a master of words.

But what if you asked this word-loving robot to do something completely different? What if you asked it to act like a scientist and predict the physical properties of materials—like how strong a metal is, how well a battery conducts electricity, or how much energy a molecule holds?

This paper is the story of an experiment where the researchers tried to turn LLaMA from a "word wizard" into a "science predictor."

The Big Experiment: Can a Librarian Do Math?

Usually, to predict how a material behaves, scientists use complex math models that need a very detailed "blueprint" of the material. They need to know exactly where every single atom is sitting, like a 3D map of a city.

But the researchers wanted to see if LLaMA could do it with just a name.

  • The Input: Instead of a 3D map, they just gave LLaMA a text string. For a metal, it was just the chemical recipe (e.g., "Al2O3"). For a molecule, it was a text code called SMILES (which is like a shorthand recipe for how atoms are connected).
  • The Task: They asked LLaMA to read that text string and guess a number (like "This metal will melt at 1200 degrees").

The Results: The "Good, Better, and Best"

Here is how LLaMA performed, using some simple analogies:

1. The "Zero-Shot" Attempt (The Untrained Librarian)
First, they asked LLaMA to guess without any training. It was a disaster. The librarian just stared blankly, said nothing, or guessed wild numbers like "-1000 degrees."

  • Lesson: You can't just ask a word-expert to do science without teaching it the rules first.

2. The "Fine-Tuned" Attempt (The Trained Librarian)
Then, they "fine-tuned" LLaMA. Imagine feeding it thousands of examples: "Here is the text 'Al2O3', and here is the correct answer: 'Melting point is 2072°C'." They did this for hundreds of thousands of examples.

  • The Result: LLaMA got really good! It started predicting numbers with surprising accuracy.
  • The Comparison:
    • Vs. Random Guessing: LLaMA crushed it.
    • Vs. Standard Computer Models (Random Forests): LLaMA was competitive. It was often just as good as, or slightly better than, standard computer models that have been used for years.
    • Vs. The "Super-Scientists" (State-of-the-Art AI): This is where LLaMA stumbled. The absolute best AI models in the world (like PAMNet) use those detailed 3D maps of atoms. Because LLaMA only had the "text recipe" and not the "3D map," it was about 5 to 10 times less accurate than the super-scientists.

3. The "Text Format" Debate (SMILES vs. InChI)
The researchers tried giving LLaMA different types of text recipes.

  • SMILES: A shorter, simpler text code.
  • InChI: A longer, more complex text code.
  • The Winner: LLaMA learned much faster and better with SMILES. It's like trying to learn a recipe from a short, punchy list of ingredients (SMILES) versus a long, legalistic paragraph (InChI). The shorter, simpler text was easier for the robot to understand.

4. The "Coordinates" Test (Giving the 3D Map)
They tried giving LLaMA the full 3D coordinates of the atoms (the detailed blueprint) to see if that would help.

  • The Surprise: It didn't help much! Even with the full 3D map, LLaMA didn't get significantly better. This suggests that LLaMA is really good at finding patterns in text, but it might not be the best tool for processing complex 3D geometry yet.

Why Does This Matter?

Think of it like this:

  • Traditional AI is like a specialized mechanic. If you want to fix a specific car engine, they are the best. But if you bring them a boat engine, they might not know what to do. They need specific tools (detailed data) for every job.
  • LLaMA (The LLM) is like a universal translator. It doesn't have specialized tools, but it understands the language of the problem. If you can describe the material in text, LLaMA can learn the relationship between the description and the outcome.

The Takeaway:
This paper proves that Large Language Models (LLMs) are versatile. They aren't just for writing emails or chatting; they can actually learn to predict physical science facts just by reading text descriptions.

  • Pros: You don't need to build complex 3D models or do heavy data engineering. You just need the text name of the material. It's a "plug-and-play" approach for science.
  • Cons: It's currently slower and less accurate than the specialized "super-scientist" AI models that use detailed 3D data.

The Bottom Line

The researchers are saying: "We found a new way to use these AI chatbots. They aren't perfect scientists yet, but they are surprisingly good at guessing material properties just by reading their names. This opens the door for using these powerful, flexible tools in chemistry and materials science, potentially making it easier to discover new materials in the future."

Drowning in papers in your field?

Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.

Try Digest →