🔬 materials science

From Literature to Lab: Closed-Loop Advancement of Perovskite Solar Cells via Domain Knowledge Guided LLM

This paper introduces PVK-LLM, a domain-knowledge-guided framework that integrates Large Language Models with hierarchical Bayesian Optimization to autonomously discover a novel, high-efficiency perovskite solar cell recipe achieving over 26.0% power conversion efficiency, thereby overcoming the limitations of general LLMs in navigating complex material design spaces.

Original authors: Penglei Sun, Shuyan Chen, Xiang Liu, Longhan Zhang, Huajie You, Chang Yan, Yongqi Zhang, Xiaowen Chu, Tong-yi Zhang

Published 2026-02-06

📖 5 min read🧠 Deep dive

CC BY 4.0

Original authors: Penglei Sun, Shuyan Chen, Xiang Liu, Longhan Zhang, Huajie You, Chang Yan, Yongqi Zhang, Xiaowen Chu, Tong-yi Zhang

Original paper licensed under CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). ✨ This is an AI-generated explanation of the paper below. It is not written or endorsed by the authors. For technical accuracy, refer to the original paper. Read full disclaimer

The Big Picture: From a Library to a Laboratory

Imagine trying to build the perfect cake. You have a library with millions of cookbooks (scientific literature), but the recipes are written in different languages, some are missing ingredients, and the instructions are vague. You also have a very smart robot chef (a Large Language Model, or LLM) who has read all those books.

The Problem: Even though the robot chef is brilliant at general conversation, if you ask it to bake a specific type of high-tech "Perovskite Solar Cell" cake, it often fails. It might suggest mixing ingredients that don't go together, or it might guess random amounts because it doesn't truly understand the chemistry of the cake. It's like a chef who knows the word "flour" but doesn't know how much to use for a soufflé versus a bread.

The Solution: The researchers built a specialized robot chef named PVK-LLM. They didn't just give it the library; they trained it specifically to become a master of Perovskite solar cells. They taught it to read the cookbooks, understand the chemistry, and then use a smart navigation system to find the perfect recipe without wasting time on bad guesses.

How They Trained the Robot (The Three-Step School)

To turn a general smart robot into a solar cell expert, they used a "curriculum" (a school plan) with three stages:

Stage 1: The Textbook Phase (Knowledge Injection)
- The Analogy: Imagine forcing the robot to read 4,000+ scientific papers and then taking a quiz on them.
- What happened: They fed the robot a massive dataset of questions and answers about solar cells. It learned the vocabulary, the rules, and the basic science. It went from knowing "what a solar cell is" to understanding "how to make one work better."
Stage 2: The Citation & Lab Phase (Instruction Alignment)
- The Analogy: Now the robot has to prove it didn't just memorize the answers. It has to show its homework (citations) and learn to read a lab notebook (experimental data).
- What happened: They taught the robot to say, "I know this works because this specific paper says so," and to look at a table of numbers and explain why a recipe worked or failed. This stopped it from making up fake science.
Stage 3: The Live Update Phase (Knowledge Graph)
- The Analogy: The library is always getting new books. The robot needs a way to check the latest news without re-reading the whole library every day.
- What happened: They built a "smart map" (a Knowledge Graph) of all the connections between materials. If a new paper comes out, the robot can instantly look up the map to see how this new ingredient fits with old ones.

The Navigation System: Finding the Needle in the Haystack

Once the robot is smart, it needs to find the best recipe. The space of possible recipes is huge—like trying to find the perfect combination of spices in a warehouse the size of a city.

The Old Way: Blindly mixing random spices until something tastes good. This takes forever.
The New Way (PVK-BO): The robot uses a "smart compass" called Bayesian Optimization.
- Because the robot already knows the "rules of the game" (domain knowledge), it doesn't start by guessing randomly. It starts with a "warm start"—a very good guess based on what it learned in school.
- It then runs a simulation (a virtual lab) to test its guess.
- If the virtual test fails, the robot learns why and adjusts the recipe.
- It repeats this loop, getting smarter and closer to the perfect recipe with every try.

The Real-World Test: The Wet-Lab Experiment

The researchers didn't just stop at computer simulations. They let the robot run a real experiment in a physical lab (a "wet lab").

The Goal: Make a solar cell that converts light to electricity as efficiently as possible (measured by PCE).
The Process:
1. Start: The robot suggested a standard recipe. It worked okay (23.68% efficiency).
2. Diagnosis: The robot looked at the results and said, "The problem is the 'passivation layer' (a protective coating). We need to mix four specific ingredients in a very precise way."
3. Iteration: The robot suggested a new mix of four ingredients (3MTPAI, PDAI2, EDAI2, and PipDI).
4. Result: After just a few rounds of testing and tweaking, the robot designed a recipe that achieved 26.0% efficiency.

Why is this a big deal?
Reaching 26% is a world-class score. Usually, scientists take years of trial-and-error to get there. This robot did it autonomously, discovering a combination of four ingredients that had never been reported in literature before. It essentially "invented" a new, better recipe on its own.

Summary

This paper shows that if you take a general AI, teach it the specific rules of a complex science (Perovskite solar cells), and give it a smart way to navigate the search for the best solution, it can act like a super-expert scientist. It can read the literature, understand the data, and autonomously design a winning experiment that rivals the best human experts.

Technical Summary: From Literature to Lab: Closed-Loop Advancement of Perovskite Solar Cells via Domain Knowledge Guided LLM

1. Problem Statement

Perovskite solar cells (PSCs) represent a disruptive next-generation photovoltaic technology with efficiencies exceeding 27%. However, their advancement is hindered by a vast, high-dimensional design space encompassing precursor compositions, solvent engineering, and processing parameters. Traditional researcher-driven trial-and-error approaches are laborious and inefficient, often failing to identify global optima within this complex combinatorial landscape.

While Large Language Models (LLMs) offer potential for leveraging textual scientific knowledge, existing models (e.g., GPT Series, ChemCrow, Coscientist) face critical limitations in the specialized domain of PSCs:

Semantic Misalignment: They struggle to align general semantic reasoning with precise, quantitative perovskite domain knowledge, often generating scientifically invalid recipes.
Inefficiency in High-Dimensional Spaces: They lack the ability to perform quantitative extrapolation in multi-parametric design spaces, making them ineffective for orchestrating the closed-loop optimization required for device breakthroughs.

2. Methodology: The PVK-LLM Framework

The authors propose PVK-LLM, a domain-knowledge-guided framework designed to bridge the gap between unstructured scientific literature and precise materials recipes. The methodology consists of two primary components: a specialized LLM trained via curriculum learning and a hierarchical Bayesian Optimization workflow (PVK-BO).

A. PVK-LLM: Three-Stage Curriculum Learning

Built upon the Qwen2.5-32B backbone, the model undergoes a three-stage training strategy to transform a general LLM into a domain expert:

Stage I: Knowledge Injection (PVK-Sci): The model is fine-tuned on a dataset of 55,104 question-answer pairs covering seven research themes (e.g., Device Structure, Interface Engineering, Defects). This stage internalizes core domain vocabulary and semantic patterns.
Stage II: Instruction Alignment:
- PVK-Cite: Fine-tuning on 22,916 QA pairs to ground responses in specific literature citations, ensuring factual accuracy.
- PVK-Exp: Fine-tuning on 10,648 QA pairs derived from experimental tables to enable the interpretation of quantitative performance metrics and efficiency mechanisms.
Stage III: Knowledge Retrieval-Augmented Generation (RAG): The model is connected to a Perovskite Knowledge Graph (PVK-KG) containing 23,789 entities and 22,272 triples. This graph is dynamically updated via an automated pipeline, allowing the model to access the latest scientific discoveries without frequent retraining.

B. PVK-BO: Hierarchical Bayesian Optimization

To navigate the high-dimensional design space, the framework employs PVK-BO, an optimization engine that integrates the LLM's generative capabilities with Bayesian Optimization. The workflow operates in a closed loop:

Warmstarting: PVK-LLM generates high-quality initial candidates based on domain knowledge, addressing the "cold-start" problem.
Recipe Design & Candidate Sampling: The LLM acts as a generative proposal distribution, suggesting new recipes conditioned on historical feedback and target performance.
Surrogate Modelling: The LLM functions as a probabilistic surrogate to predict the performance of candidates before physical verification.
Acquisition Function: Instead of rigid statistical functions, PVK-LLM itself acts as an intelligent acquisition function, weighing potential performance gains against exploration value to select the next optimal recipe.
Verification: Recipes are validated via physics-based simulators (SCAPS-1D) or wet-lab experiments, with results fed back to refine the strategy.

3. Key Results

A. Benchmark Performance

Domain Knowledge: On the PVK-MCQ benchmark (1,103 multiple-choice questions), PVK-LLM achieved a state-of-the-art accuracy of 87.25%, outperforming general LLMs like Qwen2.5-72B and GPT-4o.
Qualitative Assessment: In open-ended generation tasks (PVK-QA), PVK-LLM scored highest across accuracy, completeness, relevance, and clarity.
Human Evaluation: In pairwise comparisons with five domain experts, PVK-LLM achieved dominant win rates: 69.5% against Qwen2.5-32B, 65.5% against GPT-4, and 68.0% against Deepseek-R1.
Semantic Structure: t-SNE visualization confirmed that PVK-LLM's embedding space forms distinct, semantically meaningful clusters for perovskite materials, unlike the random distribution of the base model.

B. Simulator Experiments

Using SCAPS-1D, the framework was tested on Band Alignment and Doping Optimization tasks:

Initialization Efficiency: PVK-BO achieved a higher initial mean Power Conversion Efficiency (PCE) than random search, Sobol sequences, and Latin Hypercube sampling, effectively bypassing the initial exploration phase.
Convergence: In Band Alignment, PVK-BO reached a maximum PCE of 26.52%. In Doping Optimization, it reached 25.44%, outperforming standard Bayesian Optimization (BO), HEBO, and TuRBO baselines with lower variance and faster convergence.

C. Wet-Lab Experiment

In an autonomous closed-loop wet-lab experiment targeting p-i-n PSCs:

Discovery: PVK-LLM identified interface passivation as the bottleneck and autonomously designed a novel, unreported four-component synergistic passivation system: 3MTPAI, PDAI2, EDAI2, and PipDI.
Optimization: Through three iterative epochs, the model refined the mixture ratios.
- Epoch 0 (Baseline): 23.68% PCE.
- Epoch 1: 25.07% PCE.
- Epoch 2: 25.23% PCE.
- Epoch 3 (Champion): 26.00% PCE ( $V_{OC}=1.155$ V, $J_{SC}=26.35$ mA/cm², $FF=85.44\%$ ).
This result approaches world records typically achieved through extensive expert trial-and-error.

4. Significance and Claims

The paper claims that PVK-LLM successfully demonstrates that domain-knowledge-guided AI can:

Internalize Scientific Expertise: By bridging the semantic gap between natural language and quantitative material science, the model can comprehend complex fabrication parameters and experimental records.
Accelerate Material Discovery: The framework effectively navigates high-dimensional design spaces, overcoming cold-start problems and reducing the iteration cycle from theoretical design to laboratory validation.
Enable Autonomous Optimization: The integration of LLMs with Bayesian Optimization allows for the autonomous proposal of novel, high-performance recipes (such as the 26.0% PCE device) without human guidance in the decision-making loop.

The authors note that while the current physical execution relies on a "human-in-the-loop" for wet-lab execution, the framework is designed for extensibility. By integrating standard API interfaces for laboratory automation, the system aims to evolve into a foundational agent for self-driving laboratories, applicable to other material science challenges such as battery electrolyte optimization and organic photovoltaic screening.

Limitations Acknowledged:

The model's predictive accuracy is bounded by the quality, noise, and reporting inconsistencies of existing open-source academic literature.
Current wet-lab throughput is limited by manual intervention, though the decision-making loop is fully closed.