Prompt Readiness Levels (PRL): a maturity scale and scoring framework for production grade prompt assets

This paper introduces Prompt Readiness Levels (PRL), a nine-level maturity scale, and the Prompt Readiness Score (PRS), a multidimensional scoring framework designed to provide organizations with a standardized, auditable method for qualifying, governing, and deploying production-grade prompt assets across safety, compliance, and operational objectives.

Sebastien Guinard (Univ. Grenoble Alpes, CEA, DRT F-38000 Grenoble)

Published 2026-03-17
📖 5 min read🧠 Deep dive

Imagine you are building a house. You wouldn't just hand a brick to a worker and say, "Make a wall," and hope for the best. You need blueprints, safety inspections, and a clear plan to ensure the house won't collapse when the wind blows.

For a long time, Generative AI (like the chatbots we use today) has been built more like a magic trick than a house. Engineers would type a "prompt" (an instruction), hope the AI gave a good answer, and if it worked, they'd use it. But if the AI suddenly started lying, being rude, or giving dangerous advice, there was no standard way to say, "Whoa, this isn't ready for the real world yet."

This paper, written by Sébastien Guinard, proposes a new system to fix that. It's like giving AI instructions a driver's license.

Here is the breakdown of the paper using simple analogies:

1. The Problem: The "Wild West" of AI Instructions

Right now, writing a prompt for an AI is like giving directions to a tourist who speaks a different language. Sometimes they get it right; sometimes they get lost; sometimes they drive off a cliff.

  • The Issue: Companies are using these "prompts" in critical jobs (like banking, healthcare, or customer service), but they have no shared language to say, "Is this prompt safe?" or "Is this prompt good enough?"
  • The Analogy: Imagine a restaurant where the chef just guesses the recipe every day. Sometimes the soup tastes great; sometimes it has poison in it. We need a way to grade the recipes before they go to customers.

2. The Solution: PRL (The "Driver's License" for Prompts)

The author introduces PRL (Prompt Readiness Levels). This is inspired by the TRL (Technology Readiness Levels) used by NASA to decide if a rocket is ready to fly.

Think of PRL as a 9-Step Ladder. You cannot skip steps. You can't claim your prompt is "Production Ready" if it hasn't passed the basic tests.

  • Levels 1–3 (The Sketchpad): This is the "Idea Phase."
    • Analogy: You are drawing a rough sketch of a car. Does the engine concept make sense? Does the car have wheels?
    • Goal: Just checking if the AI understands the basic task.
  • Levels 4–6 (The Test Track): This is the "Hardening Phase."
    • Analogy: You put the car on a test track. You drive it over bumps, in the rain, and at high speeds. Does it break? Does it handle well?
    • Goal: Making sure the AI gives consistent answers and doesn't get confused by typos or weird questions.
  • Levels 7–9 (The Highway & Certification): This is the "Production Phase."
    • Analogy: The car passes safety inspections, has airbags, and is legally allowed on public roads. It has a license plate and a warranty.
    • Goal: Ensuring the AI is safe from hackers, follows laws (like privacy rules), and is integrated into the company's systems.

3. The Scorecard: PRS (The "Report Card")

Just having a "Level" isn't enough; you need a score. The paper introduces PRS (Prompt Readiness Score).

Think of this as a 5-Point Report Card for your AI prompt. To get a high score, you can't just be good at one thing. You must be good at all of them. If you fail one, you fail the whole test.

The 5 subjects on the report card are:

  1. Reliability (R): Does it give the same answer every time, or does it hallucinate (make things up)?
  2. Stability (S): Does it break if someone types a typo or uses weird slang?
  3. Compliance (C): Is it safe? Does it refuse to answer if asked to build a bomb? Does it follow privacy laws?
  4. Governance (G): Do we know who wrote it? Do we have a backup plan? Is it version-controlled (like saving a document with "v1," "v2")?
  5. Operations (O): Is it cheap and fast to run?

The "No Weak Link" Rule:
The paper says: If your prompt is amazing at being fast (Operations) but terrible at being safe (Compliance), it gets a zero.

  • Analogy: Imagine a race car that goes 200 mph but has no brakes. It doesn't matter how fast it is; it's not allowed on the track.

4. Why This Matters

Before this paper, saying "Our AI is ready" was just a marketing claim. Now, teams can say:

"Our prompt is PRL Level 7. It has passed security tests, follows GDPR laws, and has a score of 85/100. Here is the evidence."

This helps:

  • Managers know when to invest money.
  • Regulators know the AI is safe to use.
  • Engineers know exactly what to fix before the next step.

Summary

This paper is a rulebook for growing up. It tells us that writing prompts for AI isn't just "typing words." It's engineering. By using the PRL ladder and the PRS report card, we can turn messy, risky AI experiments into reliable, safe, and trustworthy tools that we can actually use in the real world.

In short: It turns AI prompts from "magic spells" into "industrial machinery" that we can inspect, certify, and trust.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →