From Business Events to Auditable Decisions: Ontology-Governed Graph Simulation for Enterprise AI

This paper introduces LOM-action, an ontology-governed graph simulation framework that transforms enterprise AI decision-making by grounding agent responses in event-driven, deterministic sandbox simulations to ensure auditability and achieve significantly higher tool-chain reliability than existing LLM baselines.

Hongyin Zhu, Jinming Liang, Mengjun Hou, Ruifan Tang, Xianbin Zhu, Jingyuan Yang, Yuanman Mao, Feng Wu

Published 2026-04-13
📖 5 min read🧠 Deep dive

The Big Problem: The "Fluent but Wrong" AI

Imagine you hire a brilliant, fast-talking consultant (a standard Large Language Model or LLM) to manage your company's finances. You ask, "Can we approve this $50,000 expense?"

The consultant answers instantly: "Yes, absolutely! Here is the approval." They sound confident, and the grammar is perfect. But here's the catch: They didn't check the rules. They didn't look at the current budget, they didn't check if the manager has the authority to sign off on that amount, and they didn't see that the company is currently in a "frozen spending" mode due to a merger.

They just guessed based on general knowledge. In the real world, this is dangerous. If you fire them, you can't prove why they said yes, because they didn't actually follow a process. They just "felt" it was right.

This is what the paper calls "Illusive Accuracy." The AI looks smart (high accuracy), but it's actually hallucinating a decision because it skipped the necessary steps to check the specific rules of the moment.


The Solution: LOM-action (The "Simulation Sandbox")

The authors propose a new system called LOM-action. Instead of letting the AI guess, they force it to play a "what-if" game before it makes a decision.

Think of it like a Flight Simulator for business decisions.

  1. The Real World (The Enterprise Ontology): This is your company's actual rulebook, database, and org chart. It's huge and complex.
  2. The Event (The Trigger): A business event happens (e.g., "A manager submits an expense report").
  3. The Sandbox (The Simulation): Before the AI says "Yes" or "No," it creates a copy of the company's rulebook in a safe, isolated room (a sandbox).
    • It applies the specific rules for this event (e.g., "Oh, this manager is in the Marketing department, so they have a $5k limit," or "The company is in a freeze, so no new spending").
    • It cuts out the parts of the rulebook that don't apply and adds the new constraints.
    • Crucially: It does this without touching the real company database. It's just a simulation.

How It Works: The Three-Step Dance

The paper describes a strict three-step process that the AI must follow, like a pilot going through a pre-flight checklist:

  1. Phase 1: The Translator (Scenario Parsing)
    The AI reads the messy human request and translates it into strict business rules.

    • Analogy: You tell the pilot, "I want to fly to London." The pilot translates that into: "Check wind speed, check fuel levels, check runway 22 availability."
  2. Phase 2: The Simulator (Sandbox Simulation)
    The AI goes into the "Sandbox." It takes a copy of the company's data and physically removes or changes the parts that don't fit the current situation.

    • Analogy: The pilot runs the flight simulator. The computer simulates the wind, the fuel burn, and the runway conditions. It creates a specific "flight path" that is valid only for this specific trip.
    • The Magic: If the simulation shows the path is blocked (e.g., "No valid path exists because the budget is frozen"), the AI stops. It doesn't guess. It reports the blockage.
  3. Phase 3: The Decision (Derivation)
    The AI looks only at the result of the simulation.

    • Analogy: The pilot looks at the simulator's output. If the simulator says "Go," the pilot says "Go." If the simulator says "Crash," the pilot says "Cancel."
    • The Audit Trail: Because the AI followed the simulation steps, we have a perfect record (a receipt) of exactly why the decision was made. "We said no because the simulation showed the budget was frozen."

The "Dual-Mode" Brain

The system has two ways of thinking, like a human having a "reflex" and a "thoughtful" mode:

  • Skill Mode (The Reflex): If the AI has seen this type of problem before, it uses a pre-approved "tool" (like a calculator or a database query) to get the answer instantly. It's fast and safe.
  • Reasoning Mode (The Thoughtful): If the problem is new and complex, the AI pauses, loads the simulated data into its "working memory," and thinks through the logic step-by-step.

Why This Matters: The "Illusive Accuracy" Trap

The paper tested this against top-tier AI models (like Doubao and DeepSeek).

  • The Top Models: They got the final answer right 80% of the time. But when you checked how they got there, they skipped the simulation steps. They just guessed. Their "Tool-Chain F1" (a score for following the process) was terrible (around 24-36%).
  • LOM-action: It got the answer right 94% of the time, and it followed the process perfectly (98% score).

The Lesson: In a business, being "right by accident" is a liability. If you get sued, you can't say, "The AI guessed right." You need to say, "The AI followed the rules, ran the simulation, and the simulation said yes."

Summary Metaphor: The Traffic Light

  • Standard AI: A driver who sees a red light but thinks, "I'm a good driver, I'll just speed through it because I feel like it." They might make it across safely (Accurate), but they broke the law and have no record of why they thought it was safe.
  • LOM-action: A driver who stops, checks the traffic camera feed (Simulation), sees the light is red, checks the police report (Audit Trail), and waits. If the light turns green, they go. If it stays red, they wait. They have a perfect log of every second they waited.

In short: This paper argues that for AI to be trusted in business, it shouldn't just be a smart talker; it must be a disciplined simulator that proves its work before making a single decision.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →