Marking Data-Informativity and Data-Driven Supervisory Control of Discrete-Event Systems

Imagine you are teaching a robot to navigate a maze to reach a treasure chest (the "goal"). In the old way of doing things (Model-Based Control), you would first need a perfect, blueprinted map of the entire maze before you could give the robot any instructions. You'd know exactly where the walls are, where the traps are, and every possible path.

But what if the maze is in a dark room, or it's a brand-new environment you've never seen? You don't have a map. However, you do have a notebook of observations. You've watched the robot move a few times, and you've noted:

What it did: "It went Left, then Right, then Up."
What it achieved: "When it went Left-Right-Up, it found the treasure."
What it definitely didn't do: "It never went Down-Left, because that leads to a bottomless pit (which we know exists from physics, even if we haven't seen it yet)."

This paper is about how to teach the robot to reach the treasure using only that notebook, without ever seeing the full map.

The Core Problem: The "Unknown" Maze

The authors ask: Can we write a set of rules (a supervisor) that guarantees the robot reaches the treasure and never gets stuck, even though we don't know the full layout of the maze?

The answer depends on the quality of your notes. If your notes are too vague, you might give the robot a rule like "Always go Right," but in the real maze, going Right might lead to a dead end that you didn't see in your notes.

The Key Concept: "Marking Data-Informativity"

The authors introduce a fancy term called Marking Data-Informativity. Let's break this down with a metaphor:

Imagine you are a detective trying to solve a crime. You have a list of suspects (all the possible maze layouts that fit your notes).

Data-Informativity means your notes are so detailed that every single suspect agrees on the same safe path to the treasure.
If your notes are "informative," you can say, "No matter which suspect is the real criminal, if they follow these rules, they will get the treasure and won't get stuck."
If your notes are not informative, it means there is a "suspect" (a possible maze layout) where your rules would cause the robot to crash or get stuck.

The "Marking" part is crucial. In robotics, "marking" means reaching a goal state. The paper emphasizes that it's not enough to just keep the robot moving; it must actually finish the job (reach the treasure). Many previous methods ignored this, leading to robots that moved forever in circles without ever finding the goal.

The Three Types of Data

To solve this, the paper says you need three specific types of notes:

Observation ( $D$ ): What the robot actually did.
Marked Observation ( $D_m$ ): Which of those actions actually led to the goal.
Negative Knowledge ( $D^-$ ): What you know is impossible (e.g., "The robot can't fly," or "It can't go through a wall").

The Magic Trick: The paper shows that having a good list of "impossible things" ( $D^-$ ) is just as important as having a long list of "things that happened." If you know what can't happen, you can safely assume that if a path isn't in your notes, it probably leads to a dead end or a wall, so you don't have to worry about it.

What Happens When the Notes Aren't Good Enough?

Sometimes, your notes are just too sparse. You might not have enough "Negative Knowledge" to rule out dangerous paths. In this case, the paper says: "Don't give up! Just aim for a smaller goal."

They introduce a concept called Restricted Marking Data-Informativity.

Analogy: You wanted the robot to find the "Golden Treasure," but your notes aren't good enough to guarantee that. However, your notes are good enough to guarantee it can find the "Silver Coin" nearby.
The paper provides an algorithm to automatically shrink your goal from "Golden Treasure" to the "Largest Possible Safe Goal" that your notes can support. It's like a GPS that says, "I can't get you to the beach, but I can definitely get you to the park."

The Algorithm: The "Safety Filter"

The authors built a step-by-step recipe (an algorithm) to check your notes:

Build a "Shadow Maze": They create a digital model based only on your notes.
Test the Rules: They simulate every possible move the robot could make.
The "Uncontrollable" Test: They ask, "If the robot hits a wall it can't control (like a sudden wind gust), does it still stay safe?"
- If the answer is Yes for all possibilities: Great! You have a valid supervisor.
- If the answer is No: The algorithm automatically trims your goal to the safest, largest possible version that does work.

Why This Matters

This is a big deal for the future of AI and robotics.

Old Way: You need a perfect map before you can start. (Slow, expensive, impossible for unknown environments).
New Way: You can start controlling a system immediately with just a few observations and a bit of common sense about what's impossible.

In summary: This paper teaches us how to be a smart teacher. Instead of waiting for a perfect textbook (the model) to teach a student (the robot), we can use a few good examples and a list of "don'ts" to guide the student to success, even if we don't know the whole story yet. If the examples aren't enough, we simply adjust the student's goals to something achievable, ensuring they never get stuck or fail.

Here is a detailed technical summary of the paper "Marking Data-Informativity and Data-Driven Supervisory Control of Discrete-Event Systems" by Yingying Liu, Kuma Fuchiwaki, and Kai Cai.

1. Problem Statement

The paper addresses the challenge of supervisory control for Discrete-Event Systems (DES) when the system model (plant) is unknown.

Context: Traditional supervisory control is model-based, requiring a precise finite-state automaton of the plant. However, in many modern applications (e.g., autonomous driving, unknown environments), obtaining an exact model is difficult, while observational data is abundant due to IoT and sensing technologies.
The Gap: Existing data-driven approaches for DES often ignore marked behaviors (specific goal states or completion states). Without considering marked behaviors, a controller might allow the system to reach a state where it cannot complete a task (deadlock), even if it satisfies safety constraints.
The Core Question: Given a set of observational data and prior knowledge, under what conditions can a valid, nonblocking supervisor be designed to enforce a specification on an unknown plant?
Data Available: The authors assume access to three finite data sets:
1. $D$ : Observed behaviors (strings of events).
2. $D_m \subseteq D$ : Observed marked behaviors (strings reaching goal states).
3. $D^-$ : Prior knowledge of impossible behaviors (strings the plant cannot generate).

2. Methodology

The authors propose a framework that bypasses system identification (learning a model) and instead constructs a supervisor directly from the data.

A. Data-Driven Automaton

They introduce a Data-Driven Automaton ( $\hat{G}$ ), which is a prefix tree automaton constructed from the union of observed data and impossible data ( $D \cup D^-$ ).

States correspond to strings in $D \cup D^-$ .
Transitions represent event occurrences observed in the data.
This structure allows the verification of control properties without knowing the true plant model.

B. Marking Data-Informativity

The central concept introduced is Marking Data-Informativity.

Definition: A data set $(D, D_m, D^-)$ is "marking informative" for a specification $E$ if a valid, nonblocking supervisor can be constructed that enforces the specification $K_{Dm} = D_m \cap E$ for all plant models consistent with the data.
Criterion (Theorem 1): The data is marking informative if and only if for every string $s \in K_{Dm}$ $s \in K_{D m}$ and every uncontrollable event $\sigma \in \Sigma_u$ $σ \in Σ_{u}$ :
$s\sigma \in K_{Dm} \cup D^-$
- Interpretation: If an uncontrollable event occurs after a valid specification string, the resulting string must either remain within the specification (safe) or be known to be impossible ( $D^-$ ). If it falls into an "unknown" region (neither in specification nor in $D^-$ ), the data is insufficient.

C. Restricted Marking Data-Informativity & Informatizability

If the full specification is not achievable (the data is not informative), the paper explores finding a subset of the specification that is achievable.

$K$ -Informativity: The data is informative for a specific subset $K \subseteq K_{Dm}$ .
Marking Informatizability: The data is "marking informatizable" if there exists at least one non-empty subset $K$ for which the data is informative.
Least Restricted Marking Informativity: The goal is to find the largest such subset ( $K_{sup}$ ), ensuring the supervisor is maximally permissive (allows the most behaviors possible while remaining safe and nonblocking).

3. Key Contributions

Novel Concept of Marking Data-Informativity: Unlike previous work, this framework explicitly incorporates marked behaviors ( $D_m$ ). This ensures that the resulting supervisor is not only safe but also nonblocking, guaranteeing that the system can reach its goal states.
Necessary and Sufficient Conditions: The paper provides a rigorous mathematical condition (Theorem 1) to verify if a given data set is sufficient for control design.
Verification Algorithms:
- Algorithm 1: Checks if the data is marking informative for a given specification.
- Algorithm 2: Identifies "non-informative states" (states where uncontrollable events lead to unknown regions).
- Algorithm 3: Computes the largest subset ( $K_{sup}$ ) of the specification for which the data is informative. This involves constructing a sub-automaton, removing non-informative states, and using a supcon (supremal controllable sublanguage) function to synthesize the maximally permissive supervisor.
Distinction from Non-Marked Approaches: The paper demonstrates (via examples) that ignoring marked behaviors can lead to supervisors that allow the system to enter deadlocks (blocking paths), whereas the proposed method guarantees goal-reaching.

4. Results

Theoretical Validation: The authors prove that the proposed algorithms correctly determine informativity and compute the supremal subset $K_{sup}$ .
Case Studies:
- Robot Navigation: A robot moving in an unknown environment with potential danger zones.
  - Scenario 1: With sufficient data and prior knowledge of impossible moves, a valid nonblocking supervisor was synthesized.
  - Scenario 2: When data was insufficient, the algorithm successfully identified a smaller, safe subset of paths (e.g., avoiding a specific route that could lead to a deadlock) rather than failing completely.
- Comparison: The paper contrasts the data-driven approach with model-based approaches, showing that the data-driven method can handle unknown environments where model identification is impossible.
Impact of Data Quality: The results highlight that the quality of the data (specifically the overlap between observation $D$ and prior knowledge $D^-$ ) is more critical than the sheer quantity. A larger $D^-$ (knowledge of what cannot happen) significantly relaxes the requirements on the observation data $D$ .

5. Significance

Bridging the Model-Data Gap: This work provides a rigorous theoretical foundation for controlling complex systems where modeling is infeasible but data is available.
Safety and Liveness: By integrating marked behaviors, the approach ensures liveness (the system can complete its tasks) alongside safety, which is often overlooked in purely data-driven control.
Practical Applicability: The algorithms allow engineers to determine a priori whether their collected data is sufficient for control. If not, the system can automatically compute the best possible control policy for the available data (the "least restricted" solution) or suggest that more specific data (or better prior knowledge) is needed.
Future Directions: The paper opens avenues for extending these concepts to observability, diagnosability, and opacity, and for developing adaptive strategies to collect new data iteratively.

In summary, this paper establishes a robust framework for data-driven supervisory control that guarantees nonblocking behavior, offering a systematic way to design controllers for unknown systems using observational data and prior constraints.