Self-Supervised Inductive Logic Programming

This paper introduces "Poker," a new self-supervised Inductive Logic Programming system that learns recursive logic programs from positive examples and a general second-order background theory by automatically generating and labeling synthetic negative examples, thereby overcoming the need for expert-curated negative data and task-specific background theories that limit existing methods like Louise.

Stassa Patsantzis

Published 2026-03-05
📖 5 min read🧠 Deep dive

The Big Problem: The "Expert" Bottleneck

Imagine you want to teach a robot to understand a secret language (like a grammar for a specific type of code or a set of rules for drawing shapes).

In the old way of doing this (called Inductive Logic Programming or ILP), you act like a strict teacher. You have to provide:

  1. Examples of what is correct (Positive examples).
  2. Examples of what is wrong (Negative examples).
  3. A rulebook (Background theory) that explains the basic vocabulary and how the robot should think.

The Catch: To make this work, you, the human, have to be an expert. You have to hand-pick the "wrong" examples so the robot doesn't get confused, and you have to write a custom rulebook for every single new problem. If you want to teach the robot a new language, you have to sit down and write a whole new manual. It's slow, tedious, and limits how useful these robots can be in the real world.

The Solution: The "Poker" System

The author, Stassa Patsantzis, introduces a new system called Poker. Think of Poker not as a card game, but as a detective who teaches itself.

Poker changes the game by asking: "What if I don't have a rulebook or a list of 'wrong' answers? Can I figure it out anyway?"

Here is how Poker works, using a simple analogy:

1. The "Blank Slate" Background

Instead of giving the robot a specific rulebook, Poker gives it a universal, generic toolkit. Imagine giving a chef a kitchen with every possible ingredient and every possible cooking tool, but no recipe.

  • Old Way: You give the chef a specific recipe for "Spaghetti Carbonara" and tell them, "Do not add chocolate."
  • Poker Way: You give the chef a kitchen full of ingredients and say, "Here are three examples of good pasta. Figure out the rest."

2. The "Self-Supervised" Detective Work

Poker starts with a few labeled examples (the "good" pasta) and a huge pile of unlabeled examples (a mix of good and bad pasta that the robot doesn't know yet).

Poker uses a clever trick called "Contradiction Detection":

  • Step 1: Poker makes a guess at a rule (a hypothesis) that fits the known "good" examples.
  • Step 2: It tests this rule against the unlabeled pile.
  • Step 3: If the rule says, "This unlabeled string is good," but the rule also implies it should be bad (or if it breaks the logic of the known good examples), Poker realizes: "Wait, I made a mistake. This unlabeled string must actually be a 'bad' example, or my rule is too loose."
  • Step 4: Poker creates its own negative examples. It essentially says, "I found a pattern that looks like it could be right, but if I accept this, I break the rules. So, I will mark this as 'Wrong' and learn from it."

It's like a student taking a practice test. If they get a question wrong, they don't just move on; they analyze why it was wrong, create a new "wrong" example to remember, and update their study guide.

3. The "SONF" (The Universal Rulebook)

To make this possible without a custom manual, the author invented something called a Second-Order Definite Normal Form (SONF).

  • Analogy: Imagine instead of writing a specific rule for "How to build a chair," you write a rule for "How to build any furniture."
  • This SONF is a super-general set of logic patterns. It's broad enough to learn any grammar (like the rules for English sentences or fractal shapes) without needing to be tweaked for each specific task. It removes the need for the human to be a grammar expert.

The Results: Poker vs. Louise

The author tested Poker against a state-of-the-art system called Louise.

  • Louise (The Old Way): When given only "good" examples and no "bad" ones, Louise got confused. It started accepting everything as correct (over-generalizing). It was like a student who, having never been corrected, thinks "All words are spelled correctly."
  • Poker (The New Way): As Poker generated more and more of its own "bad" examples to test itself, it got smarter.
    • More unlabeled data = Better performance.
    • It learned to distinguish between "1010" (correct) and "1100" (incorrect) even without being told which was which initially.
    • It successfully learned complex patterns like Context-Free Grammars (mathematical rules for language) and L-Systems (rules for drawing fractal plants and shapes).

Why This Matters

  1. No More Manual Labor: You don't need to be an expert to teach the AI. You just need to give it a few examples and a pile of raw data.
  2. Self-Correction: The system creates its own "negative feedback" loop. It learns by finding its own mistakes, just like a human does.
  3. Versatility: Because it uses a universal "furniture-building" rulebook (SONF), it can switch from learning language rules to learning drawing rules without needing a new manual.

Summary in One Sentence

Poker is an AI that learns complex rules by teaching itself, using a universal toolkit and by inventing its own "wrong answers" to avoid getting confused, freeing humans from the burden of writing custom manuals for every new problem.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →