SLAP: Shortcut Learning for Abstract Planning

The paper proposes SLAP, a method that leverages model-free reinforcement learning to automatically discover new abstract action "shortcuts" within existing Task and Motion Planning (TAMP) frameworks, thereby significantly reducing plan lengths and improving task success rates compared to traditional planning and hierarchical RL approaches.

Y. Isabel Liu, Bowen Li, Benjamin Eysenbach, Tom Silver

Published 2026-03-03
📖 4 min read☕ Coffee break read

Imagine you are trying to clean up a messy room. You have a strict set of rules (a "manual") that tells you exactly how to move things: Pick up a toy, Walk to the bin, Drop the toy.

If you have a tower of blocks blocking a specific spot, the manual says: "Pick up block A, move it. Pick up block B, move it. Pick up block C, move it." You do this one by one until the spot is clear. It works, but it takes forever.

A clever child, however, wouldn't follow the manual step-by-step. They would pick up the toy they need, then slap the whole tower of blocks aside with their hand, clearing the space in one go, and then drop the toy. It's faster, messier, and definitely not in the manual, but it gets the job done.

This paper is about teaching robots to be that clever child.

Here is the breakdown of SLAP (Shortcut Learning for Abstract Planning) in simple terms:

1. The Problem: The Robot is Too Rigid

Current robots are great at following a "To-Do List" (called Task and Motion Planning). They can break a big job into small, logical steps like "Pick," "Place," and "Move."

  • The Catch: The robot can only do what humans explicitly programmed it to do. If the robot needs to "slap" a tower of blocks to clear a path, it can't do that because "slapping" isn't on its list of approved moves. It will try to move the blocks one by one, which is slow and inefficient.

2. The Solution: SLAP (The Robot's "Aha!" Moment)

The authors created a system called SLAP. Think of SLAP as a smart coach that watches the robot try to solve problems and says, "Hey, you're doing it the hard way. Let's try a shortcut."

Here is how SLAP works, using a Video Game Analogy:

  • The Map (Abstract Planning): Imagine a map of a video game level. The robot knows the "official" paths (the long, winding roads where it moves one block at a time).
  • The Cheat Codes (Shortcuts): SLAP uses a technique called Reinforcement Learning (trial and error) to find "cheat codes" or "portals" on the map.
    • Instead of walking from Point A to Point B, the robot learns a new move: "If I hold this block and spin my arm, I can knock the whole tower over."
    • This new move is a Shortcut. It connects two points on the map that were previously far apart.

3. How It Learns (The Training Camp)

SLAP doesn't just guess; it practices.

  1. Identify the Gap: It looks at the "official map" and sees a long, boring path between two states (e.g., "Holding the target block" and "The floor is clear").
  2. Create a Mini-Game: It isolates just that specific problem and creates a tiny, focused training environment.
  3. Practice: The robot tries thousands of random moves in this mini-game. Eventually, it stumbles upon a cool, dynamic move (like a "slap," "wiggle," or "wipe") that clears the path instantly.
  4. Save the Move: Once the robot masters this new move, SLAP saves it as a new "option" on the main map.

4. The Result: Faster and Smarter

When the robot faces a new, real-world task:

  • Old Way: It follows the long, winding road of the manual.
  • SLAP Way: It looks at the map, sees the new "portal" (shortcut) it learned, and jumps straight through it.

In the experiments:

  • The robot solved tasks 50% to 73% faster than the old way.
  • It succeeded more often than robots that tried to learn the whole task from scratch without any rules (which is like trying to learn to drive by just spinning the wheel randomly).
  • It discovered moves humans never programmed, like slapping a tower of blocks or wiping a table with a tool to gather toys.

Why This Matters

This is a bridge between two worlds:

  1. Planning: The logical, step-by-step thinking of a computer (good for long, complex tasks).
  2. Learning: The creative, trial-and-error flexibility of a human (good for finding fast, clever solutions).

SLAP lets a robot keep its logical brain but give it the ability to "improvise" physically. It's like giving a robot a rulebook, but then teaching it how to break the rules when it leads to a faster, better result.

In short: SLAP teaches robots to stop being rigid bureaucrats and start being creative problem-solvers, finding the "slap" instead of the "step-by-step."

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →