The Reward Function and the Least Cost Principle for Gravitation and other Laws of Physics

This paper proposes a "least cost principle" to frame physical laws as solutions to an inverse optimal control problem, inferring that gravitational and Coulomb forces optimize a reward function that favors high relative velocities and quasi-circular orbits.

Original authors: Rubén Moreno-Bote

Published 2026-03-27✓ Author reviewed
📖 5 min read🧠 Deep dive
⚕️

This is an AI-generated explanation of a preprint that has not been peer-reviewed. It is not medical advice. Do not make health decisions based on this content. Read full disclaimer

Imagine the universe as a massive, cosmic video game. In this game, every particle (like an electron or a star) is a character, and the "laws of physics" (like gravity) are the rules that dictate how they move.

Usually, we ask: "How do these characters move?" and the answer is, "They follow Newton's laws."

But this paper asks a much deeper, almost philosophical question: "Why do they move that way? What is the game trying to achieve?"

The author, Rubén Moreno-Bote, suggests that the universe isn't just randomly following rules. Instead, it's playing a game of optimization. It's trying to get the best possible score while using the least amount of effort.

Here is the breakdown of the paper's big ideas, translated into everyday language:

1. The "Least Cost" Principle: The Lazy Universe

Imagine you are driving a car. You want to get to your destination, but you also want to save gas and avoid hitting the brakes or the gas pedal too hard. You want a smooth ride.

The paper proposes that the universe works the same way. It follows a "Least Cost Principle."

  • The Cost: This is the "effort" or "jerkiness" of the movement. If a particle has to accelerate suddenly (like slamming on the brakes), that costs a lot of "energy points." The universe hates wasting points on jerky movements.
  • The Reward: This is the "score" the universe gets for doing something cool. The universe wants to maximize this score.

The laws of motion (like gravity) are simply the result of the universe trying to maximize the score while minimizing the effort.

2. Reverse Engineering the Game (Inverse Optimal Control)

Usually, scientists start with the rules (the laws of physics) and predict the movement.

  • Standard Physics: "Here is gravity. Now, tell me where the planet will go."
  • This Paper: "Here is where the planet actually goes. Now, tell me what the 'score' (reward) must have been to make it move that way?"

The author uses a mathematical trick called Inverse Optimal Control. Think of it like watching a master chef cook a perfect dish. You don't know the recipe, but by tasting the food and watching the movements, you can guess exactly what ingredients (rewards) the chef was trying to highlight.

3. What is the Universe "Rewarding"?

When the author reverse-engineered the laws of gravity and electricity (Coulomb forces), they discovered what the universe considers a "good move." The universe is secretly cheering for two specific things:

A. The "Dance Partner" Reward (Relative Motion)

The universe loves it when particles are moving relative to each other.

  • The Analogy: Imagine a dance floor. If everyone stands still, it's boring. If everyone runs in the exact same direction at the exact same speed, it's also a bit dull.
  • The Reward: The universe gets a high score when particles are zipping around relative to one another. It wants the particles to be active and dynamic, not frozen in place.

B. The "Circular Orbit" Reward (Orthogonal Motion)

This is the most fascinating part. The paper found that the universe gets a massive bonus when particles move sideways relative to each other, rather than straight toward or away from each other.

  • The Analogy: Imagine two people holding hands.
    • If they run straight toward each other, they crash. (Bad score).
    • If they run straight away, they drift apart forever. (Bad score).
    • If they run in a circle around each other, holding hands, that's a perfect score.
  • The Result: This explains why planets orbit stars in circles (or ellipses) instead of crashing into them or flying off into space. The universe is "rewarding" that circular, orbital dance because it's the most efficient, stable, and "interesting" way to move.

4. Why Does This Matter?

The paper suggests that complexity (like life, stars, and galaxies) needs this specific kind of motion to exist.

  • If the universe just minimized effort without a reward, everything would just sit still or move in straight lines forever.
  • If the universe just maximized motion without caring about effort, everything would be chaotic and explode.

By balancing effort (minimizing acceleration) with rewards (encouraging relative motion and circular orbits), the universe creates the structured, beautiful, and complex systems we see today.

Summary Metaphor

Think of the universe as a gymnast on a balance beam.

  • The Cost: Falling off the beam (too much acceleration/force).
  • The Reward: Performing a beautiful, complex routine (relative motion and orbits).

The laws of gravity aren't just random rules; they are the gymnast's strategy to perform the most spectacular routine possible without falling off. The universe is constantly trying to keep the dance going, spinning in circles, and staying active, all while using the least amount of energy possible to do it.

Drowning in papers in your field?

Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.

Try Digest →