Graph Neural Model Predictive Control for High-Dimensional Systems

Imagine you are trying to control a giant, living octopus made of soft rubber. This octopus has thousands of tiny muscles and segments. If you want it to grab a cup of coffee without crushing it, or to wiggle through a tight cave, you need to tell every single part of its body exactly what to do, thousands of times per second.

This is the challenge of high-dimensional control. The problem is that calculating the perfect movement for thousands of parts is like trying to solve a math puzzle with a million pieces while running a marathon. It's too slow for real-time use.

This paper presents a clever new way to solve that puzzle using Graph Neural Networks (GNNs) and a smart shortcut called Condensing. Here is how it works, broken down into simple concepts:

1. The Octopus as a Social Network (The Graph)

Instead of treating the robot as one giant, messy blob of math, the researchers imagine it as a social network.

The Nodes: Each segment of the robot (like a finger or a spine segment) is a "person" in the network.
The Edges: The connections between them are "friendships." A segment only really cares about its immediate neighbors (the segments touching it), not the ones on the other side of the body.

By using a Graph Neural Network (GNN), the computer learns how these "friends" interact. It learns that if you pull on one segment, the one next to it moves a little, and the one next to that moves a tiny bit. It ignores the ones far away. This keeps the math simple and fast, just like how you only need to talk to your immediate neighbors at a party, not everyone in the room.

2. The "Condensing" Shortcut (The Magic Eraser)

Even with the social network idea, the math is still too heavy. The computer needs to predict the future position of every segment to decide what to do next. This is like trying to plan a 20-step dance routine for 1,000 dancers all at once.

The researchers used a technique called Condensing.

The Analogy: Imagine you are the choreographer. Instead of writing down the exact position of every single dancer for every step of the dance, you realize: "If I tell the first dancer to move left, the second one must follow, and the third one must follow."
The Result: You don't need to calculate the position of the 999 followers. You only need to calculate the move for the leader (the control input). The rest of the dance is automatically determined by the rules of the network.

This "Condensing" algorithm strips away all the unnecessary variables, turning a massive, impossible math problem into a tiny, solvable one.

3. The Supercomputer Muscle (GPU)

To make this happen fast enough to control a real robot, they ran the calculations on a Graphics Processing Unit (GPU).

The Analogy: If a normal computer processor is a single chef cooking a meal, a GPU is a kitchen with 10,000 chefs working in perfect sync.
Because the "social network" approach treats every segment independently (until they are summed up), the GPU can calculate the movement of all 1,000 segments simultaneously. This turns a task that would take minutes into one that takes milliseconds.

The Results: A Real-World Test

The team tested this on a soft robotic trunk (basically a flexible, elephant-like arm).

The Challenge: They made the robot try to trace a figure-eight and a circle in the air, and also dodge obstacles.
The Competition: They compared their method against older, standard ways of controlling robots (like Koopman operators and SSMs).
The Winner: The new GNN method was 63.6% more accurate than the others. It could track the target path with sub-centimeter precision (less than the width of a fingernail).
Speed: It ran at 100 Hz, meaning it made 100 decisions every second, fast enough to react instantly to changes.

Why This Matters

Before this, controlling a soft robot with thousands of moving parts was like trying to steer a ship with a thousand rudders using a calculator. It was too slow and clunky.

This paper gives us a GPS and a steering wheel for these complex machines. By realizing that parts only talk to their neighbors (the Graph) and by only calculating the leader's moves (the Condensing), we can finally control soft, flexible robots in real-time. This opens the door for robots that can safely work in hospitals, explore disaster zones, or handle delicate objects without breaking them.

1. Problem Statement

Controlling high-dimensional robotic systems, particularly soft robots with high Degrees of Freedom (DoFs), presents a fundamental trade-off between model fidelity and computational tractability.

The Challenge: Accurate modeling of soft robots requires complex, high-dimensional nonlinear dynamics. Traditional physics-based models (e.g., Finite Element Methods) are too computationally expensive for real-time control, while reduced-order models often fail to capture complex deformations or restrict control to a subset of the state space.
The Gap: While Graph Neural Networks (GNNs) have emerged as powerful data-driven tools for modeling relational dynamics, integrating them into Model Predictive Control (MPC) remains difficult. Standard MPC formulations using GNNs often result in Optimal Control Problems (OCPs) that are too large to solve in real-time due to the cubic scaling of traditional solvers with respect to the state dimension.
Goal: To develop a framework that enables real-time, closed-loop optimal control for systems with up to 1,000 nodes (high-dimensional states) while maintaining high tracking accuracy and the ability to enforce constraints across the entire system structure.

2. Methodology

The proposed framework, GNN-MPC, integrates three core components: Graph Neural Network dynamics, a structure-exploiting condensing algorithm, and GPU-accelerated implementation.

A. System Modeling with GNNs

Graph Representation: The system is modeled as a graph $G=(V, E)$ where nodes represent discrete segments of the robot (subsystems) and edges represent localized physical interactions.
Assumptions: The authors assume localized interactions (Assumption 2), meaning each node only interacts with a small, bounded number of neighbors ( $d \ll M$ ). This mirrors the physics of continuum soft robots where forces propagate locally.
Dynamics Learning: A GNN (based on the Interaction Network architecture) learns the forward dynamics. It predicts velocity increments using message-passing mechanisms that aggregate neighbor states. The state update is performed via a backward Euler integration scheme.
Linearization: For MPC, the nonlinear GNN dynamics are linearized around a nominal trajectory using automatic differentiation to generate local Jacobian matrices ( $A_{ij}, B_i$ ).

B. Structure-Aware Condensing Algorithm

To solve the resulting OCP efficiently, the authors adapt and extend condensing techniques:

Standard Condensing: Typically eliminates state variables from the OCP to reduce it to a Quadratic Program (QP) in terms of control inputs only. However, standard condensing scales poorly ( $O(n_x^2)$ or worse) with system size.
Local-Scalable Condensing: Leveraging the sparse structure of the GNN (where each node only depends on a bounded neighborhood), the authors derive a distributed condensing algorithm.
- The algorithm computes the condensed matrices ( $\Gamma_u, \Gamma_x, H, g$ ) recursively for each node based only on its local neighborhood.
- Theorem 1: Proves that under the localized interaction assumption, the computational complexity of condensing scales linearly ( $O(M)$ ) with the number of system nodes $M$ , rather than cubically.
- This allows the Hessian and gradient of the QP to be constructed via distributed summation of local contributions.

C. Implementation & Optimization

Hardware Acceleration: The condensing and QP construction steps are implemented in JAX and executed on GPUs, leveraging parallel operations for each node.
Solver: The resulting condensed QP is solved using an Interior Point Method (IPM) solver (HPIPM) within a Sequential Quadratic Programming (SQP) Real-Time Iteration (RTI) scheme.

3. Key Contributions

GNN Dynamics with Structure-Aware Condensing: The first framework to combine GNNs with MPC for high-dimensional systems while explicitly preserving and exploiting the sparse graph structure to achieve linear scaling in computational complexity.
Highly Efficient Implementation: A GPU-parallelized implementation that achieves near-constant computation times for systems with up to 1,000 nodes at a 100 Hz control frequency.
Experimental Validation: Successful deployment on a physical soft robotic trunk, demonstrating superior performance over existing baselines (Koopman operators and Spectral Submanifolds) in both simulation and hardware.

4. Experimental Results

Simulation Results

Prediction Accuracy: The GNN model achieved open-loop Root Mean Square Error (RMSE) comparable to Koopman operator models and significantly better than Spectral Submanifold (SSM) and standard MLP baselines.
Scalability: The framework maintained real-time performance (100 Hz) as the number of segments increased from 1 to 1,000. The QP solve time remained nearly constant, while the condensing step (the computational bottleneck) scaled sub-linearly due to GPU parallelization.

Hardware Experiments (Soft Robotic Trunk)

Setup: A physical soft trunk robot with 4 measured nodes (representing the whole body) controlled at 100 Hz.
Tracking Performance:
- GNN-MPC: Achieved average tracking errors of 6.25 mm (figure-eight) and 3.67 mm (circle).
- Baselines: The Koopman method had errors of ~23.76 mm, and SSM had ~17.15 mm.
- Improvement: The GNN-MPC outperformed the best baseline by 63.6%.
Robustness: The GNN successfully learned dynamics from random-walk trajectories, whereas baselines required specific quasi-periodic training data and failed on random inputs.
Obstacle Avoidance: The system demonstrated full-body obstacle avoidance, successfully deflecting specific nodes to avoid collisions while maintaining the rest of the robot's trajectory, a capability difficult to achieve with reduced-order models.

5. Significance and Impact

Bridging the Gap: This work resolves the long-standing trade-off between model complexity and control speed for high-dimensional systems. It proves that data-driven models (GNNs) can be used for rigorous, constraint-based optimal control if the underlying sparsity is exploited.
Scalability: The linear scaling property ( $O(M)$ ) is a breakthrough for controlling systems with thousands of degrees of freedom, opening the door for real-time control of complex soft robots, swarm robotics, and other distributed systems.
Full-State Control: Unlike previous methods that often control only a subset of states (e.g., end-effector), this approach enables full-body control, allowing for complex tasks like whole-body collision avoidance and precise manipulation in constrained environments.
Generalization: The GNN's ability to generalize from diverse training data (random walks) makes it more robust for real-world deployment compared to methods requiring specific trajectory identification.

In summary, the paper presents a mathematically rigorous and practically validated framework that enables the real-time control of highly complex, high-dimensional robotic systems by unifying the representational power of Graph Neural Networks with the efficiency of structure-exploiting optimization.