Token Bottleneck: One Token to Remember Dynamics
This paper introduces Token Bottleneck (ToBo), a self-supervised learning framework that compresses dynamic scenes into a compact token to predict future states with minimal hints, thereby enabling robust sequential scene understanding and robotic manipulation across both simulated and real-world environments.