MotionBits: Video Segmentation through Motion-Level Analysis of Rigid Bodies
This paper introduces MotionBits, a novel concept and learning-free segmentation method that identifies the smallest manipulable rigid bodies through kinematic spatial twist equivalence, outperforming state-of-the-art embodied perception models on the new MoRiBo benchmark and enabling more effective downstream robotic manipulation and reasoning tasks.