MomentMix Augmentation with Length-Aware DETR for Temporally Robust Moment Retrieval
This paper proposes MomentMix, a data augmentation strategy combining ForegroundMix and BackgroundMix, and a Length-Aware Decoder to address feature diversity limitations and prediction biases, thereby significantly improving the localization accuracy of short moments in Video Moment Retrieval tasks.