RAGTrack: Language-aware RGBT Tracking with Retrieval-Augmented Generation
This paper introduces RAGTrack, a novel Retrieval-Augmented Generation framework that enhances RGB-Thermal tracking by integrating textual descriptions via Multi-modal Large Language Models and employing adaptive token fusion with context-aware reasoning to overcome appearance variations and modality gaps.