Efficient Domain-Adaptive Multi-Task Dense Prediction with Vision Foundation Models
This paper introduces FAMDA, a simple yet effective unsupervised domain adaptation framework that leverages Vision Foundation Models as teachers within a self-training paradigm to generate high-quality pseudo-labels, enabling the training of highly efficient student networks that achieve state-of-the-art performance in multi-task dense prediction for resource-constrained robotics applications.