MERGETUNE: Continued Fine-Tuning of Vision-Language Models
This paper introduces MERGETUNE, a model-agnostic continued fine-tuning strategy that leverages linear mode connectivity and a second-order surrogate to recover pretrained knowledge in vision-language models after adaptation, thereby mitigating catastrophic forgetting and achieving state-of-the-art performance without additional parameters or data replay.