Learning to Think Fast and Slow for Visual Language Models
This paper introduces DualMindVLM, a visual language model that leverages a dual-mode thinking mechanism to dynamically select between fast, intuitive responses and slow, deliberate reasoning based on problem complexity, thereby achieving state-of-the-art performance with significantly improved token efficiency.