OddGridBench: Exposing the Lack of Fine-Grained Visual Discrepancy Sensitivity in Multimodal Large Language Models
This paper introduces OddGridBench, a benchmark revealing that current multimodal large language models significantly underperform humans in detecting fine-grained visual discrepancies, and proposes OddGrid-GRPO, a reinforcement learning framework that effectively enhances this sensitivity through curriculum learning and distance-aware rewards.