Harnessing Chain-of-Thought Reasoning in Multimodal Large Language Models for Face Anti-Spoofing

This paper addresses the generalization limitations of traditional Face Anti-Spoofing by introducing FaceCoT, the first large-scale Visual Question Answering dataset enriched with Chain-of-Thought reasoning and generated via reinforcement learning, alongside a CEPL training strategy that collectively enable Multimodal Large Language Models to achieve superior robustness and interpretability across diverse spoofing attacks.

Honglu Zhang, Zhiqin Fang, Ningning Zhao + 4 more2026-03-03💻 cs

Improving Wildlife Out-of-Distribution Detection: Africas Big Five

This study addresses the challenge of overconfident predictions in closed-world animal classification by demonstrating that feature-based out-of-distribution detection methods, particularly Nearest Class Mean with ImageNet pre-trained features, significantly outperform existing techniques in identifying unknown wildlife species within the context of Africa's Big Five.

Mufhumudzi Muthivhi, Jiahao Huo, Fredrik Gustafsson + 1 more2026-03-03🤖 cs.AI

Advancing Complex Video Object Segmentation via Progressive Concept Construction

The paper introduces Segment Concept (SeC), a novel video object segmentation framework that leverages Large Vision-Language Models to progressively construct high-level object-centric representations, achieving state-of-the-art performance on a new Semantic Complex Scenarios benchmark (SeCVOS) by significantly outperforming existing methods like SAM 2.

Zhixiong Zhang, Shuangrui Ding, Xiaoyi Dong + 7 more2026-03-03🤖 cs.AI