Customizing Visual Emotion Evaluation for MLLMs: An Open-vocabulary, Multifaceted, and Scalable Approach

This paper addresses the limitations of existing visual emotion evaluation methods for Multimodal Large Language Models (MLLMs) by proposing an open-vocabulary, automated Emotion Statement Judgment framework that reveals current models' strengths in context-based interpretation but highlights significant gaps in understanding subjective perception compared to humans.

Daiqing Wu, Dongbao Yang, Sicheng Zhao + 2 more2026-03-03💻 cs

CircuitSense: A Hierarchical MLLM Benchmark Bridging Visual Comprehension and Symbolic Reasoning in Engineering Design Process

The paper introduces CircuitSense, a hierarchical benchmark of over 8,000 circuit problems that evaluates Multi-modal Large Language Models across perception, analysis, and design tasks, revealing a critical performance gap where models excel at visual recognition but struggle significantly with deriving symbolic equations and performing mathematical reasoning essential for engineering design.

Arman Akbari, Jian Gao, Yifei Zou + 6 more2026-03-03💻 cs

VA-Adapter: Adapting Ultrasound Foundation Model to Echocardiography Probe Guidance

To overcome the challenges of individual variability in echocardiography probe guidance, the authors propose VA-Adapter, a lightweight module that integrates vision-action sequences into an ultrasound foundation model to enable online inference of individual 3D cardiac structures, achieving superior performance with significantly fewer parameters than existing methods.

Teng Wang, Haojun Jiang, Yuxuan Wang + 4 more2026-03-03💻 cs

LinearSR: Unlocking Linear Attention for Stable and Efficient Image Super-Resolution

This paper introduces LinearSR, a holistic framework that enables stable and efficient photorealistic image super-resolution by overcoming linear attention's historical training instability and perception-distortion trade-off through novel strategies like ESGF, SNR-based MoE, and TAG, achieving state-of-the-art quality with exceptional computational efficiency.

Xiaohui Li, Shaobin Zhuang, Shuo Cao + 6 more2026-03-03💻 cs