Implicit U-KAN2.0: Dynamic, Efficient and Interpretable Medical Image Segmentation

This paper introduces Implicit U-KAN 2.0, a novel medical image segmentation model that combines second-order neural ordinary differential equations (SONO) with MultiKAN layers in a two-phase encoder-decoder architecture to achieve superior performance, enhanced interpretability, and dimension-independent approximation capabilities while reducing computational costs.

Chun-Wun Cheng, Yining Zhao, Yanqi Cheng + 3 more2026-03-05🤖 cs.LG

Beyond Accuracy: What Matters in Designing Well-Behaved Image Classification Models?

This paper presents a large-scale analysis of 326 image classification models across nine quality dimensions beyond accuracy, revealing that vision-language models, self-supervised initialization, and dataset size significantly influence model behavior, and introduces the QUBA score to holistically rank and recommend models based on specific user needs.

Robin Hesse, Doğukan Bağcı, Bernt Schiele + 2 more2026-03-05🤖 cs.LG

Intelligent Diagnosis Using Dual-Branch Attention Network for Rare Thyroid Carcinoma Recognition with Ultrasound Imaging

This paper proposes the Channel-Spatial Attention Synergy Network (CSASN), a novel multitask learning framework that integrates dual-branch EfficientNet and ViT architectures with attention mechanisms to effectively address data imbalance and morphological heterogeneity for the accurate diagnosis of rare thyroid carcinoma subtypes using ultrasound imaging.

Peiqi Li, Yincheng Gao, Renxing Li + 10 more2026-03-05💻 cs

Why 1 + 1 < 1 in Visual Token Pruning: Beyond Naive Integration via Multi-Objective Balanced Covering

This paper introduces Multi-Objective Balanced Covering (MoB), a novel visual token pruning framework that leverages Hausdorff distance and ϵ\epsilon-covering theory to derive a closed-form error bound and dynamically balance prompt alignment with visual preservation, achieving significant inference acceleration with minimal performance loss across diverse multimodal models.

Yangfu Li, Hongjian Zhan, Tianyi Chen + 2 more2026-03-05💬 cs.CL

EgoWorld: Translating Exocentric View to Egocentric View using Rich Exocentric Observations

EgoWorld is a novel framework that reconstructs semantically coherent egocentric views from rich exocentric observations—including point clouds, 3D hand poses, and text—by leveraging depth estimation and diffusion models to overcome the limitations of existing 2D-based translation methods and achieve state-of-the-art performance across diverse datasets.

Junho Park, Andrew Sangwoo Ye, Taein Kwon2026-03-05🤖 cs.AI

Fast Equivariant Imaging: Acceleration for Unsupervised Learning via Augmented Lagrangian and Auxiliary PnP Denoisers

This paper introduces Fast Equivariant Imaging (FEI), a novel unsupervised learning framework that leverages the Augmented Lagrangian method and auxiliary Plug-and-Play denoisers to achieve a 10x training acceleration and improved generalization for deep imaging tasks like X-ray CT reconstruction and inpainting without requiring ground-truth data.

Guixian Xu, Jinglai Li, Junqi Tang2026-03-05🤖 cs.LG