Evaluating GPT-5 as a Multimodal Clinical Reasoner: A Landscape Commentary

This landscape commentary evaluates the GPT-5 family against GPT-4o, revealing substantial improvements in expert-level textual reasoning and multimodal synthesis that approach state-of-the-art performance in tasks like mammography, while highlighting that generalist models still lag behind specialized systems in perception-critical domains such as neuroradiology.

Alexandru Florea, Shansong Wang, Mingzhe Hu + 5 more2026-03-06💻 cs

Evaluating and Correcting Human Annotation Bias in Dynamic Micro-Expression Recognition

This paper introduces the Global Anti-Monotonic Differential Selection Strategy (GAMDSS), a novel architecture that mitigates human annotation bias in cross-cultural micro-expression recognition by dynamically re-selecting keyframes to construct robust spatio-temporal representations, thereby improving model performance and standardizing annotation practices without increasing computational parameters.

Feng Liu, Bingyu Nan, Xuezhong Qian + 1 more2026-03-06💻 cs

DSA-SRGS: Super-Resolution Gaussian Splatting for Dynamic Sparse-View DSA Reconstruction

This paper proposes DSA-SRGS, the first super-resolution Gaussian splatting framework for dynamic sparse-view DSA reconstruction, which integrates a Multi-Fidelity Texture Learning Module with confidence-aware supervision and Radiative Sub-Pixel Densification to recover fine-grained vascular details while avoiding blurring and hallucination artifacts.

Shiyu Zhang, Zhicong Wu, Huangxuan Zhao + 7 more2026-03-06💻 cs

MADCrowner: Margin Aware Dental Crown Design with Template Deformation and Refinement

The paper proposes MADCrowner, a margin-aware framework that combines a template deformation network (CrownDeformR) with a novel margin segmentation network (CrownSegger) to automatically generate high-precision, clinically feasible dental crowns by addressing limitations in spatial resolution and surface overextension found in existing learning-based methods.

Linda Wei, Chang Liu, Wenran Zhang + 9 more2026-03-06💻 cs

RMK RetinaNet: Rotated Multi-Kernel RetinaNet for Robust Oriented Object Detection in Remote Sensing Imagery

The paper proposes RMK RetinaNet, a rotated object detection framework for remote sensing imagery that addresses limitations in receptive field adaptation, multi-scale feature fusion, and angle regression discontinuity through a Multi-Scale Kernel Block, Multi-Directional Contextual Anchor Attention, a Bottom-up Path, and an Euler Angle Encoding Module, achieving state-of-the-art performance on benchmark datasets.

Huiran Sun2026-03-06💻 cs

LAW & ORDER: Adaptive Spatial Weighting for Medical Diffusion and Segmentation

This paper introduces "LAW & ORDER," a dual-adapter framework that employs Learnable Adaptive Weighting to stabilize diffusion-based medical image synthesis and Optimal Region Detection to enhance efficient segmentation, collectively addressing spatial imbalance to significantly improve generative quality and segmentation accuracy while maintaining a lightweight model architecture.

Anugunj Naman, Ayushman Singh, Gaibo Zhang + 1 more2026-03-06💻 cs

Guiding Diffusion-based Reconstruction with Contrastive Signals for Balanced Visual Representation

This paper proposes Diffusion Contrastive Reconstruction (DCR), a method that injects contrastive signals derived from reconstructed images into the diffusion process to resolve gradient conflicts and jointly optimize both discriminative and detail-perceptive abilities, thereby overcoming the limitations of CLIP's visual encoder for balanced visual representation.

Boyu Han, Qianqian Xu, Shilong Bao + 4 more2026-03-06💻 cs

Meta-D: Metadata-Aware Architectures for Brain Tumor Analysis and Missing-Modality Segmentation

The paper presents Meta-D, a metadata-aware architecture that leverages categorical scanner information to dynamically modulate feature extraction for improved 2D brain tumor detection and to serve as a robust anchor for cross-attention mechanisms in 3D missing-modality segmentation, achieving significant performance gains and parameter reduction.

SangHyuk Kim, Daniel Haehn, Sumientra Rampersad2026-03-06💻 cs

Revisiting Shape from Polarization in the Era of Vision Foundation Models

This paper demonstrates that by addressing domain gaps through a high-quality dataset of 3D-scanned objects, DINOv3 priors, and sensor-aware augmentation, a lightweight polarization-based model trained on a small dataset can significantly outperform both state-of-the-art Shape from Polarization methods and large-scale RGB-only Vision Foundation Models in single-shot surface normal estimation.

Chenhao Li, Taishi Ono, Takeshi Uemori + 1 more2026-03-06💻 cs

On Multi-Step Theorem Prediction via Non-Parametric Structural Priors

This paper introduces a training-free, non-parametric approach to multi-step theorem prediction that overcomes the scalability limitations of vanilla in-context learning by leveraging Theorem Precedence Graphs to encode temporal dependencies and impose topological constraints, achieving state-of-the-art accuracy on the FormalGeo7k benchmark without gradient-based optimization.

Junbo Zhao, Ting Zhang, Can Li + 3 more2026-03-06🤖 cs.AI