Automated Dose-Based Anatomic Region Classification of Radiotherapy Treatment for Big Data Applications

This paper presents a scalable, automated deep-learning software solution that accurately classifies radiotherapy treatment sites into six anatomic regions by analyzing dose-volume overlaps with segmented organs, thereby overcoming metadata inconsistencies to enable reliable curation of large-scale, multi-institutional radiotherapy datasets.

Justin Hink, Yasin Abdulkadir, Jack Neylon + 1 more2026-03-02🔬 physics

CycleBEV: Regularizing View Transformation Networks via View Cycle Consistency for Bird's-Eye-View Semantic Segmentation

CycleBEV is a training-only regularization framework that enhances Bird's-Eye-View semantic segmentation by introducing an inverse view transformation network to enforce cycle consistency between perspective and BEV spaces, thereby improving geometric and semantic feature learning without increasing inference complexity.

Jeongbin Hong, Dooseop Choi, Taeg-Hyun An + 2 more2026-03-02🤖 cs.AI

Hyperdimensional Cross-Modal Alignment of Frozen Language and Image Models for Efficient Image Captioning

This paper introduces HDFLIM, a framework that achieves efficient image captioning by aligning frozen vision and language models through hyperdimensional computing operations like binding and bundling, thereby eliminating the need for computationally intensive multimodal fine-tuning while maintaining performance comparable to end-to-end training methods.

Abhishek Dalvi, Vasant Honavar2026-03-02🤖 cs.AI

Suppressing Prior-Comparison Hallucinations in Radiology Report Generation via Semantically Decoupled Latent Steering

This paper introduces Semantically Decoupled Latent Steering (SDLS), a training-free inference-time framework that utilizes LLM-driven semantic decomposition and QR-based orthogonalization to generate intervention vectors that specifically suppress prior-comparison hallucinations in radiology report generation while preserving clinical accuracy.

Ao Li, Rui Liu, Mingjie Li + 6 more2026-03-02💻 cs

HiDrop: Hierarchical Vision Token Reduction in MLLMs via Late Injection, Concave Pyramid Pruning, and Early Exit

HiDrop is a novel framework that significantly accelerates Multimodal Large Language Models (MLLMs) by aligning token pruning with hierarchical layer functions through Late Injection, Concave Pyramid Pruning, and Early Exit mechanisms, achieving a 90% reduction in visual tokens with a 1.72x training speedup while maintaining original performance.

Hao Wu, Yingqi Fan, Jinyang Dai + 3 more2026-03-02💬 cs.CL