Beyond Heuristic Prompting: A Concept-Guided Bayesian Framework for Zero-Shot Image Recognition

This paper proposes a Concept-Guided Bayesian Framework for zero-shot image recognition that enhances Vision-Language Models by treating class-specific concepts as latent variables, utilizing an LLM-driven synthesis pipeline with diversity enforcement and a training-free adaptive soft-trim likelihood to achieve superior performance over heuristic prompting methods.

Hui Liu, Kecheng Chen, Jialiang Wang, Xianming Liu, Wenya Wang, Haoliang Li2026-03-10💻 cs

IMSE: Intrinsic Mixture of Spectral Experts Fine-tuning for Test-Time Adaptation

The paper proposes IMSE, a test-time adaptation method that fine-tunes only the singular values of Vision Transformer linear layers via a spectral mixture of experts and a diversity maximization loss to prevent feature collapse, achieving state-of-the-art performance with significantly fewer trainable parameters.

Sunghyun Baek (Korea Advanced Institute of Science and Technology), Jaemyung Yu (Korea Advanced Institute of Science and Technology), Seunghee Koh (Korea Advanced Institute of Science and Technology), Minsu Kim (LG Energy Solution), Hyeonseong Jeon (LG Energy Solution), Junmo Kim (Korea Advanced Institute of Science and Technology)2026-03-10💻 cs

Condition-Triggered Cryptographic Asset Control via Dormant Authorization Paths

This paper introduces Condition-Triggered Dormant Authorization Paths (CT-DAP), a cryptographic framework that enables secure, conditional control and revocable delegation of digital assets through dormant authorization paths activated only by simultaneous user and administrative factors, thereby eliminating the need for persistent key exposure or trusted intermediaries while maintaining regulatory compliance.

Jian Sheng Wang2026-03-10💻 cs

Text to Automata Diagrams: Comparing TikZ Code Generation with Direct Image Synthesis

This study evaluates the effectiveness of vision-language and large language models in converting scanned student-drawn automata diagrams into TikZ code, finding that while direct image-to-text generation often yields errors, human-corrected descriptions significantly improve the accuracy of the resulting digital diagrams for educational applications like automated grading.

Ethan Young, Zichun Wang, Aiden Taylor, Chance Jewell, Julian Myers, Satya Sri Rajiteswari Nimmagadda, Anthony White, Aniruddha Maiti, Ananya Jana2026-03-10💻 cs

RAPID: Redundancy-Aware and Compatibility-Optimal Edge-Cloud Partitioned Inference for Diverse VLA models

The paper introduces RAPID, a novel Edge-Cloud Collaborative inference framework designed to optimize the deployment of Vision Language Action models by addressing visual noise interference and step-wise task redundancy, thereby achieving up to a 1.73x speedup with minimal overhead.

Zihao Zheng, Sicheng Tian, Hangyu Cao, Chenyue Li, Jiayu Chen, Maoliang Li, Xinhao Sun, Hailong Zou, Guojie Luo, Xiang Chen2026-03-10💻 cs

Decomposition-Driven Multi-Table Retrieval and Reasoning for Numerical Question Answering

This paper proposes DMRAL, a decomposition-driven framework that constructs a table relationship graph and employs aligned question decomposition with coverage-aware retrieval and sub-question guided reasoning to significantly outperform existing methods in numerical multi-table question answering over large-scale table collections.

Feng Luo, Hai Lan, Hui Luo, Zhifeng Bao, Xiaoli Wang, J. Shane Culpepper, Shazia Sadiq2026-03-10💻 cs