cs papers | Gist.Science

Beyond Heuristic Prompting: A Concept-Guided Bayesian Framework for Zero-Shot Image Recognition

This paper proposes a Concept-Guided Bayesian Framework for zero-shot image recognition that enhances Vision-Language Models by treating class-specific concepts as latent variables, utilizing an LLM-driven synthesis pipeline with diversity enforcement and a training-free adaptive soft-trim likelihood to achieve superior performance over heuristic prompting methods.

Hui Liu, Kecheng Chen, Jialiang Wang, Xianming Liu, Wenya Wang, Haoliang Li2026-03-10💻 cs

Geometric Transformation-Embedded Mamba for Learned Video Compression

This paper proposes a streamlined learned video compression framework that replaces traditional motion estimation with a direct transform strategy, utilizing a cascaded Mamba module with embedded geometric transformations and a locality refinement network to achieve superior perceptual quality and temporal consistency at low bitrates.

Hao Wei, Yanhui Zhou, Chenyang Ge2026-03-10💻 cs

Ares: Adaptive Reasoning Effort Selection for Efficient LLM Agents

The paper introduces Ares, a framework that dynamically selects the optimal reasoning effort level for each step of an LLM agent's task using a lightweight router, achieving up to 52.7% reduction in token usage with minimal impact on success rates compared to static high-effort strategies.

Jingbo Yang, Bairu Hou, Wei Wei, Yujia Bao, Shiyu Chang2026-03-10💻 cs

SageSched: Efficient LLM Scheduling Confronting Demand Uncertainty and Hybridity

SageSched is an efficient LLM scheduler that addresses demand uncertainty and hybrid resource requirements by integrating lightweight output-length prediction with a comprehensive cost model and an uncertainty-aware policy, achieving over 28.7% efficiency improvement in diverse testbed experiments.

Zhenghao Gan, Yichen Bao, Yifei Liu, Chen Chen, Quan Chen, Minyi Guo2026-03-10💻 cs

Enhancing Unregistered Hyperspectral Image Super-Resolution via Unmixing-based Abundance Fusion Learning

This paper proposes an unmixing-based fusion framework that decouples spatial-spectral information and employs a coarse-to-fine deformable aggregation module to effectively mitigate registration errors and achieve state-of-the-art performance in unregistered hyperspectral image super-resolution.

Yingkai Zhang, Tao Zhang, Jing Nie, Ying Fu2026-03-10💻 cs

Social Proof is in the Pudding: The (Non)-Impact of Social Proof on Software Downloads

Through two field experiments on GitHub involving the manipulation of repository stars and package download counts, the study finds that social proof metrics have no discernible impact on subsequent software downloads or developer engagement, suggesting that open-source software choices are not easily gamed by inflating these indicators.

Lucas Shen, Gaurav Sood2026-03-10💻 cs

RLPR: Radar-to-LiDAR Place Recognition via Two-Stage Asymmetric Cross-Modal Alignment for Autonomous Driving

This paper presents RLPR, a robust framework for radar-to-LiDAR place recognition that employs a dual-stream network and a two-stage asymmetric cross-modal alignment strategy to achieve state-of-the-art accuracy and zero-shot generalization across diverse radar types and adverse weather conditions.

Zhangshuo Qi, Jingyi Xu, Luqi Cheng, Shichen Wen, Guangming Xiong2026-03-10💻 cs

IMSE: Intrinsic Mixture of Spectral Experts Fine-tuning for Test-Time Adaptation

The paper proposes IMSE, a test-time adaptation method that fine-tunes only the singular values of Vision Transformer linear layers via a spectral mixture of experts and a diversity maximization loss to prevent feature collapse, achieving state-of-the-art performance with significantly fewer trainable parameters.

Sunghyun Baek (Korea Advanced Institute of Science and Technology), Jaemyung Yu (Korea Advanced Institute of Science and Technology), Seunghee Koh (Korea Advanced Institute of Science and Technology), Minsu Kim (LG Energy Solution), Hyeonseong Jeon (LG Energy Solution), Junmo Kim (Korea Advanced Institute of Science and Technology)2026-03-10💻 cs

SWE-Fuse: Empowering Software Agents via Issue-free Trajectory Learning and Entropy-aware RLVR Training

SWE-Fuse is a novel training framework that enhances software engineering agents by fusing issue-free trajectory learning with entropy-aware RLVR to overcome the limitations of noisy real-world issue descriptions, achieving state-of-the-art performance on the SWE-bench Verified benchmark.

Xin-Cheng Wen, Binbin Chen, Haoxuan Lan, Hang Yu, Peng Di, Cuiyun Gao2026-03-10💻 cs

Omnidirectional Humanoid Locomotion on Stairs via Unsafe Stepping Penalty and Sparse LiDAR Elevation Mapping

This paper presents a robust framework for safe omnidirectional humanoid stair locomotion that combines a single-stage training strategy with dense unsafe stepping penalties and a refined sparse LiDAR elevation mapping system to achieve high success rates in both simulation and real-world deployments.

Yuzhi Jiang, Yujun Liang, Junhao Li, Han Ding, Lijun Zhu2026-03-10💻 cs

A Hybrid Vision Transformer Approach for Mathematical Expression Recognition

This paper proposes a Hybrid Vision Transformer approach with 2D positional encoding and a coverage attention decoder to address the complexities of mathematical expression recognition, achieving a state-of-the-art BLEU score of 89.94 on the IM2LATEX-100K dataset.

Anh Duy Le, Van Linh Pham, Vinh Loi Ly, Nam Quan Nguyen, Huu Thang Nguyen, Tuan Anh Tran2026-03-10💻 cs

Condition-Triggered Cryptographic Asset Control via Dormant Authorization Paths

This paper introduces Condition-Triggered Dormant Authorization Paths (CT-DAP), a cryptographic framework that enables secure, conditional control and revocable delegation of digital assets through dormant authorization paths activated only by simultaneous user and administrative factors, thereby eliminating the need for persistent key exposure or trusted intermediaries while maintaining regulatory compliance.

Jian Sheng Wang2026-03-10💻 cs

Unsupervised Domain Adaptation for Audio Deepfake Detection with Modular Statistical Transformations

This paper presents a modular, unsupervised domain adaptation pipeline that combines Wav2Vec 2.0 embeddings with statistical transformations like CORAL alignment and feature selection to significantly improve cross-domain generalization for audio deepfake detection without requiring labeled target data.

Urawee Thani, Gagandeep Singh, Priyanka Singh2026-03-10💻 cs

Text to Automata Diagrams: Comparing TikZ Code Generation with Direct Image Synthesis

This study evaluates the effectiveness of vision-language and large language models in converting scanned student-drawn automata diagrams into TikZ code, finding that while direct image-to-text generation often yields errors, human-corrected descriptions significantly improve the accuracy of the resulting digital diagrams for educational applications like automated grading.

Ethan Young, Zichun Wang, Aiden Taylor, Chance Jewell, Julian Myers, Satya Sri Rajiteswari Nimmagadda, Anthony White, Aniruddha Maiti, Ananya Jana2026-03-10💻 cs

$L^3$ :Scene-agnostic Visual Localization in the Wild

The paper introduces $L^3$ , a novel map-free visual localization framework that achieves high accuracy and robustness in sparse, wild scenes by leveraging online feed-forward 3D reconstruction and two-stage pose refinement, thereby eliminating the need for offline scene preprocessing and storage.

Yu Zhang, Muhua Zhu, Yifei Xue, Tie Ji, Yizhen Lao2026-03-10💻 cs

AI Agents, Language, Deep Learning and the Next Revolution in Science

This paper proposes that intelligent, human-supervised AI agents built on deep learning and large language models represent the next evolution of the scientific method, enabling researchers to manage unprecedented data complexity and scale discovery, as demonstrated by the Dr. Sai system in particle physics.

Ke Li, Beijiang Liu, Bruce Mellado, Changzheng Yuan, Zhengde Zhang2026-03-10💻 cs

ConnChecker: Automated Root-Cause Analysis for Formal Connectivity Check via Graph

ConnChecker is a novel graph-based automated tool that accelerates formal connectivity verification in complex SoC designs by categorizing counterexamples and localizing root causes, achieving up to an 80% reduction in debugging time.

Do Ngoc Tiep, Nguyen Linh Anh, Luu Danh Minh2026-03-10💻 cs

The Li-Chao Tree: Algorithm Specification and Analysis

This paper provides the first formal specification and comprehensive analysis of the Li-Chao tree, a widely used data structure in competitive programming, by establishing its algorithmic details, proving correctness, and evaluating its theoretical and empirical performance.

Chao Li2026-03-10💻 cs

RAPID: Redundancy-Aware and Compatibility-Optimal Edge-Cloud Partitioned Inference for Diverse VLA models

The paper introduces RAPID, a novel Edge-Cloud Collaborative inference framework designed to optimize the deployment of Vision Language Action models by addressing visual noise interference and step-wise task redundancy, thereby achieving up to a 1.73x speedup with minimal overhead.

Zihao Zheng, Sicheng Tian, Hangyu Cao, Chenyue Li, Jiayu Chen, Maoliang Li, Xinhao Sun, Hailong Zou, Guojie Luo, Xiang Chen2026-03-10💻 cs

Decomposition-Driven Multi-Table Retrieval and Reasoning for Numerical Question Answering

This paper proposes DMRAL, a decomposition-driven framework that constructs a table relationship graph and employs aligned question decomposition with coverage-aware retrieval and sub-question guided reasoning to significantly outperform existing methods in numerical multi-table question answering over large-scale table collections.

Feng Luo, Hai Lan, Hui Luo, Zhifeng Bao, Xiaoli Wang, J. Shane Culpepper, Shazia Sadiq2026-03-10💻 cs

← Previous Next →

cs