cs.AI papers | Gist.Science

Small Language Models for Efficient Agentic Tool Calling: Outperforming Large Models with Targeted Fine-tuning

This paper demonstrates that a single-epoch, domain-adapted fine-tuning of a 350M-parameter Small Language Model (OPT-350M) can significantly outperform larger models and existing baselines in tool-calling tasks, achieving a 77.55% pass rate on ToolBench and proving that targeted training can make generative AI more cost-effective and scalable for enterprise use.

Polaris Jhandi, Owais Kazi, Shreyas Subramanian, Neel Sendas2026-03-11🤖 cs.AI

Reinforcement Learning for Self-Improving Agent with Skill Library

This paper introduces SAGE, a novel Reinforcement Learning framework that enhances LLM-based agents' self-improvement capabilities by utilizing a skill library with sequential rollouts and skill-integrated rewards, achieving significantly higher goal completion rates and greater efficiency than existing methods on the AppWorld benchmark.

Jiongxiao Wang, Qiaojing Yan, Yawei Wang, Yijun Tian, Soumya Smruti Mishra, Zhichao Xu, Megha Gandhi, Panpan Xu, Lin Lee Cheong2026-03-11🤖 cs.AI

MCGI: Manifold-Consistent Graph Indexing for Billion-Scale Disk-Resident Vector Search

The paper proposes Manifold-Consistent Graph Indexing (MCGI), a geometry-aware, disk-resident indexing method that leverages Local Intrinsic Dimensionality to dynamically adapt search strategies, achieving significantly higher throughput and lower latency than state-of-the-art baselines on billion-scale datasets by resolving the Euclidean-Geodesic mismatch in high-dimensional spaces.

Dongfang Zhao2026-03-11🤖 cs.AI

CRANE: Causal Relevance Analysis of Language-Specific Neurons in Multilingual Large Language Models

This paper introduces CRANE, a causal relevance analysis framework that identifies language-specific neurons in multilingual LLMs through targeted interventions, demonstrating that these neurons functionally specialize in language-conditioned predictions rather than merely exhibiting high activation magnitudes.

Yifan Le, Yunliang Li2026-03-11🤖 cs.AI

An AI-powered Bayesian Generative Modeling Approach for Arbitrary Conditional Inference

This paper introduces Bayesian Generative Modeling (BGM), a unified framework that leverages a stochastic iterative Bayesian updating algorithm to learn a single generative model capable of performing arbitrary conditional inference with principled uncertainty quantification, without requiring retraining for different conditioning structures.

Qiao Liu, Wing Hung Wong2026-03-11🤖 cs.AI

Empowering All-in-Loop Health Management of Spacecraft Power System in the Mega-Constellation Era via Human-AI Collaboration

This paper addresses the challenges of health management for spacecraft power systems in the emerging mega-constellation era by proposing the "Aligning Underlying Capabilities" principle and introducing SpaceHMchat, an open-source Human-AI collaboration framework validated on a realistic hardware platform and a new large-scale dataset to achieve high-precision, interpretable, and efficient all-in-loop health management.

Yi Di, Zhibin Zhao, Fujin Wang, Xue Liu, Jiafeng Tang, Jiaxin Ren, Zhi Zhai, Xuefeng Chen2026-03-11🤖 cs.AI

CLEAR-Mamba:Towards Accurate, Adaptive and Trustworthy Multi-Sequence Ophthalmic Angiography Classification

The paper introduces CLEAR-Mamba, an enhanced MedMamba framework featuring a hypernetwork-based adaptive conditioning layer and a reliability-aware prediction scheme, which achieves superior accuracy and trustworthiness in multi-sequence ophthalmic angiography classification by addressing challenges in generalization and confidence estimation.

Zhuonan Wang, Wenjie Yan, Wenqiao Zhang, Xiaohui Song, Jian Ma, Ke Yao, Yibo Yu, Beng Chin Ooi2026-03-11🤖 cs.AI

Automating Forecasting Question Generation and Resolution for AI Evaluation

This paper presents an automated system using LLM-powered web research agents to generate and resolve diverse, real-world forecasting questions at scale, demonstrating high-quality question creation and resolution rates that surpass human-curated platforms while effectively evaluating and improving AI forecasting performance.

Nikos I. Bosse, Peter Mühlbacher, Jack Wildman, Lawrence Phillips, Dan Schwarz2026-03-11🤖 cs.AI

From Self-Evolving Synthetic Data to Verifiable-Reward RL: Post-Training Multi-turn Interactive Tool-Using Agents

This paper introduces EigenData, a unified framework that combines a self-evolving multi-agent system for synthesizing verifiable tool-use dialogues with a verifier-based reinforcement learning recipe, enabling scalable post-training of interactive agents that achieve state-of-the-art performance on complex multi-turn benchmarks without relying on expensive human annotation.

Jiaxuan Gao, Jiaao Chen, Chuyi He, Shusheng Xu, Di Jin, Yi Wu2026-03-11🤖 cs.AI

Multi-head automated segmentation by incorporating detection head into the contextual layer neural network

This paper proposes a gated multi-head Transformer architecture that integrates a parallel detection head to suppress anatomically implausible false positives in radiotherapy auto-segmentation, significantly improving robustness and accuracy on the Prostate-Anatomical-Edge-Cases dataset compared to conventional segmentation-only models.

Edwin Kys, Febian Febian2026-03-11🤖 cs.AI

UAT-LITE: Inference-Time Uncertainty-Aware Attention for Pretrained Transformers

The paper proposes UAT-LITE, an inference-time framework that injects Monte Carlo dropout into the self-attention mechanisms of pretrained transformers to estimate token-level epistemic uncertainty and modulate attention, thereby significantly improving model calibration and selective prediction performance without requiring additional training or weight modifications.

Elias Hossain, Shubhashis Roy Dipta, Subash Neupane, Rajib Rana, Ravid Shwartz-Ziv, Ivan Garibay, Niloofar Yousefi2026-03-11🤖 cs.AI

WebAccessVL: Violation-Aware VLM for Web Accessibility

The paper introduces WebAccessVL, a violation-aware vision-language model that automatically edits website HTML to fix WCAG2 accessibility violations while preserving visual design, achieving a 96% reduction in violations and outperforming GPT-5 through a supervised image-conditioned program synthesis approach enhanced by a checker-in-the-loop refinement strategy.

Amber Yijia Zheng, Jae Joong Lee, Bedrich Benes, Raymond A. Yeh2026-03-11🤖 cs.AI

Why do we Trust Chatbots? From Normative Principles to Behavioral Drivers

This paper argues that user trust in chatbots is often driven by interactional design choices that exploit cognitive biases rather than genuine trustworthiness, urging a reframing of chatbots as skilled salespeople and a distinction between psychological trust formation and normative trustworthiness to better calibrate user expectations.

Aditya Gulati, Nuria Oliver2026-03-11🤖 cs.AI

Monocular Normal Estimation via Shading Sequence Estimation

This paper introduces RoSE, a novel approach that reformulates monocular normal estimation as shading sequence estimation using image-to-video generative models to overcome 3D misalignment issues and achieve state-of-the-art performance on real-world benchmarks.

Zongrui Li, Xinhua Ma, Minghui Hu, Yunqing Zhao, Yingchen Yu, Qian Zheng, Chang Liu, Xudong Jiang, Song Bai2026-03-11🤖 cs.AI

Infusion: Shaping Model Behavior by Editing Training Data via Influence Functions

The paper introduces "Infusion," a framework that leverages scalable influence-function approximations to compute subtle perturbations in training data, demonstrating that modifying as little as 0.2% of a dataset can effectively and transferably shape model behavior across vision and language domains.

J Rosser, Robert Kirk, Edward Grefenstette, Jakob Foerster, Laura Ruis2026-03-11🤖 cs.AI

Energy-Aware Spike Budgeting for Continual Learning in Spiking Neural Networks for Neuromorphic Vision

This paper proposes an energy-aware spike budgeting framework that integrates experience replay, learnable neuron parameters, and an adaptive scheduler to effectively mitigate catastrophic forgetting while optimizing both accuracy and energy efficiency in Spiking Neural Networks across diverse frame-based and event-based neuromorphic vision benchmarks.

Anika Tabassum Meem, Muntasir Hossain Nadid, Md Zesun Ahmed Mia2026-03-11🤖 cs.AI

B-DENSE: Branching For Dense Ensemble Network Supervision Efficiency

The paper proposes B-DENSE, a novel distillation framework that leverages multi-branch trajectory alignment to enforce dense intermediate supervision, thereby overcoming the structural information loss and discretization errors of existing methods to achieve superior image generation quality with reduced inference latency.

Cherish Puniani, Tushar Kumar, Arnav Bendre, Gaurav Kumar, Shree Singhi2026-03-11🤖 cs.AI

Contextuality from Single-State Ontological Models: An Information-Theoretic No-Go Theorem

This paper establishes an information-theoretic no-go theorem proving that classical ontological models constrained to reuse a single ontic state space across multiple interventions inevitably incur an irreducible contextual information cost, thereby identifying contextuality as a fundamental limitation of such classical representations that quantum theory circumvents by relaxing the single-variable assumption.

Song-Ju Kim2026-03-11⚛️ quant-ph

Continual uncertainty learning

This paper proposes a curriculum-based continual learning framework that decomposes complex robust control problems with multiple uncertainties into sequential tasks, combining a model-based controller with deep reinforcement learning to achieve efficient, non-forgetting policy updates and successful sim-to-real transfer for automotive powertrain vibration control.

Heisei Yonezawa, Ansei Yonezawa, Itsuro Kajiwara2026-03-11🤖 cs.AI

ReDON: Recurrent Diffractive Optical Neural Processor with Reconfigurable Self-Modulated Nonlinearity

The paper introduces ReDON, a novel recurrent diffractive optical neural processor that overcomes the limitations of static passive masks by employing reconfigurable, self-modulated nonlinearity inspired by gated linear units, thereby significantly enhancing computational expressivity and task performance on image benchmarks with minimal power overhead.

Ziang Yin, Qi Jing, Raktim Sarma, Rena Huang, Yu Yao, Jiaqi Gu2026-03-11🔬 physics.optics

← Previous Next →