cs.AI papers | Gist.Science

ViLAM: Distilling Vision-Language Reasoning into Attention Maps for Social Robot Navigation

ViLAM is a novel method that distills vision-language reasoning from large Vision-Language Models into spatial attention maps to guide socially compliant robot navigation, achieving significant improvements in success rates through real-world validation.

Mohamed Elnoor, Kasun Weerakoon, Gershom Seneviratne, Jing Liang, Vignesh Rajagopal, Dinesh Manocha2026-03-10💻 cs

IMPACT: Intelligent Motion Planning with Acceptable Contact Trajectories via Vision-Language Models

The paper proposes IMPACT, a novel motion planning framework that leverages Vision-Language Models to infer environment semantics and generate anisotropic cost maps, enabling a contact-aware A* planner to safely navigate cluttered environments by distinguishing between acceptable and dangerous object contacts.

Yiyang Ling, Karan Owalekar, Oluwatobiloba Adesanya, Erdem Bıyık, Daniel Seita2026-03-10🤖 cs.LG

Engineering Systems for Data Analysis Using Interactive Structured Inductive Programming

The paper introduces iProg, an interactive tool that leverages a structured communication protocol between humans and large language models to decompose scientific data analysis tasks into declarative Data Flow Diagrams and generate corresponding code, thereby achieving significantly faster development, higher code quality, and better performance than traditional Low Code/No Code alternatives.

Shraddha Surana, Ashwin Srinivasan, Michael Bain2026-03-10💻 cs

More Women, Same Stereotypes: Unpacking the Gender Bias Paradox in Large Language Models

This paper reveals that while Large Language Models overrepresent female characters due to fine-tuning, they paradoxically still reinforce traditional occupational gender stereotypes more than real-world labor data, highlighting the need for nuanced bias mitigation strategies.

Evan Chen, Run-Jun Zhan, Yan-Bai Lin, Hung-Hsuan Chen2026-03-10💬 cs.CL

From 2D Alignment to 3D Plausibility: Unifying Heterogeneous 2D Priors and Penetration-Free Diffusion for Occlusion-Robust Two-Hand Reconstruction

This paper proposes a unified framework for occlusion-robust two-hand reconstruction that combines a fusion-alignment encoder to implicitly integrate heterogeneous 2D structural priors from vision foundation models with a penetration-free diffusion model that guides 3D pose generation toward collision-free, kinematically coherent interactions.

Gaoge Han, Yongkang Cheng, Zhe Chen, Shaoli Huang, Tongliang Liu2026-03-10💻 cs

More Bang for the Buck: Process Reward Modeling with Entropy-Driven Uncertainty

The paper introduces EDU-PRM, an entropy-driven process reward model that automatically identifies reasoning step boundaries using predictive entropy to eliminate manual annotations, achieving state-of-the-art performance with only 1.5% of the training data while significantly improving accuracy and reducing token usage.

Lang Cao, Renhong Chen, Yingtian Zou, Chao Peng, Huacong Xu, Yuxian Wang, Wu Ning, Qian Chen, Mofan Peng, Zijie Chen, Peishuo Su, Yitong Li2026-03-10🤖 cs.LG

MediTools -- Medical Education Powered by LLMs

This paper introduces MediTools, an AI-powered prototype application that leverages large language models to revolutionize medical education through interactive dermatology case simulations, enhanced literature analysis, and automated medical news summaries, while validating its potential through a survey of medical professionals and students.

Amr Alshatnawi, Remi Sampaleanu, David Liebovitz2026-03-10💻 cs

Enhancing Metabolic Syndrome Prediction with Hybrid Data Balancing and Counterfactuals

This paper proposes MetaBoost, a novel hybrid framework combining SMOTE, ADASYN, and CTGAN to optimize data balancing for enhanced Metabolic Syndrome prediction, while utilizing counterfactual analysis to identify blood glucose and triglycerides as the most critical modifiable risk factors.

Sanyam Paresh Shah, Abdullah Mamun, Shovito Barua Soumma + 1 more2026-03-10🤖 cs.AI

From LLM Reasoning to Autonomous AI Agents: A Comprehensive Review

This paper presents a comprehensive review that consolidates fragmented evaluation efforts into a unified taxonomy of approximately 60 benchmarks, surveys AI-agent frameworks and collaboration protocols, and explores real-world applications and future research directions for autonomous AI agents.

Mohamed Amine Ferrag, Norbert Tihanyi, Merouane Debbah2026-03-10🤖 cs.LG

SFIBA: Spatial-based Full-target Invisible Backdoor Attacks

The paper proposes SFIBA, a spatial-based full-target invisible backdoor attack that ensures trigger specificity and stealthiness in black-box settings by restricting triggers to local spatial regions and employing a frequency-domain injection method, thereby achieving high attack performance while evading existing defenses.

Yangxu Yin, Honglong Chen, Yudong Gao, Peng Sun, Zhishuai Li, Weifeng Liu2026-03-10💻 cs

Multi-Domain Audio Question Answering Benchmark Toward Acoustic Content Reasoning

This paper introduces Task 5 of the DCASE 2025 Challenge, a multi-domain Audio Question Answering benchmark designed to evaluate and advance the acoustic reasoning capabilities of audio-language models across diverse scenarios including bioacoustics, temporal soundscapes, and complex real-world clips.

Chao-Han Huck Yang, Sreyan Ghosh, Qing Wang, Jaeyeon Kim, Hengyi Hong, Sonal Kumar, Guirui Zhong, Zhifeng Kong, S Sakshi, Vaibhavi Lokegaonkar, Oriol Nieto, Ramani Duraiswami, Dinesh Manocha, Gunhee Kim, Jun Du, Rafael Valle, Bryan Catanzaro2026-03-10💬 cs.CL

Precision Proactivity: Measuring Cognitive Load in Real-World AI-Assisted Work

This study of financial professionals using AI reveals that while AI-generated content improves task quality, unsolicited proactive interventions significantly increase extraneous cognitive load—particularly for less experienced users—thereby degrading performance more severely than intrinsic task complexity.

Brandon Lepine, Juho Kim, Pamela Mishkin, Matthew Beane2026-03-10💻 cs

Ready2Unlearn: A Learning-Time Approach for Preparing Models with Future Unlearning Readiness

This paper introduces Ready2Unlearn, a proactive, model-agnostic training-time optimization approach that leverages meta-learning principles to prepare machine learning models for efficient and principled future unlearning, shifting the focus from reactive post-deployment algorithms to forward-looking readiness.

Hanyu Duan, Yi Yang, Ahmed Abbasi, Kar Yan Tam2026-03-10🤖 cs.LG

FreeKV: Boosting KV Cache Retrieval for Efficient LLM Inference

FreeKV is a training-free framework that combines speculative retrieval, fine-grained correction, and hybrid CPU-GPU memory management to significantly accelerate KV cache retrieval for large language models, achieving up to a 13× speedup over state-of-the-art methods while maintaining near-lossless accuracy.

Guangda Liu, Chengwei Li, Zhenyu Ning, Jing Lin, Yiwu Yao, Danning Ke, Minyi Guo, Jieru Zhao2026-03-10🤖 cs.LG

A Neuro-Symbolic Approach for Reliable Proof Generation with LLMs: A Case Study in Euclidean Geometry

This paper proposes a neuro-symbolic framework that enhances the reliability of LLM-generated mathematical proofs in Euclidean geometry by combining analogous problem retrieval with formal verification feedback, resulting in significant accuracy improvements for models like OpenAI's o1.

Oren Sultan, Eitan Stern, Dafna Shahaf2026-03-10💬 cs.CL

MAS-ZERO: Designing Multi-Agent Systems with Zero Supervision

MAS-ZERO is a novel, self-evolved inference-time framework that automatically designs, critiques, and refines multi-agent system configurations for specific tasks without requiring a validation set, achieving significant performance improvements over manual and existing automatic baselines across reasoning, coding, and agentic benchmarks.

Zixuan Ke, Austin Xu, Yifei Ming, Xuan-Phi Nguyen, Ryan Chin, Caiming Xiong, Shafiq Joty2026-03-10🤖 cs.LG

The Cell Must Go On: Agar.io for Continual Reinforcement Learning

This paper introduces AgarCL, a research platform based on the non-episodic game Agar.io designed to advance continual reinforcement learning by providing a complex, dynamic environment where standard algorithms and existing continual learning methods face significant challenges beyond the traditional stability-plasticity dilemma.

Mohamed A. Mohamed, Kateryna Nekhomiazh, Vedant Vyas, Marcos M. Jose, Andrew Patterson, Marlos C. Machado2026-03-10🤖 cs.LG

Maximum Principle of Optimal Probability Density Control

This paper establishes a maximum principle and the Hamilton-Jacobi-Bellman equation for optimal control on infinite-dimensional probability distribution spaces, and leverages these theoretical results to develop a scalable deep learning algorithm for solving high-dimensional multi-agent control problems.

Nathan Gaby, Xiaojing Ye2026-03-10🤖 cs.LG

Stronger Enforcement of Instruction Hierarchy via Augmented Intermediate Representations

This paper proposes a novel defense against prompt injection attacks in large language models by augmenting intermediate token representations with layer-specific trainable embeddings to enforce instruction hierarchy, achieving a 1.6x to 9.2x reduction in attack success rates compared to state-of-the-art methods without compromising model utility.

Sanjay Kariyappa, G. Edward Suh2026-03-10🤖 cs.LG

OCN: Effectively Utilizing Higher-Order Common Neighbors for Better Link Prediction

This paper proposes Orthogonal Common Neighbor (OCN), a novel link prediction method that addresses redundancy and over-smoothing in higher-order common neighbors through orthogonalization and normalization, achieving significant performance improvements over state-of-the-art baselines.

Juntong Wang, Xiyuan Wang, Muhan Zhang2026-03-10🤖 cs.LG

← Previous Next →