cs.AI papers | Gist.Science

SPARC: Spatial-Aware Path Planning via Attentive Robot Communication

The paper proposes SPARC, a spatial-aware path planning framework that introduces a Relation-enhanced Multi-Head Attention (RMHA) mechanism to explicitly encode pairwise distances into robot communication, significantly improving decentralized multi-robot coordination and zero-shot generalization in high-density environments compared to existing methods.

Sayang Mu, Xiangyu Wu, Bo AnWed, 11 Ma🤖 cs.AI

Boltzmann-based Exploration for Robust Decentralized Multi-Agent Planning (Extended Version)

This paper introduces Coordinated Boltzmann MCTS (CB-MCTS), a novel decentralized multi-agent planning algorithm that replaces deterministic UCT with a stochastic Boltzmann policy and decaying entropy bonus to overcome the limitations of existing methods in sparse or deceptive reward environments.

Nhat D. A. Nguyen, Duong D. Nguyen, Gianluca Rizzo, Hung X. NguyenWed, 11 Ma🤖 cs.AI

Reasoning as Gradient: Scaling MLE Agents Beyond Tree Search

This paper introduces \textsc{Gome}, a gradient-based MLE agent that outperforms traditional tree search methods on MLE-Bench by mapping diagnostic reasoning to gradient computation, demonstrating that as LLM reasoning capabilities improve, gradient-based optimization becomes increasingly superior to exhaustive enumeration.

Yifei Zhang, Xu Yang, Xiao Yang, Bowen Xian, Qizheng Li, Shikai Fang, Jingyuan Li, Jian Wang, Mingrui Xu, Weiqing Liu, Jiang BianWed, 11 Ma🤖 cs.AI

Pri4R: Learning World Dynamics for Vision-Language-Action Models with Privileged 4D Representation

Pri4R is a simple yet effective method that enhances Vision-Language-Action models with an implicit understanding of world dynamics by training them to predict 3D point tracks using privileged 4D information, thereby significantly improving physical manipulation performance without adding inference overhead.

Jisoo Kim, Jungbin Cho, Sanghyeok Chu, Ananya Bal, Jinhyung Kim, Gunhee Lee, Sihaeng Lee, Seung Hwan Kim, Bohyung Han, Hyunmin Lee, Laszlo A. Jeni, Seungryong KimWed, 11 Ma🤖 cs.AI

Zero-Shot and Supervised Bird Image Segmentation Using Foundation Models: A Dual-Pipeline Approach with Grounding DINO~1.5, YOLOv11, and SAM~2.1

This paper proposes a dual-pipeline framework for bird image segmentation that leverages the frozen SAM 2.1 backbone with either a zero-shot Grounding DINO 1.5 detector or a supervised fine-tuned YOLOv11 detector, achieving state-of-the-art performance on the CUB-200-2011 dataset while eliminating the need for retraining the segmentation model across different species or domains.

Abhinav MunagalaWed, 11 Ma🤖 cs.AI

OrthoAI: A Neurosymbolic Framework for Evidence-Grounded Biomechanical Reasoning in Clear Aligner Orthodontics

OrthoAI is a neurosymbolic framework that bridges 3D tooth segmentation and clinical reasoning for clear aligner orthodontics by combining sparse-supervision learning, knowledge-grounded biomechanical constraint inference, and multi-criteria treatment evaluation to enable fast, evidence-based automated decision support.

Edouard Lansiaux, Margaux Leman, Mehdi AmmiWed, 11 Ma🤖 cs.AI

Breaking the Factorization Barrier in Diffusion Language Models

The paper introduces Coupled Discrete Diffusion (CoDD), a hybrid framework that overcomes the "factorization barrier" in diffusion language models by replacing fully factorized outputs with a lightweight probabilistic inference layer, thereby enabling efficient parallel generation of coherent, high-quality text without the prohibitive costs of full joint modeling or reinforcement learning.

Ian Li, Zilei Shao, Benjie Wang, Rose Yu, Guy Van den Broeck, Anji LiuWed, 11 Ma🤖 cs.AI

Continual uncertainty learning

This paper proposes a curriculum-based continual learning framework that decomposes complex robust control problems with multiple uncertainties into sequential tasks, combining a model-based controller with deep reinforcement learning to achieve efficient, non-forgetting policy updates and successful sim-to-real transfer for automotive powertrain vibration control.

Heisei Yonezawa, Ansei Yonezawa, Itsuro KajiwaraWed, 11 Ma🤖 cs.AI

B-DENSE: Branching For Dense Ensemble Network Supervision Efficiency

The paper proposes B-DENSE, a novel distillation framework that leverages multi-branch trajectory alignment to enforce dense intermediate supervision, thereby overcoming the structural information loss and discretization errors of existing methods to achieve superior image generation quality with reduced inference latency.

Cherish Puniani, Tushar Kumar, Arnav Bendre, Gaurav Kumar, Shree SinghiWed, 11 Ma🤖 cs.AI

Energy-Aware Spike Budgeting for Continual Learning in Spiking Neural Networks for Neuromorphic Vision

This paper proposes an energy-aware spike budgeting framework that integrates experience replay, learnable neuron parameters, and an adaptive scheduler to effectively mitigate catastrophic forgetting while optimizing both accuracy and energy efficiency in Spiking Neural Networks across diverse frame-based and event-based neuromorphic vision benchmarks.

Anika Tabassum Meem, Muntasir Hossain Nadid, Md Zesun Ahmed MiaWed, 11 Ma🤖 cs.AI

Infusion: Shaping Model Behavior by Editing Training Data via Influence Functions

The paper introduces "Infusion," a framework that leverages scalable influence-function approximations to compute subtle perturbations in training data, demonstrating that modifying as little as 0.2% of a dataset can effectively and transferably shape model behavior across vision and language domains.

J Rosser, Robert Kirk, Edward Grefenstette, Jakob Foerster, Laura RuisWed, 11 Ma🤖 cs.AI

Monocular Normal Estimation via Shading Sequence Estimation

This paper introduces RoSE, a novel approach that reformulates monocular normal estimation as shading sequence estimation using image-to-video generative models to overcome 3D misalignment issues and achieve state-of-the-art performance on real-world benchmarks.

Zongrui Li, Xinhua Ma, Minghui Hu, Yunqing Zhao, Yingchen Yu, Qian Zheng, Chang Liu, Xudong Jiang, Song BaiWed, 11 Ma🤖 cs.AI

WebAccessVL: Violation-Aware VLM for Web Accessibility

The paper introduces WebAccessVL, a violation-aware vision-language model that automatically edits website HTML to fix WCAG2 accessibility violations while preserving visual design, achieving a 96% reduction in violations and outperforming GPT-5 through a supervised image-conditioned program synthesis approach enhanced by a checker-in-the-loop refinement strategy.

Amber Yijia Zheng, Jae Joong Lee, Bedrich Benes, Raymond A. YehWed, 11 Ma🤖 cs.AI

Automating Forecasting Question Generation and Resolution for AI Evaluation

This paper presents an automated system using LLM-powered web research agents to generate and resolve diverse, real-world forecasting questions at scale, demonstrating high-quality question creation and resolution rates that surpass human-curated platforms while effectively evaluating and improving AI forecasting performance.

Nikos I. Bosse, Peter Mühlbacher, Jack Wildman, Lawrence Phillips, Dan SchwarzWed, 11 Ma🤖 cs.AI

CLEAR-Mamba:Towards Accurate, Adaptive and Trustworthy Multi-Sequence Ophthalmic Angiography Classification

The paper introduces CLEAR-Mamba, an enhanced MedMamba framework featuring a hypernetwork-based adaptive conditioning layer and a reliability-aware prediction scheme, which achieves superior accuracy and trustworthiness in multi-sequence ophthalmic angiography classification by addressing challenges in generalization and confidence estimation.

Zhuonan Wang, Wenjie Yan, Wenqiao Zhang, Xiaohui Song, Jian Ma, Ke Yao, Yibo Yu, Beng Chin OoiWed, 11 Ma🤖 cs.AI

An AI-powered Bayesian Generative Modeling Approach for Arbitrary Conditional Inference

This paper introduces Bayesian Generative Modeling (BGM), a unified framework that leverages a stochastic iterative Bayesian updating algorithm to learn a single generative model capable of performing arbitrary conditional inference with principled uncertainty quantification, without requiring retraining for different conditioning structures.

Qiao Liu, Wing Hung WongWed, 11 Ma🤖 cs.AI

CRANE: Causal Relevance Analysis of Language-Specific Neurons in Multilingual Large Language Models

This paper introduces CRANE, a causal relevance analysis framework that identifies language-specific neurons in multilingual LLMs through targeted interventions, demonstrating that these neurons functionally specialize in language-conditioned predictions rather than merely exhibiting high activation magnitudes.

Yifan Le, Yunliang LiWed, 11 Ma🤖 cs.AI

MCGI: Manifold-Consistent Graph Indexing for Billion-Scale Disk-Resident Vector Search

The paper proposes Manifold-Consistent Graph Indexing (MCGI), a geometry-aware, disk-resident indexing method that leverages Local Intrinsic Dimensionality to dynamically adapt search strategies, achieving significantly higher throughput and lower latency than state-of-the-art baselines on billion-scale datasets by resolving the Euclidean-Geodesic mismatch in high-dimensional spaces.

Dongfang ZhaoWed, 11 Ma🤖 cs.AI

EMFusion: Conditional Diffusion Framework for Trustworthy Frequency Selective EMF Forecasting in Wireless Networks

This paper introduces EMFusion, a conditional multivariate diffusion-based framework that leverages a residual U-Net with cross-attention and imputation-based sampling to provide accurate, uncertainty-quantified, frequency-selective electromagnetic field forecasts for wireless network planning, significantly outperforming existing baseline models.

Zijiang Yan, Yixiang Huang, Jianhua Pei, Hina Tabassum, Luca ChiaraviglioWed, 11 Ma🤖 cs.AI

Enhancing Retrieval-Augmented Generation with Entity Linking for Educational Platforms

This paper introduces ELERAG, an enhanced Retrieval-Augmented Generation system that integrates Wikidata-based Entity Linking and a hybrid re-ranking strategy to significantly improve factual accuracy in Italian educational question-answering, particularly outperforming standard methods in domain-specific contexts while demonstrating the importance of domain-adapted strategies.

Francesco Granata, Francesco Poggi, Misael MongiovìWed, 11 Ma🤖 cs.AI