cs.AI papers | Gist.Science

From Data Statistics to Feature Geometry: How Correlations Shape Superposition

This paper challenges the standard view of superposition in neural networks by demonstrating that, unlike in idealized uncorrelated settings where interference is merely noise, realistic feature correlations allow models to arrange features so that interference becomes constructive, thereby naturally forming the semantic clusters and cyclical structures observed in real language models.

Lucas Prieto, Edward Stevinson, Melih Barsbey, Tolga Birdal, Pedro A. M. MedianoWed, 11 Ma🤖 cs.AI

Understanding the Use of a Large Language Model-Powered Guide to Make Virtual Reality Accessible for Blind and Low Vision People

This paper presents a study of a large language model-powered "sighted guide" for blind and low vision users in social virtual reality, revealing that participants adapt their interaction from a tool-based approach when alone to a companionable relationship in the presence of others, thereby offering key design recommendations for future accessible VR guides.

Jazmin Collins, Sharon Y Lin, Tianqi Liu, Andrea Stevenson Won, Shiri AzenkotWed, 11 Ma🤖 cs.AI

Emotional Modulation in Swarm Decision Dynamics

This paper extends the classical "bee equation" for swarm decision-making into an agent-based model where emotional valence and arousal modulate interaction rates, revealing how emotional contagion and non-linear amplification mechanisms bias consensus outcomes and accelerate convergence in collective choices.

David Freire-ObregónWed, 11 Ma🤖 cs.AI

BEACON: Language-Conditioned Navigation Affordance Prediction under Occlusion

This paper introduces BEACON, a language-conditioned navigation system that overcomes the limitations of existing 2D image-space methods by predicting an occlusion-aware Bird's-Eye View affordance heatmap from surround-view RGB-D observations, thereby significantly improving the accuracy of inferring traversable targets in occluded regions.

Xinyu Gao, Gang Chen, Javier Alonso-MoraWed, 11 Ma🤖 cs.AI

Towards a Neural Debugger for Python

This paper introduces "neural debuggers," a new class of language models that emulate traditional debugging interactions like setting breakpoints and stepping through code to enable both forward and inverse execution prediction, thereby laying the foundation for more powerful agentic coding systems and automated debugging.

Maximilian Beck, Jonas Gehring, Jannik Kossen, Gabriel SynnaeveWed, 11 Ma🤖 cs.AI

When Learning Rates Go Wrong: Early Structural Signals in PPO Actor-Critic

This paper introduces the Overfitting-Underfitting Indicator (OUI) as an efficient, early-stage metric based on hidden neuron activation patterns to distinguish optimal learning rates in PPO actor-critic training, demonstrating its superior ability to prune unpromising runs compared to traditional criteria by revealing distinct structural signatures in actor and critic networks.

Alberto Fernández-Hernández, Cristian Pérez-Corral, Jose I. Mestre, Manuel F. Dolz, Jose Duato, Enrique S. Quintana-OrtíWed, 11 Ma🤖 cs.AI

No Image, No Problem: End-to-End Multi-Task Cardiac Analysis from Undersampled k-Space

The paper proposes k-MTR, a novel framework that bypasses the traditional image reconstruction step by directly learning multi-task cardiac diagnostic features from undersampled k-space data through a shared semantic manifold, thereby eliminating reconstruction artifacts and achieving competitive performance across regression, classification, and segmentation tasks.

Yundi Zhang, Sevgi Gokce Kafali, Niklas Bubeck, Daniel Rueckert, Jiazhen PanWed, 11 Ma🤖 cs.AI

Towards Flexible Spectrum Access: Data-Driven Insights into Spectrum Demand

This paper presents a data-driven methodology using geospatial analytics and machine learning to accurately estimate and identify the drivers of spectrum demand variations across urban regions, achieving 70% predictive accuracy to support flexible spectrum access policies for future 6G networks.

Mohamad Alkadamani, Amir Ghasemi, Halim YanikomerogluWed, 11 Ma🤖 cs.AI

Adaptive Clinical-Aware Latent Diffusion for Multimodal Brain Image Generation and Missing Modality Imputation

The paper introduces ACADiff, an adaptive clinical-aware latent diffusion framework that synthesizes missing multimodal brain imaging data (sMRI, FDG-PET, and AV45-PET) by integrating imaging observations with GPT-4o-encoded clinical metadata, achieving superior generation quality and robust diagnostic performance even when up to 80% of modalities are missing.

Rong Zhou, Houliang Zhou, Yao Su, Brian Y. Chen, Yu Zhang, Lifang He, Alzheimer's Disease Neuroimaging InitiativeWed, 11 Ma🤖 cs.AI

AI-Enabled Data-driven Intelligence for Spectrum Demand Estimation

This paper presents an AI-driven, data-driven framework that utilizes validated proxies from site license and crowdsourced data to accurately forecast spectrum demand across five major Canadian cities, thereby enabling more efficient dynamic spectrum planning and resource allocation for regulators and network operators.

Colin Brown, Mohamad Alkadamani, Halim YanikomerogluWed, 11 Ma🤖 cs.AI

MSSR: Memory-Aware Adaptive Replay for Continual LLM Fine-Tuning

The paper proposes MSSR, a memory-aware adaptive replay framework that estimates sample-level memory strength to dynamically schedule rehearsal intervals, effectively mitigating catastrophic forgetting while maintaining fast adaptation in continual LLM fine-tuning.

Yiyang Lu, Yu He, Jianlong Chen, Hongyuan ZhaWed, 11 Ma🤖 cs.AI

Emerging Extrinsic Dexterity in Cluttered Scenes via Dynamics-aware Policy Learning

This paper introduces the Dynamics-Aware Policy Learning (DAPL) framework, which leverages explicit world modeling to learn contact-induced dynamics, enabling robots to achieve robust extrinsic dexterity in cluttered environments without hand-crafted heuristics and significantly outperforming existing manipulation methods in both simulation and real-world deployments.

Yixin Zheng, Jiangran Lyu, Yifan Zhang, Jiayi Chen, Mi Yan, Yuntian Deng, Xuesong Shi, Xiaoguang Zhao, Yizhou Wang, Zhizheng Zhang, He WangWed, 11 Ma🤖 cs.AI

A Graph-Based Approach to Spectrum Demand Prediction Using Hierarchical Attention Networks

This paper introduces HR-GAT, a hierarchical resolution graph attention network that leverages geospatial data to predict spectrum demand with 21% higher accuracy than baseline models, effectively addressing spatial autocorrelation challenges to enable more efficient spectrum sharing and policy-making.

Mohamad Alkadamani, Halim Yanikomeroglu, Amir GhasemiWed, 11 Ma🤖 cs.AI

SCENEBench: An Audio Understanding Benchmark Grounded in Assistive and Industrial Use Cases

This paper introduces SCENEBench, a comprehensive benchmark suite designed to evaluate Large Audio Language Models on critical non-speech and cross-lingual audio understanding tasks relevant to assistive and industrial applications, revealing significant performance gaps and latency challenges in current state-of-the-art models.

Laya Iyer, Angelina Wang, Sanmi KoyejoWed, 11 Ma🤖 cs.AI

MA-EgoQA: Question Answering over Egocentric Videos from Multiple Embodied Agents

This paper introduces MA-EgoQA, a novel benchmark and dataset featuring 1,700 questions across five categories designed to evaluate the ability of AI models to understand and reason over multiple long-horizon egocentric videos from embodied agents, alongside a proposed baseline model named EgoMAS that highlights current limitations in system-level multi-agent understanding.

Kangsan Kim, Yanlai Yang, Suji Kim, Woongyeong Yeo, Youngwan Lee, Mengye Ren, Sung Ju HwangWed, 11 Ma🤖 cs.AI

Correction of Transformer-Based Models with Smoothing Pseudo-Projector

This paper introduces the smoothing pseudo-projector, a lightweight, multigrid-inspired module that corrects hidden representations in transformer-based models to suppress noise from label-irrelevant inputs, thereby improving training dynamics and robustness without altering the core architecture.

Vitaly BulgakovWed, 11 Ma🤖 cs.AI

MITRA: An AI Assistant for Knowledge Retrieval in Physics Collaborations

The paper introduces MITRA, an on-premise Retrieval-Augmented Generation (RAG) system designed to enhance knowledge retrieval in large-scale physics collaborations like CMS by employing a novel automated pipeline for document processing and a two-tiered vector database architecture to accurately answer context-aware questions while ensuring data privacy.

Abhishikth Mallampalli, Sridhara DasuWed, 11 Ma🤖 cs.AI

Exploiting Label-Aware Channel Scoring for Adaptive Channel Pruning in Split Learning

This paper proposes ACP-SL, an adaptive channel pruning scheme for Split Learning that utilizes a label-aware channel importance scoring module to compress smashed data, thereby significantly reducing communication overhead while improving test accuracy and training efficiency.

Jialei Tan, Zheng Lin, Xiangming Cai, Ruoxi Zhu, Zihan Fang, Pingping Chen, Wei NiWed, 11 Ma🤖 cs.AI

Ego: Embedding-Guided Personalization of Vision-Language Models

The paper proposes "Ego," an efficient personalization method for vision-language models that extracts visual tokens representing target concepts via internal attention mechanisms to serve as memory, enabling strong performance across single-concept, multi-concept, and video personalization tasks without requiring additional training stages or external modules.

Soroush Seifi, Simon Gardier, Vaggelis Dorovatas, Daniel Olmeda Reino, Rahaf AljundiWed, 11 Ma🤖 cs.AI

EXPLORE-Bench: Egocentric Scene Prediction with Long-Horizon Reasoning

This paper introduces EXPLORE-Bench, a benchmark derived from real first-person videos to evaluate the ability of multimodal large language models to perform long-horizon egocentric scene prediction, revealing significant performance gaps compared to humans and demonstrating that stepwise reasoning offers partial improvements at a computational cost.

Chengjun Yu, Xuhan Zhu, Chaoqun Du, Pengfei Yu, Wei Zhai, Yang Cao, Zheng-Jun ZhaWed, 11 Ma🤖 cs.AI

← Previous Next →