cs.AI papers | Gist.Science

RbtAct: Rebuttal as Supervision for Actionable Review Feedback Generation

This paper introduces RbtAct, a framework that leverages peer review rebuttals as implicit supervision to train large language models to generate more actionable and specific review feedback through a novel perspective-conditioned task and a new dataset called RMR-75K.

Sihong Wu, Yiling Ma, Yilun Zhao, Tiansheng Hu, Owen Jiang, Manasi Patwardhan, Arman CohanWed, 11 Ma🤖 cs.AI

EXPLORE-Bench: Egocentric Scene Prediction with Long-Horizon Reasoning

This paper introduces EXPLORE-Bench, a benchmark derived from real first-person videos to evaluate the ability of multimodal large language models to perform long-horizon egocentric scene prediction, revealing significant performance gaps compared to humans and demonstrating that stepwise reasoning offers partial improvements at a computational cost.

Chengjun Yu, Xuhan Zhu, Chaoqun Du, Pengfei Yu, Wei Zhai, Yang Cao, Zheng-Jun ZhaWed, 11 Ma🤖 cs.AI

Ego: Embedding-Guided Personalization of Vision-Language Models

The paper proposes "Ego," an efficient personalization method for vision-language models that extracts visual tokens representing target concepts via internal attention mechanisms to serve as memory, enabling strong performance across single-concept, multi-concept, and video personalization tasks without requiring additional training stages or external modules.

Soroush Seifi, Simon Gardier, Vaggelis Dorovatas, Daniel Olmeda Reino, Rahaf AljundiWed, 11 Ma🤖 cs.AI

World2Mind: Cognition Toolkit for Allocentric Spatial Reasoning in Foundation Models

World2Mind is a training-free toolkit that enhances foundation models' allocentric spatial reasoning by constructing structured cognitive maps and an Allocentric-Spatial Tree, enabling significant performance gains and even allowing text-only models to achieve complex 3D spatial reasoning comparable to advanced multimodal systems.

Shouwei Ruan, Bin Wang, Zhenyu Wu, Qihui Zhu, Yuxiang Zhang, Hang Su, Yubin WangWed, 11 Ma🤖 cs.AI

First Estimation of Model Parameters for Neutrino-Induced Nucleon Knockout Using Simulation-Based Inference

This paper demonstrates that simulation-based inference (SBI) is a viable and potentially superior alternative to traditional empirical tuning for determining neutrino interaction model parameters, as it successfully reproduces and slightly improves upon the MicroBooNE collaboration's tuned GENIE configuration while also approximating the NuWro simulation.

Karla Tame-Narvaez, Steven Gardiner, Aleksandra Ciprijanovic, Giuseppe CeratiWed, 11 Ma⚛️ hep-ph

Quantifying the Necessity of Chain of Thought through Opaque Serial Depth

This paper introduces the concept of "opaque serial depth" to formally quantify the limits of internal reasoning in language models without externalized chain-of-thought, providing computational bounds for various architectures and an open-source tool to analyze the potential for non-externalized reasoning.

Jonah Brown-Cohen, David Lindner, Rohin ShahWed, 11 Ma🤖 cs.AI

A Hybrid Quantum-Classical Framework for Financial Volatility Forecasting Based on Quantum Circuit Born Machines

This paper proposes and validates a hybrid quantum-classical framework that integrates a Long Short-Term Memory (LSTM) network with a Quantum Circuit Born Machine (QCBM) to significantly improve financial volatility forecasting accuracy on high-frequency stock market data compared to traditional classical models.

Yixiong ChenWed, 11 Ma⚛️ quant-ph

Exploiting Label-Aware Channel Scoring for Adaptive Channel Pruning in Split Learning

This paper proposes ACP-SL, an adaptive channel pruning scheme for Split Learning that utilizes a label-aware channel importance scoring module to compress smashed data, thereby significantly reducing communication overhead while improving test accuracy and training efficiency.

Jialei Tan, Zheng Lin, Xiangming Cai, Ruoxi Zhu, Zihan Fang, Pingping Chen, Wei NiWed, 11 Ma🤖 cs.AI

MITRA: An AI Assistant for Knowledge Retrieval in Physics Collaborations

The paper introduces MITRA, an on-premise Retrieval-Augmented Generation (RAG) system designed to enhance knowledge retrieval in large-scale physics collaborations like CMS by employing a novel automated pipeline for document processing and a two-tiered vector database architecture to accurately answer context-aware questions while ensuring data privacy.

Abhishikth Mallampalli, Sridhara DasuWed, 11 Ma🤖 cs.AI

Correction of Transformer-Based Models with Smoothing Pseudo-Projector

This paper introduces the smoothing pseudo-projector, a lightweight, multigrid-inspired module that corrects hidden representations in transformer-based models to suppress noise from label-irrelevant inputs, thereby improving training dynamics and robustness without altering the core architecture.

Vitaly BulgakovWed, 11 Ma🤖 cs.AI

MA-EgoQA: Question Answering over Egocentric Videos from Multiple Embodied Agents

This paper introduces MA-EgoQA, a novel benchmark and dataset featuring 1,700 questions across five categories designed to evaluate the ability of AI models to understand and reason over multiple long-horizon egocentric videos from embodied agents, alongside a proposed baseline model named EgoMAS that highlights current limitations in system-level multi-agent understanding.

Kangsan Kim, Yanlai Yang, Suji Kim, Woongyeong Yeo, Youngwan Lee, Mengye Ren, Sung Ju HwangWed, 11 Ma🤖 cs.AI

SCENEBench: An Audio Understanding Benchmark Grounded in Assistive and Industrial Use Cases

This paper introduces SCENEBench, a comprehensive benchmark suite designed to evaluate Large Audio Language Models on critical non-speech and cross-lingual audio understanding tasks relevant to assistive and industrial applications, revealing significant performance gaps and latency challenges in current state-of-the-art models.

Laya Iyer, Angelina Wang, Sanmi KoyejoWed, 11 Ma🤖 cs.AI

A Graph-Based Approach to Spectrum Demand Prediction Using Hierarchical Attention Networks

This paper introduces HR-GAT, a hierarchical resolution graph attention network that leverages geospatial data to predict spectrum demand with 21% higher accuracy than baseline models, effectively addressing spatial autocorrelation challenges to enable more efficient spectrum sharing and policy-making.

Mohamad Alkadamani, Halim Yanikomeroglu, Amir GhasemiWed, 11 Ma🤖 cs.AI

Emerging Extrinsic Dexterity in Cluttered Scenes via Dynamics-aware Policy Learning

This paper introduces the Dynamics-Aware Policy Learning (DAPL) framework, which leverages explicit world modeling to learn contact-induced dynamics, enabling robots to achieve robust extrinsic dexterity in cluttered environments without hand-crafted heuristics and significantly outperforming existing manipulation methods in both simulation and real-world deployments.

Yixin Zheng, Jiangran Lyu, Yifan Zhang, Jiayi Chen, Mi Yan, Yuntian Deng, Xuesong Shi, Xiaoguang Zhao, Yizhou Wang, Zhizheng Zhang, He WangWed, 11 Ma🤖 cs.AI

LCA: Local Classifier Alignment for Continual Learning

This paper proposes Local Classifier Alignment (LCA), a novel loss function that resolves the mismatch between task-specific classifiers and adapted backbones in continual learning, thereby enhancing generalization and robustness while achieving state-of-the-art performance on standard benchmarks.

Tung Tran, Danilo Vasconcellos Vargas, Khoat ThanWed, 11 Ma🤖 cs.AI

Influencing LLM Multi-Agent Dialogue via Policy-Parameterized Prompts

This paper proposes a lightweight, training-free framework that parameterizes prompts as actions to dynamically influence LLM multi-agent dialogue behaviors, demonstrating through experiments that this policy-based approach effectively controls conversational dynamics across various indicators.

Hongbo Bo, Jingyu Hu, Weiru LiuWed, 11 Ma🤖 cs.AI

MSSR: Memory-Aware Adaptive Replay for Continual LLM Fine-Tuning

The paper proposes MSSR, a memory-aware adaptive replay framework that estimates sample-level memory strength to dynamically schedule rehearsal intervals, effectively mitigating catastrophic forgetting while maintaining fast adaptation in continual LLM fine-tuning.

Yiyang Lu, Yu He, Jianlong Chen, Hongyuan ZhaWed, 11 Ma🤖 cs.AI

MedMASLab: A Unified Orchestration Framework for Benchmarking Multimodal Medical Multi-Agent Systems

MedMASLab is a unified framework and benchmarking platform that addresses architectural fragmentation in medical multi-agent systems by introducing a standardized communication protocol, an automated zero-shot clinical reasoning evaluator, and an extensive multimodal benchmark spanning 473 diseases to reveal critical performance gaps in cross-specialty transitions.

Yunhang Qian, Xiaobin Hu, Jiaquan Yu, Siyang Xin, Xiaokun Chen, Jiangning Zhang, Peng-Tao Jiang, Jiawei Liu, Hongwei Bran LiWed, 11 Ma🤖 cs.AI

AI-Enabled Data-driven Intelligence for Spectrum Demand Estimation

This paper presents an AI-driven, data-driven framework that utilizes validated proxies from site license and crowdsourced data to accurately forecast spectrum demand across five major Canadian cities, thereby enabling more efficient dynamic spectrum planning and resource allocation for regulators and network operators.

Colin Brown, Mohamad Alkadamani, Halim YanikomerogluWed, 11 Ma🤖 cs.AI

Adaptive Clinical-Aware Latent Diffusion for Multimodal Brain Image Generation and Missing Modality Imputation

The paper introduces ACADiff, an adaptive clinical-aware latent diffusion framework that synthesizes missing multimodal brain imaging data (sMRI, FDG-PET, and AV45-PET) by integrating imaging observations with GPT-4o-encoded clinical metadata, achieving superior generation quality and robust diagnostic performance even when up to 80% of modalities are missing.

Rong Zhou, Houliang Zhou, Yao Su, Brian Y. Chen, Yu Zhang, Lifang He, Alzheimer's Disease Neuroimaging InitiativeWed, 11 Ma🤖 cs.AI

← Previous Next →