cs.AI papers | Gist.Science

Guiding Diffusion-based Reconstruction with Contrastive Signals for Balanced Visual Representation

This paper proposes Diffusion Contrastive Reconstruction (DCR), a method that injects contrastive signals derived from reconstructed images into the diffusion process to resolve gradient conflicts and jointly optimize both discriminative and detail-perceptive abilities, thereby overcoming the limitations of CLIP's visual encoder for balanced visual representation.

Boyu Han, Qianqian Xu, Shilong Bao + 4 more2026-03-06💻 cs

Attention's Gravitational Field:A Power-Law Interpretation of Positional Correlation

This paper introduces the Attention Gravitational Field (AGF) concept, which decouples positional encodings from semantic embeddings to optimize Large Language Model architecture, demonstrating that positional correlations follow a power-law distribution consistent with Newton's Law of Universal Gravitation and yielding superior accuracy and interpretability.

Edward Zhang2026-03-06💻 cs

Meta-D: Metadata-Aware Architectures for Brain Tumor Analysis and Missing-Modality Segmentation

The paper presents Meta-D, a metadata-aware architecture that leverages categorical scanner information to dynamically modulate feature extraction for improved 2D brain tumor detection and to serve as a robust anchor for cross-attention mechanisms in 3D missing-modality segmentation, achieving significant performance gains and parameter reduction.

SangHyuk Kim, Daniel Haehn, Sumientra Rampersad2026-03-06💻 cs

EchoGuard: An Agentic Framework with Knowledge-Graph Memory for Detecting Manipulative Communication in Longitudinal Dialogue

EchoGuard is an agentic AI framework that utilizes a Knowledge Graph-based memory system to track longitudinal dialogue, detect psychologically grounded manipulative patterns, and guide users toward self-discovery through targeted Socratic prompts.

Ratna Kandala, Niva Manchanda, Akshata Kishore Moharir + 1 more2026-03-06🤖 cs.AI

LLM-Grounded Explainability for Port Congestion Prediction via Temporal Graph Attention Networks

This paper introduces AIS-TGNN, a framework that combines Temporal Graph Attention Networks with structured Large Language Models to predict port congestion using AIS data while generating operationally interpretable, evidence-grounded natural language explanations that maintain high predictive accuracy and directional consistency.

Zhiming Xue, Yujue Wang2026-03-06🤖 cs.AI

On the Strengths and Weaknesses of Data for Open-set Embodied Assistance

This paper investigates the generalization capabilities of a multimodal foundation model fine-tuned on diverse synthetic interactive data for the novel task of Open-Set Corrective Assistance, demonstrating that effective open-set assistive intelligence requires datasets encompassing multimodal grounding, defect inference, and exposure to varied scenarios.

Pradyumna Tambwekar, Andrew Silva, Deepak Gopinath + 3 more2026-03-06🤖 cs.AI

VISA: Value Injection via Shielded Adaptation for Personalized LLM Alignment

The paper proposes VISA, a closed-loop framework that utilizes Group Relative Policy Optimization to inject fine-grained human values into Large Language Models while preserving semantic integrity and mitigating the alignment tax typically caused by standard fine-tuning.

Jiawei Chen, Tianzhuo Yang, Guoxi Zhang + 3 more2026-03-06🤖 cs.AI

Multilevel Training for Kolmogorov Arnold Networks

This paper introduces a multilevel training framework for Kolmogorov-Arnold Networks (KANs) that leverages their structural equivalence to multichannel MLPs and the properties of spline basis functions to create a properly nested hierarchy of models, resulting in orders-of-magnitude improvements in training accuracy and speed, particularly for physics-informed neural networks.

Ben S. Southworth, Jonas A. Actor, Graham Harper + 1 more2026-03-06🔢 math

SCoUT: Scalable Communication via Utility-Guided Temporal Grouping in Multi-Agent Reinforcement Learning

SCoUT is a scalable multi-agent reinforcement learning framework that improves coordination by dynamically grouping agents into latent clusters for efficient value estimation and employing counterfactual advantages to precisely guide decentralized communication decisions on when and whom to message.

Manav Vora, Gokul Puthumanaillam, Hiroyasu Tsukamoto + 1 more2026-03-06🤖 cs.AI

Design Behaviour Codes (DBCs): A Taxonomy-Driven Layered Governance Benchmark for Large Language Models

This paper introduces the Dynamic Behavioral Constraint (DBC) benchmark, a model-agnostic, inference-time governance framework that demonstrates a 36.8% relative reduction in risk exposure and enhanced EU AI Act compliance across multiple LLM families compared to standard safety prompts, validated through a rigorous, taxonomy-driven red-teaming protocol.

G. Madan Mohan, Veena Kiran Nambiar, Kiranmayee Janardhan2026-03-06🤖 cs.AI

An Approach to Simultaneous Acquisition of Real-Time MRI Video, EEG, and Surface EMG for Articulatory, Brain, and Muscle Activity During Speech Production

This paper presents a novel framework for the simultaneous acquisition of real-time MRI, EEG, and surface EMG to capture brain, muscle, and articulatory activity during speech, featuring a specialized artifact suppression pipeline to overcome technical challenges and enable unprecedented insights into speech neuroscience.

Jihwan Lee, Parsa Razmara, Kevin Huang + 16 more2026-03-06🤖 cs.AI

On Multi-Step Theorem Prediction via Non-Parametric Structural Priors

This paper introduces a training-free, non-parametric approach to multi-step theorem prediction that overcomes the scalability limitations of vanilla in-context learning by leveraging Theorem Precedence Graphs to encode temporal dependencies and impose topological constraints, achieving state-of-the-art accuracy on the FormalGeo7k benchmark without gradient-based optimization.

Junbo Zhao, Ting Zhang, Can Li + 3 more2026-03-06🤖 cs.AI

Causally Robust Reward Learning from Reason-Augmented Preference Feedback

This paper introduces ReCouPLe, a lightweight framework that leverages natural language rationales as causal guidance to train reward models that are robust to spurious correlations and capable of zero-shot transfer to novel tasks, significantly outperforming baselines in reward accuracy and downstream policy performance under distribution shifts.

Minjune Hwang, Yigit Korkmaz, Daniel Seita + 1 more2026-03-06🤖 cs.AI

K-Gen: A Multimodal Language-Conditioned Approach for Interpretable Keypoint-Guided Trajectory Generation

K-Gen is an interpretable multimodal framework that leverages Multimodal Large Language Models to generate reasoning-guided keypoints from rasterized maps and text, which are then refined into realistic trajectories, outperforming existing baselines on autonomous driving benchmarks.

Mingxuan Mu, Guo Yang, Lei Chen + 2 more2026-03-06🤖 cs.AI

SEA-TS: Self-Evolving Agent for Autonomous Code Generation of Time Series Forecasting Algorithms

SEA-TS is a self-evolving agent framework that autonomously generates and optimizes time series forecasting code through innovations like Metric-Advantage MCTS and global steerable reasoning, achieving superior accuracy and discovering novel algorithmic patterns that outperform state-of-the-art methods and human-engineered baselines.

Longkun Xu, Xiaochun Zhang, Qiantu Tuo + 1 more2026-03-06🤖 cs.AI

Interpretable Pre-Release Baseball Pitch Type Anticipation from Broadcast 3D Kinematics

This paper presents a scalable, interpretable framework that achieves 80.4% accuracy in classifying eight professional baseball pitch types using only monocular 3D body kinematics, revealing that upper-body mechanics—particularly wrist position and trunk tilt—are the primary predictors while establishing an empirical ceiling for grip-based distinctions.

Jerrin Bright, Michelle Lu, John Zelek2026-03-06🤖 cs.AI

DeformTrace: A Deformable State Space Model with Relay Tokens for Temporal Forgery Localization

This paper proposes DeformTrace, a hybrid architecture combining State Space Models with deformable dynamics and relay tokens to achieve state-of-the-art temporal forgery localization by addressing challenges in boundary ambiguity, sparse forgeries, and long-range modeling.

Xiaodong Zhu, Suting Wang, Yuanming Zheng + 5 more2026-03-06🤖 cs.AI

Bounded State in an Infinite Horizon: Proactive Hierarchical Memory for Ad-Hoc Recall over Streaming Dialogues

To address the fidelity-efficiency dilemma in infinite-horizon dialogue streams where existing memory mechanisms fail to support ad-hoc recall, this paper introduces STEM-Bench, a new benchmark for evaluation, and ProStream, a proactive hierarchical memory framework that achieves bounded-state inference with high reasoning fidelity through multi-granular distillation and adaptive spatiotemporal optimization.

Bingbing Wang, Jing Li, Ruifeng Xu2026-03-06🤖 cs.AI

FedAFD: Multimodal Federated Learning via Adversarial Fusion and Distillation

FedAFD is a unified multimodal federated learning framework that enhances both client and server performance by employing a bi-level adversarial alignment and granularity-aware fusion for personalized local learning, alongside a similarity-guided ensemble distillation mechanism to effectively handle model heterogeneity and modality discrepancies.

Min Tan, Junchao Ma, Yinfu Feng + 6 more2026-03-06🤖 cs.AI

Free Lunch for Pass@ $k$ ? Low Cost Diverse Sampling for Diffusion Language Models

This paper proposes a training-free, low-cost intervention for Diffusion Language Models that sequentially repels intermediate samples in a batch to enhance generative diversity and improve Pass@ $k$ performance on reasoning tasks without significant computational overhead.

Sean Lamont, Christian Walder, Paul Montague + 2 more2026-03-06🤖 cs.AI

← Previous Next →

cs.AI