cs.LG papers | Gist.Science

Thin Keys, Full Values: Reducing KV Cache via Low-Dimensional Attention Selection

This paper proposes reducing KV cache memory usage in transformers by employing low-dimensional keys for attention selection while maintaining full-dimensional values for semantic transfer, a strategy validated across multiple models and datasets to achieve up to 75% cache savings with minimal performance degradation.

Hengshuai Yao, Guan Wang2026-03-06💻 cs

Agent Memory Below the Prompt: Persistent Q4 KV Cache for Multi-Agent LLM Inference on Edge Devices

This paper introduces a system for multi-agent LLM inference on edge devices that persists 4-bit quantized KV caches to disk, enabling direct cache restoration to eliminate redundant prefill computations and achieve up to 136x faster time-to-first-token while fitting four times more agent contexts into limited RAM.

Yakov Pyotr Shkolnikov2026-03-06💻 cs

Flowers: A Warp Drive for Neural PDE Solvers

The paper introduces Flowers, a novel neural architecture for solving partial differential equations that replaces traditional attention, convolution, and Fourier mechanisms with multihead warps to achieve adaptive global interactions at linear computational cost, demonstrating superior performance on 2D and 3D flow and wave benchmarks compared to existing foundation models.

Till Muser, Alexandra Spitzer, Matti Lassas + 2 more2026-03-06💻 cs

Uncertainty-Calibrated Spatiotemporal Field Diffusion with Sparse Supervision

The paper introduces SOLID, a mask-conditioned diffusion framework that learns spatiotemporal dynamics directly from sparse observations without requiring dense ground truth, achieving superior probabilistic forecasting and calibrated uncertainty through a novel dual-masking objective.

Kevin Valencia, Xihaier Luo, Shinjae Yoo + 1 more2026-03-06💻 cs

Auction-Based RIS Allocation With DRL: Controlling the Cost-Performance Trade-Off

This paper proposes an auction-based framework for allocating shared reconfigurable intelligent surfaces (RISs) in multi-cell networks, where deep reinforcement learning agents optimize bidding strategies to dynamically balance spectral efficiency and cost constraints.

Martin Mark Zan, Stefan Schwarz2026-03-06💻 cs

ZorBA: Zeroth-order Federated Fine-tuning of LLMs with Heterogeneous Block Activation

This paper proposes ZorBA, a zeroth-order federated fine-tuning framework for large language models that reduces VRAM usage and communication overhead through heterogeneous block activation and shared random seeds, while optimizing convergence via a novel lexicographic algorithm.

Chuiyang Meng, Ming Tang, Vincent W. S. Wong2026-03-06💻 cs

ASFL: An Adaptive Model Splitting and Resource Allocation Framework for Split Federated Learning

This paper proposes ASFL, an adaptive split federated learning framework that jointly optimizes model splitting and resource allocation via an online block coordinate descent algorithm to significantly reduce training delay and energy consumption while accelerating convergence in wireless networks.

Chuiyang Meng, Ming Tang, Vincent W. S. Wong2026-03-06💻 cs

CogGen: Cognitive-Load-Informed Fully Unsupervised Deep Generative Modeling for Compressively Sampled MRI Reconstruction

The paper proposes CogGen, a fully unsupervised deep generative modeling framework for compressively sampled MRI reconstruction that enhances fidelity and convergence by regulating cognitive load through a self-paced curriculum learning strategy that progressively schedules k-space data fitting from low-frequency, high-SNR samples to more complex, noise-dominated measurements.

Qingyong Zhu, Yumin Tan, Xiang Gu + 1 more2026-03-06💻 cs

Explainable Regime Aware Investing

This paper presents an explainable, regime-aware portfolio construction framework using a strictly causal Wasserstein Hidden Markov Model that dynamically adapts regime complexity while preserving economic interpretability, achieving superior risk-adjusted returns and lower drawdowns compared to traditional benchmarks and nonparametric alternatives.

Amine Boukardagha2026-03-06💻 cs

AMV-L: Lifecycle-Managed Agent Memory for Tail-Latency Control in Long-Running LLM Systems

This paper introduces AMV-L, a value-driven memory lifecycle framework for long-running LLM agents that replaces age-based retention with utility-based tiering to bound retrieval workloads, thereby achieving significantly improved tail-latency control and throughput compared to traditional TTL and LRU policies.

Emmanuel Bamidele2026-03-06💻 cs

SkillNet: Create, Evaluate, and Connect AI Skills

SkillNet is an open infrastructure that addresses the lack of systematic skill accumulation in AI agents by providing a unified ontology, a repository of over 200,000 skills, and evaluation tools to create, connect, and assess skills, thereby significantly enhancing agent performance and efficiency across diverse tasks.

Yuan Liang, Ruobin Zhong, Haoming Xu + 46 more2026-03-06✓ Author reviewed ⓘ💻 cs

An Explainable Ensemble Framework for Alzheimer's Disease Prediction Using Structured Clinical and Cognitive Data

This research proposes an explainable ensemble learning framework that integrates structured clinical and cognitive data with advanced preprocessing and hybrid class balancing techniques to achieve accurate and transparent Alzheimer's disease prediction, demonstrating that optimized ensemble models outperform deep learning while providing actionable clinical insights through SHAP analysis.

Nishan Mitra2026-03-06💻 cs

MPBMC: Multi-Property Bounded Model Checking with GNN-guided Clustering

This paper proposes MPBMC, a hybrid approach that leverages Graph Neural Network embeddings and runtime design statistics to functionally cluster properties, thereby significantly accelerating multi-property Bounded Model Checking verification on HWMCC benchmarks compared to state-of-the-art methods.

Soumik Guha Roy, Sumana Ghosh, Ansuman Banerjee + 2 more2026-03-06💻 cs

On Emergences of Non-Classical Statistical Characteristics in Classical Neural Networks

This paper introduces the Non-Classical Network (NCnet), a classical neural architecture that exhibits quantum-like non-classical statistical behaviors through gradient competitions and implicit inter-task correlations, revealing that the resulting CHSH $S$ statistic serves as a novel indicator for understanding internal network dynamics and generalization performance across different resource regimes.

Hanyu Zhao, Yang Wu, Yuexian Hou2026-03-06⚛️ quant-ph

Induced Numerical Instability: Hidden Costs in Multimodal Large Language Models

This paper introduces a novel attack method that induces numerical instability in multimodal large language models by optimizing a specific loss function to generate images, causing significant performance degradation across state-of-the-art models and datasets that is distinct from traditional adversarial perturbations.

Wai Tuck Wong, Jun Sun, Arunesh Sinha2026-03-06💻 cs

Learning Unified Distance Metric for Heterogeneous Attribute Data Clustering

This paper proposes a novel, parameter-free Heterogeneous Attribute Reconstruction and Representation (HARR) learning paradigm that unifies numerical and categorical attributes into homogeneous spaces with learnable metrics to effectively adapt to various clustering tasks while guaranteeing convergence.

Yiqun Zhang, Mingjie Zhao, Yizhou Chen + 2 more2026-03-06💻 cs

VSPrefill: Vertical-Slash Sparse Attention with Lightweight Indexing for Long-Context Prefilling

VSPrefill is a lightweight, training-efficient sparse attention mechanism that leverages vertical-slash structural patterns and adaptive thresholding to achieve linear complexity during long-context prefilling, delivering a 4.95x speedup while preserving 98.35% of full attention accuracy on 128k context lengths.

Chen Guanzhong2026-03-06💻 cs

MAD-SmaAt-GNet: A Multimodal Advection-Guided Neural Network for Precipitation Nowcasting

This paper introduces MAD-SmaAt-GNet, a multimodal, advection-guided neural network that extends the SmaAt-UNet architecture by integrating multiple weather variables and physics-based advection to significantly improve the accuracy and physical consistency of short-term precipitation nowcasting.

Samuel van Wonderen, Siamak Mehrkanoon2026-03-06💻 cs

Understanding the Dynamics of Demonstration Conflict in In-Context Learning

This paper investigates how large language models process conflicting demonstrations in in-context learning, revealing a two-phase computational structure where early layers encode both correct and incorrect rules while late layers commit to predictions, and identifies specific attention heads responsible for this vulnerability that can be mitigated through targeted ablation to significantly improve performance.

Difan Jiao, Di Wang, Lijie Hu2026-03-06💻 cs

Act-Observe-Rewrite: Multimodal Coding Agents as In-Context Policy Learners for Robot Manipulation

This paper introduces Act-Observe-Rewrite (AOR), a framework enabling multimodal language models to iteratively improve robot manipulation policies by synthesizing and rewriting executable Python controller code based on visual feedback and failure analysis, achieving high success rates across tasks without demonstrations, reward engineering, or gradient updates.

Vaishak Kumar2026-03-06💻 cs

← Previous Next →