cs.LG papers | Gist.Science

SWE-MiniSandbox: Container-Free Reinforcement Learning for Building Software Engineering Agents

SWE-MiniSandbox is a lightweight, container-free framework that leverages kernel-level isolation and environment pre-caching to significantly reduce storage and setup overhead while maintaining performance comparable to traditional container-based pipelines for scaling reinforcement learning in software engineering agents.

Danlong Yuan, Wei Wu, Zhengren Wang, Xueliang Zhao, Huishuai Zhang, Dongyan Zhao2026-03-09🤖 cs.AI

MiDAS: A Multimodal Data Acquisition System and Dataset for Robot-Assisted Minimally Invasive Surgery

This paper introduces MiDAS, an open-source, platform-agnostic system that enables non-invasive, time-synchronized multimodal data acquisition for robot-assisted minimally invasive surgery, validated by demonstrating that its external sensing approach achieves gesture recognition performance comparable to proprietary telemetry while releasing the first annotated dataset for hernia repair suturing.

Keshara Weerasinghe (MD), Seyed Hamid Reza Roodabeh (MD), Andrew Hawkins (MD), Zhaomeng Zhang, Zachary Schrader, Homa Alemzadeh2026-03-09🤖 cs.LG

An Adaptive Model Selection Framework for Demand Forecasting under Horizon-Induced Degradation to Support Business Strategy and Operations

This paper introduces AHSIV, an adaptive framework that addresses horizon-induced model ranking instability in demand forecasting by integrating horizon-aware error metrics, structural demand classification, and multi-objective optimization to provide robust, operationally coherent model selection for heterogeneous business environments.

Adolfo González, Víctor Parada2026-03-09🤖 cs.AI

GaiaFlow: Semantic-Guided Diffusion Tuning for Carbon-Frugal Search

GaiaFlow is a novel framework that achieves carbon-frugal search by integrating semantic-guided diffusion tuning, retrieval-guided Langevin dynamics, and adaptive efficiency protocols to balance high retrieval accuracy with significantly reduced environmental impact.

Rong Fu, Jia Yee Tan, Chunlei Meng, Shuo Yin, Xiaowen Ma, Wangyu Wu, Muge Qi, Guangzhen Yao, Zhaolu Kang, Zeli Su, Simon Fong2026-03-09🤖 cs.LG

MolCrystalFlow: Molecular Crystal Structure Prediction via Flow Matching

MolCrystalFlow is a novel flow-based generative model that predicts molecular crystal structures by disentangling intramolecular complexity from intermolecular packing through rigid body embeddings and Riemannian manifold representations, thereby outperforming existing methods and enabling data-driven discovery of periodic molecular crystals.

Cheng Zeng, Harry W. Sullivan, Thomas Egg, Maya M. Martirossyan, Philipp Höllmer, Jirui Jin, Richard G. Hennig, Adrian Roitberg, Stefano Martiniani, Ellad B. Tadmor, Mingjie Liu2026-03-09🔬 cond-mat.mtrl-sci

The Limits of Long-Context Reasoning in Automated Bug Fixing

This paper demonstrates that while agentic workflows improve bug-fixing performance by decomposing tasks into short-context steps, current large language models fail to effectively reason over genuinely long contexts (e.g., 64k tokens), revealing a significant gap between nominal context length and usable reasoning capacity.

Ravi Raju, Mengmeng Ji, Shubhangi Upasani, Bo Li, Urmish Thakker2026-03-09🤖 cs.LG

FLoRG: Federated Fine-tuning with Low-rank Gram Matrices and Procrustes Alignment

The paper proposes FLoRG, a federated fine-tuning framework that utilizes single low-rank Gram matrix aggregation and Procrustes alignment to eliminate aggregation errors and decomposition drift, thereby achieving superior downstream accuracy and significantly reduced communication overhead compared to existing state-of-the-art methods.

Chuiyang Meng, Ming Tang, Vincent W. S. Wong2026-03-09🤖 cs.AI

Conditionally Site-Independent Neural Evolution of Antibody Sequences

This paper introduces CoSiNE, a deep neural network-parameterized continuous-time Markov chain that bridges the gap between expressive deep learning and classical phylogenetic models to capture epistatic interactions in antibody evolution, thereby outperforming state-of-the-art language models in variant effect prediction and enabling efficient affinity optimization through a novel Guided Gillespie sampling scheme.

Stephen Zhewen Lu, Aakarsh Vermani, Kohei Sanno, Jiarui Lu, Frederick A Matsen, Milind Jagota, Yun S. Song2026-03-09🤖 cs.LG

What Topological and Geometric Structure Do Biological Foundation Models Learn? Evidence from 141 Hypotheses

Through an autonomous AI-driven screening of 141 hypotheses, this study demonstrates that biological foundation models like scGPT and Geneformer learn genuine, shared geometric and topological structures in their internal representations that are biologically meaningful yet more localized to specific tissues like immune cells than previously assumed.

Ihor Kendiukhov2026-03-09🤖 cs.LG

Exploratory Memory-Augmented LLM Agent via Hybrid On- and Off-Policy Optimization

The paper introduces EMPO $^2$ , a hybrid reinforcement learning framework that integrates memory-augmented on- and off-policy optimization to overcome exploration bottlenecks in LLM agents, achieving significant performance gains on benchmark tasks and demonstrating superior adaptability to out-of-distribution scenarios without parameter updates.

Zeyuan Liu, Jeonghye Kim, Xufang Luo, Dongsheng Li, Yuqing Yang2026-03-09🤖 cs.AI

Modality Collapse as Mismatched Decoding: Information-Theoretic Limits of Multimodal LLMs

This paper reframes the modality collapse observed in multimodal LLMs as a mismatched decoding problem, demonstrating through information-theoretic analysis and empirical validation that the accessibility of non-text information is fundamentally limited by the decoder's training objective and scoring rule rather than the encoder's architecture or alignment.

Jayadev Billa2026-03-09🤖 cs.AI

Coverage-Aware Web Crawling for Domain-Specific Supplier Discovery via a Web--Knowledge--Web Pipeline

This paper proposes a Coverage-Aware Web Crawling framework utilizing a Web--Knowledge--Web pipeline and ecological species-richness estimators to iteratively discover and map under-represented SME suppliers in specialized sectors, achieving superior precision and efficiency compared to baseline methods in the semiconductor equipment manufacturing industry.

Yijiashun Qi, Yijiazhen Qi, Tanmay Wagh2026-03-09🤖 cs.LG

Weight Updates as Activation Shifts: A Principled Framework for Steering

This paper establishes a principled framework linking activation steering to weight updates, demonstrating that targeting post-block outputs achieves near-full fine-tuning accuracy with minimal parameters and that jointly adapting both spaces surpasses the performance limits of either method alone.

Dyah Adila, John Cooper, Alexander Yun, Avi Trost, Frederic Sala2026-03-09🤖 cs.LG

Reparameterized Tensor Ring Functional Decomposition for Multi-Dimensional Data Recovery

This paper proposes a reparameterized Tensor Ring functional decomposition that leverages Implicit Neural Representations and a structured basis combination to overcome the high-frequency modeling limitations of traditional methods, achieving superior performance in multi-dimensional data recovery tasks such as image inpainting and point cloud reconstruction.

Yangyang Xu, Junbo Ke, You-Wei Wen, Chao Wang2026-03-09🤖 cs.AI

LMU-Based Sequential Learning and Posterior Ensemble Fusion for Cross-Domain Infant Cry Classification

This paper proposes a compact acoustic framework that combines multi-branch CNN feature extraction with an efficient Legendre Memory Unit (LMU) for temporal modeling and a calibrated posterior ensemble fusion strategy to achieve robust, real-time cross-domain infant cry classification despite limited annotations and strong domain shifts.

Niloofar Jazaeri, Hilmi R. Dajani, Marco Janeczek, Martin Bouchard2026-03-09🤖 cs.LG

Whisper-RIR-Mega: A Paired Clean-Reverberant Speech Benchmark for ASR Robustness to Room Acoustics

This paper introduces Whisper-RIR-Mega, a benchmark dataset pairing clean LibriSpeech utterances with real room impulse responses to evaluate and demonstrate the performance degradation of various Whisper models under reverberant conditions, while providing open-source resources for reproducible research on robust ASR.

Mandip Goswami2026-03-09🤖 cs.AI

Rigidity-Aware Geometric Pretraining for Protein Design and Conformational Ensembles

This paper introduces RigidSSL, a rigidity-aware self-supervised learning framework that pretrains on static and dynamic protein structures using a bi-directional flow matching objective to jointly optimize geometric understanding and conformational dynamics, thereby significantly improving protein designability, novelty, and the modeling of realistic conformational ensembles.

Zhanghan Ni, Yanjing Li, Zeju Qiu, Bernhard Schölkopf, Hongyu Guo, Weiyang Liu, Shengchao Liu2026-03-09🤖 cs.AI

mlx-vis: GPU-Accelerated Dimensionality Reduction and Visualization on Apple Silicon

mlx-vis is a Python library built on Apple's MLX framework that delivers GPU-accelerated dimensionality reduction, k-nearest neighbor graph construction, and Metal-based circle-splatting rendering for rapid visualization and animation on Apple Silicon devices.

Han Xiao2026-03-09🤖 cs.LG

Traversal-as-Policy: Log-Distilled Gated Behavior Trees as Externalized, Verifiable Policies for Safe, Robust, and Efficient Agents

This paper proposes "Traversal-as-Policy," a framework that distills sandboxed execution logs into verifiable Gated Behavior Trees to replace implicit LLM policies with explicit, state-conditioned macro traversals, thereby significantly improving success rates, eliminating safety violations, and reducing computational costs across diverse autonomous agent benchmarks.

Peiran Li, Jiashuo Sun, Fangzhou Lin, Shuo Xing, Tianfu Fu, Suofei Feng, Chaoqun Ni, Zhengzhong Tu2026-03-09🤖 cs.AI

Information-Theoretic Privacy Control for Sequential Multi-Agent LLM Systems

This paper addresses the risk of amplified privacy leakage in sequential multi-agent LLM systems by formalizing compositional leakage through mutual information, deriving a theoretical bound on its propagation, and proposing a privacy-regularized training framework that enforces system-level privacy guarantees rather than relying on local agent constraints alone.

Sadia Asif, Mohammad Mohammadi Amiri2026-03-09🤖 cs.LG

← Previous Next →