cs.LG papers | Gist.Science

CARE What Fails: Contrastive Anchored-REflection for Verifiable Multimodal

CARE (Contrastive Anchored REflection) is a failure-centric post-training framework for multimodal reasoning that enhances Group-relative Reinforcement Learning with Verifiable Rewards (RLVR) by leveraging an anchored-contrastive objective and Reflection-Guided Resampling to transform erroneous rollouts into effective supervision signals, thereby significantly improving accuracy and training stability on visual-reasoning benchmarks.

Yongxin Wang, Zhicheng Yang, Meng Cao, Mingfei Han, Haokun Lin, Yingying Zhu, Xiaojun Chang, Xiaodan Liang2026-03-09🤖 cs.AI

LLMTM: Benchmarking and Optimizing LLMs for Temporal Motif Analysis in Dynamic Graphs

This paper introduces LLMTM, a comprehensive benchmark for evaluating Large Language Models on temporal motif analysis in dynamic graphs, and proposes a cost-effective, structure-aware dispatcher that intelligently balances high accuracy and computational expense by routing queries between standard prompting and a specialized tool-augmented agent.

Bing Hao, Minglai Shao, Zengyi Wo, Yunlong Chu, Yuhang Liu, Ruijie Wang2026-03-09🤖 cs.AI

Bayesian Monocular Depth Refinement via Neural Radiance Fields

The paper proposes MDENeRF, an iterative Bayesian framework that refines smooth monocular depth estimates by fusing them with high-frequency geometric details and uncertainty derived from Neural Radiance Fields, thereby enhancing scene understanding for applications like autonomous navigation.

Arun Muthukkumar2026-03-09🤖 cs.LG

Purification Before Fusion: Toward Mask-Free Speech Enhancement for Robust Audio-Visual Speech Recognition

This paper proposes a novel end-to-end audio-visual speech recognition framework that integrates speech enhancement via a Conformer-based bottleneck fusion module to implicitly refine noisy audio features without explicit mask generation, thereby preserving semantic integrity and outperforming existing mask-based methods on the LRS3 benchmark under noisy conditions.

Linzhi Wu, Xingyu Zhang, Hao Yuan, Yakun Zhang, Changyan Zheng, Liang Xie, Tiejun Liu, Erwei Yin2026-03-09🤖 cs.AI

Beyond Mapping : Domain-Invariant Representations via Spectral Embedding of Optimal Transport Plans

This paper proposes a novel domain adaptation method that derives domain-invariant representations by interpreting smoothed optimal transport plans as bipartite graph adjacency matrices and applying spectral embedding, demonstrating strong performance across acoustic and electrical defect detection tasks while mitigating the sensitivity of traditional Monge map approximations to regularization and hyperparameters.

Abdel Djalil Sad Saoud, Fred Maurice Ngolè Mboula, Hanane Slimani2026-03-09🤖 cs.LG

Laser interferometry as a robust neuromorphic platform for machine learning

This paper presents a robust neuromorphic platform for machine learning that implements optical neural networks using only linear optical resources and coherent states, achieving necessary nonlinearity through phase-shift encoding to enable straightforward experimental in situ training and inference while demonstrating high resilience to photon losses.

Amanuel Anteneh, Kyungeun Kim, J. M. Schwarz, Israel Klich, Olivier Pfister2026-03-09🔬 physics.optics

Neural Signals Generate Clinical Notes in the Wild

This paper introduces CELM, the first foundation model for clinical EEG-to-language generation that leverages a large-scale dataset of 9,922 reports and 11,000 hours of recordings to achieve significant improvements in summarizing long-term EEG data and generating comprehensive clinical reports.

Jathurshan Pradeepkumar, Zheng Chen, Jimeng Sun2026-03-09🤖 cs.AI

Online unsupervised Hebbian learning in deep photonic neuromorphic networks

This paper presents and experimentally demonstrates a purely photonic deep neuromorphic network that achieves 100% accuracy on a letter recognition task by utilizing a local optical feedback mechanism with non-volatile phase-change material synapses to enable online, unsupervised Hebbian learning without inefficient optical-electrical-optical conversions.

Xi Li, Disha Biswas, Peng Zhou, Wesley H. Brigner, Anna Capuano, Joseph S. Friedman, Qing Gu2026-03-09🔬 physics.optics

ZK-HybridFL: Zero-Knowledge Proof-Enhanced Hybrid Ledger for Federated Learning

ZK-HybridFL is a secure, scalable decentralized federated learning framework that integrates a DAG ledger with zero-knowledge proofs and sidechains to enable privacy-preserving model validation, robust adversarial detection, and efficient on-chain verification while outperforming existing solutions in convergence speed, accuracy, and latency.

Amirhossein Taherpour, Xiaodong Wang2026-03-09🤖 cs.LG

EDIS: Diagnosing LLM Reasoning via Entropy Dynamics

This paper introduces the Entropy Dynamics Instability Score (EDIS), a metric that leverages the temporal evolution of token-level entropy to diagnose and improve LLM reasoning by identifying characteristic instability patterns associated with erroneous solutions.

Chenghua Zhu, Siyan Wu, Xiangkang Zeng, Zishan Xu, Zhaolu Kang, Yifu Guo, Yuquan Lu, Junduan Huang, Guojing Zhou2026-03-09🤖 cs.LG

Restoring Exploration after Post-Training: Latent Exploration Decoding for Large Reasoning Models

This paper introduces Latent Exploration Decoding (LED), a training-free decoding strategy that leverages high-entropy intermediate layer posteriors to counteract exploration collapse in post-trained Large Reasoning Models, thereby significantly improving accuracy across multiple benchmarks.

Wenhui Tan, Fiorenzo Parascandolo, Enver Sangineto, Jianzhong Ju, Zhenbo Luo, Qian Cao, Rita Cucchiara, Ruihua Song, Jian Luan2026-03-09🤖 cs.LG

Stress-Testing Alignment Audits With Prompt-Level Strategic Deception

This paper introduces an automatic red-team pipeline that successfully stress-tests alignment audits by generating strategic system prompts capable of deceiving both black-box and white-box methods into making confident, incorrect assessments of misaligned models, thereby revealing the first documented evidence of activation-based strategic deception.

Oliver Daniels, Perusha Moodley, Benjamin M. Marlin, David Lindner2026-03-09🤖 cs.LG

Latent Poincaré Shaping for Agentic Reinforcement Learning

The paper introduces LaPha, a method that trains AlphaZero-like LLM agents in a hyperbolic Poincaré latent space to leverage negative curvature for efficient search and dense process rewards, significantly boosting mathematical reasoning performance on benchmarks like MATH-500 and AIME.

Hanchen Xia, Baoyou Chen, Zelin Zang, Yutang Ge, Guojiang Zhao, Siyu Zhu2026-03-09🤖 cs.LG

Validating Interpretability in siRNA Efficacy Prediction: A Perturbation-Based, Dataset-Aware Protocol

This paper introduces a perturbation-based validation protocol to ensure the faithfulness of saliency maps in siRNA efficacy prediction, revealing critical failure modes across datasets and proposing a biology-informed regularizer to enhance the reliability of explanation-guided therapeutic design.

Zahra Khodagholi, Niloofar Yousefi2026-03-09🤖 cs.LG

Towards Autonomous Mathematics Research

This paper introduces Aletheia, an autonomous AI research agent powered by advanced reasoning models and tool use that successfully generates, verifies, and revises mathematical proofs from Olympiad problems to PhD-level research, achieving milestones such as fully AI-generated papers and the autonomous solution of open problems while proposing new frameworks for quantifying AI autonomy and transparency.

Tony Feng, Trieu H. Trinh, Garrett Bingham, Dawsen Hwang, Yuri Chervonyi, Junehyuk Jung, Joonkyung Lee, Carlo Pagano, Sang-hyun Kim, Federico Pasqualotto, Sergei Gukov, Jonathan N. Lee, Junsu Kim, Kaiying Hou, Golnaz Ghiasi, Yi Tay, YaGuang Li, Chenkai Kuang, Yuan Liu, Hanzhao Lin, Evan Zheran Liu, Nigamaa Nayakanti, Xiaomeng Yang, Heng-Tze Cheng, Demis Hassabis, Koray Kavukcuoglu, Quoc V. Le, Thang Luong2026-03-09🤖 cs.AI

Stochastic Parroting in Temporal Attention -- Regulating the Diagonal Sink

This paper investigates the "diagonal sink" phenomenon in temporal attention mechanisms, where information degeneration causes a bias toward initial tokens, and proposes theoretical sensitivity bounds alongside effective regularization methods to mitigate this issue.

Victoria Hankemeier, Malte Schilling2026-03-09🤖 cs.LG

SWE-MiniSandbox: Container-Free Reinforcement Learning for Building Software Engineering Agents

SWE-MiniSandbox is a lightweight, container-free framework that leverages kernel-level isolation and environment pre-caching to significantly reduce storage and setup overhead while maintaining performance comparable to traditional container-based pipelines for scaling reinforcement learning in software engineering agents.

Danlong Yuan, Wei Wu, Zhengren Wang, Xueliang Zhao, Huishuai Zhang, Dongyan Zhao2026-03-09🤖 cs.AI

MiDAS: A Multimodal Data Acquisition System and Dataset for Robot-Assisted Minimally Invasive Surgery

This paper introduces MiDAS, an open-source, platform-agnostic system that enables non-invasive, time-synchronized multimodal data acquisition for robot-assisted minimally invasive surgery, validated by demonstrating that its external sensing approach achieves gesture recognition performance comparable to proprietary telemetry while releasing the first annotated dataset for hernia repair suturing.

Keshara Weerasinghe (MD), Seyed Hamid Reza Roodabeh (MD), Andrew Hawkins (MD), Zhaomeng Zhang, Zachary Schrader, Homa Alemzadeh2026-03-09🤖 cs.LG

An Adaptive Model Selection Framework for Demand Forecasting under Horizon-Induced Degradation to Support Business Strategy and Operations

This paper introduces AHSIV, an adaptive framework that addresses horizon-induced model ranking instability in demand forecasting by integrating horizon-aware error metrics, structural demand classification, and multi-objective optimization to provide robust, operationally coherent model selection for heterogeneous business environments.

Adolfo González, Víctor Parada2026-03-09🤖 cs.AI

GaiaFlow: Semantic-Guided Diffusion Tuning for Carbon-Frugal Search

GaiaFlow is a novel framework that achieves carbon-frugal search by integrating semantic-guided diffusion tuning, retrieval-guided Langevin dynamics, and adaptive efficiency protocols to balance high retrieval accuracy with significantly reduced environmental impact.

Rong Fu, Jia Yee Tan, Chunlei Meng, Shuo Yin, Xiaowen Ma, Wangyu Wu, Muge Qi, Guangzhen Yao, Zhaolu Kang, Zeli Su, Simon Fong2026-03-09🤖 cs.LG

← Previous Next →