cs.LG papers | Gist.Science

A-3PO: Accelerating Asynchronous LLM Training with Staleness-aware Proximal Policy Approximation

The paper introduces A-3PO, a method that accelerates asynchronous LLM training by 1.8x by approximating the computationally expensive proximal policy in Decoupled PPO through simple interpolation, thereby eliminating the need for extra forward passes while maintaining comparable performance.

Xiaocan Li, Shiliang Wu, Zheng Shen2026-03-09🤖 cs.AI

DFIR-DETR: Frequency-Domain Iterative Refinement and Dynamic Feature Aggregation for Small Object Detection

DFIR-DETR is a transformer-based small object detector that addresses key limitations in standard architectures by introducing Dynamic Content-Feature Aggregation for adaptive attention, a norm-preserving Dynamic Feature Pyramid Network for detail recovery, and a Frequency-domain Iterative Refinement module to preserve high-frequency boundaries, achieving state-of-the-art performance on NEU-DET and VisDrone benchmarks with high efficiency.

Bo Gao, Jingcheng Tong, Xingsheng Chen, Han Yu, Zichen Li2026-03-09🤖 cs.LG

Two-dimensional RMSD projections for reaction path visualization and validation

This paper introduces a novel two-dimensional visualization framework that maps reaction trajectories onto a permutation-corrected RMSD plane with gradient-enhanced Gaussian Process energy interpolation, enabling more effective comparison and validation of optimization histories across different computational methods for complex chemical reactions.

Rohit Goswami2026-03-09🔬 cond-mat.mtrl-sci

Uncertainty-Aware Subset Selection for Robust Visual Explainability under Distribution Shifts

This paper addresses the degradation of existing subset-based visual explanation methods under out-of-distribution conditions by introducing a training-free framework that integrates layer-wise uncertainty estimation with submodular optimization to generate robust, diverse, and informative attributions.

Madhav Gupta, Vishak Prasad C, Ganesh Ramakrishnan2026-03-09🤖 cs.LG

Data-Driven Global Sensitivity Analysis for Engineering Design Based on Individual Conditional Expectations

This paper proposes a novel global sensitivity analysis method based on Individual Conditional Expectation (ICE) curves that overcomes the limitations of traditional Partial Dependence Plots (PDPs) in capturing input interactions, offering a mathematically proven, more informative metric for explainable machine learning in engineering design.

Pramudita Satria Palar, Paul Saves, Rommel G. Regis, Koji Shimoyama, Shigeru Obayashi, Nicolas Verstaevel, Joseph Morlier2026-03-09🤖 cs.AI

A Novel Patch-Based TDA Approach for Computed Tomography Imaging

This paper introduces a novel patch-based Topological Data Analysis approach for 3D CT imaging that significantly outperforms traditional 3D cubical complex methods and radiomic features in both classification accuracy and computational efficiency, accompanied by the release of a Python package to facilitate its adoption.

Dashti A. Ali, Aras T. Asaad, Jacob J. Peoples, Mohammad Hamghalam, Natalie Gangai, Richard K. G. Do, Alice C. Wei, Amber L. Simpson2026-03-09🤖 cs.LG

Understanding and Improving Hyperbolic Deep Reinforcement Learning

This paper addresses the optimization challenges in hyperbolic deep reinforcement learning by identifying the destabilizing effects of large-norm embeddings and introducing Hyper++, a new agent that employs feature regularization, categorical value loss, and improved layer formulations to achieve stable, faster, and superior performance compared to existing Euclidean and hyperbolic baselines.

Timo Klein, Thomas Lang, Andrii Shkabrii, Alexander Sturm, Kevin Sidak, Lukas Miklautz, Claudia Plant, Yllka Velaj, Sebastian Tschiatschek2026-03-09🤖 cs.AI

CARE What Fails: Contrastive Anchored-REflection for Verifiable Multimodal

CARE (Contrastive Anchored REflection) is a failure-centric post-training framework for multimodal reasoning that enhances Group-relative Reinforcement Learning with Verifiable Rewards (RLVR) by leveraging an anchored-contrastive objective and Reflection-Guided Resampling to transform erroneous rollouts into effective supervision signals, thereby significantly improving accuracy and training stability on visual-reasoning benchmarks.

Yongxin Wang, Zhicheng Yang, Meng Cao, Mingfei Han, Haokun Lin, Yingying Zhu, Xiaojun Chang, Xiaodan Liang2026-03-09🤖 cs.AI

LLMTM: Benchmarking and Optimizing LLMs for Temporal Motif Analysis in Dynamic Graphs

This paper introduces LLMTM, a comprehensive benchmark for evaluating Large Language Models on temporal motif analysis in dynamic graphs, and proposes a cost-effective, structure-aware dispatcher that intelligently balances high accuracy and computational expense by routing queries between standard prompting and a specialized tool-augmented agent.

Bing Hao, Minglai Shao, Zengyi Wo, Yunlong Chu, Yuhang Liu, Ruijie Wang2026-03-09🤖 cs.AI

Bayesian Monocular Depth Refinement via Neural Radiance Fields

The paper proposes MDENeRF, an iterative Bayesian framework that refines smooth monocular depth estimates by fusing them with high-frequency geometric details and uncertainty derived from Neural Radiance Fields, thereby enhancing scene understanding for applications like autonomous navigation.

Arun Muthukkumar2026-03-09🤖 cs.LG

Purification Before Fusion: Toward Mask-Free Speech Enhancement for Robust Audio-Visual Speech Recognition

This paper proposes a novel end-to-end audio-visual speech recognition framework that integrates speech enhancement via a Conformer-based bottleneck fusion module to implicitly refine noisy audio features without explicit mask generation, thereby preserving semantic integrity and outperforming existing mask-based methods on the LRS3 benchmark under noisy conditions.

Linzhi Wu, Xingyu Zhang, Hao Yuan, Yakun Zhang, Changyan Zheng, Liang Xie, Tiejun Liu, Erwei Yin2026-03-09🤖 cs.AI

Beyond Mapping : Domain-Invariant Representations via Spectral Embedding of Optimal Transport Plans

This paper proposes a novel domain adaptation method that derives domain-invariant representations by interpreting smoothed optimal transport plans as bipartite graph adjacency matrices and applying spectral embedding, demonstrating strong performance across acoustic and electrical defect detection tasks while mitigating the sensitivity of traditional Monge map approximations to regularization and hyperparameters.

Abdel Djalil Sad Saoud, Fred Maurice Ngolè Mboula, Hanane Slimani2026-03-09🤖 cs.LG

Laser interferometry as a robust neuromorphic platform for machine learning

This paper presents a robust neuromorphic platform for machine learning that implements optical neural networks using only linear optical resources and coherent states, achieving necessary nonlinearity through phase-shift encoding to enable straightforward experimental in situ training and inference while demonstrating high resilience to photon losses.

Amanuel Anteneh, Kyungeun Kim, J. M. Schwarz, Israel Klich, Olivier Pfister2026-03-09🔬 physics.optics

Neural Signals Generate Clinical Notes in the Wild

This paper introduces CELM, the first foundation model for clinical EEG-to-language generation that leverages a large-scale dataset of 9,922 reports and 11,000 hours of recordings to achieve significant improvements in summarizing long-term EEG data and generating comprehensive clinical reports.

Jathurshan Pradeepkumar, Zheng Chen, Jimeng Sun2026-03-09🤖 cs.AI

Online unsupervised Hebbian learning in deep photonic neuromorphic networks

This paper presents and experimentally demonstrates a purely photonic deep neuromorphic network that achieves 100% accuracy on a letter recognition task by utilizing a local optical feedback mechanism with non-volatile phase-change material synapses to enable online, unsupervised Hebbian learning without inefficient optical-electrical-optical conversions.

Xi Li, Disha Biswas, Peng Zhou, Wesley H. Brigner, Anna Capuano, Joseph S. Friedman, Qing Gu2026-03-09🔬 physics.optics

ZK-HybridFL: Zero-Knowledge Proof-Enhanced Hybrid Ledger for Federated Learning

ZK-HybridFL is a secure, scalable decentralized federated learning framework that integrates a DAG ledger with zero-knowledge proofs and sidechains to enable privacy-preserving model validation, robust adversarial detection, and efficient on-chain verification while outperforming existing solutions in convergence speed, accuracy, and latency.

Amirhossein Taherpour, Xiaodong Wang2026-03-09🤖 cs.LG

EDIS: Diagnosing LLM Reasoning via Entropy Dynamics

This paper introduces the Entropy Dynamics Instability Score (EDIS), a metric that leverages the temporal evolution of token-level entropy to diagnose and improve LLM reasoning by identifying characteristic instability patterns associated with erroneous solutions.

Chenghua Zhu, Siyan Wu, Xiangkang Zeng, Zishan Xu, Zhaolu Kang, Yifu Guo, Yuquan Lu, Junduan Huang, Guojing Zhou2026-03-09🤖 cs.LG

Restoring Exploration after Post-Training: Latent Exploration Decoding for Large Reasoning Models

This paper introduces Latent Exploration Decoding (LED), a training-free decoding strategy that leverages high-entropy intermediate layer posteriors to counteract exploration collapse in post-trained Large Reasoning Models, thereby significantly improving accuracy across multiple benchmarks.

Wenhui Tan, Fiorenzo Parascandolo, Enver Sangineto, Jianzhong Ju, Zhenbo Luo, Qian Cao, Rita Cucchiara, Ruihua Song, Jian Luan2026-03-09🤖 cs.LG

Stress-Testing Alignment Audits With Prompt-Level Strategic Deception

This paper introduces an automatic red-team pipeline that successfully stress-tests alignment audits by generating strategic system prompts capable of deceiving both black-box and white-box methods into making confident, incorrect assessments of misaligned models, thereby revealing the first documented evidence of activation-based strategic deception.

Oliver Daniels, Perusha Moodley, Benjamin M. Marlin, David Lindner2026-03-09🤖 cs.LG

Latent Poincaré Shaping for Agentic Reinforcement Learning

The paper introduces LaPha, a method that trains AlphaZero-like LLM agents in a hyperbolic Poincaré latent space to leverage negative curvature for efficient search and dense process rewards, significantly boosting mathematical reasoning performance on benchmarks like MATH-500 and AIME.

Hanchen Xia, Baoyou Chen, Zelin Zang, Yutang Ge, Guojiang Zhao, Siyu Zhu2026-03-09🤖 cs.LG

← Previous Next →