cs.LG papers | Gist.Science

Escaping Model Collapse via Synthetic Data Verification: Near-term Improvements and Long-term Convergence

This paper demonstrates that injecting external verification into synthetic data retraining can prevent model collapse and yield near-term improvements, though theoretical analysis and experiments across linear regression, VAEs, and LLMs show that long-term performance ultimately converges to the verifier's knowledge center and may plateau or decline if the verifier is imperfect.

Bingji Yi, Qiyuan Liu, Yuwei Cheng, Haifeng Xu2026-03-09🤖 cs.LG

Mixed Monotonicity Reachability Analysis of Neural ODE: A Trade-Off Between Tightness and Efficiency

This paper proposes a novel, computationally efficient interval-based reachability analysis method for Neural ODEs that leverages continuous-time mixed monotonicity to trade off tightness for scalability, making it particularly suitable for high-dimensional and safety-critical real-time applications.

Abdelrahman Sayed Sayed, Pierre-Jean Meyer, Mohamed Ghazel2026-03-09🤖 cs.LG

Real-Time Learning of Predictive Dynamic Obstacle Models for Robotic Motion Planning

This paper presents a real-time online framework that utilizes modified sliding-window Hankel Dynamic Mode Decomposition with singular-value hard thresholding and Cadzow projection to denoise partial measurements and construct predictive models for dynamic obstacle motion, enabling stable, variance-aware forecasting suitable for robotic motion planning.

Stella Kombo, Masih Haseli, Skylar X. Wei, Joel W. Burdick2026-03-09🤖 cs.LG

KLASS: KL-Guided Fast Inference in Masked Diffusion Models

The paper introduces KLASS, a training-free, KL-divergence-guided sampling method that significantly accelerates inference in masked diffusion models by unmasking multiple stable tokens per iteration, achieving state-of-the-art speed and performance across text, image, and molecular generation tasks.

Seo Hyun Kim, Sunwoo Hong, Hojung Jung, Youngrok Park, Se-Young Yun2026-03-09🤖 cs.LG

CADM: Cluster-customized Adaptive Distance Metric for Categorical Data Clustering

The paper proposes CADM, a cluster-customized adaptive distance metric that dynamically adjusts distance measurements based on the unique attribute distributions within each cluster to improve categorical and mixed data clustering performance.

Taixi Chen, Yiu-ming Cheung, Yiqun Zhang2026-03-09🤖 cs.LG

FireScope: Wildfire Risk Prediction with a Chain-of-Thought Oracle

The paper introduces FireScope, a novel VLM-based framework and accompanying FireScope-Bench dataset that leverage chain-of-thought reasoning to significantly improve the generalization, interpretability, and accuracy of cross-continental wildfire risk prediction by integrating visual, climatic, and geographic factors.

Mario Markov (INSAIT, Sofia University "St. Kliment Ohridski"), Stefan Maria Ailuro (INSAIT, Sofia University "St. Kliment Ohridski"), Luc Van Gool (INSAIT, Sofia University "St. Kliment Ohridski"), Konrad Schindler (ETH Zurich), Danda Pani Paudel (INSAIT, Sofia University "St. Kliment Ohridski")2026-03-09🤖 cs.LG

EgoCogNav: Cognition-aware Human Egocentric Navigation

The paper introduces EgoCogNav, a multimodal framework that predicts perceived path uncertainty to jointly forecast egocentric trajectories and head motion, supported by the new Cognition-aware Egocentric Navigation (CEN) dataset to better model human cognitive factors in navigation.

Zhiwen Qiu, Ziang Liu, Wenqian Niu, Tapomayukh Bhattacharjee, Saleh Kalantari2026-03-09🤖 cs.LG

SPINE: Token-Selective Test-Time Reinforcement Learning with Entropy-Band Regularization

The paper proposes SPINE, a token-selective test-time reinforcement learning framework that improves reasoning model performance by updating only high-entropy decision-critical tokens with entropy-band regularization, thereby preventing response collapse and enhancing stability without requiring external labels or reward models.

Jianghao Wu, Yasmeen George, Jin Ye, Yicheng Wu, Daniel F. Schmidt, Jianfei Cai2026-03-09🤖 cs.LG

DAISI: Data Assimilation with Inverse Sampling using Stochastic Interpolants

The paper introduces DAISI, a scalable data assimilation algorithm that leverages flow-based generative models with a novel inverse-sampling step to integrate forecast information and guide conditional sampling, enabling accurate filtering in complex, nonlinear systems where traditional Gaussian-based methods fail.

Martin Andrae, Erik Larsson, So Takao, Tomas Landelius, Fredrik Lindsten2026-03-09🤖 cs.LG

Diffusion Fine-Tuning via Reparameterized Policy Gradient of the Soft Q-Function

This paper introduces Soft Q-based Diffusion Finetuning (SQDF), a novel KL-regularized reinforcement learning method that employs a reparameterized policy gradient of a training-free soft Q-function, enhanced by discount factors, consistency models, and off-policy replay buffers, to effectively align diffusion models with downstream objectives while mitigating reward over-optimization and preserving sample diversity.

Hyeongyu Kang, Jaewoo Lee, Woocheol Shin, Kiyoung Om, Jinkyoo Park2026-03-09🤖 cs.AI

Whatever Remains Must Be True: Filtering Drives Reasoning in LLMs, Shaping Diversity

This paper proposes a novel training framework that leverages the $\alpha$ -divergence family to explicitly filter incorrect answers and control the precision-diversity trade-off, thereby overcoming the diversity loss inherent in standard Reinforcement Learning and achieving state-of-the-art performance on the Lean theorem-proving benchmark.

Germán Kruszewski, Pierre Erbacher, Jos Rozen, Marc Dymetman2026-03-09🤖 cs.AI

A-3PO: Accelerating Asynchronous LLM Training with Staleness-aware Proximal Policy Approximation

The paper introduces A-3PO, a method that accelerates asynchronous LLM training by 1.8x by approximating the computationally expensive proximal policy in Decoupled PPO through simple interpolation, thereby eliminating the need for extra forward passes while maintaining comparable performance.

Xiaocan Li, Shiliang Wu, Zheng Shen2026-03-09🤖 cs.AI

DFIR-DETR: Frequency-Domain Iterative Refinement and Dynamic Feature Aggregation for Small Object Detection

DFIR-DETR is a transformer-based small object detector that addresses key limitations in standard architectures by introducing Dynamic Content-Feature Aggregation for adaptive attention, a norm-preserving Dynamic Feature Pyramid Network for detail recovery, and a Frequency-domain Iterative Refinement module to preserve high-frequency boundaries, achieving state-of-the-art performance on NEU-DET and VisDrone benchmarks with high efficiency.

Bo Gao, Jingcheng Tong, Xingsheng Chen, Han Yu, Zichen Li2026-03-09🤖 cs.LG

Two-dimensional RMSD projections for reaction path visualization and validation

This paper introduces a novel two-dimensional visualization framework that maps reaction trajectories onto a permutation-corrected RMSD plane with gradient-enhanced Gaussian Process energy interpolation, enabling more effective comparison and validation of optimization histories across different computational methods for complex chemical reactions.

Rohit Goswami2026-03-09🔬 cond-mat.mtrl-sci

Uncertainty-Aware Subset Selection for Robust Visual Explainability under Distribution Shifts

This paper addresses the degradation of existing subset-based visual explanation methods under out-of-distribution conditions by introducing a training-free framework that integrates layer-wise uncertainty estimation with submodular optimization to generate robust, diverse, and informative attributions.

Madhav Gupta, Vishak Prasad C, Ganesh Ramakrishnan2026-03-09🤖 cs.LG

Data-Driven Global Sensitivity Analysis for Engineering Design Based on Individual Conditional Expectations

This paper proposes a novel global sensitivity analysis method based on Individual Conditional Expectation (ICE) curves that overcomes the limitations of traditional Partial Dependence Plots (PDPs) in capturing input interactions, offering a mathematically proven, more informative metric for explainable machine learning in engineering design.

Pramudita Satria Palar, Paul Saves, Rommel G. Regis, Koji Shimoyama, Shigeru Obayashi, Nicolas Verstaevel, Joseph Morlier2026-03-09🤖 cs.AI

A Novel Patch-Based TDA Approach for Computed Tomography Imaging

This paper introduces a novel patch-based Topological Data Analysis approach for 3D CT imaging that significantly outperforms traditional 3D cubical complex methods and radiomic features in both classification accuracy and computational efficiency, accompanied by the release of a Python package to facilitate its adoption.

Dashti A. Ali, Aras T. Asaad, Jacob J. Peoples, Mohammad Hamghalam, Natalie Gangai, Richard K. G. Do, Alice C. Wei, Amber L. Simpson2026-03-09🤖 cs.LG

Understanding and Improving Hyperbolic Deep Reinforcement Learning

This paper addresses the optimization challenges in hyperbolic deep reinforcement learning by identifying the destabilizing effects of large-norm embeddings and introducing Hyper++, a new agent that employs feature regularization, categorical value loss, and improved layer formulations to achieve stable, faster, and superior performance compared to existing Euclidean and hyperbolic baselines.

Timo Klein, Thomas Lang, Andrii Shkabrii, Alexander Sturm, Kevin Sidak, Lukas Miklautz, Claudia Plant, Yllka Velaj, Sebastian Tschiatschek2026-03-09🤖 cs.AI

CARE What Fails: Contrastive Anchored-REflection for Verifiable Multimodal

CARE (Contrastive Anchored REflection) is a failure-centric post-training framework for multimodal reasoning that enhances Group-relative Reinforcement Learning with Verifiable Rewards (RLVR) by leveraging an anchored-contrastive objective and Reflection-Guided Resampling to transform erroneous rollouts into effective supervision signals, thereby significantly improving accuracy and training stability on visual-reasoning benchmarks.

Yongxin Wang, Zhicheng Yang, Meng Cao, Mingfei Han, Haokun Lin, Yingying Zhu, Xiaojun Chang, Xiaodan Liang2026-03-09🤖 cs.AI

LLMTM: Benchmarking and Optimizing LLMs for Temporal Motif Analysis in Dynamic Graphs

This paper introduces LLMTM, a comprehensive benchmark for evaluating Large Language Models on temporal motif analysis in dynamic graphs, and proposes a cost-effective, structure-aware dispatcher that intelligently balances high accuracy and computational expense by routing queries between standard prompting and a specialized tool-augmented agent.

Bing Hao, Minglai Shao, Zengyi Wo, Yunlong Chu, Yuhang Liu, Ruijie Wang2026-03-09🤖 cs.AI

← Previous Next →