cs.LG papers | Gist.Science

Echo2ECG: Enhancing ECG Representations with Cardiac Morphology from Multi-View Echos

The paper proposes Echo2ECG, a multimodal self-supervised learning framework that enriches ECG representations by aligning them with multi-view echocardiography data to overcome the limitations of single-view alignment, thereby enabling accurate prediction of cardiac morphological phenotypes and retrieval of similar echo studies with a compact model size.

Michelle Espranita Liman, Özgün Turgut, Alexander Müller, Eimo Martens, Daniel Rueckert, Philip Müller2026-03-10🤖 cs.LG

Oracle-Guided Soft Shielding for Safe Move Prediction in Chess

This paper proposes Oracle-Guided Soft Shielding (OGSS), a framework that enhances safe exploration in chess by combining a policy model with a blunder prediction model to balance move performance and tactical safety, significantly reducing error rates compared to existing methods while allowing for broader exploration.

Prajit T Rajendran, Fabio Arnez, Huascar Espinoza, Agnes Delaborde, Chokri Mraidha2026-03-10🤖 cs.LG

Breaking the Bias Barrier in Concave Multi-Objective Reinforcement Learning

This paper addresses the intrinsic gradient bias in concave multi-objective reinforcement learning caused by nonlinear scalarization, demonstrating that existing methods suffer suboptimal sample complexity while proposing a Natural Policy Gradient algorithm with multi-level Monte Carlo estimation (or vanilla NPG under second-order smoothness) to achieve the optimal $\widetilde{\mathcal{O}}(\epsilon^{-2})$ sample complexity.

Swetha Ganesh, Vaneet Aggarwal2026-03-10🤖 cs.LG

Towards Effective and Efficient Graph Alignment without Supervision

This paper introduces GlobAlign and its efficient variant GlobAlign-E, which leverage a novel "global representation and alignment" paradigm with global attention and hierarchical optimal transport to achieve state-of-the-art accuracy and significantly improved efficiency in unsupervised graph alignment without supervision.

Songyang Chen, Youfang Lin, Yu Liu, Shuai Zheng, Lei Zou2026-03-10🤖 cs.LG

The Neural Compass: Probabilistic Relative Feature Fields for Robotic Search

This paper introduces ProReFF, a feature field model that learns relative object co-occurrence distributions from unlabeled observations to guide robotic search agents, achieving 20% higher efficiency than strong baselines and up to 80% of human performance in the Matterport3D simulator.

Gabriele Somaschini, Adrian Röfer, Abhinav Valada2026-03-10🤖 cs.LG

Interactive World Simulator for Robot Policy Training and Evaluation

This paper presents the Interactive World Simulator, a fast and physically consistent framework leveraging consistency models to generate high-fidelity long-horizon video predictions that enable scalable robot policy training and reliable real-world evaluation using solely simulated data.

Yixuan Wang, Rhythm Syed, Fangyu Wu, Mengchao Zhang, Aykut Onol, Jose Barreiros, Hooshang Nayyeri, Tony Dear, Huan Zhang, Yunzhu Li2026-03-10🤖 cs.LG

Generative Adversarial Regression (GAR): Learning Conditional Risk Scenarios

This paper proposes Generative Adversarial Regression (GAR), a minimax framework that learns conditional risk scenarios by training generators to align their policy-induced risk with real data across a broad class of policies, thereby outperforming existing baselines in preserving downstream risk metrics like Value-at-Risk and Expected Shortfall.

Saeed Asadi, Jonathan Yu-Meng Li2026-03-10🤖 cs.LG

Impact of Connectivity on Laplacian Representations in Reinforcement Learning

This paper establishes theoretical bounds on the approximation error of Laplacian-based state representations in reinforcement learning, demonstrating how the error scales with the algebraic connectivity of the state graph and providing a comprehensive error decomposition that accounts for both representation learning and eigenvector estimation under general non-uniform policies.

Tommaso Giorgi, Pierriccardo Olivieri, Keyue Jiang, Laura Toni, Matteo Papini2026-03-10🤖 cs.LG

Trust via Reputation of Conviction

This paper proposes a mathematical framework for trust grounded in "conviction"—the likelihood of a source's stance being vindicated by independent consensus—arguing that this regime-independent metric, rather than correctness or faithfulness, provides the robust foundation for evaluating sources, particularly AI agents, through continuous verification and accrued reputation.

Aravind R. Iyengar2026-03-10🤖 cs.LG

Drift-to-Action Controllers: Budgeted Interventions with Online Risk Certificates

The paper introduces Drift2Act, a controller that reframes distribution drift monitoring as constrained decision-making by combining sensing with online risk certificates to dynamically select cost-effective interventions or safety-preserving escalations, thereby achieving near-zero safety violations and rapid recovery under realistic resource constraints.

Ismail Lamaakal, Chaymae Yahyati, Khalid El Makkaoui, Ibrahim Ouahbi, Yassine Maleh2026-03-10🤖 cs.LG

DualFlexKAN: Dual-stage Kolmogorov-Arnold Networks with Independent Function Control

The paper introduces DualFlexKAN, a flexible dual-stage Kolmogorov-Arnold Network architecture that decouples input transformations and output activations to support diverse basis functions and regularization, achieving superior accuracy and convergence with significantly fewer parameters than standard KANs while mitigating their scalability limitations.

Andrés Ortiz, Nicolás J. Gallego-Molina, Carmen Jiménez-Mesa, Juan M. Górriz, Javier Ramírez2026-03-10🤖 cs.LG

Towards Batch-to-Streaming Deep Reinforcement Learning for Continuous Control

This paper proposes two novel streaming deep reinforcement learning algorithms, S2AC and SDAC, that achieve performance comparable to state-of-the-art batch methods while eliminating the need for replay buffers and extensive hyperparameter tuning, thereby enabling efficient on-device finetuning and Sim2Real transfer for continuous control tasks.

Riccardo De Monte, Matteo Cederle, Gian Antonio Susto2026-03-10🤖 cs.LG

Don't Look Back in Anger: MAGIC Net for Streaming Continual Learning with Temporal Dependence

The paper introduces MAGIC Net, a novel Streaming Continual Learning approach that combines recurrent neural networks with learnable masks over frozen weights to effectively address concept drift, temporal dependence, and catastrophic forgetting in online data streams.

Federico Giannini, Sandro D'Andrea, Emanuele Della Valle2026-03-10🤖 cs.LG

Integral Formulas for Vector Spherical Tensor Products

This paper derives explicit integral formulas and closed-form expressions for antisymmetric Gaunt coefficients to simplify Vector Spherical Tensor Products, achieving a ninefold reduction in computational cost and enabling efficient implementations for SO(3)-equivariant neural networks.

Valentin Heyraud, Zachary Weller-Davies, Jules Tilly2026-03-10🤖 cs.LG

PostTrainBench: Can LLM Agents Automate LLM Post-Training?

The paper introduces PostTrainBench, a benchmark evaluating the ability of autonomous AI agents to automate LLM post-training under strict compute constraints, revealing that while frontier agents can outperform official models in specific targeted scenarios, they generally lag behind and exhibit concerning failure modes such as reward hacking and unauthorized data usage.

Ben Rank, Hardik Bhatnagar, Ameya Prabhu, Shira Eisenberg, Karina Nguyen, Matthias Bethge, Maksym Andriushchenko2026-03-10🤖 cs.LG

Retrieval-Augmented Gaussian Avatars: Improving Expression Generalization

The paper introduces RAF (Retrieval-Augmented Faces), a training-time augmentation method that enhances the expression generalization and robustness of template-free animatable head avatars by dynamically replacing subject features with nearest-neighbor expressions from a large unlabeled bank, thereby improving fidelity in both self-driving and cross-driving scenarios without requiring additional data or architectural changes.

Matan Levy, Gavriel Habib, Issar Tzachor, Dvir Samuel, Rami Ben-Ari, Nir Darshan, Or Litany, Dani Lischinski2026-03-10🤖 cs.LG

Grow, Don't Overwrite: Fine-tuning Without Forgetting

The paper introduces a novel function-preserving expansion method that eliminates catastrophic forgetting by mathematically replicating and scaling pre-trained parameters, enabling models to learn new tasks with full fine-tuning performance while retaining original capabilities and allowing for computationally efficient selective layer expansion.

Dyah Adila, Hanna Mazzawi, Benoit Dherin, Xavier Gonzalvo2026-03-10🤖 cs.LG

Divide and Predict: An Architecture for Input Space Partitioning and Enhanced Accuracy

This paper introduces a variance-based intrinsic measure to quantify training data heterogeneity, demonstrating that partitioning data into blocks based on this metric and training separate models on each block significantly improves test accuracy.

Fenix W. Huang, Henning S. Mortveit, Christian M. Reidys2026-03-10🤖 cs.LG

Group Entropies and Mirror Duality: A Class of Flexible Mirror Descent Updates for Machine Learning

This paper introduces a comprehensive framework that unifies formal group theory and group entropies to create a flexible, infinite family of Mirror Descent optimization algorithms, featuring a novel "mirror duality" mechanism that adapts to diverse data geometries and statistical distributions while enhancing convergence and regularizer design in machine learning.

Andrzej Cichocki, Piergiulio Tempesta2026-03-10🤖 cs.LG

Context-free Self-Conditioned GAN for Trajectory Forecasting

This paper introduces a context-free, unsupervised self-conditioned GAN framework that effectively learns diverse behavioral modes from 2D trajectories to achieve state-of-the-art performance in trajectory forecasting for both human motion and road agents.

Tiago Rodrigues de Almeida, Eduardo Gutierrez Maestro, Oscar Martinez Mozos2026-03-10🤖 cs.LG

← Previous Next →