cs.LG papers | Gist.Science

Reward-Conditioned Reinforcement Learning

This paper introduces Reward-Conditioned Reinforcement Learning (RCRL), a framework that trains a single agent to optimize a family of reward specifications from a shared off-policy dataset, enabling robust and efficient adaptation to changing task preferences without sacrificing the simplicity of single-task training.

Michal Nauman, Marek Cygan, Pieter Abbeel2026-03-06🤖 cs.LG

Synchronization-based clustering on the unit hypersphere

This paper introduces a novel clustering algorithm for unit hypersphere data based on the $d$ -dimensional generalized Kuramoto model, demonstrating its effectiveness and superior or comparable accuracy against traditional methods through experiments on both synthetic and real-world datasets.

Zinaid Kapić, Aladin Crnkić, Goran Mauša2026-03-06🤖 cs.LG

Aura: Universal Multi-dimensional Exogenous Integration for Aviation Time Series

This paper introduces Aura, a universal framework that enhances aviation time series forecasting by explicitly encoding and integrating three distinct types of multi-dimensional exogenous factors through a tailored tripartite mechanism, achieving state-of-the-art performance on large-scale industrial datasets.

Jiafeng Lin, Mengren Zheng, Simeng Ye + 5 more2026-03-06🤖 cs.AI

Axiomatic On-Manifold Shapley via Optimal Generative Flows

This paper proposes a novel Axiomatic On-Manifold Shapley framework that utilizes optimal generative flows and Wasserstein-2 geodesics to eliminate off-manifold artifacts, ensuring geometric efficiency, reparameterization invariance, and superior semantic alignment in model attribution.

Cenwei Zhang, Lin Zhu, Manxi Lin + 1 more2026-03-06🤖 cs.AI

ARC-TGI: Human-Validated Task Generators with Reasoning Chain Templates for ARC-AGI

This paper introduces ARC-TGI, an open-source framework featuring human-validated Python generators that produce diverse, rule-consistent ARC-AGI tasks with natural language reasoning chains and code, addressing dataset overfitting by ensuring training examples collectively reveal underlying rules for scalable benchmarking.

Jens Lehmann, Syeda Khushbakht, Nikoo Salehfard + 4 more2026-03-06🤖 cs.AI

BLINK: Behavioral Latent Modeling of NK Cell Cytotoxicity

This paper introduces BLINK, a trajectory-based recurrent state-space model that learns latent interaction dynamics from time-lapse microscopy to accurately predict and forecast NK cell cytotoxic outcomes while providing an interpretable representation of cellular behavioral modes.

Iman Nematollahi, Jose Francisco Villena-Ossa, Alina Moter + 6 more2026-03-06🤖 cs.LG

Decoupling Task and Behavior: A Two-Stage Reward Curriculum in Reinforcement Learning for Robotics

This paper proposes a two-stage reward curriculum that decouples task-specific objectives from behavioral terms to improve exploration and training stability in robotic reinforcement learning, demonstrating superior performance and robustness across multiple environments compared to direct full-reward training.

Kilian Freitag, Knut Åkesson, Morteza Haghir Chehreghani2026-03-06🤖 cs.LG

FedBCD:Communication-Efficient Accelerated Block Coordinate Gradient Descent for Federated Learning

This paper proposes FedBCGD and its accelerated variant FedBCGD+, novel federated learning algorithms that split model parameters into blocks to enable selective client uploads, thereby significantly reducing communication overhead and achieving faster convergence for large-scale deep models compared to existing methods.

Junkang Liu, Fanhua Shang, Yuanyuan Liu + 3 more2026-03-06🤖 cs.AI

SRasP: Self-Reorientation Adversarial Style Perturbation for Cross-Domain Few-Shot Learning

The paper proposes SRasP, a novel method for Cross-Domain Few-Shot Learning that mitigates domain shift and improves generalization by using global semantic guidance to reorient and aggregate style gradients, thereby stabilizing training and encouraging convergence to flatter, more transferable solutions.

Wenqian Li, Pengfei Fang, Hui Xue2026-03-06🤖 cs.LG

Particle-Guided Diffusion for Gas-Phase Reaction Kinetics

This paper demonstrates that a particle-guided diffusion model trained on advection-reaction-diffusion solutions can effectively generate physically consistent concentration fields and accurately predict outlet concentrations for gas-phase chemical reactions, even under unseen parameter conditions.

Andrew Millard, Henrik Pedersen2026-03-06🔬 physics

Recurrent Graph Neural Networks and Arithmetic Circuits

This paper establishes an exact correspondence between the computational power of recurrent graph neural networks and recurrent arithmetic circuits over real numbers by demonstrating their mutual ability to simulate each other's computations.

Timon Barlag, Vivian Holzapfel, Laura Strieker + 2 more2026-03-06🤖 cs.AI

Feature Resemblance: On the Theoretical Understanding of Analogical Reasoning in Transformers

This paper theoretically and empirically demonstrates that analogical reasoning in transformers emerges from a unified mechanism where entities with similar properties are encoded into aligned representations, a capability that depends critically on specific training curricula and the explicit inclusion of identity bridges in the data.

Ruichen Xu, Wenjing Yan, Ying-Jun Angela Zhang2026-03-06🤖 cs.LG

Federated Causal Discovery Across Heterogeneous Datasets under Latent Confounding

This paper introduces fedCI and fedCI-IOD, a novel framework for privacy-preserving federated causal discovery that enables conditional independence testing and latent confounding analysis across heterogeneous, distributed datasets with non-identical variables and mixed data types.

Maximilian Hahn, Alina Zajak, Dominik Heider + 1 more2026-03-06🤖 cs.AI

The Impact of Preprocessing Methods on Racial Encoding and Model Robustness in CXR Diagnosis

This study demonstrates that simple lung cropping preprocessing effectively mitigates racial shortcut learning in chest X-ray diagnosis models by suppressing spurious racial cues while preserving diagnostic accuracy, thereby avoiding the typical fairness-accuracy trade-off.

Dishantkumar Sutariya, Eike Petersen2026-03-06🤖 cs.LG

Balancing Privacy-Quality-Efficiency in Federated Learning through Round-Based Interleaving of Protection Techniques

This paper proposes Alt-FL, a novel federated learning framework that employs a round-based interleaving strategy of Differential Privacy, Homomorphic Encryption, and synthetic data to effectively balance privacy protection, learning quality, and system efficiency under varying resource constraints.

Yenan Wang, Carla Fabiana Chiasserini, Elad Michael Schiller2026-03-06🤖 cs.LG

A Geometry-Adaptive Deep Variational Framework for Phase Discovery in the Landau-Brazovskii Model

This paper introduces GeoDVF, a geometry-adaptive deep variational framework that jointly optimizes neural network-parameterized order parameters and trainable domain sizes to eliminate artificial stress and robustly discover stable and metastable ordered phases in the Landau-Brazovskii model from random initializations.

Yuchen Xie, Jianyuan Yin, Lei Zhang2026-03-06🔬 cond-mat.mtrl-sci

Trainable Bitwise Soft Quantization for Input Feature Compression

This paper proposes a trainable bitwise soft quantization layer that compresses neural network input features using sigmoid-approximated step functions, achieving significant data transmission reductions (5x to 16x) with minimal accuracy loss for efficient IoT applications.

Karsten Schrödter, Jan Stenkamp, Nina Herrmann + 1 more2026-03-06🤖 cs.LG

Incentive Aware AI Regulations: A Credal Characterisation

This paper proposes a mechanism design framework for AI regulation that forces providers to bet on their model's compliance, proving that such mechanisms can achieve perfect market outcomes if and only if the set of non-compliant distributions forms a credal set, thereby bridging mechanism design and imprecise probability to create enforceable regulations.

Anurag Singh, Julian Rodemann, Rajeev Verma + 2 more2026-03-06🤖 cs.LG

Towards a data-scale independent regulariser for robust sparse identification of non-linear dynamics

This paper introduces the Sequential Thresholding of Coefficient of Variation (STCV), a novel sparse regression algorithm that replaces magnitude-based thresholding with a dimensionless statistical metric to ensure robust and accurate identification of governing equations in non-linear dynamics, even when data is normalized and noisy.

Jay Raut, Daniel N. Wilke, Stephan Schmidt2026-03-06🤖 cs.LG

Stable-LoRA: Stabilizing Feature Learning of Low-Rank Adaptation

This paper introduces Stable-LoRA, a weight-shrinkage optimization strategy that resolves the feature learning instability caused by non-zero initialization in Low-Rank Adaptation (LoRA) while preserving its benefits and achieving superior performance across diverse tasks without additional memory costs.

Yize Wu, Ke Gao, Ling Li + 1 more2026-03-06🤖 cs.AI

← Previous Next →