cs.LG papers | Gist.Science

CauKer: Classification Time Series Foundation Models Can Be Pretrained on Synthetic Data

The paper introduces CauKer, a novel algorithm that combines Gaussian Process kernel composition with Structural Causal Models to generate diverse, causally coherent synthetic time series, enabling sample-efficient pre-training of classification foundation models that exhibit clear scaling laws across varying dataset sizes and model capacities.

Shifeng Xie, Vasilii Feofanov, Ambroise Odonnat, Lei Zan, Marius Alonso, Jianfeng Zhang, Themis Palpanas, Lujia Pan, Keli Zhang, Ievgen Redko2026-03-10🤖 cs.LG

GraphProp: Training the Graph Foundation Models using Graph Properties

GraphProp is a two-phase framework for training graph foundation models that first learns structural generalization by predicting graph invariants and then leverages these representations as positional encodings to enhance cross-domain performance in graph-level tasks, particularly outperforming existing methods in scenarios with limited data or missing node attributes.

Ziheng Sun, Qi Feng, Lehao Lin, Chris Ding, Jicong Fan2026-03-10🤖 cs.LG

Time-Scale Coupling Between States and Parameters in Recurrent Neural Networks

This paper demonstrates that gating mechanisms in recurrent neural networks act as data-driven preconditioners that couple state-space time-scales with parameter-space dynamics, inducing lag-dependent and anisotropic effective learning rates that complement optimizer-driven adaptivity to enhance trainability.

Lorenzo Livi2026-03-10🤖 cs.LG

ECHO: Frequency-aware Hierarchical Encoding for Variable-length Signals

The paper introduces ECHO, a novel foundation model that leverages band-split architecture and frequency positional embeddings to achieve state-of-the-art performance in anomaly detection and fault classification across variable-length, arbitrary sampling rate machine signals without requiring padding or cropping.

Yucong Zhang, Juan Liu, Ming Li2026-03-10🤖 cs.LG

Constraint Learning in Multi-Agent Dynamic Games from Demonstrations of Local Nash Interactions

This paper presents an inverse dynamic game algorithm that uses mixed-integer linear programs to learn parametric constraints from multi-agent interaction demonstrations by encoding Karush-Kuhn-Tucker conditions, thereby providing theoretical guarantees for recovering inner approximations of safe and unsafe sets to enable robust motion planning.

Zhouyu Zhang, Chih-Yuan Chiu, Glen Chou2026-03-10🤖 cs.LG

CbLDM: A Diffusion Model for recovering nanostructure from atomic pair distribution function

This paper proposes CbLDM, a Condition-based Latent Diffusion Model that utilizes conditional priors and Laplacian matrices to effectively and stably recover the nanostructures of monometallic nanoparticles from their atomic pair distribution functions, addressing the highly ill-posed nature of the inverse problem.

Jiarui Cao, Zhiyang Zhang, Heming Wang, Jun Xu, Ling Lan, Simon J. L. Billinge, Ran Gu2026-03-10🔬 cond-mat.mtrl-sci

Entropy-Driven Curriculum for Multi-Task Training in Human Mobility Prediction

This paper proposes a unified training framework that combines entropy-driven curriculum learning, which sequences training from simple to complex trajectories based on Lempel-Ziv compression, with multi-task learning to simultaneously optimize location, distance, and direction predictions, thereby achieving state-of-the-art performance and significantly faster convergence in human mobility prediction.

Tianye Fang, Xuanshu Luo, Martin Werner2026-03-10🤖 cs.LG

Synthetic data for ratemaking: imputation-based methods vs adversarial networks and autoencoders

This paper benchmarks Multivariate Imputation by Chained Equations (MICE) against deep generative models like Variational Autoencoders and Conditional Tabular GANs for synthetic ratemaking data, finding that MICE offers a simpler yet high-fidelity alternative that effectively preserves statistical distributions and supports robust Generalized Linear Model training.

Yevhen Havrylenko, Meelis Käärik, Artur Tuttar2026-03-10🤖 cs.LG

Faster Gradient Methods for Highly-Smooth Stochastic Bilevel Optimization

This paper proposes the F²SA- $p$ method, which utilizes $p$ -th order finite differences to achieve a nearly optimal $\tilde{\mathcal{O}}(p \epsilon^{-4-p/2})$ complexity for finding $\epsilon$ -stationary points in stochastic bilevel optimization with highly smooth objectives, thereby improving upon previous first-order bounds and matching the fundamental lower limit.

Lesi Chen, Junru Li, El Mahdi Chayti, Jingzhao Zhang2026-03-10🤖 cs.LG

Behavioral Inference at Scale: The Fundamental Asymmetry Between Motivations and Belief Systems

Through large-scale experiments with over 1.5 million LLM-generated behavioral sequences, this paper reveals a fundamental asymmetry in behavioral inference where agent motivations are nearly perfectly recoverable while belief systems remain largely opaque due to inherent information-theoretic limits and architectural constraints, particularly within a "neutral zone" of behavioral ambiguity.

Jason Starace, Terence Soule2026-03-10🤖 cs.LG

Synthetic Homes: An Accessible Multimodal Pipeline for Producing Residential Building Data with Generative AI

This paper introduces a modular, multimodal framework that leverages generative AI to synthesize realistic residential building data from public images and information, thereby overcoming data accessibility and privacy barriers to advance energy modeling and machine learning research.

Jackson Eshbaugh, Chetan Tiwari, Jorge Silveyra2026-03-10🤖 cs.LG

Physics-Aware Neural Operators for Direct Inversion in 3D Photoacoustic Tomography

The paper introduces PANO, a physics-aware neural operator that performs direct, single-pass inversion of raw sensor data into high-quality 3D photoacoustic images, outperforming traditional algorithms and enabling real-time reconstruction across diverse sparse acquisition settings to facilitate the clinical translation of 3D PACT.

Jiayun Wang, Yousuf Aborahama, Arya Khokhar, Yang Zhang, Chuwei Wang, Karteekeya Sastry, Julius Berner, Yilin Luo, Boris Bonev, Zongyi Li, Kamyar Azizzadenesheli, Lihong V. Wang, Anima Anandkumar2026-03-10🤖 cs.LG

Fast reconstruction of degenerate populations of conductance-based neuron models from spike times

This paper presents a fast, scalable method that combines deep learning with Dynamic Input Conductances (DICs) to reconstruct diverse, degenerate populations of conductance-based neuron models directly from spike times, effectively bridging experimental recordings and mechanistic biophysical parameters.

Julien Brandoit, Damien Ernst, Guillaume Drion, Arthur Fyon2026-03-10🤖 cs.LG

MICA: Multi-Agent Industrial Coordination Assistant

This paper introduces MICA, a privacy-preserving, speech-interactive multi-agent system that leverages Adaptive Step Fusion and a safety-audited coordination topology to deliver robust, real-time industrial assistance for assembly and maintenance tasks on resource-constrained hardware.

Di Wen, Kunyu Peng, Junwei Zheng, Yufan Chen, Yitian Shi, Jiale Wei, Ruiping Liu, Kailun Yang, Rainer Stiefelhagen2026-03-10🤖 cs.LG

ORIC: Benchmarking Object Recognition under Contextual Incongruity in Large Vision-Language Models

This paper introduces the ORIC framework and benchmark to evaluate and improve Large Vision-Language Models' object recognition capabilities under contextual incongruity, demonstrating that such scenarios significantly degrade performance and that targeted Visual Reinforcement Fine-Tuning can effectively mitigate these failures.

Zhaoyang Li, Zhan Ling, Yuchen Zhou, Litian Gong, Erdem Bıyık, Hao Su2026-03-10🤖 cs.LG

ORN-CBF: Learning Observation-conditioned Residual Neural Control Barrier Functions via Hypernetworks

This paper proposes ORN-CBF, a hypernetwork-based learning framework that utilizes Hamilton-Jacobi reachability analysis to generate observation-conditioned neural control barrier functions, ensuring rigorous safety guarantees and improved generalization in partially observable environments through simulation and hardware experiments.

Bojan Derajic, Sebastian Bernhard, Wolfgang Hönig2026-03-10🤖 cs.LG

Empirical PAC-Bayes bounds for Markov chains

This paper introduces the first fully empirical PAC-Bayes bound for Markov chains by deriving a data-dependent estimate for the pseudo-spectral gap, thereby eliminating the need for unknown constants related to mixing properties that typically hinder practical generalization guarantees.

Vahe Karagulyan, Pierre Alquier2026-03-10🤖 cs.LG

Linear probes rely on textual evidence: Results from leakage mitigation studies in language models

This paper demonstrates that linear probes used to detect harmful behaviors in language models are heavily reliant on explicit textual evidence, as their performance significantly degrades when such surface-level cues are filtered out or when models are trained to express behaviors without verbalization.

Gerard Boxo, Aman Neelappa, Shivam Raval2026-03-10🤖 cs.LG

AEGIS: Authentic Edge Growth In Sparsity for Link Prediction in Edge-Sparse Bipartite Knowledge Graphs

The paper introduces AEGIS, an edge-only augmentation framework that resamples existing training edges to enhance link prediction in edge-sparse bipartite knowledge graphs, demonstrating that authenticity-constrained resampling preserves data integrity while semantic KNN augmentation further boosts performance when node descriptions are available.

Hugh Xuechen Liu, Kıvanç Tatar2026-03-10🤖 cs.LG

Aurora: Towards Universal Generative Multimodal Time Series Forecasting

Aurora is a multimodal time series foundation model that leverages text and image modalities to guide temporal representation learning and prototype-based flow matching, achieving state-of-the-art zero-shot cross-domain generalization across diverse forecasting benchmarks.

Xingjian Wu, Jianxin Jin, Wanghui Qiu + 4 more2026-03-10🤖 cs.LG

← Previous Next →