cs.LG papers | Gist.Science

On the Non-Identifiability of Steering Vectors in Large Language Models

This paper demonstrates that steering vectors in large language models are fundamentally non-identifiable, as numerous distinct interventions—including orthogonal perturbations—produce behaviorally indistinguishable results, thereby revealing inherent limits in interpreting these vectors as unique internal representations without additional structural constraints.

Sohan Venkatesh, Ashish Mahendran Kurapath2026-03-06💻 cs

LatentChem: From Textual CoT to Latent Thinking in Chemical Reasoning

LatentChem introduces a latent reasoning interface that decouples chemical computation from textual generation, enabling models to perform multi-step reasoning in continuous latent space which spontaneously emerges as a more efficient and accurate alternative to explicit Chain-of-Thought, achieving a 59.88% win rate and 10.84 $\times$ speedup over baselines.

Xinwu Ye, Yicheng Mao, Jia Zhang + 16 more2026-03-06🔬 physics

Beyond the Unit Hypersphere: Embedding Magnitude in Contrastive Learning

This paper demonstrates that systematically learning embedding magnitudes by independently controlling query and document normalization significantly improves retrieval and RAG performance—particularly in out-of-domain scenarios—by revealing that magnitude encodes distinct, beneficial roles for queries and documents that are lost when assuming magnitude is noise.

Xincan Feng, Taro Watanabe2026-03-06💻 cs

Supervised Metric Regularization Through Alternating Optimization for Multi-Regime Physics-Informed Neural Networks

This paper introduces Topology-Aware PINNs (TAPINN), a novel framework that employs supervised metric regularization and alternating optimization to effectively resolve spectral bias and mode collapse in multi-regime physics-informed neural networks, achieving superior convergence stability and accuracy compared to standard and hypernetwork-based baselines.

Enzo Nicolas Spotorno, Josafat Ribeiro Leal, Antonio Augusto Frohlich2026-03-06🔬 physics

Empirical Stability Analysis of Kolmogorov-Arnold Networks in Hard-Constrained Recurrent Physics-Informed Discovery

This paper empirically demonstrates that while Kolmogorov-Arnold Networks (KANs) can compete with MLPs on simple univariate residuals in hard-constrained recurrent physics-informed architectures, they suffer from severe hyperparameter fragility, instability in deeper configurations, and consistent failure on multiplicative terms, ultimately revealing limitations in their additive inductive bias for modeling state coupling in oscillatory systems.

Enzo Nicolas Spotorno, Josafat Leal Filho, Antonio Augusto Medeiros Frohlich2026-03-06🔬 physics

Learn from Your Mistakes: Self-Correcting Masked Diffusion Models

This paper introduces Progressive Self-Correction (ProSeCo), a framework for Masked Diffusion Models that iteratively refines both unmasked and previously generated tokens during sampling to mitigate error accumulation, thereby achieving superior sample quality and faster inference compared to standard approaches.

Yair Schiff, Omer Belhasin, Roy Uziel + 5 more2026-03-06💻 cs

QTabGAN: A Hybrid Quantum-Classical GAN for Tabular Data Synthesis

The paper introduces QTabGAN, a hybrid quantum-classical generative adversarial network that leverages the expressive power of quantum circuits to synthesize realistic tabular data, demonstrating significant performance improvements over state-of-the-art models in scenarios with scarce or privacy-restricted data.

Subhangi Kumari, Rakesh Achutha, Vignesh Sivaraman2026-03-06⚛️ quant-ph

Out-of-Support Generalisation via Weight-Space Sequence Modelling

This paper introduces WeightCaster, a framework that reformulates out-of-support generalisation as a weight-space sequence modelling task to generate reliable, uncertainty-aware predictions without explicit inductive biases, demonstrating competitive performance on both synthetic and real-world datasets.

Roussel Desmond Nzoyem2026-03-06💻 cs

Neural Network-Based Parameter Estimation of a Labour Market Agent-Based Model

This study demonstrates that a neural network-based simulation inference framework effectively and efficiently estimates parameters in a large-scale labour market agent-based model, outperforming traditional Bayesian methods and summary statistics in recovering original parameters across various dataset scales.

M Lopes Alves, Joel Dyer, Doyne Farmer + 2 more2026-03-06💻 cs

Optimal training-conditional regret for online conformal prediction

This paper proposes minimax-optimal online conformal prediction algorithms for non-stationary data streams with distribution drift, utilizing drift detection to adaptively update calibration sets and leveraging model stability to achieve optimal training-conditional cumulative regret.

Jiadong Liang, Zhimei Ren, Yuxin Chen2026-03-06🔢 math

SubQuad: Near-Quadratic-Free Structure Inference with Distribution-Balanced Objectives in Adaptive Receptor framework

SubQuad is an end-to-end pipeline that overcomes the computational and data imbalance bottlenecks in large-scale adaptive immune repertoire analysis by integrating near-subquadratic MinHash retrieval, GPU-accelerated affinity kernels, and fairness-constrained clustering to enable scalable, bias-aware discovery of clinically relevant clonotypes.

Rong Fu, Zijian Zhang, Kun Liu + 3 more2026-03-06💻 cs

Curriculum Learning for Efficient Chain-of-Thought Distillation via Structure-Aware Masking and GRPO

This paper proposes a three-stage curriculum learning framework that leverages structure-aware masking and Group Relative Policy Optimization (GRPO) to efficiently distill Chain-of-Thought reasoning into compact student models, achieving significant accuracy gains and output length reduction on GSM8K by progressively guiding the model from structural understanding to self-optimized brevity and targeted knowledge internalization.

Bowen Yu, Maolin Wang, Sheng Zhang + 7 more2026-03-06💻 cs

CityGuard: Graph-Aware Private Descriptors for Bias-Resilient Identity Search Across Urban Cameras

CityGuard is a privacy-preserving, graph-aware transformer framework that enables robust, bias-resilient person re-identification across distributed urban cameras by integrating dispersion-adaptive metric learning, spatially conditioned attention for coarse geometric alignment, and differentially private embeddings to balance retrieval accuracy with data protection.

Rong Fu, Yibo Meng, Jia Yee Tan + 5 more2026-03-06💻 cs

RA-QA: A Benchmarking System for Respiratory Audio Question Answering Under Real-World Heterogeneity

This paper introduces RA-QA, a comprehensive benchmarking system featuring a standardized pipeline, a large-scale dataset of 9 million diverse question-answer pairs, and a unified evaluation protocol to assess and expose the limitations of respiratory audio question-answering models under real-world heterogeneity.

Gaia A. Bertolino, Yuwei Zhang, Tong Xia + 2 more2026-03-06💻 cs

cc-Shapley: Measuring Multivariate Feature Importance Needs Causal Context

The paper introduces cc-Shapley, a causal-context modification of Shapley values that leverages knowledge of the data-generating process to eliminate misleading feature attributions caused by collider bias and suppression, thereby providing more accurate multivariate feature importance for explainable AI and scientific discovery.

Jörg Martin, Stefan Haufe2026-03-06💻 cs

On Imbalanced Regression with Hoeffding Trees

This paper extends kernel density estimation and hierarchical shrinkage to Hoeffding trees for imbalanced regression in data streams, demonstrating that kernel density estimation significantly improves early-stream performance while hierarchical shrinkage offers limited gains.

Pantia-Marina Alchirch, Dimitrios I. Diochnos2026-03-06💻 cs

Zatom-1: A Multimodal Flow Foundation Model for 3D Molecules and Materials

Zatom-1 is the first open-source, end-to-end foundation model that unifies generative and predictive learning for 3D molecules and materials using a multimodal flow matching objective, achieving state-of-the-art performance across domains while significantly reducing inference time and enabling positive transfer between chemical systems.

Alex Morehead, Miruna Cretu, Antonia Panescu + 14 more2026-03-06🔬 cond-mat.mtrl-sci

Regularized Online RLHF with Generalized Bilinear Preferences

This paper proposes a regularized online RLHF framework using Generalized Bilinear Preference Models to identify Nash Equilibria, establishing the first statistically efficient, dimension-free regret bounds for high-dimensional settings through two simple algorithms that leverage strong convexity and low-rank structures.

Junghyun Lee, Minju Hong, Kwang-Sung Jun + 2 more2026-03-06💻 cs

Lap2: Revisiting Laplace DP-SGD for High Dimensions via Majorization Theory

This paper introduces Lap2, a novel framework that enables L2-norm clipping for Laplace DP-SGD in high-dimensional models by leveraging majorization theory and Schur-convexity to overcome dimensionality barriers, thereby achieving privacy-utility performance comparable to or exceeding Gaussian DP-SGD.

Meisam Mohammady, Qin Yang, Nicholas Stout, Ayesha Samreen, Han Wang, Christopher J Quinn, Yuan Hong2026-03-06🔒 cs.CR

Inference-time optimization for experiment-grounded protein ensemble generation

This paper introduces a general inference-time optimization framework that generates experiment-grounded protein ensembles by optimizing latent representations and employing novel sampling schemes, thereby overcoming the limitations of current diffusion-based methods to produce thermodynamically plausible structures with improved agreement to experimental data while exposing vulnerabilities in existing confidence metrics.

Advaith Maddipatla, Anar Rzayev, Marco Pegoraro + 5 more2026-03-06💻 cs

← Previous Next →