cs.LG papers | Gist.Science

Making Reconstruction FID Predictive of Diffusion Generation FID

This paper introduces interpolated FID (iFID), a novel metric that achieves a strong correlation with diffusion generation FID by interpolating latent representations between dataset samples and their nearest neighbors, thereby overcoming the limitations of traditional reconstruction FID in predicting generative model quality.

Tongda Xu, Mingwei He, Shady Abu-Hussein, Jose Miguel Hernandez-Lobato, Haotian Zhang, Kai Zhao, Chao Zhou, Ya-Qin Zhang, Yan Wang2026-03-09🤖 cs.LG

When Rubrics Fail: Error Enumeration as Reward in Reference-Free RL Post-Training for Virtual Try-On

This paper introduces Implicit Error Counting (IEC), a reference-free reinforcement learning post-training method that enumerates and weights errors to generate rewards, demonstrating superior performance over Rubrics as Rewards (RaR) in virtual try-on tasks where multiple valid outputs exist and ideal reference answers are unavailable.

Wisdom Ikezogwo, Mehmet Saygin Seyfioglu, Ranjay Krishna, Karim Bouyarmane2026-03-09🤖 cs.AI

The Value of Graph-based Encoding in NBA Salary Prediction

This paper demonstrates that integrating graph-based embeddings of on-court and off-court player data into tabular datasets significantly improves the accuracy of supervised machine learning models for predicting NBA player salaries, particularly for veterans and high-earning outliers where traditional methods fail.

Junhao Su, David Grimsman, Christopher Archibald2026-03-09🤖 cs.LG

Reinforcement Learning for Power-Flow Network Analysis

This paper proposes a reinforcement learning framework with a probabilistic reward function and a Gaussian baseline to discover power-flow network configurations that yield a significantly higher number of equilibrium points than current computational algebra methods can identify.

Alperen Ergur, Julia Lindberg, Vinny Miller2026-03-09🤖 cs.LG

Improved Scaling Laws via Weak-to-Strong Generalization in Random Feature Ridge Regression

This paper demonstrates that in random feature ridge regression, a strong student model trained on imperfect labels from a weak teacher can achieve substantially improved scaling laws and even reach minimax optimal rates, regardless of whether the teacher's own test error decays with sample size.

Diyuan Wu, Lehan Chen, Theodor Misiakiewicz, Marco Mondelli2026-03-09🤖 cs.LG

Parallelization Strategies for Dense LLM Deployment: Navigating Through Application-Specific Tradeoffs and Bottlenecks

This paper investigates parallelization strategies for deploying dense LLMs, demonstrating that while Tensor Parallelism optimizes latency and Pipeline Parallelism enhances throughput, a hybrid approach allows for effective control over the inherent latency-throughput tradeoff to meet specific application requirements.

Burak Topcu, Musa Oguzhan Cim, Poovaiah Palangappa, Meena Arunachalam, Mahmut Taylan Kandemir2026-03-09🤖 cs.LG

Warm Starting State-Space Models with Automata Learning

This paper establishes a formal correspondence between Moore machines and state-space models to demonstrate that initializing continuous SSMs with symbolically learned automata significantly improves training efficiency and accuracy compared to random initialization, thereby effectively leveraging symbolic inductive bias for learning complex systems.

William Fishell, Sam Nicholas Kouteili, Mark Santolucito2026-03-09🤖 cs.LG

Random Dot Product Graphs as Dynamical Systems: Limitations and Opportunities

This paper establishes a geometric framework using principal fiber bundles to identify fundamental obstructions in learning differential equations from temporal Random Dot Product Graphs, characterizing the interplay between gauge ambiguity, spectral gaps, and holonomy while demonstrating that symmetric dynamics can resolve gauge issues to enable vector field recovery.

Giulio Valentino Dalla Riva2026-03-09🤖 cs.LG

The Rise of AI in Weather and Climate Information and its Impact on Global Inequality

This paper argues that while AI promises to revolutionize climate information, its current reliance on Global North-dominated infrastructure and biased data risks exacerbating global inequality, necessitating a shift toward data-centric development, shared digital public infrastructure, and co-produced knowledge to ensure equitable outcomes.

Amirpasha Mozaffari, Amanda Duarte, Lina Teckentrup, Stefano Materia, Gina E. C. Charnley, Lluis Palma, Eulalia Baulenas Serra, Dragana Bojovic, Paula Checchia, Aude Carreric, Francisco Doblas-Reyes2026-03-09🤖 cs.AI

Unsupervised domain adaptation for radioisotope identification in gamma spectroscopy

This paper demonstrates that unsupervised domain adaptation, specifically through minimizing maximum mean discrepancy (MMD) between synthetic and unlabeled real-world data, significantly improves the generalization and testing accuracy of machine learning models for radioisotope identification in gamma spectroscopy.

Peter Lalor, Ayush Panigrahy, Alex Hagen2026-03-09🤖 cs.LG

Revisiting the (Sub)Optimality of Best-of-N for Inference-Time Alignment

This paper challenges prior claims of Best-of-N's suboptimality by demonstrating that, under practical assumptions and when evaluated via win-rate rather than expected reward, properly tuned Best-of-N is both statistically and computationally optimal, while also proposing a simple variant that eliminates reward hacking without sacrificing performance.

Ved Sriraman, Adam Block2026-03-09🤖 cs.AI

Full Dynamic Range Sky-Modelling For Image Based Lighting

This paper introduces Icarus, a deep learning-based all-weather sky model that overcomes the limitations of existing methods in handling full dynamic range and class-imbalanced solar regions to generate photorealistic, user-controllable environment maps for accurate Image-Based Lighting.

Ian J. Maquignaz2026-03-09🤖 cs.LG

MIRACL: A Diverse Meta-Reinforcement Learning for Multi-Objective Multi-Echelon Combinatorial Supply Chain Optimisation

The paper introduces MIRACL, a novel hierarchical Meta-MORL framework that enables few-shot generalization and efficient adaptation for multi-objective multi-echelon supply chain optimization by decomposing tasks into structured subproblems and employing a Pareto-based strategy to achieve superior performance over conventional baselines.

Rifny Rachman, Josh Tingey, Richard Allmendinger, Wei Pan, Pradyumn Shukla, Bahrul Ilmi Nasution2026-03-09🤖 cs.LG

Score-Guided Proximal Projection: A Unified Geometric Framework for Rectified Flow Editing

This paper introduces Score-Guided Proximal Projection (SGPP), a unified geometric framework that reformulates Rectified Flow editing as a proximal optimization problem to overcome the limitations of existing inversion and sampling methods by theoretically guaranteeing manifold convergence while enabling a continuous, training-free trade-off between identity preservation and generative flexibility.

Vansh Bansal, James G Scott2026-03-09🤖 cs.LG

TML-Bench: Benchmark for Data Science Agents on Tabular ML Tasks

This paper introduces TML-Bench, a benchmark for evaluating the end-to-end correctness and reliability of autonomous coding agents on Kaggle-style tabular machine learning tasks, demonstrating that the MiniMax-M2.1 model achieves the best aggregate performance across four competitions under varying time budgets.

Mykola Pinchuk2026-03-09🤖 cs.AI

Bridging Domains through Subspace-Aware Model Merging

This paper introduces SCORE, a novel model merging method that resolves singular subspace conflicts between domain-specific models by projecting them into a shared orthogonal basis, thereby significantly improving generalization to unseen domains compared to existing approaches.

Levy Chaves, Chao Zhou, Rebekka Burkholz, Eduardo Valle, Sandra Avila2026-03-09🤖 cs.AI

Knowing without Acting: The Disentangled Geometry of Safety Mechanisms in Large Language Models

This paper proposes the Disentangled Safety Hypothesis (DSH), which reveals that large language models separate safety "recognition" and "refusal execution" into distinct geometric subspaces, enabling the development of the Refusal Erasure Attack (REA) to bypass safety mechanisms by surgically disabling the refusal axis while preserving harmful content generation.

Jinman Wu, Yi Xie, Shen Lin, Shiqian Zhao, Xiaofeng Chen2026-03-09🤖 cs.AI

First-Order Softmax Weighted Switching Gradient Method for Distributed Stochastic Minimax Optimization with Stochastic Constraints

This paper proposes a first-order Softmax-Weighted Switching Gradient method for distributed stochastic minimax optimization under stochastic constraints, achieving optimal oracle complexity and high-probability convergence guarantees in both full and partial client participation settings while avoiding the instability of traditional primal-dual approaches.

Zhankun Luo, Antesh Upadhyay, Sang Bin Moon, Abolfazl Hashemi2026-03-09🤖 cs.LG

The Coordination Gap: Alternation Metrics for Temporal Dynamics in Multi-Agent Battle of the Exes

This paper introduces temporally sensitive Alternation (ALT) metrics to reveal that conventional outcome-based evaluations can severely mischaracterize multi-agent coordination, as demonstrated by Q-learning agents in a Battle of the Exes variant that achieve high traditional fairness scores but perform significantly worse than random baselines in actual turn-taking dynamics.

Nikolaos Al. Papadopoulos, Konstantinos Psannis2026-03-09🤖 cs.LG

Sparse Crosscoders for diffing MoEs and Dense models

This paper utilizes crosscoders to demonstrate that Mixture of Experts (MoE) models develop more specialized, focused representations with fewer unique features compared to the broader, general-purpose feature distributions found in dense models of equivalent active parameter count.

Marmik Chaudhari, Nishkal Hundia, Idhant Gulati2026-03-09🤖 cs.LG

← Previous Next →