cs.LG papers | Gist.Science

TianQuan-S2S: A Subseasonal-to-Seasonal Global Weather Model via Incorporate Climatology State

The paper introduces TianQuan-S2S, a novel global subseasonal-to-seasonal weather forecasting model that integrates climatological states into patch embeddings and utilizes an uncertainty-augmented Transformer to overcome the limitations of over-smoothing and inadequate climate representation, thereby outperforming both traditional numerical methods and advanced data-driven models in deterministic and ensemble forecasting.

Guowen Li, Xintong Liu, Yang Liu + 11 more2026-03-06💻 cs

Noise2Ghost: Self-supervised deep convolutional reconstruction for ghost imaging

The paper introduces Noise2Ghost, a self-supervised deep learning method that achieves superior noise reduction and reconstruction quality in ghost imaging without requiring clean reference data, thereby enabling high-quality imaging in low-light scenarios such as dose-sensitive x-ray fluorescence and biological studies.

Mathieu Manni, Dmitry Karpov, K. Joost Batenburg + 2 more2026-03-06🔬 physics

Differentially Private and Scalable Estimation of the Network Principal Component

This paper proposes a novel, instance-specific Differentially Private framework based on the Propose-Test-Release mechanism that enables scalable and accurate estimation of network principal components on large real-world graphs, achieving a 180-fold runtime improvement over existing baselines while also providing the first DP solution for the Densest- $k$ -subgraph problem.

Alireza Khayatian, Anil Vullikanti, Aritra Konar2026-03-06💻 cs

Variational Formulation of Particle Flow

This paper presents a variational inference formulation of log-homotopy particle flow as a Fisher-Rao gradient flow, deriving Gaussian and Gaussian mixture approximations that recover the Exact Daum and Huang flow under linear Gaussian assumptions while enhancing expressiveness for multi-modal estimation.

Yinzhuang Yi, Jorge Cortés, Nikolay Atanasov2026-03-06💻 cs

ReactDance: Hierarchical Representation for High-Fidelity and Coherent Long-Form Reactive Dance Generation

ReactDance is a novel diffusion framework that achieves high-fidelity, coherent long-form reactive dance generation by employing Hierarchical Finite Scalar Quantization for fine-grained spatial control and a Blockwise Local Context strategy for efficient, temporally consistent sequence synthesis.

Jingzhong Lin, Xinru Li, Yuanyuan Qi + 8 more2026-03-06💻 cs

Learning Virtual Machine Scheduling in Cloud Computing through Language Agents

This paper proposes MiCo, a hierarchical language agent framework that leverages large language models to design adaptive heuristics for solving the complex Online Dynamic Multidimensional Bin Packing problem in cloud VM scheduling, achieving a 96.9% competitive ratio in large-scale, real-world scenarios.

JieHao Wu, Ziwei Wang, Junjie Sheng + 3 more2026-03-06💻 cs

Ice Cream Doesn't Cause Drowning: Benchmarking LLMs Against Statistical Pitfalls in Causal Inference

This paper introduces CausalPitfalls, a comprehensive benchmark designed to rigorously evaluate and expose the significant limitations of large language models in handling statistical causal inference pitfalls, such as Simpson's paradox, through both direct and code-assisted prompting protocols.

Jin Du, Li Chen, Xun Xian + 6 more2026-03-06💻 cs

ShIOEnv: A Command Evaluation Environment for Grammar-Constrained Synthesis and Execution Behavior Modeling

This paper introduces ShIOEnv, a grammar-constrained, self-supervised Bash environment that generates 2.1 million system-grounded input-output pairs to significantly improve the accuracy of modeling complex command-line execution behaviors compared to prior execution-free approaches.

Jarrod Ragsdale, Rajendra Boppana2026-03-06💻 cs

VTool-R1: VLMs Learn to Think with Images via Reinforcement Learning on Multimodal Tool Use

VTool-R1 is a novel framework that leverages reinforcement learning to train vision-language models to generate multimodal chains of thought by strategically interleaving text with intermediate visual reasoning steps using Python-based editing tools, thereby enhancing performance on structured visual tasks without requiring process-based supervision.

Mingyuan Wu, Jingcheng Yang, Jize Jiang + 6 more2026-03-06💻 cs

Attribute-Efficient PAC Learning of Sparse Halfspaces with Constant Malicious Noise Rate

This paper presents an attribute-efficient PAC learning algorithm for sparse halfspaces that achieves robustness against a constant malicious noise rate using $poly(s, \log d)$ samples by applying simple variants to hinge loss minimization under specific concentration and margin conditions.

Shiwei Zeng, Jie Shen2026-03-06💻 cs

Highly Efficient and Effective LLMs with Multi-Boolean Architectures

This paper introduces a novel framework that enables direct finetuning of large language models using multi-kernel Boolean parameters without latent weights, significantly reducing complexity while outperforming existing ultra low-bit quantization and binarization techniques.

Ba-Hien Tran, Van Minh Nguyen2026-03-06💻 cs

Continuous Chain of Thought Enables Parallel Exploration and Reasoning

This paper introduces Continuous Chain of Thought (CoT2), a framework that replaces discrete token sampling with continuously-valued tokens to enable parallel exploration of multiple reasoning traces, offering theoretical guarantees for solving combinatorial problems and demonstrating improved performance through novel supervision and policy optimization strategies.

Halil Alperen Gozeten, M. Emrullah Ildiz, Xuechen Zhang + 3 more2026-03-06💻 cs

SealQA: Raising the Bar for Reasoning in Search-Augmented Language Models

The paper introduces SealQA, a new benchmark comprising three challenging flavors (Seal-0, Seal-Hard, and LongSeal) designed to evaluate search-augmented language models on fact-seeking tasks with noisy or conflicting web results, revealing that even frontier models struggle significantly with reasoning accuracy, robustness to noise, and long-context document retrieval.

Thinh Pham, Nguyen Nguyen, Pratibha Zunjare + 3 more2026-03-06💻 cs

FPGA-Enabled Machine Learning Applications in Earth Observation: A Systematic Review

This systematic review analyzes 68 experiments deploying machine learning models on FPGAs for Earth Observation, introducing dual taxonomies for model architectures and implementation strategies to address the challenges of onboard processing in the NewSpace era.

Cédric Léonard, Dirk Stober, Martin Schulz2026-03-06💻 cs

HSG-12M: A Large-Scale Benchmark of Spatial Multigraphs from the Energy Spectra of Non-Hermitian Crystals

This paper introduces Poly2Graph, an automated pipeline for generating HSG-12M, a pioneering 16.7-million-scale dataset of spatial multigraphs derived from non-Hermitian crystal energy spectra, which bridges condensed matter physics and geometry-aware graph learning by preserving vital geometric information often discarded in existing benchmarks.

Xianquan Yan, Hakan Akgün, Kenji Kawaguchi + 2 more2026-03-06🔬 cond-mat.mes-hall

EDINET-Bench: Evaluating LLMs on Complex Financial Tasks using Japanese Financial Statements

This paper introduces EDINET-Bench, a challenging open-source benchmark derived from ten years of Japanese financial reports to evaluate LLMs on complex tasks like fraud detection and earnings forecasting, revealing that current models struggle significantly without specialized scaffolding and highlighting the need for more realistic evaluation frameworks.

Issa Sugiura, Takashi Ishida, Taro Makino + 4 more2026-03-06💻 cs

SPEED-RL: Faster Training of Reasoning Models via Online Curriculum Learning

The paper introduces SPEED-RL, an adaptive online curriculum learning method that selectively samples intermediate-difficulty prompts to theoretically and empirically accelerate reinforcement learning training for reasoning models by 2x to 6x without compromising accuracy.

Ruiqi Zhang, Daman Arora, Song Mei + 1 more2026-03-06💻 cs

Bures-Wasserstein Flow Matching for Graph Generation

This paper introduces BWFlow, a graph generation framework that overcomes the limitations of independent node-edge modeling by utilizing Bures-Wasserstein optimal transport on Markov random fields to construct a smooth, theoretically grounded probability path for the joint evolution of graph components, resulting in improved training convergence and sampling efficiency.

Keyue Jiang, Jiahao Cui, Xiaowen Dong + 1 more2026-03-06💻 cs

From Bandit Regret to FDR Control: Online Selective Generation with Adversarial Feedback Unlocking

This paper proposes ExSUL, a novel online learning framework that enables selective generation for large language models to robustly control the False Discovery Rate (FDR) and achieve optimal regret bounds in non-stationary and adversarial environments by converting bandit regret into FDR guarantees and unlocking additional learning signals from partial user feedback.

Minjae Lee, Yoonjae Jung, Sangdon Park2026-03-06💻 cs

Structured Kolmogorov-Arnold Neural ODEs for Interpretable Learning and Symbolic Discovery of Nonlinear Dynamics

This paper introduces Structured Kolmogorov-Arnold Neural ODEs (SKANODEs), a framework that combines structured state-space modeling with Kolmogorov-Arnold Networks to accurately recover interpretable physical latent states and discover compact symbolic governing equations for nonlinear dynamical systems, outperforming black-box neural ODEs and classical identification methods across synthetic and real-world datasets.

Wei Liu, Kiran Bacsa, Loon Ching Tang + 1 more2026-03-06🔬 physics

← Previous Next →