Reject, Resample, Repeat: Understanding Parallel Reasoning in Language Model Inference

This paper introduces a particle filtering framework to rigorously analyze the accuracy-cost tradeoffs of parallel inference methods in large language models, establishing theoretical guarantees and identifying fundamental limits while demonstrating that sampling error alone does not fully predict final model accuracy.

Noah Golowich, Fan Chen, Dhruv Rohatgi, Raghav Singhal, Carles Domingo-Enrich, Dylan J. Foster, Akshay KrishnamurthyTue, 10 Ma🤖 cs.LG

Cost-Driven Representation Learning for Linear Quadratic Gaussian Control: Part II

This paper establishes finite-sample guarantees for cost-driven state representation learning in infinite-horizon time-invariant Linear Quadratic Gaussian (LQG) control by analyzing two approaches—explicit latent modeling and implicit MuZero-like dynamics—while introducing a key technical proof of persistency of excitation for a novel stochastic process arising from quadratic regression.

Yi Tian, Kaiqing Zhang, Russ Tedrake, Suvrit SraTue, 10 Ma🤖 cs.LG

Making LLMs Optimize Multi-Scenario CUDA Kernels Like Experts

This paper introduces MSKernelBench, a comprehensive benchmark covering diverse multi-scenario GPU kernels, and CUDAMaster, a multi-agent, hardware-aware system that leverages this benchmark to achieve significant speedups, often matching or surpassing closed-source libraries like cuBLAS, thereby advancing general-purpose automated CUDA kernel optimization beyond current ML-focused methods.

Yuxuan Han, Meng-Hao Guo, Zhengning Liu, Wenguang Chen, Shi-Min HuTue, 10 Ma🤖 cs.LG

Sampling via Stochastic Interpolants by Langevin-based Velocity and Initialization Estimation in Flow ODEs

This paper proposes a novel sampling method for unnormalized Boltzmann densities that leverages a sequence of Langevin samplers to efficiently simulate a probability flow ODE derived from linear stochastic interpolants by generating intermediate samples and robustly estimating the velocity field, while providing theoretical convergence guarantees and demonstrating effectiveness on challenging multimodal distributions and Bayesian inference tasks.

Chenguang Duan, Yuling Jiao, Gabriele Steidl, Christian Wald, Jerry Zhijian Yang, Ruizhe ZhangThu, 12 Ma📊 stat