BTZSC: A Benchmark for Zero-Shot Text Classification Across Cross-Encoders, Embedding Models, Rerankers and LLMs

This paper introduces BTZSC, a comprehensive benchmark of 22 datasets designed to systematically evaluate and compare the zero-shot text classification capabilities of NLI cross-encoders, embedding models, rerankers, and instruction-tuned LLMs, revealing that modern rerankers currently achieve state-of-the-art performance while embedding models offer the best accuracy-latency trade-off.

Ilias AarabFri, 13 Ma💬 cs.CL

A Quantitative Characterization of Forgetting in Post-Training

This paper provides a theoretical framework for quantifying forgetting in continual post-training of generative models by demonstrating how the choice of divergence objective (forward vs. reverse KL), replay strategies, and geometric overlap between old and new task distributions determine whether models suffer from mass forgetting or controlled component drift.

Krishnakumar Balasubramanian, Shiva Prasad KasiviswanathanFri, 13 Ma📊 stat

SSRCA: a novel machine learning pipeline to perform sensitivity analysis for agent-based models

This paper introduces SSRCA, a novel machine learning pipeline that effectively performs sensitivity analysis on complex agent-based models by identifying sensitive parameters, revealing common output patterns, and determining the specific input values that generate them, as demonstrated through a tumor spheroid growth model where it outperforms the Sobol' Method in robustness.

Edward H. Rohr, John T. Nardini2026-03-11🧬 q-bio

Accounting for shared covariates in semi-parametric Bayesian additive regression trees

This paper proposes a novel extension to semi-parametric Bayesian additive regression trees (BART) that resolves non-identifiability and bias issues by modifying tree-generation moves to allow shared covariates between linear and non-parametric components, thereby enabling the modeling of complex interactions while maintaining competitive performance across simulation and real-world applications.

Estevão B. Prado, Andrew C. Parnell, Keefe Murphy + 3 more2026-03-10🤖 cs.LG

Convergence and complexity of block majorization-minimization for constrained block-Riemannian optimization

This paper establishes the asymptotic convergence and O~(ϵ2)\widetilde{O}(\epsilon^{-2}) iteration complexity of block majorization-minimization algorithms for smooth nonconvex optimization problems with block constraints on Riemannian manifolds, demonstrating their broad applicability and superior performance over standard Euclidean approaches.

Yuchen Li, Laura Balzano, Deanna Needell + 1 more2026-03-10📊 stat

Curse of Dimensionality in Neural Network Optimization

This paper demonstrates that training shallow neural networks with Lipschitz continuous activation functions to approximate smooth target functions suffers from the curse of dimensionality, as the population risk decays at a rate bounded by a power of time that depends inversely on the input dimension, regardless of whether the optimization is analyzed via empirical or population risk or through 2-Wasserstein gradient flow dynamics.

Sanghoon Na, Haizhao Yang2026-03-06🔢 math

Enabling stratified sampling in high dimensions via nonlinear dimensionality reduction

This paper proposes a method to enable effective stratified sampling in high-dimensional spaces by using neural active manifolds to identify a one-dimensional latent space that captures model variability, allowing for the creation of input partitions that align with model level sets to significantly reduce variance in uncertainty propagation.

Gianluca Geraci, Daniele E. Schiavazzi, Andrea Zanoni2026-03-06🔢 math