Healthy LLMs? Benchmarking LLM Knowledge of UK Government Public Health Information

This paper introduces PubHealthBench, a new benchmark comprising over 8,000 questions derived from UK government guidance, to evaluate LLMs on public health knowledge and finds that while state-of-the-art models excel in multiple-choice tasks, their performance on free-form responses remains limited, highlighting the need for additional safeguards in real-world applications.

Joshua Harris, Fan Grayson, Felix Feldman + 8 more2026-03-05🤖 cs.LG

Emotion-Gradient Metacognitive RSI (Part I): Theoretical Foundations and Single-Agent Architecture

This paper establishes the theoretical foundations and single-agent architecture of the Emotion-Gradient Metacognitive RSI (EG-MRSI) framework, a novel system that integrates introspective metacognition and emotion-driven intrinsic motivation to enable provably safe, recursive self-improvement through differentiable reward signals and quantifiable semantic learning metrics.

Rintaro Ando2026-03-05🤖 cs.AI

Convergence, Sticking and Escape: Stochastic Dynamics Near Critical Points in SGD

This paper analyzes the convergence and escape dynamics of Stochastic Gradient Descent in one-dimensional landscapes, establishing that while SGD reliably converges to local minima, it may linger near local maxima depending on noise variance and geometry, with specific results provided for the probability of escaping sharp maxima to neighboring minima.

Dmitry Dudukalov, Artem Logachov, Vladimir Lotov + 3 more2026-03-05🤖 cs.LG

A Copula Based Supervised Filter for Feature Selection in Diabetes Risk Prediction Using Machine Learning

This paper proposes a computationally efficient supervised filter based on a Gumbel-copula implied upper-tail concordance score to identify features that are simultaneously extreme with the positive class, demonstrating its effectiveness in ranking clinically relevant predictors for diabetes risk across large-scale and clinical datasets while outperforming standard filters and matching strong baselines.

Agnideep Aich, Md Monzur Murshed, Sameera Hewage + 1 more2026-03-05🤖 cs.LG

Boosting In-Context Learning in LLMs Through the Lens of Classical Supervised Learning

This paper introduces Supervised Calibration (SC), a loss-minimization framework that enhances In-Context Learning in Large Language Models by learning optimal per-class affine transformations to correct systematic biases and alter decision boundary orientations, thereby achieving state-of-the-art performance across multiple models and datasets.

Korel Gundem, Juncheng Dong, Dennis Zhang + 2 more2026-03-05🤖 cs.AI