Validation of a Small Language Model for DSM-5 Substance Category Classification in Child Welfare Records

This study validates that a locally hosted 20-billion-parameter small language model can reliably classify specific DSM-5 substance categories within child welfare investigation narratives, achieving near-perfect agreement with human experts for five major substance types despite limitations with low-prevalence categories.

Brian E. Perron, Dragan Stoll, Bryan G. Victor, Zia Qia, Andreas Jud, Joseph P. RyanTue, 10 Ma💬 cs.CL

Breaking Training Bottlenecks: Effective and Stable Reinforcement Learning for Coding Models

This paper introduces MicroCoder-GRPO, an enhanced Group Relative Policy Optimization framework featuring innovations like conditional truncation masking and diversity-driven temperature selection, alongside a challenging new dataset and robust evaluator, to overcome training bottlenecks in modern coding models and achieve significant performance gains on LiveCodeBench v6.

Zongqian Li, Shaohan Huang, Zewen Chi, Yixuan Su, Lexin Zhou, Li Dong, Nigel Collier, Furu WeiTue, 10 Ma🤖 cs.LG

Scaling Data Difficulty: Improving Coding Models via Reinforcement Learning on Fresh and Challenging Problems

This paper introduces MicroCoder, a high-quality dataset of curated, recent, and challenging competitive programming problems processed through a four-stage framework with automatic difficulty filtering, which significantly boosts coding model performance on unseen hard tasks compared to existing baselines.

Zongqian Li, Tengchao Lv, Shaohan Huang, Yixuan Su, Qinzheng Sun, Qiufeng Yin, Ying Xin, Scarlett Li, Lei Cui, Nigel Collier, Furu WeiTue, 10 Ma🤖 cs.LG