Vibe Researching as Wolf Coming: Can AI Agents with Skills Replace or Augment Social Scientists?

This paper argues that AI agents equipped with specialized skills can augment, but not fully replace, social scientists by executing codifiable research tasks autonomously through "vibe researching," while highlighting the enduring necessity of human theoretical originality and tacit knowledge alongside the profession's emerging risks of stratification and pedagogical crisis.

Yongjun Zhang2026-03-10💻 cs

A Mathematical Theory of Agency and Intelligence

This paper introduces "bipredictability" (P) as a fundamental, bounded measure of shared information between observations, actions, and outcomes to distinguish mere agency from true intelligence, demonstrating that current AI systems lack the self-monitoring feedback loops necessary for adaptive learning and proposing a thalamocortical-inspired architecture to restore it.

Wael Hafez, Chenan Wei, Rodrigo Pena, Amir Nazeri, Cameron Reid2026-03-10🔢 math

Scaling Search Relevance: Augmenting App Store Ranking with LLM-Generated Judgments

This paper addresses the scarcity of expert textual relevance labels in large-scale app store search by leveraging a specialized, fine-tuned LLM to generate millions of high-quality labels, which, when used to augment the production ranker, significantly improves both offline metrics and real-world conversion rates, particularly for tail queries lacking reliable behavioral data.

Evangelia Christakopoulou, Vivekkumar Patel, Hemanth Velaga, Sandip Gaikwad, Sean Suchter, Venkat Sundaranatha2026-03-10🤖 cs.LG

Attn-QAT: 4-Bit Attention With Quantization-Aware Training

This paper introduces Attn-QAT, the first systematic 4-bit quantization-aware training framework for attention mechanisms that ensures stable FP4 training and inference by matching low-precision recomputation in the backward pass and correcting implicit precision assumptions, thereby eliminating quality drops and delivering up to 1.5x speedup on FP4-capable GPUs without relying on outlier-mitigation heuristics.

Peiyuan Zhang, Matthew Noto, Wenxuan Tan, Chengquan Jiang, Will Lin, Wei Zhou, Hao Zhang2026-03-10🤖 cs.LG

How Well Do Multimodal Models Reason on ECG Signals?

This paper introduces a reproducible, scalable framework for evaluating multimodal models on ECG signals by decomposing reasoning into "Perception" (verified via code generation) and "Deduction" (verified via retrieval against clinical criteria) to address the limitations of existing manual or superficial evaluation methods.

Maxwell A. Xu, Harish Haresamudram, Catherine W. Liu, Patrick Langer, Jathurshan Pradeepkumar, Wanting Mao, Sunita J. Ferns, Aradhana Verma, Jimeng Sun, Paul Schmiedmayer, Xin Liu, Daniel McDuff, Emily B. Fox, James M. Rehg2026-03-10🤖 cs.LG

Conformal Prediction for Risk-Controlled Medical Entity Extraction Across Clinical Domains

This paper proposes a conformal prediction framework that ensures safe, domain-specific deployment of LLMs for medical entity extraction by adapting calibration thresholds to counteract the distinct underconfidence observed in structured FDA labels and overconfidence in free-text radiology reports, thereby achieving target coverage guarantees with manageable rejection rates across diverse clinical settings.

Manil Shrestha, Edward Kim2026-03-10💬 cs.CL

HarmonyCell: Automating Single-Cell Perturbation Modeling under Semantic and Distribution Shifts

HarmonyCell is an end-to-end agent framework that automates single-cell perturbation modeling by combining an LLM-driven semantic unifier to resolve metadata incompatibilities and an adaptive Monte Carlo Tree Search engine to synthesize architectures that handle distribution shifts, thereby achieving high execution success and outperforming expert baselines without manual engineering.

Wenxuan Huang, Mingyu Tsoi, Yanhao Huang, Xinjie Mao, Xue Xia, Hao Wu, Jiaqi Wei, Yuejin Yang, Lang Yu, Cheng Tan, Xiang Zhang, Zhangyang Gao, Siqi Sun2026-03-10💻 cs

LLM-assisted Semantic Option Discovery for Facilitating Adaptive Deep Reinforcement Learning

This paper proposes a novel LLM-driven closed-loop framework that maps natural language instructions to executable rules and semantically annotates options to enhance the data efficiency, interpretability, and cross-environment transferability of Deep Reinforcement Learning, with experimental validation showing superior performance in constraint compliance and skill reuse.

Chang Yao, Jinghui Qin, Kebing Jin, Hankz Hankui Zhuo2026-03-10💻 cs

Leveraging Model Soups to Classify Intangible Cultural Heritage Images from the Mekong Delta

This paper proposes a robust framework combining the hybrid CoAtNet architecture with model soups ensembling to effectively classify Intangible Cultural Heritage images from the Mekong Delta, achieving state-of-the-art performance on the ICH-17 dataset by reducing variance and enhancing generalization in data-scarce, high-similarity settings.

Quoc-Khang Tran, Minh-Thien Nguyen, Nguyen-Khang Pham2026-03-10🤖 cs.LG