DRUPI: Dataset Reduction Using Privileged Information

The paper introduces DRUPI (Dataset Condensation using Privileged Information), a framework that enhances dataset condensation by synthesizing auxiliary privileged information, such as feature or attention labels, alongside reduced data to significantly improve model training performance across various benchmarks.

Shaobo Wang, Youxin Jiang, Tianle Niu, Yantai Yang, Ruiji Zhang, Shuhao Hu, Shuaiyu Zhang, Chenghao Sun, Weiya Li, Conghui He, Xuming Hu, Linfeng ZhangWed, 11 Ma🤖 cs.AI

Robust Training of Neural Networks at Arbitrary Precision and Sparsity

This paper introduces a unified framework that models quantization and sparsification as additive noise to derive a principled, noise-corrective gradient path, enabling the stable training of neural networks at arbitrary low precisions and sparsity levels without relying on heuristic estimators like the Straight-Through Estimator.

Chengxi Ye, Grace Chu, Yanfeng Liu, Yichi Zhang, Lukasz Lew, Li Zhang, Mark Sandler, Andrew HowardWed, 11 Ma🤖 cs.AI

Sparse Variational Student-t Processes for Heavy-tailed Modeling

This paper introduces Sparse Variational Student-t Processes (SVTP), a scalable framework that extends sparse inducing point methods to Student-t processes via novel inference algorithms and natural gradient optimization, achieving superior robustness to outliers and heavy-tailed data with significantly faster convergence and lower prediction error compared to sparse Gaussian processes on large datasets.

Jian Xu, Delu Zeng, John PaisleyWed, 11 Ma🤖 cs.AI

Enhancing Heterogeneous Multi-Agent Cooperation in Decentralized MARL via GNN-driven Intrinsic Rewards

This paper proposes CoHet, a novel algorithm that leverages Graph Neural Network-driven intrinsic rewards to enable effective decentralized learning and cooperation among heterogeneous multi-agent systems despite challenges like partial observability and reward sparsity, demonstrating superior performance over state-of-the-art methods in standard benchmarks.

Jahir Sadik Monon, Deeparghya Dutta Barua, Md. Mosaddek KhanWed, 11 Ma🤖 cs.AI

FinTexTS: Financial Text-Paired Time-Series Dataset via Semantic-Based and Multi-Level Pairing

The paper introduces FinTexTS, a large-scale financial text-paired time-series dataset constructed via a novel semantic-based and multi-level pairing framework that overcomes the limitations of simple keyword matching by leveraging LLMs to align news articles with stock prices across macro, sector, related company, and target-company levels, thereby significantly improving stock price forecasting performance.

Jaehoon Lee, Suhwan Park, Tae Yoon Lim, Seunghan Lee, Jun Seo, Dongwan Kang, Hwanil Choi, Minjae Kim, Sungdong Yoo, SoonYoung Lee, Yongjae Lee, Wonbin AhnWed, 11 Ma🤖 cs.AI

UAT-LITE: Inference-Time Uncertainty-Aware Attention for Pretrained Transformers

The paper proposes UAT-LITE, an inference-time framework that injects Monte Carlo dropout into the self-attention mechanisms of pretrained transformers to estimate token-level epistemic uncertainty and modulate attention, thereby significantly improving model calibration and selective prediction performance without requiring additional training or weight modifications.

Elias Hossain, Shubhashis Roy Dipta, Subash Neupane, Rajib Rana, Ravid Shwartz-Ziv, Ivan Garibay, Niloofar YousefiWed, 11 Ma🤖 cs.AI

From Self-Evolving Synthetic Data to Verifiable-Reward RL: Post-Training Multi-turn Interactive Tool-Using Agents

This paper introduces EigenData, a unified framework that combines a self-evolving multi-agent system for synthesizing verifiable tool-use dialogues with a verifier-based reinforcement learning recipe, enabling scalable post-training of interactive agents that achieve state-of-the-art performance on complex multi-turn benchmarks without relying on expensive human annotation.

Jiaxuan Gao, Jiaao Chen, Chuyi He, Shusheng Xu, Di Jin, Yi WuWed, 11 Ma🤖 cs.AI

Empowering All-in-Loop Health Management of Spacecraft Power System in the Mega-Constellation Era via Human-AI Collaboration

This paper addresses the challenges of health management for spacecraft power systems in the emerging mega-constellation era by proposing the "Aligning Underlying Capabilities" principle and introducing SpaceHMchat, an open-source Human-AI collaboration framework validated on a realistic hardware platform and a new large-scale dataset to achieve high-precision, interpretable, and efficient all-in-loop health management.

Yi Di, Zhibin Zhao, Fujin Wang, Xue Liu, Jiafeng Tang, Jiaxin Ren, Zhi Zhai, Xuefeng ChenWed, 11 Ma🤖 cs.AI

Reinforcement Learning for Self-Improving Agent with Skill Library

This paper introduces SAGE, a novel Reinforcement Learning framework that enhances LLM-based agents' self-improvement capabilities by utilizing a skill library with sequential rollouts and skill-integrated rewards, achieving significantly higher goal completion rates and greater efficiency than existing methods on the AppWorld benchmark.

Jiongxiao Wang, Qiaojing Yan, Yawei Wang, Yijun Tian, Soumya Smruti Mishra, Zhichao Xu, Megha Gandhi, Panpan Xu, Lin Lee CheongWed, 11 Ma🤖 cs.AI

Small Language Models for Efficient Agentic Tool Calling: Outperforming Large Models with Targeted Fine-tuning

This paper demonstrates that a single-epoch, domain-adapted fine-tuning of a 350M-parameter Small Language Model (OPT-350M) can significantly outperform larger models and existing baselines in tool-calling tasks, achieving a 77.55% pass rate on ToolBench and proving that targeted training can make generative AI more cost-effective and scalable for enterprise use.

Polaris Jhandi, Owais Kazi, Shreyas Subramanian, Neel SendasWed, 11 Ma🤖 cs.AI

Multi-Agent Reinforcement Learning with Communication-Constrained Priors

This paper proposes a communication-constrained multi-agent reinforcement learning framework that utilizes a generalized model and dual mutual information estimator to distinguish between lossy and lossless messages, thereby quantifying their impact on global rewards to enhance cooperative policy learning in complex, dynamic environments.

Guang Yang, Tianpei Yang, Jingwen Qiao, Yanqing Wu, Jing Huo, Xingguo Chen, Yang GaoWed, 11 Ma🤖 cs.AI

AlphaApollo: A System for Deep Agentic Reasoning

AlphaApollo is an agentic reasoning system that enhances foundation models' performance on complex, long-horizon tasks by orchestrating multi-turn agentic reasoning, turn-level reinforcement learning for tool-use optimization, and a propose-judge-update evolution loop with verification.

Zhanke Zhou, Chentao Cao, Xiao Feng, Xuan Li, Zongze Li, Xiangyu Lu, Jiangchao Yao, Weikai Huang, Tian Cheng, Jianghangfan Zhang, Tangyu Jiang, Linrui Xu, Yiming Zheng, Brando Miranda, Tongliang Liu, Sanmi Koyejo, Masashi Sugiyama, Bo HanWed, 11 Ma🤖 cs.AI