To Mix or To Merge: Toward Multi-Domain Reinforcement Learning for Large Language Models

This paper introduces M2RL, a comprehensive study comparing mixed multi-task training versus separate training with model merging for multi-domain Reinforcement Learning with Verifiable Rewards (RLVR), revealing that reasoning-intensive domains exhibit synergistic effects with minimal interference and providing mechanistic insights through extensive experiments.

Haoqing Wang, Xiang Long, Ziheng Li, Yilong Xu, Tingguang Li, Yehui TangTue, 10 Ma💻 cs

RIS Control through the Lens of Stochastic Network Calculus: An O-RAN Framework for Delay-Sensitive 6G Applications

This paper proposes DARIO, an O-RAN-compliant framework that leverages a novel Stochastic Network Calculus model to dynamically assign Reconfigurable Intelligent Surfaces (RIS) to users, achieving significant uplink delay reductions for heterogeneous 6G applications by solving a near-optimal nonlinear integer program with low computational overhead.

Oscar Adamuz-Hinojosa, Lanfranco Zanzi, Vincenzo Sciancalepore, Marco Di Renzo, Xavier Costa-PérezTue, 10 Ma💻 cs

Self-Attention And Beyond the Infinite: Towards Linear Transformers with Infinite Self-Attention

This paper introduces Infinite Self-Attention (InfSA) and its linear-time variant, Linear-InfSA, a spectral reformulation of self-attention as a diffusion process on token graphs that achieves state-of-the-art ImageNet accuracy and enables efficient, memory-free inference at ultra-high resolutions (up to 9216×9216) by replacing the quadratic softmax cost with a Neumann series approximation.

Giorgio Roffo, Luke PalmerTue, 10 Ma💻 cs

HarmonyCell: Automating Single-Cell Perturbation Modeling under Semantic and Distribution Shifts

HarmonyCell is an end-to-end agent framework that automates single-cell perturbation modeling by combining an LLM-driven semantic unifier to resolve metadata incompatibilities and an adaptive Monte Carlo Tree Search engine to synthesize architectures that handle distribution shifts, thereby achieving high execution success and outperforming expert baselines without manual engineering.

Wenxuan Huang, Mingyu Tsoi, Yanhao Huang, Xinjie Mao, Xue Xia, Hao Wu, Jiaqi Wei, Yuejin Yang, Lang Yu, Cheng Tan, Xiang Zhang, Zhangyang Gao, Siqi SunTue, 10 Ma💻 cs

Building the ethical AI framework of the future: from philosophy to practice

This paper proposes an ethics-by-design control architecture that operationalizes AI governance across the entire lifecycle by embedding philosophical reasoning into a triple-gate enforcement structure (Metric, Governance, and Eco) with measurable triggers and audit trails, thereby translating normative commitments into testable controls compatible with existing MLOps pipelines and major regulatory frameworks like the EU AI Act and NIST RMF.

Jasper Kyle CatapangTue, 10 Ma💻 cs

Causal Analysis of Author Demographics in Academic Peer Review

Using causal inference on a dataset of 530 papers, this study quantifies statistically significant disadvantages in academic peer review rankings for authors from minority racial groups, female authors, and those affiliated with institutions in the Global South, highlighting the urgent need for fairness interventions in both traditional and AI-driven assessment systems.

Uttamasha Anjally Oyshi, Gibson Nkhata, Susan GauchTue, 10 Ma💻 cs

Margin-Consistent Deep Subtyping of Invasive Lung Adenocarcinoma via Perturbation Fidelity in Whole-Slide Image Analysis

This paper proposes a margin-consistent deep subtyping framework for invasive lung adenocarcinoma that integrates attention-weighted aggregation, contrastive regularization, and a novel Perturbation Fidelity scoring mechanism to achieve robust, high-accuracy classification across multiple architectures and demonstrate cross-institutional generalizability on whole-slide images.

Meghdad Sabouri Rad (Vincent), Junze (Vincent), Huang, Mohammad Mehdi Hosseini, Rakesh Choudhary, Saverio J. Carello, Ola El-Zammar, Michel R. Nasr, Bardia RoddTue, 10 Ma💻 cs

PaLMR: Towards Faithful Visual Reasoning via Multimodal Process Alignment

PaLMR is a novel framework that enhances the faithfulness of multimodal large language models by aligning both the reasoning process and outcomes through a perception-aligned data layer and a hierarchical reward fusion scheme, thereby significantly reducing visual hallucinations while achieving state-of-the-art performance on key benchmarks.

Yantao Li, Qiang Hui, Chenyang Yan, Kanzhi Cheng, Fang Zhao, Chao Tan, Huanling Gao, Jianbing Zhang, Kai Wang, Xinyu Dai, Shiguo LianTue, 10 Ma💻 cs

Digital Twin-Enabled Mobility-Aware Cooperative Caching in Vehicular Edge Computing

This paper proposes a Digital Twin-enabled framework (DAPR) that integrates asynchronous federated learning, a GRU-VAE prediction model, and deep reinforcement learning to optimize client selection and content request prediction, thereby significantly improving cache hit ratios and reducing transmission latency in vehicular edge computing systems.

Jiahao Zeng, Zhenkui Shi, Chunpei Li, Mengkai Yan, Hongliang Zhang, Sihan Chen, Xiantao Hu, Xianxian LiTue, 10 Ma💻 cs

GameVerse: Can Vision-Language Models Learn from Video-based Reflection?

The paper introduces GameVerse, a comprehensive benchmark featuring a novel reflect-and-retry paradigm and a hierarchical taxonomy across 15 games, demonstrating that Vision-Language Models can effectively improve their gameplay policies through video-based reflection by combining failure trajectories with expert tutorials.

Kuan Zhang, Dongchen Liu, Qiyue Zhao, Jinkun Hou, Xinran Zhang, Qinlei Xie, Miao Liu, Yiming LiTue, 10 Ma💻 cs

ASMIL: Attention-Stabilized Multiple Instance Learning for Whole Slide Imaging

The paper introduces ASMIL, a unified framework that addresses unstable attention dynamics, overfitting, and over-concentrated attention in attention-based multiple instance learning for whole slide imaging by employing an anchor model with a normalized sigmoid function and token random dropping, resulting in significant performance improvements over state-of-the-art methods.

Linfeng Ye, Shayan Mohajer Hamidi, Zhixiang Chi, Guang Li, Mert Pilanci, Takahiro Ogawa, Miki Haseyama, Konstantinos N. PlataniotisTue, 10 Ma💻 cs

Science Literacy: Generative AI as Enabler of Coherence in the Teaching, Learning, and Assessment of Scientific Knowledge and Reasoning

This chapter explores the potential of generative AI to enhance K-16+ science literacy by proposing a coherent architectural framework that aligns the teaching, learning, and assessment of scientific knowledge and reasoning, while addressing associated challenges and outlining future research needs.

Xiaoming Zhai, James W. Pellegrino, Matias Rojas, Jongchan Park, Matthew Nyaaba, Clayton Cohn, Gautam BiswasTue, 10 Ma💻 cs