Guided Flow Policy: Learning from High-Value Actions in Offline Reinforcement Learning

The paper introduces Guided Flow Policy (GFP), a novel offline reinforcement learning method that couples a multi-step flow-matching policy with a distilled one-step actor to selectively focus on high-value actions, achieving state-of-the-art performance across diverse benchmarks by overcoming the limitations of indiscriminate behavior regularization.

Franki Nguimatsia Tiofack, Théotime Le Hellard, Fabian Schramm + 2 more2026-03-06💻 cs

ClinNoteAgents: An LLM Multi-Agent System for Predicting and Interpreting Heart Failure 30-Day Readmission from Clinical Notes

ClinNoteAgents is a novel LLM-based multi-agent system that effectively predicts and interprets 30-day heart failure readmission risks by transforming unstructured clinical notes into structured risk factors and clinician-style abstractions, offering a scalable and interpretable solution for data-limited healthcare settings.

Rongjia Zhou, Chengzhuo Li, Carl Yang + 1 more2026-03-06💻 cs

Achieving Olympia-Level Geometry Large Language Model Agent via Complexity Boosting Reinforcement Learning

The paper introduces InternGeometry, an LLM agent enhanced by Complexity-Boosting Reinforcement Learning and a dynamic memory mechanism that iteratively proposes and verifies auxiliary constructions, achieving a medalist-level performance on IMO geometry problems with significantly less training data than previous expert models.

Haiteng Zhao, Junhao Shen, Yiming Zhang + 7 more2026-03-06💻 cs

HydroGEM: A Self Supervised Zero Shot Hybrid TCN Transformer Foundation Model for Continental Scale Streamflow Quality Control

HydroGEM is a self-supervised, zero-shot hybrid TCN-Transformer foundation model that effectively performs continental-scale streamflow quality control by detecting and reconstructing sensor anomalies with high accuracy and cross-national generalization, thereby addressing the scalability limitations of manual hydrological data validation.

Ijaz Ul Haq, Byung Suk Lee, Julia N. Perdrial + 1 more2026-03-06💻 cs

FluenceFormer: Transformer-Driven Multi-Beam Fluence Map Regression for Radiotherapy Planning

This paper introduces FluenceFormer, a transformer-driven, two-stage framework that leverages a physics-informed Fluence-Aware Regression loss to achieve superior, geometry-aware fluence map prediction for radiotherapy planning, significantly outperforming existing CNN and single-stage methods in energy conservation and structural fidelity.

Ujunwa Mgboh, Rafi Ibn Sultan, Joshua Kim + 2 more2026-03-06💻 cs

When Do Tools and Planning Help Large Language Models Think? A Cost- and Latency-Aware Benchmark

This paper presents a cost- and latency-aware benchmark demonstrating that while tool-augmented planning significantly improves accuracy for complex knowledge-intensive tasks like Event-QA, it often incurs prohibitive latency costs and offers no benefit—or even degrades performance—for tasks like persuasive response generation where simple one-shot prompting is more efficient.

Subha Ghoshal, Ali Al-Bustami2026-03-06💻 cs