M3^3-ACE: Rectifying Visual Perception in Multimodal Math Reasoning via Multi-Agentic Context Engineering

The paper proposes M3-ACE, a multi-agentic context engineering framework that rectifies inaccurate visual perception in multimodal math reasoning by decoupling perception from reasoning and employing collaborative agents with specialized tools to dynamically refine visual evidence, thereby achieving state-of-the-art performance on benchmarks like MathVision.

Peijin Xie, Zhen Xu, Bingquan Liu, Baoxun Wang2026-03-10💻 cs

A Hierarchical Error-Corrective Graph Framework for Autonomous Agents with LLM-Based Action Generation

This paper proposes the Hierarchical Error-Corrective Graph Framework (HECG) for autonomous agents, which integrates Multi-Dimensional Transferable Strategy (MDTS) for precise candidate selection, Error Matrix Classification (EMC) for structured failure attribution, and Causal-Context Graph Retrieval (CCGR) for enhanced contextual reasoning to improve execution reliability in complex, multi-step tasks.

Cong Cao, Jingyao Zhang, Kun Tong2026-03-10💻 cs

Revealing Behavioral Plasticity in Large Language Models: A Token-Conditional Perspective

This paper introduces Token-Conditioned Reinforcement Learning (ToCoRL), a framework that leverages the intrinsic behavioral plasticity of Large Language Models to internalize and stabilize inference-time adaptations, enabling precise control over behavioral modes like switching from reasoning to direct answering without degrading overall capabilities.

Liyuan Mao, Le Yu, Jing Zhou, Chujie Zheng, Bowen Yu, Chang Gao, Shixuan Liu, An Yang, Weinan Zhang, JunYang Lin2026-03-10🤖 cs.LG

Human-Aware Robot Behaviour in Self-Driving Labs

This paper proposes an AI-driven perception method with hierarchical human intention prediction to enable mobile robot chemists in self-driving laboratories to proactively distinguish between human preparatory actions and transient interactions, thereby overcoming the inefficiencies of passive obstruction detection and streamlining human-robot coordination in shared-access scenarios.

Satheeshkumar Veeramani, Anna Kisil, Abigail Bentley, Hatem Fakhruldeen, Gabriella Pizzuto, Andrew I. Cooper2026-03-10💻 cs

SYNAPSE: Framework for Neuron Analysis and Perturbation in Sequence Encoding

The paper introduces SYNAPSE, a systematic, training-free framework that analyzes and stress-tests Transformer models by extracting layer representations and applying forward-hook interventions to reveal domain-independent internal organization, functional stability through redundant neuron subsets, and specific vulnerabilities to small manipulations.

Jesús Sánchez Ochoa, Enrique Tomás Martínez Beltrán, Alberto Huertas Celdrán2026-03-10🤖 cs.LG

Efficient Policy Learning with Hybrid Evaluation-Based Genetic Programming for Uncertain Agile Earth Observation Satellite Scheduling

This paper proposes a Hybrid Evaluation-based Genetic Programming (HE-GP) framework that dynamically switches between exact and approximate evaluation modes within an Online Scheduling Algorithm to efficiently solve the Uncertain Agile Earth Observation Satellite Scheduling Problem, achieving significant computational cost reductions while maintaining superior scheduling performance compared to existing methods.

Junhua Xue, Yuning Chen2026-03-10💻 cs

A prospective clinical feasibility study of a conversational diagnostic AI in an ambulatory primary care clinic

This prospective feasibility study demonstrates that a conversational AI system (AMIE) can safely and effectively conduct clinical history-taking and generate diagnostic suggestions in a real-world urgent care setting, achieving high patient satisfaction and diagnostic accuracy comparable to primary care providers while requiring no real-time human intervention.

Peter Brodeur, Jacob M. Koshy, Anil Palepu, Khaled Saab, Ava Homiar, Roma Ruparel, Charles Wu, Ryutaro Tanno, Joseph Xu, Amy Wang, David Stutz, Hannah M. Ferrera, David Barrett, Lindsey Crowley, Jihyeon Lee, Spencer E. Rittner, Ellery Wulczyn, Selena K. Zhang, Elahe Vedadi, Christine G. Kohn, Kavita Kulkarni, Vinay Kadiyala, Sara Mahdavi, Wendy Du, Jessica Williams, David Feinbloom, Renee Wong, Tao Tu, Petar Sirkovic, Alessio Orlandi, Christopher Semturs, Yun Liu, Juraj Gottweis, Dale R. Webster, Joëlle Barral, Katherine Chou, Pushmeet Kohli, Avinatan Hassidim, Yossi Matias, James Manyika, Rob Fields, Jonathan X. Li, Marc L. Cohen, Vivek Natarajan, Mike Schaekermann, Alan Karthikesalingam, Adam Rodman2026-03-10🤖 cs.LG

The Boiling Frog Threshold: Criticality and Blindness in World Model-Based Anomaly Detection Under Gradual Drift

This paper investigates world model-based anomaly detection under gradual observation drift, revealing a universal sharp detection threshold that depends on the interaction between detector sensitivity, noise floor, and environment-specific dynamics, while identifying critical failure modes such as the undetectability of sinusoidal drift and agent collapse prior to detection.

Zhe Hong2026-03-10🤖 cs.LG

R2F: Repurposing Ray Frontiers for LLM-free Object Navigation

The paper proposes R2F, an LLM-free framework for zero-shot open-vocabulary object navigation that repurposes ray frontiers as direction-conditioned semantic hypotheses to achieve competitive performance with real-time execution, eliminating the latency and computational overhead of iterative large-model queries.

Francesco Argenziano, John Mark Alexis Marcelo, Michele Brienza, Abdel Hakim Drid, Emanuele Musumeci, Daniele Nardi, Domenico D. Bloisi, Vincenzo Suriani2026-03-10💻 cs