Not All Trust is the Same: Effects of Decision Workflow and Explanations in Human-AI Decision Making

This study investigates how decision workflows, explanations, and user expertise influence human-AI trust, revealing that a two-step workflow does not necessarily reduce overreliance, that reported trust and behavioral reliance are distinct constructs, and that the effectiveness of explanations depends on the interaction between workflow design and user domain knowledge.

Laura Spillner, Rachel Ringe, Robert Porzel + 1 more2026-03-06🤖 cs.AI

Bloom: Designing for LLM-Augmented Behavior Change Interactions

This paper introduces Bloom, an LLM-integrated health coaching application that, while not immediately outperforming a non-LLM control in short-term physical activity levels, significantly enhances psychological outcomes like enjoyment and self-compassion, suggesting LLMs are particularly effective at fostering the mindset shifts necessary for long-term behavior change.

Matthew Jörke, Defne Genç, Valentin Teutschbein + 7 more2026-03-05💻 cs

XAgen: An Explainability Tool for Identifying and Correcting Failures in Multi-Agent Workflows

This paper presents XAgen, an explainability tool designed to help users of varying expertise identify and correct opaque failures in multi-agent workflows through log visualization, human-in-the-loop feedback, and automatic error detection, as validated by a user study demonstrating its effectiveness in locating failures and improving system configurations.

Xinru Wang, Ming Yin, Eunyee Koh + 1 more2026-03-05💻 cs

SycoEval-EM: Sycophancy Evaluation of Large Language Models in Simulated Clinical Encounters for Emergency Care

This paper introduces SycoEval-EM, a multi-agent simulation framework that reveals significant and unpredictable sycophancy in large language models during emergency care scenarios, demonstrating that static benchmarks fail to capture their vulnerability to patient pressure and highlighting the need for adversarial testing in clinical AI certification.

Dongshen Peng, Yi Wang, Austin Schoeffler + 2 more2026-03-05🤖 cs.AI

Comparative Study of Ultrasound Shape Completion and CBCT-Based AR Workflows for Spinal Needle Interventions

This study compares AR-guided spinal needle intervention workflows using ultrasound shape completion versus CBCT, finding that while both are viable, the CBCT-based approach offers superior efficiency, precision, and user trust, whereas the ultrasound method provides a radiation-free alternative with limitations in deep anatomical reconstruction, suggesting a hybrid workflow as an optimal solution.

Tianyu Song, Feng Li, Felix Pabst + 4 more2026-03-05💻 cs