FinSheet-Bench: From Simple Lookups to Complex Reasoning, Where LLMs Break on Financial Spreadsheets

FinSheet-Bench introduces a synthetic benchmark modeled on real private equity fund structures to evaluate LLMs on financial spreadsheet tasks, revealing that even the best-performing models currently lack the accuracy required for unsupervised professional use, particularly on complex, large-scale documents, and suggesting that reliable extraction will require separating document understanding from deterministic computation.

Jan Ravnik, Matjaž Ličen, Felix Bührmann, Bithiah Yuan, Felix Stinson, Tanvi Singh2026-03-10💻 cs

Self-Supervised Evolutionary Learning of Neurodynamic Progression and Identity Manifolds from EEG During Safety-Critical Decision Making

This paper proposes a self-supervised evolutionary learning framework that extracts individualized neurodynamic progressions and identity manifolds from unlabeled EEG data during safety-critical decision-making, enabling robust user authentication, anomaly detection, and improved generalization without relying on external labels or predefined cognitive models.

Xiaoshan Zhou, Carol C. Menassa, Vineet R. Kamat2026-03-10💻 cs

VisualScratchpad: Inference-time Visual Concepts Analysis in Vision Language Models

This paper introduces VisualScratchpad, an interactive inference-time analysis tool that leverages sparse autoencoders and attention mechanisms to visualize and debug vision language models by linking visual concepts to text tokens, thereby revealing previously underexplored failure modes such as limited cross-modal alignment and misleading visual concepts.

Hyesu Lim, Jinho Choi, Taekyung Kim, Byeongho Heo, Jaegul Choo, Dongyoon Han2026-03-10💻 cs

Agora: Teaching the Skill of Consensus-Finding with AI Personas Grounded in Human Voice

The paper introduces Agora, an AI-powered platform that leverages LLMs to simulate diverse human perspectives on policy issues, enabling users to practice consensus-building and demonstrating through a preliminary study that access to authentic voice explanations significantly enhances problem-solving skills and the quality of collective decisions compared to viewing aggregate data alone.

Suyash Fulay, Prerna Ravi, Emily Kubin, Shrestha Mohanty, Michiel Bakker, Deb Roy2026-03-10💻 cs

Uber's Failover Architecture: Reconciling Reliability and Efficiency in Hyperscale Microservice Infrastructure

Uber's Failover Architecture (UFA) replaces its costly uniform 2x capacity model with a differentiated, criticality-based approach that opportunistically shares resources and preempts non-critical services during peak failovers, thereby reducing steady-state provisioning from 2x to 1.3x and eliminating over one million CPU cores while maintaining 99.97% availability.

Mayank Bansal, Milind Chabbi, Kenneth Bogh, Srikanth Prodduturi, Kevin Xu, Amit Kumar, David Bell, Ranjib Dey, Yufei Ren, Sachin Sharma, Juan Marcano, Shriniket Kale, Subhav Pradhan, Ivan Beschastnikh, Miguel Covarrubias, Chien-Chih Liao, Sandeep Koushik Sheshadri, Wen Luo, Kai Song, Ashish Samant, Sahil Rihan, Nimish Sheth, Uday Kiran Medisetty2026-03-10💻 cs

Pre-Clinical Latency Characterization of VRxBioRelax: A Real-Time EMG Biofeedback System for Muscle Relaxation in Virtual Reality

This paper introduces VRxBioRelax, a real-time virtual reality biofeedback system that utilizes sEMG data to drive an immersive relaxation environment, demonstrating through extensive pre-clinical testing that its average end-to-end latency of 25.34 ms significantly meets both VR comfort and clinical benchmarks for effective muscle relaxation training.

Melanie Baumgartner, Raphael Weibel, Tobias Hoesli, Aydin Javadov, Rayna Ney, Helen Schwerdt, Florian von Wangenheim, Joseph Ollier2026-03-10💻 cs

The Yerkes-Dodson Curve for AI Agents: Emergent Cooperation Under Environmental Pressure in Multi-Agent LLM Simulations

This paper demonstrates that environmental pressure in multi-agent LLM simulations follows a Yerkes-Dodson inverted-U relationship, where medium stress optimizes emergent cooperative trade while extreme pressure causes behavioral collapse, and suggests that calibrating such pressure serves as an effective curriculum design strategy for agent development.

Ivan Pasichnyk2026-03-10💻 cs

Enhancing OLAP Resilience at LinkedIn

This paper presents a holistic resiliency framework for Apache Pinot at LinkedIn, featuring Query Workload Isolation, Impact-Free Rebalancing, Maintenance Zone Awareness, and Adaptive Server Selection, which collectively ensure stable subsecond query latency and high availability for petabyte-scale OLAP workloads under failures and load spikes.

Praveen Chaganlal, Jia Guo, Vivek Vaidyanathan, Dino Occhialini, Sonam Mandal, Subbu Subramaniam, Siddharth Teotia, Tianqi Li, Xiaxuan Gao, Florence Zhang2026-03-10💻 cs