Scaling Real-Time Traffic Analytics on Edge-Cloud Fabrics for City-Scale Camera Networks

This paper presents a scalable, AI-driven Intelligent Transportation System that orchestrates edge-cloud resources to process thousands of city-scale CCTV streams in real time, utilizing DNNs for detection, Spatio-Temporal GNNs for forecasting, and continuous federated learning to maintain accuracy under strict latency and bandwidth constraints.

Akash Sharma, Pranjal Naman, Roopkatha Banerjee + 11 more2026-03-06💻 cs

Unlocking Python's Cores: Hardware Usage and Energy Implications of Removing the GIL

This study evaluates Python 3.14.2's experimental free-threaded build, revealing that while it significantly improves execution time and energy efficiency for parallelizable workloads, it incurs higher memory usage and increased energy consumption for sequential or highly contended tasks, indicating that its adoption depends on specific workload characteristics rather than offering a universal performance boost.

José Daniel Montoya Salazar2026-03-06💻 cs

Overcoming Latency-bound Limitations of Distributed Graph Algorithms using the HPX Runtime System

This paper presents a distributed library prototype implementing Breadth-First Search, PageRank, and Triangle Counting using the HPX runtime system, demonstrating that its asynchronous execution and unified programming model significantly outperform conventional frameworks like GraphX and PBGL by effectively overcoming latency-bound limitations through fine-grained parallelism and overlapping communication with computation.

Karame Mohammadiporshokooh, Panagiotis Syskakis, Andrew Lumsdaine + 1 more2026-03-06💻 cs

FedEMA-Distill: Exponential Moving Average Guided Knowledge Distillation for Robust Federated Learning

FedEMA-Distill is a robust and communication-efficient federated learning framework that leverages server-side exponential moving average smoothing and ensemble knowledge distillation from compressed client logits to achieve superior accuracy, faster convergence, and Byzantine resilience under non-IID data conditions without requiring client-side software modifications.

Hamza Reguieg, Mohamed El Kamili, Essaid Sabir2026-03-06💻 cs

Classification of Local Optimization Problems in Directed Cycles

This paper presents a complete classification of the distributed computational complexity for local optimization problems in directed cycles within both deterministic and randomized LOCAL models, identifying four distinct complexity classes and providing an efficient meta-algorithm to automatically determine the complexity and synthesize optimal distributed algorithms for any given problem.

Thomas Boudier, Fabian Kuhn, Augusto Modanese + 2 more2026-03-06💻 cs

Modality Inflation: Energy Characterization and Optimization Opportunities for MLLM Inference

This paper investigates "modality inflation" in multimodal large language models by providing the first stage-level energy analysis across vision encoding, prefill, and decoding, revealing significant energy overheads and GPU underutilization while demonstrating that stage-wise dynamic voltage and frequency scaling (DVFS) offers an effective optimization strategy for energy-efficient inference.

Mona Moghadampanah, Adib Rezaei Shahmirzadi, Farhana Amin + 1 more2026-03-06💻 cs

Combining Serverless and High-Performance Computing Paradigms to support ML Data-Intensive Applications

This paper introduces Cylon, a high-performance distributed data frame solution that leverages a serverless communicator using NAT Traversal TCP Hole Punching to enable direct communication between AWS Lambda functions, achieving scaling efficiency within 6.5% of traditional EC2 clusters for data-intensive machine learning applications.

Mills Staylor, Arup Kumar Sarker, Gregor von Laszewski + 3 more2026-03-06💻 cs

Universal Pattern Formation by Oblivious Robots Under Sequential Schedulers

This paper demonstrates that oblivious robots operating under sequential schedulers possess computational power orthogonal to and often superior to those under the fully synchronous scheduler (FSYNC), proving that Universal Pattern Formation is solvable in the former without additional assumptions while remaining unsolvable in the latter even with strong capabilities, with the exception that Gathering requires weak multiplicity detection in the sequential setting.

Paola Flocchini, Alfredo Navarra, Debasish Pattanayak + 2 more2026-03-06💻 cs