The range from Cs to Dc captures a fascinating intersection of computer science theory and the emerging frontiers of distributed computing. Here, researchers explore how networks of machines coordinate to solve complex problems, often pushing the boundaries of what is possible in artificial intelligence and secure data systems. These studies form the backbone of modern digital infrastructure, translating abstract mathematical proofs into the robust protocols that power our connected world.

Gist.Science processes every new preprint in this category directly from arXiv, ensuring you never miss a breakthrough. For each submission, we provide both a clear, plain-language explanation of the core ideas and a detailed technical summary for experts. This dual approach makes the latest findings in computer science accessible to everyone, from curious students to seasoned researchers.

Below are the latest papers in this field, curated to help you stay ahead of the curve.

Prism: Cost-Efficient Multi-LLM Serving via GPU Memory Ballooning

Prism is a memory-centric LLM co-serving framework that utilizes a novel memory ballooning technique called kvcached to dynamically reclaim and reallocate GPU memory across multiple models, thereby unifying spatial and temporal sharing to improve cost-efficiency and SLO adherence in production environments.

Shan Yu, Yifan Qiao, Mingyuan Ma, Yangmin Li, Shuo Yang, Xinyuan Tong, Yang Wang, Zhiqiang Xie, Yuwei An, Shiyi Cao, Ke Bao, Deepak Vij, Xiaoning Ding, Yichen Wang, Qingda Lu, Zhong Wang, Gao Gao, Har (…)2026-06-12🤖 cs.AI

A Communication Complexity Lower Bound for Nonuniformly Convex Consensus Optimization

This paper establishes a new communication complexity lower bound of Ω ⁣(χGκglognχGlog1ε)\Omega\!\left(\chi_{\mathcal G} \sqrt{\kappa_g}\,\log\frac{n}{\chi_{\mathcal G}}\log\frac1\varepsilon\right) for nonuniformly convex consensus optimization over time-varying networks, demonstrating that the round complexity achievable under uniform regularity cannot be matched in the nonuniform regime through a construction embedding time-rotating star gadgets into expander graphs.

Demyan Yarmoshik, Maxim Klimenko2026-06-12🔢 math

M*: A Modular, Extensible, Serving System for Multimodal Models

This paper introduces M*, a modular and extensible serving system that represents composite multimodal models as dataflow graphs to enable flexible component composition and distributed optimization, achieving significant latency and throughput improvements over existing frameworks across diverse tasks like text-to-image, text-to-speech, and robotic planning.

Atindra Jha, Naomi Sagan, Keisuke Kamahori, Irmak Sivgin, Rohan Sanda, Steven Gao, Mark Horowitz, Luke Zettlemoyer, Olivia Hsu, Jure Leskovec, Baris Kasikci, Stephanie Wang2026-06-12🤖 cs.AI

Impossibility Results for Strong Linearizability: The Difficulty of Consistent Refereeing

This paper establishes that implementing various concurrent objects with strong linearizability in lock-free or wait-free settings inherently requires a form of "consistent refereeing" that, while weaker than consensus, demands high coordination power and leads to new impossibility results for objects like window registers, interfering primitives, and stacks.

Hagit Attiya (Technion - Israel Institute of Technology), Armando Castañeda (Universidad Nacional Autónoma de México), Constantin Enea (LIX, Ecole Polytechnique, CNRS and Institut Polytechnique de Par (…)2026-06-11💻 cs

MPK: A Compiler and Runtime for Mega-Kernelizing Tensor Programs

Mirage Persistent Kernel (MPK) is a novel compiler and runtime system that automatically transforms multi-GPU tensor programs into a single high-performance mega-kernel using SM-level graph representations to enable cross-operator pipelining and fine-grained overlap of computation and communication, thereby significantly reducing LLM inference latency compared to traditional kernel-per-operator approaches.

Xinhao Cheng, Zhihao Zhang, Yu Zhou, Jianan Ji, Jinchen Jiang, Zepeng Zhao, Ziruo Xiao, Zihao Ye, Yingyi Huang, Ruihang Lai, Hongyi Jin, Bohan Hou, Mengdi Wu, Yixin Dong, Anthony Yip, Zihao Ye, Songti (…)2026-06-11🤖 cs.LG

DASH: Deterministic Attention Scheduling for High-throughput Reproducible LLM Training

DASH (Deterministic Attention Scheduling for High-Throughput) addresses the significant performance overhead of deterministic attention in LLM training by formulating the backward pass as a DAG scheduling problem and introducing novel strategies like Descending Q-Tile Iteration and Shift Scheduling, which reduce pipeline stalls and improve throughput by up to 1.28×\times on NVIDIA H800 GPUs.

Xinwei Qiang, Hongmin Chen, Shixuan Sun, Jingwen Leng, Xin Liu, Minyi Guo2026-06-11🤖 cs.LG

LCLs Beyond Bounded Degrees

This paper demonstrates that while polynomial complexity gaps for Locally Checkable Labelings (LCLs) vanish on trees with unbounded degrees due to the ability to distinguish infinitely many local cases, these gaps can be restored by restricting problems to Locally Finite Labelings (LFLs), which ensure each node falls into one of finitely many local cases, thereby limiting deterministic complexities to either O(logn)O(\log n) or Θ(n1/k)\Theta(n^{1/k}).

Gustav Schmid2026-06-11💻 cs