Adaptive Multi-Objective Tiered Storage Configuration for KV Cache in LLM Service

This paper introduces Kareto, an adaptive multi-objective optimizer that efficiently navigates the complex configuration space of tiered KV cache storage to dynamically balance cost, throughput, and latency, significantly outperforming static strategies in LLM inference services.

Xianzhe Zheng, Zhengheng Wang, Ruiyan Ma, Rui Wang, Xiyu Wang, Rui Chen, Peng Zhang, Sicheng Pan, Zhangheng Huang, Chenxin Wu, Yi Zhang, Bo Cai, Kan Liu, Teng Ma, Yin Du, Dong Deng, Sai Wu, Guoyun Zhu, Wei Zhang, Feifei Li2026-03-11💻 cs

Granulon: Awakening Pixel-Level Visual Encoders with Adaptive Multi-Granularity Semantics for MLLM

Granulon is a novel multimodal large language model that leverages a DINOv3-based visual encoder enhanced with a text-conditioned granularity controller and adaptive token aggregation to dynamically unify pixel-level perception with coarse-grained semantics, significantly improving accuracy and reducing hallucinations compared to existing approaches.

Junyuan Mao, Qiankun Li, Linghao Meng, Zhicheng He, Xinliang Zhou, Kun Wang, Yang Liu, Yueming Jin2026-03-11💻 cs

VisionCreator-R1: A Reflection-Enhanced Native Visual-Generation Agentic Model

The paper introduces VisionCreator-R1, a native visual-generation agent enhanced with explicit reflection mechanisms and trained via a Reflection-Plan Co-Optimization (RPCO) methodology that addresses credit assignment challenges to outperform state-of-the-art models on both single and multi-image generation benchmarks.

Jinxiang Lai, Wenzhe Zhao, Zexin Lu, Hualei Zhang, Qinyu Yang, Rongwei Quan, Zhimin Li, Shuai Shao, Song Guo, Qinglin Lu2026-03-11💻 cs

HMR-1: Hierarchical Massage Robot with Vision-Language-Model for Embodied Healthcare

This paper addresses the lack of standardized benchmarks and datasets in embodied healthcare by introducing MedMassage-12K, a large-scale multimodal acupoint massage dataset, and proposing HMR-1, a hierarchical framework that leverages vision-language models for high-level acupoint grounding and low-level trajectory control to enable robust robotic massage therapy.

Rongtao Xu, Mingming Yu, Xiaofeng Han, Yu Zhang, Kaiyi Hu, Zhe Feng, Zenghuang Fu, Changwei Wang, Weiliang Meng, Xiaopeng Zhang2026-03-11💻 cs

Impact of Different Failures on a Robot's Perceived Reliability

This study demonstrates that in human-robot interaction, different failure types impact perceived reliability differently—with mistakes being less damaging than slips or lapses—and that trust can be effectively recovered through subsequent successful executions without the need for explicit social repair actions.

Andrew Violette, Zhanxin Wu, Haruki Nishimura, Masha Itkina, Leticia Priebe Rocha, Mark Zolotas, Guy Hoffman, Hadas Kress-Gazit2026-03-11💻 cs

HeteroFedSyn: Differentially Private Tabular Data Synthesis for Heterogeneous Federated Settings

The paper proposes HeteroFedSyn, the first differentially private framework for synthesizing tabular data in horizontal federated settings, which achieves utility comparable to centralized methods by introducing noise-efficient dependency metrics, unbiased noise correction, and adaptive selection strategies to handle heterogeneous data distributions.

Xiaochen Li, Fengyu Gao, Xizixiang Wei, Tianhao Wang, Cong Shen, Jing Yang2026-03-11💻 cs

NaviNote: Enabling In-situ Spatial Annotation Authoring to Support Exploration and Navigation for Blind and Low Vision People

This paper presents NaviNote, a voice-based system that combines high-precision visual localization with an agentic architecture to enable blind and low vision users to author in-situ spatial annotations and navigate unfamiliar environments with greater accuracy.

Ruijia Chen, Yuheng Wu, Charlie Houseago, Filipe Gaspar, Filippo Aleotti, Dorian Gálvez-López, Oliver Johnston, Diego Mazala, Guillermo Garcia-Hernando, Maryam Bandukda, Gabriel Brostow, Jessica Van Brummelen2026-03-11💻 cs