CGL: Advancing Continual GUI Learning via Reinforcement Fine-Tuning

This paper introduces CGL, a continual GUI learning framework that mitigates catastrophic forgetting by dynamically balancing Supervised Fine-Tuning and Reinforcement Learning through an entropy-guided proportion adjustment mechanism and a specialized gradient surgery strategy, validated by a new AndroidControl-CL benchmark.

Zhenquan Yao, Zitong Huang, Yihan Zeng, Jianhua Han, Hang Xu, Chun-Mei Feng, Jianwei Ma, Wangmeng Zuo2026-03-10🤖 cs.LG

Roots Beneath the Cut: Uncovering the Risk of Concept Revival in Pruning-Based Unlearning for Diffusion Models

This paper reveals that pruning-based unlearning in diffusion models is inherently insecure because the locations of pruned weights act as side-channel signals that enable a novel, data-free, and training-free attack to fully revive erased concepts, prompting a call for safer pruning mechanisms that conceal these locations.

Ci Zhang, Zhaojun Ding, Chence Yang, Jun Liu, Xiaoming Zhai, Shaoyi Huang, Beiwen Li, Xiaolong Ma, Jin Lu, Geng Yuan2026-03-10🤖 cs.LG

Margin-Consistent Deep Subtyping of Invasive Lung Adenocarcinoma via Perturbation Fidelity in Whole-Slide Image Analysis

This paper proposes a margin-consistent deep subtyping framework for invasive lung adenocarcinoma that integrates attention-weighted aggregation, contrastive regularization, and a novel Perturbation Fidelity scoring mechanism to achieve robust, high-accuracy classification across multiple architectures and demonstrate cross-institutional generalizability on whole-slide images.

Meghdad Sabouri Rad (Vincent), Junze (Vincent), Huang, Mohammad Mehdi Hosseini, Rakesh Choudhary, Saverio J. Carello, Ola El-Zammar, Michel R. Nasr, Bardia Rodd2026-03-10💻 cs

PaLMR: Towards Faithful Visual Reasoning via Multimodal Process Alignment

PaLMR is a novel framework that enhances the faithfulness of multimodal large language models by aligning both the reasoning process and outcomes through a perception-aligned data layer and a hierarchical reward fusion scheme, thereby significantly reducing visual hallucinations while achieving state-of-the-art performance on key benchmarks.

Yantao Li, Qiang Hui, Chenyang Yan, Kanzhi Cheng, Fang Zhao, Chao Tan, Huanling Gao, Jianbing Zhang, Kai Wang, Xinyu Dai, Shiguo Lian2026-03-10💻 cs

GameVerse: Can Vision-Language Models Learn from Video-based Reflection?

The paper introduces GameVerse, a comprehensive benchmark featuring a novel reflect-and-retry paradigm and a hierarchical taxonomy across 15 games, demonstrating that Vision-Language Models can effectively improve their gameplay policies through video-based reflection by combining failure trajectories with expert tutorials.

Kuan Zhang, Dongchen Liu, Qiyue Zhao, Jinkun Hou, Xinran Zhang, Qinlei Xie, Miao Liu, Yiming Li2026-03-10💻 cs

ASMIL: Attention-Stabilized Multiple Instance Learning for Whole Slide Imaging

The paper introduces ASMIL, a unified framework that addresses unstable attention dynamics, overfitting, and over-concentrated attention in attention-based multiple instance learning for whole slide imaging by employing an anchor model with a normalized sigmoid function and token random dropping, resulting in significant performance improvements over state-of-the-art methods.

Linfeng Ye, Shayan Mohajer Hamidi, Zhixiang Chi, Guang Li, Mert Pilanci, Takahiro Ogawa, Miki Haseyama, Konstantinos N. Plataniotis2026-03-10💻 cs

SJD-PV: Speculative Jacobi Decoding with Phrase Verification for Autoregressive Image Generation

This paper introduces SJD-PV, a training-free acceleration framework for autoregressive image generation that leverages phrase-level speculative verification based on token co-occurrence statistics to jointly validate multiple correlated tokens, achieving up to 30% faster decoding without compromising visual fidelity.

Zhehao Yu, Baoquan Zhang, Bingqi Shan, Xinhao Liu, Dongliang Zhou, Guotao Liang, Guangming Ye, Yunming Ye2026-03-10💻 cs