Angular Gradient Sign Method: Uncovering Vulnerabilities in Hyperbolic Networks

This paper introduces the Angular Gradient Sign Method, a novel adversarial attack for hyperbolic networks that leverages the geometric decomposition of gradients to apply perturbations solely along angular (semantic) directions, thereby achieving higher fooling rates and revealing unique vulnerabilities in hierarchical embeddings compared to conventional Euclidean-based methods.

Minsoo Jo, Dongyoon Yang, Taesup Kim2026-03-10🤖 cs.LG

Video2Layout: Recall and Reconstruct Metric-Grounded Cognitive Map for Spatial Reasoning

The paper proposes Video2Layout, a two-stage framework that reconstructs metric-grounded spatial layouts using continuous object boundary coordinates instead of discretized grids, thereby enhancing fine-grained spatial reasoning in Multimodal Large Language Models and achieving superior performance on spatial benchmarks.

Yibin Huang, Wang Xu, Wanyue Zhang, Helu Zhi, Jingjing Huang, Yangbin Xu, Yangang Sun, Conghui Zhu, Tiejun Zhao2026-03-10💻 cs

UnfoldLDM: Deep Unfolding-based Blind Image Restoration with Latent Diffusion Priors

The paper proposes UnfoldLDM, a deep unfolding framework that integrates a multi-granularity degradation-aware module for robust degradation estimation and a degradation-resistant latent diffusion model with an over-smoothing correction transformer to effectively address blind image restoration by overcoming degradation-specific dependencies and suppressing over-smoothing bias.

Chunming He, Rihan Zhang, Zheng Chen, Bowen Yang, Chengyu Fang, Yunlong Lin, Yulun Zhang, Fengyang Xiao, Sina Farsiu2026-03-10💻 cs

Yo'City: Personalized and Boundless 3D Realistic City Scene Generation via Self-Critic Expansion

This paper introduces Yo'City, an agentic framework that leverages large models for hierarchical planning and a self-critic expansion loop to generate personalized, boundless, and spatially coherent 3D realistic city scenes, outperforming existing state-of-the-art methods across multiple evaluation metrics.

Keyang Lu, Sifan Zhou, Hongbin Xu, Gang Xu, Zhifei Yang, Yikai Wang, Zhen Xiao, Jieyi Long, Ming Li2026-03-10💻 cs

ForamDeepSlice: A High-Accuracy Deep Learning Framework for Foraminifera Species Classification from 2D Micro-CT Slices

This study introduces ForamDeepSlice, a high-accuracy deep learning framework that combines an ensemble of ConvNeXt-Large and EfficientNetV2-Small models with a rigorous specimen-level split dataset to achieve 95.64% accuracy in classifying foraminifera species from 2D micro-CT slices, while also providing an interactive dashboard for real-time identification and 3D matching.

Abdelghafour Halimi, Ali Alibrahim, Didier Barradas-Bautista, Ronell Sicat, Abdulkader M. Afifi2026-03-10🤖 cs.LG

MAViD: A Multimodal Framework for Audio-Visual Dialogue Understanding and Generation

MAViD is a novel multimodal framework that employs a Conductor-Creator architecture, combining autoregressive audio and diffusion-based video generation with a specialized fusion module, to overcome existing limitations and achieve seamless, long-duration, and contextually coherent audio-visual dialogue understanding and generation.

Youxin Pang, Jiajun Liu, Lingfeng Tan, Yong Zhang, Feng Gao, Xiang Deng, Zhuoliang Kang, Xiaoming Wei, Yebin Liu2026-03-10💻 cs

When Token Pruning is Worse than Random: Understanding Visual Token Information in VLLMs

This paper reveals that visual token information in Vision Large Language Models progressively vanishes at a depth-dependent "information horizon," beyond which existing pruning methods underperform random selection, leading to a novel strategy that integrates random pruning to achieve state-of-the-art efficiency without sacrificing accuracy.

Yahong Wang, Juncheng Wu, Zhangkai Ni, Longzhen Yang, Yihang Liu, Chengmei Yang, Ying Wen, Lianghua He, Xianfeng Tang, Hui Liu, Yuyin Zhou2026-03-10💻 cs

Two-Step Data Augmentation for Masked Face Detection and Recognition: Turning Fake Masks to Real

This paper presents a two-step generative data augmentation framework combining rule-based mask warping and unpaired image-to-image translation to address the scarcity of masked face datasets, achieving performance improvements with minimal training data while explicitly noting its origins as a resource-constrained coursework project that lacked downstream quantitative evaluation.

Yan Yang, George Bebis, Mircea Nicolescu2026-03-10🤖 cs.LG

ReMeDI: Refined Memory for Disambiguation of Identities with SAM3 in Surgical Segmentation

The paper introduces ReMeDI-SAM3, a training-free extension of SAM3 that enhances surgical instrument segmentation in endoscopy by implementing relevance-aware memory filtering, piecewise interpolation, and feature-based re-identification to overcome challenges like occlusions and rapid motion, achieving significant zero-shot performance improvements over existing methods.

Valay Bundele, Mehran Hosseinzadeh, Hendrik P. A. Lensch2026-03-10💻 cs

It is not always greener on the other side: Greenery perception across demographics and personalities in multiple cities

This study analyzes the discrepancies between objective and subjective urban greenery perceptions across five countries using street view imagery and a survey of 1,000 participants, revealing that while demographics and personality have little influence, an individual's geographic location is a primary factor shaping how they perceive green spaces.

Matias Quintana, Fangqi Liu, Jussi Torkko, Youlong Gu, Xiucheng Liang, Yujun Hou, Koichi Ito, Yihan Zhu, Mahmoud Abdelrahman, Tuuli Toivonen, Yi Lu, Filip Biljecki2026-03-10💻 cs