GeoTop: Advancing Image Classification with Geometric-Topological Analysis

GeoTop is a mathematically principled framework that unifies Topological Data Analysis and Lipschitz-Killing Curvatures to resolve the diagnostic ambiguity of topologically equivalent structures by integrating robust topological signatures with precise geometric features, thereby achieving superior accuracy and interpretability in image classification tasks such as skin lesion diagnosis.

Mariem Abaach, Ian Morilla2026-03-05🤖 cs.LG

Catch Me If You Can Describe Me: Open-Vocabulary Camouflaged Instance Segmentation with Diffusion

This paper proposes a novel diffusion-based method for Open-Vocabulary Camouflaged Instance Segmentation (OVCIS) that effectively fuses multi-scale textual-visual features to overcome the challenges of blending boundaries and segmenting unseen object classes, demonstrating superior performance on benchmarks with applications in surveillance, wildlife monitoring, and military reconnaissance.

Tuan-Anh Vu, Duc Thanh Nguyen, Qing Guo + 4 more2026-03-05🤖 cs.AI

FireANTs: Adaptive Riemannian Optimization for Multi-Scale Diffeomorphic Matching

The paper introduces FireANTs, a training-free, GPU-accelerated multi-scale Adaptive Riemannian Optimization algorithm that achieves significantly faster and more memory-efficient dense diffeomorphic image matching than both traditional methods and deep learning approaches while maintaining robust generalization across diverse modalities and anatomical structures.

Rohit Jena, Pratik Chaudhari, James C. Gee2026-03-05💻 cs

Natural Adversaries: Fuzzing Autonomous Vehicles with Realistic Roadside Object Placements

This paper introduces TrashFuzz, a black-box fuzzing algorithm that manipulates the realistic placement of common roadside objects to generate adversarial scenarios causing autonomous vehicles to misperceive traffic signals and violate traffic laws, demonstrating significant vulnerabilities in the Apollo system without relying on unnatural adversarial patches.

Yang Sun, Haoyu Wang, Christopher M. Poskitt + 1 more2026-03-05💻 cs

Building a Mind Palace: Structuring Environment-Grounded Semantic Graphs for Effective Long Video Analysis with LLMs

This paper introduces VideoMindPalace, a framework that structures long-form video understanding into a topologically organized semantic graph based on hand-object interactions, activity zones, and layout mapping, alongside a new benchmark (VMB), to significantly enhance the spatio-temporal coherence and human-aligned reasoning capabilities of Large Vision Language Models.

Zeyi Huang, Yuyang Ji, Xiaofang Wang + 11 more2026-03-05💻 cs