eess.IV papers | Gist.Science

Segment-to-Act: Label-Noise-Robust Action-Prompted Video Segmentation Towards Embodied Intelligence

This paper addresses the unexplored challenge of label noise in action-based video object segmentation by introducing the ActiSeg-NL benchmark, analyzing the impact of textual and mask annotation noise, and proposing a Parallel Mask Head Mechanism to enhance robustness for embodied intelligence applications.

Wenxin Li, Kunyu Peng, Di Wen + 4 more2026-03-05🤖 cs.LG

Fast Equivariant Imaging: Acceleration for Unsupervised Learning via Augmented Lagrangian and Auxiliary PnP Denoisers

This paper introduces Fast Equivariant Imaging (FEI), a novel unsupervised learning framework that leverages the Augmented Lagrangian method and auxiliary Plug-and-Play denoisers to achieve a 10x training acceleration and improved generalization for deep imaging tasks like X-ray CT reconstruction and inpainting without requiring ground-truth data.

Guixian Xu, Jinglai Li, Junqi Tang2026-03-05🤖 cs.LG

Implicit U-KAN2.0: Dynamic, Efficient and Interpretable Medical Image Segmentation

This paper introduces Implicit U-KAN 2.0, a novel medical image segmentation model that combines second-order neural ordinary differential equations (SONO) with MultiKAN layers in a two-phase encoder-decoder architecture to achieve superior performance, enhanced interpretability, and dimension-independent approximation capabilities while reducing computational costs.

Chun-Wun Cheng, Yining Zhao, Yanqi Cheng + 3 more2026-03-05🤖 cs.LG

GeoTop: Advancing Image Classification with Geometric-Topological Analysis

GeoTop is a mathematically principled framework that unifies Topological Data Analysis and Lipschitz-Killing Curvatures to resolve the diagnostic ambiguity of topologically equivalent structures by integrating robust topological signatures with precise geometric features, thereby achieving superior accuracy and interpretability in image classification tasks such as skin lesion diagnosis.

Mariem Abaach, Ian Morilla2026-03-05🤖 cs.LG

Field imaging framework for morphological characterization of aggregates with computer vision: Algorithms and applications

This dissertation presents a comprehensive field imaging framework that leverages advanced computer vision algorithms, including 2D instance segmentation and an integrated 3D reconstruction-segmentation-completion approach, to overcome the limitations of traditional methods and enable accurate morphological characterization of construction aggregates across diverse field scenarios.

Haohang Huang2026-03-05🤖 cs.AI

Cryo-SWAN: the Multi-Scale Wavelet-decomposition-inspired Autoencoder Network for molecular density representation of molecular volumes

Cryo-SWAN is a multi-scale wavelet-decomposition-inspired variational autoencoder that effectively learns robust 3D molecular density representations from voxelized data, outperforming state-of-the-art methods in reconstruction quality and enabling advanced applications like denoising and conditional shape generation for structural biology.

Rui Li, Artsemi Yushkevich, Mikhail Kudryashev + 1 more2026-03-05🤖 cs.AI

← Previous