From Semantics to Pixels: Coarse-to-Fine Masked Autoencoders for Hierarchical Visual Understanding

The paper proposes C2FMAE, a coarse-to-fine masked autoencoder that resolves the tension between global semantics and local details in self-supervised learning by employing a cascaded decoder and progressive masking curriculum on a newly constructed multi-granular dataset to achieve hierarchical visual understanding and superior performance across various vision tasks.

Wenzhao Xiang, Yue Wu, Hongyang Yu, Feng Gao, Fan Yang, Xilin Chen2026-03-11🤖 cs.LG

From Data Statistics to Feature Geometry: How Correlations Shape Superposition

This paper challenges the standard view of superposition in neural networks by demonstrating that, unlike in idealized uncorrelated settings where interference is merely noise, realistic feature correlations allow models to arrange features so that interference becomes constructive, thereby naturally forming the semantic clusters and cyclical structures observed in real language models.

Lucas Prieto, Edward Stevinson, Melih Barsbey, Tolga Birdal, Pedro A. M. Mediano2026-03-11🤖 cs.AI

Differentiable Microscopy Designs an All Optical Phase Retrieval Microscope

This paper introduces "differentiable microscopy" (μ\partial\mu), a data-driven, top-down design framework that automatically optimizes optical systems for phase retrieval, demonstrating superior performance over existing methods and experimentally validating its effectiveness on biological samples.

Kithmini Herath, Hasindu Kariyawasam, Ramith Hettiarachchi, Udith Haputhanthri, Dineth Jayakody, Raja N. Ahmad, Azeem Ahmad, Balpreet S. Ahluwalia, Chamira U. S. Edussooriya, Dushan N. Wadduwage2026-03-10🔬 physics.optics

Goldilocks Test Sets for Face Verification

This paper proposes three high-quality, controlled test sets (Hadrian, Eclipse, and ND-Twins) designed to challenge face verification models on natural variations in facial attributes and similar-looking identities, while introducing "Goldilocks" rules to ensure balanced difficulty and demographic fairness without artificially degrading image quality.

Haiyu Wu, Sicong Tian, Aman Bhatta, Jacob Gutierrez, Grace Bezold, Genesis Argueta, Karl Ricanek Jr., Michael C. King, Kevin W. Bowyer2026-03-10💻 cs

Exploring Diffusion Models' Corruption Stage in Few-Shot Fine-tuning and Mitigating with Bayesian Neural Networks

This paper identifies a "corruption stage" in few-shot fine-tuned diffusion models caused by a narrowed learning distribution and proposes a Bayesian Neural Network approach with variational inference to broaden this distribution, thereby mitigating corruption and improving image fidelity, quality, and diversity without additional inference costs.

Xiaoyu Wu, Jiaru Zhang, Yang Hua, Bohan Lyu, Hao Wang, Tao Song, Haibing Guan2026-03-10🤖 cs.LG

Autoassociative Learning of Structural Representations for Modeling and Classification in Medical Imaging

This paper introduces a neurosymbolic system that reconstructs medical images using visual primitives to generate high-level structural explanations, achieving superior classification accuracy and transparency compared to conventional deep learning models in diagnosing histological abnormalities.

Zuzanna Buchnajzer, Kacper Dobek, Stanisław Hapke, Daniel Jankowski, Krzysztof Krawiec2026-03-10🤖 cs.LG

From Pixels to Predicates: Learning Symbolic World Models via Pretrained Vision-Language Models

This paper proposes a method that leverages pretrained vision-language models to learn compact, abstract symbolic world models from limited visual demonstrations, enabling zero-shot generalization and long-horizon planning for complex robotic tasks across novel objects, environments, and goals.

Ashay Athalye, Nishanth Kumar, Tom Silver, Yichao Liang, Jiuguang Wang, Tomás Lozano-Pérez, Leslie Pack Kaelbling2026-03-10🤖 cs.LG