LAMM-ViT: AI Face Detection via Layer-Aware Modulation of Region-Guided Attention
The paper introduces LAMM-ViT, a novel Vision Transformer that enhances AI face detection by integrating Region-Guided Multi-Head Attention with dynamic Layer-aware Mask Modulation to capture hierarchical structural inconsistencies across diverse generative models, achieving state-of-the-art generalization performance.