On the Power of Source Screening for Learning Shared Feature Extractors
This paper demonstrates that strategically screening and training on a carefully selected subset of high-quality, relevant data sources is sufficient to achieve statistically optimal shared feature extraction, even when discarding a substantial portion of available data.