BabAR: from phoneme recognition to developmental measures of young children's speech production

The paper introduces BabAR, a cross-linguistic automatic phoneme recognition system trained on the newly curated TinyVox corpus of over half a million child vocalizations, which effectively supports large-scale developmental speech analysis by demonstrating that multilingual pretraining and contextual fine-tuning yield accurate measures of speech maturity.

Marvin Lavechin, Elika Bergelson, Roger Levy2026-03-06⚡ eess