Raising Bars, Not Parameters: LilMoo Compact Language Model for Hindi
This paper introduces LilMoo, a 0.6-billion-parameter Hindi language model trained from scratch on a high-quality, transparently curated corpus, which outperforms similarly sized multilingual baselines and demonstrates that specialized pretraining can effectively address low-resource language gaps without relying on opaque foundation models.