Leveraging Large Language Models for Automated Scalable Development of Open Scientific Databases
This paper introduces a scalable, domain-agnostic web-based framework that leverages Large Language Models to automate the collection, filtering, and construction of open scientific databases, achieving 90% overlap with expert-curated datasets while significantly reducing manual workload.