Connecting Voices: LoReSpeech as a Low-Resource Speech Parallel Corpus
This paper introduces LoReSpeech, a low-resource speech-to-speech translation corpus constructed by aligning short collaborative recordings (LoReASR) with long-form audio using tools like MFA, aiming to advance multilingual ASR, direct speech translation, and linguistic preservation for underrepresented languages.