Maximizing Data Efficiency for Cross-Lingual TTS Adaptation by Self-Supervised Representation Mixing and Embedding Initialization Paper • 2402.01692 • Published Jan 23, 2024 • 1
Generative Speech Foundation Model Pretraining for High-Quality Speech Extraction and Restoration Paper • 2409.16117 • Published Sep 24, 2024
DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment Paper • 2507.02768 • Published Jul 3 • 18