End-to-end Whispered Speech Recognition with Frequency-weighted Approaches and Pseudo Whisper Pre-training Paper • 2005.01972 • Published May 5, 2020
USAD 2.0: Scaling Representation Distillation for Universal Audio Understanding Paper • 2606.06444 • Published 9 days ago • 3
Improving Multilingual Speech Models on ML-SUPERB 2.0: Fine-tuning with Data Augmentation and LID-Aware CTC Paper • 2505.24200 • Published May 30, 2025
Nemotron 3 Nano Omni: Efficient and Open Multimodal Intelligence Paper • 2604.24954 • Published Apr 27 • 25