Running on CPU Upgrade Featured 2.47k The Smol Training Playbook 📚 2.47k The secrets to building world-class LLMs
Vietnamese speech dataset Collection for any speech-related tasks including but not limited to: speech-to-text & text-to-speech, speech classification, speaker verification, etc. • 34 items • Updated Jul 8 • 35
suzii/vi-whisper-large-v3-turbo-v1 Automatic Speech Recognition • 0.8B • Updated May 6 • 421 • 13
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages Paper • 2309.09400 • Published Sep 17, 2023 • 85