firdhokk/speech-emotion-recognition-with-openai-whisper-large-v3 Audio Classification • 0.6B • Updated Jul 17 • 19.3k • 86
MiniMax-Speech: Intrinsic Zero-Shot Text-to-Speech with a Learnable Speaker Encoder Paper • 2505.07916 • Published May 12 • 132