Hierarchical Codec Diffusion for Video-to-Speech Generation Paper • 2604.15923 • Published 29 days ago • 2
Hierarchical Codec Diffusion for Video-to-Speech Generation Paper • 2604.15923 • Published 29 days ago • 2
audeering/wav2vec2-large-robust-6-ft-age-gender Audio Classification • 90.8M • Updated Nov 27, 2023 • 20.8k • 6
FaceLLM Collection A multimodal large language model trained specifically for facial image understanding. Project page: https://www.idiap.ch/paper/facellm • 3 items • Updated Jul 23, 2025 • 4
DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control Paper • 2410.13830 • Published Oct 17, 2024 • 26
Configuration error Agents Featured 178 NaturalSpeech3 FACodec 🏃 178 Convert and reconstruct speech files