DreamFoley: Scalable VLMs for High-Fidelity Video-to-Audio Generation Paper • 2512.06022 • Published 28 days ago • 3
EMMA: Efficient Multimodal Understanding, Generation, and Editing with a Unified Architecture Paper • 2512.04810 • Published 27 days ago • 25