GGT-100K: Generative Ground Truth for Generalizable Real-World Image Restoration Paper • 2605.31039 • Published 5 days ago • 34
SmartDirector: Keyframe-Conditioned Cinematic Video Generation with Narrative Pacing Control Paper • 2605.27891 • Published 7 days ago • 6
minWM: A Full-Stack Open-Source Framework for Real-Time Interactive Video World Models Paper • 2605.30263 • Published 6 days ago • 53
Lens: Rethinking Training Efficiency for Foundational Text-to-Image Models Paper • 2605.21573 • Published 14 days ago • 107
LiteFrame: Efficient Vision Encoders Unlock Frame Scaling in Video LLMs Paper • 2605.17260 • Published 17 days ago • 25
KVPO: ODE-Native GRPO for Autoregressive Video Alignment via KV Semantic Exploration Paper • 2605.14278 • Published 20 days ago • 37
Lance: Unified Multimodal Modeling by Multi-Task Synergy Paper • 2605.18678 • Published 16 days ago • 78
LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation Paper • 2605.18739 • Published 16 days ago • 112
DiffusionOPD: A Unified Perspective of On-Policy Distillation in Diffusion Models Paper • 2605.15055 • Published 20 days ago • 19
Causal Forcing++: Scalable Few-Step Autoregressive Diffusion Distillation for Real-Time Interactive Video Generation Paper • 2605.15141 • Published 20 days ago • 93
AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation Paper • 2605.13724 • Published 21 days ago • 101
World Action Models: The Next Frontier in Embodied AI Paper • 2605.12090 • Published 22 days ago • 67
SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture Paper • 2605.12500 • Published 22 days ago • 191