Thinking-while-Generating: Interleaving Textual Reasoning throughout Visual Generation Paper • 2511.16671 • Published Nov 20 • 15
Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm Paper • 2511.04570 • Published Nov 6 • 210
Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models Paper • 2305.16223 • Published May 25, 2023
COVE: Unleashing the Diffusion Feature Correspondence for Consistent Video Editing Paper • 2406.08850 • Published Jun 13, 2024
Everything to the Synthetic: Diffusion-driven Test-time Adaptation via Synthetic-Domain Alignment Paper • 2406.04295 • Published Jun 6, 2024
Efficient Diffusion Transformer with Step-wise Dynamic Attention Mediators Paper • 2408.05710 • Published Aug 11, 2024 • 2
ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesis Paper • 2411.06959 • Published Nov 11, 2024
IMG: Calibrating Diffusion Models via Implicit Multimodal Guidance Paper • 2509.26231 • Published Sep 30 • 17
IMG: Calibrating Diffusion Models via Implicit Multimodal Guidance Paper • 2509.26231 • Published Sep 30 • 17
IMG: Calibrating Diffusion Models via Implicit Multimodal Guidance Paper • 2509.26231 • Published Sep 30 • 17 • 2