Energy-Based Transformers are Scalable Learners and Thinkers Paper • 2507.02092 • Published Jul 2 • 69
Model Already Knows the Best Noise: Bayesian Active Noise Selection via Attention in Video Diffusion Model Paper • 2505.17561 • Published May 23 • 31
Autoregressive Image Generation without Vector Quantization Paper • 2406.11838 • Published Jun 17, 2024 • 5
PLADIS: Pushing the Limits of Attention in Diffusion Models at Inference Time by Leveraging Sparsity Paper • 2503.07677 • Published Mar 10 • 86