Masking Teacher and Reinforcing Student for Distilling Vision-Language Models Paper • 2512.22238 • Published 8 days ago • 16
Temporal Alignment Guidance: On-Manifold Sampling in Diffusion Models Paper • 2510.11057 • Published Oct 13, 2025 • 30
EXAONE 4.0: Unified Large Language Models Integrating Non-reasoning and Reasoning Modes Paper • 2507.11407 • Published Jul 15, 2025 • 58
Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation Paper • 2507.10524 • Published Jul 14, 2025 • 70
Revisiting Multi-Agent Debate as Test-Time Scaling: A Systematic Study of Conditional Effectiveness Paper • 2505.22960 • Published May 29, 2025 • 16
Learning Adaptive Parallel Reasoning with Language Models Paper • 2504.15466 • Published Apr 21, 2025 • 44
Phantom of Latent for Large Language and Vision Models Paper • 2409.14713 • Published Sep 23, 2024 • 28
Scaling Synthetic Data Creation with 1,000,000,000 Personas Paper • 2406.20094 • Published Jun 28, 2024 • 104
Towards Fast Multilingual LLM Inference: Speculative Decoding and Specialized Drafters Paper • 2406.16758 • Published Jun 24, 2024 • 20
Towards Fast Inference: Exploring and Improving Blockwise Parallel Drafts Paper • 2404.09221 • Published Apr 14, 2024 • 1
Towards Fast Multilingual LLM Inference: Speculative Decoding and Specialized Drafters Paper • 2406.16758 • Published Jun 24, 2024 • 20
Towards Fast Multilingual LLM Inference: Speculative Decoding and Specialized Drafters Paper • 2406.16758 • Published Jun 24, 2024 • 20 • 3
Block Transformer: Global-to-Local Language Modeling for Fast Inference Paper • 2406.02657 • Published Jun 4, 2024 • 41
Block Transformer: Global-to-Local Language Modeling for Fast Inference Paper • 2406.02657 • Published Jun 4, 2024 • 41