Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs Paper • 2508.14896 • Published Aug 20 • 22
view article Article How I Built 7 Custom Gradio Components in Just 12 Days! By elismasilva • Aug 12 • 7
MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning Paper • 2507.16812 • Published Jul 22 • 63
view article Article Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face Jul 29 • 190
Kimi-K2 Collection Moonshot's MoE LLMs with 1 trillion parameters, exceptional on agentic intellegence • 3 items • Updated 1 day ago • 128
view article Article Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment By NormalUhr • Feb 11 • 79
Understanding R1-Zero-Like Training: A Critical Perspective Paper • 2503.20783 • Published Mar 26 • 56
view article Article Illustrating Reinforcement Learning from Human Feedback (RLHF) Dec 9, 2022 • 361
view article Article Multi-Label Classification Model From Scratch: Step-by-Step Tutorial By Valerii-Knowledgator • Jan 8, 2024 • 47