P1: Mastering Physics Olympiads with Reinforcement Learning Paper • 2511.13612 • Published 1 day ago • 111 • 4
DINOv3 Collection DINOv3: foundation models producing excellent dense features, outperforming SotA w/o fine-tuning - https://arxiv.org/abs/2508.10104 • 13 items • Updated Aug 21 • 380
TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling Paper • 2508.17445 • Published Aug 24 • 80
ODYSSEY: Open-World Quadrupeds Exploration and Manipulation for Long-Horizon Tasks Paper • 2508.08240 • Published Aug 11 • 45
No application file Small LM Agent Interpretable Demo 🚀 demo with the article "Interpretability with SLM finetuning"
No application file Small LM Agent Interpretable Demo 🚀 demo with the article "Interpretability with SLM finetuning"
Gemma 3 QAT Collection Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory • 15 items • Updated Jul 10 • 210
Running 3.49k The Ultra-Scale Playbook 🌌 3.49k The ultimate guide to training LLM on large GPU Clusters