view reply What is the different between FSDP vs Deepspeed Zero 3? Is it save to say FSDP is better than Zero 3 in term of speed and memory saving?
unsloth/Qwen3-VL-8B-Instruct-unsloth-bnb-4bit Image-Text-to-Text • 9B • Updated 15 days ago • 20.4k • 7
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models Paper • 2410.07985 • Published Oct 10, 2024 • 33
MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning Paper • 2507.16812 • Published Jul 22 • 63
view article Article OpenReasoning-Nemotron: A Family of State-of-the-Art Distilled Reasoning Models By nvidia and 3 others • Jul 18 • 50