cutelemonlili/Qwen2.5-Math-7B_lora_MATH_training_Qwen_QwQ_32B_Preview_checkpoint-1500 Updated Jul 20, 2025
cutelemonlili/Qwen2.5-Math-7B_lora_MATH_training_Qwen_QwQ_32B_Preview_checkpoint-750 Updated Jul 20, 2025 • 1
cutelemonlili/Qwen2.5-Math-7B_lora_MATH_training_Qwen_QwQ_32B_Preview_checkpoint-1000 Updated Jul 20, 2025
cutelemonlili/Qwen2.5-Math-7B_lora_MATH_training_Qwen_QwQ_32B_Preview_checkpoint-1250 Updated Jul 20, 2025 • 1
cutelemonlili/Qwen2.5-Math-7B_lora_MATH_training_Qwen_QwQ_32B_Preview_checkpoint-250 Updated Jul 20, 2025 • 1
cutelemonlili/Qwen2.5-Math-7B_lora_MATH_training_Qwen_QwQ_32B_Preview_checkpoint-500 Updated Jul 20, 2025
cutelemonlili/Qwen2.5-Math-7B_lora_MATH_training_Qwen_QwQ_32B_Preview_checkpoint-2000 Updated Jul 20, 2025 • 1
cutelemonlili/Qwen2.5-Math-7B_lora_MATH_training_Qwen_QwQ_32B_Preview_checkpoint-1750 Updated Jul 20, 2025
Pritish92/lavida-variant-D-seed0-oracleaug-alpha0p001 Reinforcement Learning • Updated 5 days ago • 21
Pritish92/lavida-variant-B-seed0-selfdistill-alpha0p02 Reinforcement Learning • Updated 5 days ago • 22