hdong0/Qwen3-1.7B-base-Open-R1-GRPO_deepscaler_acc_8192_nokl Text Generation • 2B • Updated Oct 7 • 10
hdong0/deepseek-Qwen-1.5B-batch-mix-GRPO_deepscaler_acc_seq_end_mask_thin_mu_8_warmed_4x4 Text Generation • 2B • Updated 27 days ago • 58
hdong0/deepseek-Qwen-7B-batch-mix-GRPO_deepscaler_acc_seq_end_mask_thin_mu_8_warmed_4x4 Text Generation • 8B • Updated 25 days ago • 80