hdong0/Qwen3-8B-base-Open-R1-GRPO_dapo_acc_16384_nokl Text Generation • 8B • Updated 24 days ago • 122
hdong0/Qwen3-8B-base-Open-R1-GRPO_dapo_acc_4096_nokl Text Generation • 8B • Updated 25 days ago • 171
hdong0/Qwen3-8B-base-Open-R1-GRPO_dapo_acc_2048_nokl Text Generation • 8B • Updated 26 days ago • 145
hdong0/Qwen3-8B-base-Open-R1-GRPO_dapo_acc_2048_to_16384_nokl Text Generation • 8B • Updated 22 days ago • 253
hdong0/Qwen3-8B-base-Open-R1-GRPO_dapo_acc_4096_to_16384_nokl Text Generation • 8B • Updated 20 days ago • 135
hdong0/Qwen3-8B-base-Open-R1-GRPO_dapo_acc_8192_to_16384_nokl Text Generation • 8B • Updated 19 days ago • 160
DCAgent/staqc-sandboxes-traces-terminus-2_Qwen3-8B-Base Text Generation • 308k • Updated 9 days ago • 26
ChenWu98/zebra_logic_train_for_sft_correct_responses_gpt-oss-20b_shortest_from_qwen3-8b-base Updated about 5 hours ago