Common Reasoning TIGER-Lab/WebInstruct-verified Viewer • Updated about 22 hours ago • 462k • 542 • 49
Coder SFT Data ise-uiuc/Magicoder-Evol-Instruct-110K Viewer • Updated Dec 28, 2023 • 111k • 1.93k • 166 theblackcat102/evol-codealpaca-v1 Viewer • Updated Mar 10, 2024 • 111k • 1.67k • 170 Multilingual-Multimodal-NLP/McEval-Instruct Viewer • Updated Jun 12, 2024 • 35.9k • 81 • 35 KodCode/KodCode-V1-SFT-4o Viewer • Updated Mar 16 • 410k • 280 • 10
Coder DPO argilla/ultrafeedback-binarized-preferences-cleaned Viewer • Updated Dec 11, 2023 • 60.9k • 2.93k • 152 argilla/ultrafeedback-multi-binarized-quality-preferences-cleaned Viewer • Updated Dec 11, 2023 • 155k • 32 • 5
argilla/ultrafeedback-binarized-preferences-cleaned Viewer • Updated Dec 11, 2023 • 60.9k • 2.93k • 152
argilla/ultrafeedback-multi-binarized-quality-preferences-cleaned Viewer • Updated Dec 11, 2023 • 155k • 32 • 5
Funny Questions (Long-COT) JackGao/brain-teaser-chinese Viewer • Updated Mar 4 • 1.15k • 14 • 5 Conard/fortune-telling Viewer • Updated Feb 17 • 207 • 538 • 166
Reasoning Model deepcogito/cogito-v1-preview-qwen-32B Text Generation • 33B • Updated Apr 8 • 126 • 115
Pretrain Data Utils mlfoundations/fasttext-oh-eli5 Updated Aug 1, 2024 • 28 hkust-nlp/preselect-fasttext-classifier Text Classification • Updated Mar 6 • 10 • 8 HuggingFaceFW/fineweb-edu-classifier Text Classification • 0.1B • Updated Nov 17, 2024 • 2.3k • • 199
HuggingFaceFW/fineweb-edu-classifier Text Classification • 0.1B • Updated Nov 17, 2024 • 2.3k • • 199
Coder SFT Data (Long-COT ) nvidia/Llama-Nemotron-Post-Training-Dataset Viewer • Updated May 8 • 3.91M • 6.6k • 608 open-r1/codeforces-cots Viewer • Updated Mar 28 • 254k • 6.82k • 194 nvidia/OpenCodeReasoning Viewer • Updated May 4 • 753k • 3.29k • 513 nvidia/OpenCodeReasoning-2 Viewer • Updated May 17 • 2.16M • 1.66k • 47
Math SFT Data BytedTsinghua-SIA/DAPO-Math-17k Viewer • Updated Apr 18 • 1.79M • 8.29k • 121 nvidia/OpenMathInstruct-2 Viewer • Updated Nov 25, 2024 • 22M • 13.3k • 212 nvidia/OpenMathReasoning Viewer • Updated May 27 • 5.68M • 12.2k • 361 miromind-ai/MiroMind-M1-SFT-719K Viewer • Updated Jul 22 • 719k • 908 • 14
WebPage Related HuggingFaceM4/WebSight Viewer • Updated Mar 26, 2024 • 2.75M • 12k • 377 bytedance-research/Web-Bench Viewer • Updated May 19 • 1k • 277 • 7 luzimu/WebGen-Bench Viewer • Updated Sep 29 • 6.77k • 163 • 1
Coder Models agentica-org/DeepCoder-14B-Preview Text Generation • 15B • Updated May 11 • 568 • • 678 Qwen/Qwen2.5-Coder-32B-Instruct Text Generation • 33B • Updated Jan 12 • 275k • • 1.95k
Pretrain Data Utils mlfoundations/fasttext-oh-eli5 Updated Aug 1, 2024 • 28 hkust-nlp/preselect-fasttext-classifier Text Classification • Updated Mar 6 • 10 • 8 HuggingFaceFW/fineweb-edu-classifier Text Classification • 0.1B • Updated Nov 17, 2024 • 2.3k • • 199
HuggingFaceFW/fineweb-edu-classifier Text Classification • 0.1B • Updated Nov 17, 2024 • 2.3k • • 199
Common Reasoning TIGER-Lab/WebInstruct-verified Viewer • Updated about 22 hours ago • 462k • 542 • 49
Coder SFT Data ise-uiuc/Magicoder-Evol-Instruct-110K Viewer • Updated Dec 28, 2023 • 111k • 1.93k • 166 theblackcat102/evol-codealpaca-v1 Viewer • Updated Mar 10, 2024 • 111k • 1.67k • 170 Multilingual-Multimodal-NLP/McEval-Instruct Viewer • Updated Jun 12, 2024 • 35.9k • 81 • 35 KodCode/KodCode-V1-SFT-4o Viewer • Updated Mar 16 • 410k • 280 • 10
Coder SFT Data (Long-COT ) nvidia/Llama-Nemotron-Post-Training-Dataset Viewer • Updated May 8 • 3.91M • 6.6k • 608 open-r1/codeforces-cots Viewer • Updated Mar 28 • 254k • 6.82k • 194 nvidia/OpenCodeReasoning Viewer • Updated May 4 • 753k • 3.29k • 513 nvidia/OpenCodeReasoning-2 Viewer • Updated May 17 • 2.16M • 1.66k • 47
Coder DPO argilla/ultrafeedback-binarized-preferences-cleaned Viewer • Updated Dec 11, 2023 • 60.9k • 2.93k • 152 argilla/ultrafeedback-multi-binarized-quality-preferences-cleaned Viewer • Updated Dec 11, 2023 • 155k • 32 • 5
argilla/ultrafeedback-binarized-preferences-cleaned Viewer • Updated Dec 11, 2023 • 60.9k • 2.93k • 152
argilla/ultrafeedback-multi-binarized-quality-preferences-cleaned Viewer • Updated Dec 11, 2023 • 155k • 32 • 5
Math SFT Data BytedTsinghua-SIA/DAPO-Math-17k Viewer • Updated Apr 18 • 1.79M • 8.29k • 121 nvidia/OpenMathInstruct-2 Viewer • Updated Nov 25, 2024 • 22M • 13.3k • 212 nvidia/OpenMathReasoning Viewer • Updated May 27 • 5.68M • 12.2k • 361 miromind-ai/MiroMind-M1-SFT-719K Viewer • Updated Jul 22 • 719k • 908 • 14
WebPage Related HuggingFaceM4/WebSight Viewer • Updated Mar 26, 2024 • 2.75M • 12k • 377 bytedance-research/Web-Bench Viewer • Updated May 19 • 1k • 277 • 7 luzimu/WebGen-Bench Viewer • Updated Sep 29 • 6.77k • 163 • 1
Funny Questions (Long-COT) JackGao/brain-teaser-chinese Viewer • Updated Mar 4 • 1.15k • 14 • 5 Conard/fortune-telling Viewer • Updated Feb 17 • 207 • 538 • 166
Coder Models agentica-org/DeepCoder-14B-Preview Text Generation • 15B • Updated May 11 • 568 • • 678 Qwen/Qwen2.5-Coder-32B-Instruct Text Generation • 33B • Updated Jan 12 • 275k • • 1.95k
Reasoning Model deepcogito/cogito-v1-preview-qwen-32B Text Generation • 33B • Updated Apr 8 • 126 • 115