Papers SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published Jan 28 • 123
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published Jan 28 • 123
Alignment Dataset English and other model alignment datasets. H-D-T/Buzz-8b-Large-v0.5 Text Generation • 8B • Updated May 14, 2024 • 31 • 29 allenai/WildChat-1M Viewer • Updated Oct 17, 2024 • 838k • 7.9k • 401 nvidia/ChatQA-Training-Data Viewer • Updated Jun 4, 2024 • 442k • 835 • 172 nvidia/ChatRAG-Bench Viewer • Updated May 24, 2024 • 34.6k • 2.02k • 114
Personalization LLM User-LLM: Efficient LLM Contextualization with User Embeddings Paper • 2402.13598 • Published Feb 21, 2024 • 20
User-LLM: Efficient LLM Contextualization with User Embeddings Paper • 2402.13598 • Published Feb 21, 2024 • 20
Indic Datasets List of text and voice datasets to train and finetune Indic LLMs ai4bharat/sangraha Viewer • Updated Mar 5 • 268M • 15.9k • 65 uonlp/CulturaX Viewer • Updated Dec 16, 2024 • 7.18B • 10.5k • 556 pary/hind_encorp Updated Jan 18, 2024 • 104 • 2 PleIAs/YouTube-Commons Updated Jun 26, 2024 • 4.19k • 367
Papers SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published Jan 28 • 123
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published Jan 28 • 123
Personalization LLM User-LLM: Efficient LLM Contextualization with User Embeddings Paper • 2402.13598 • Published Feb 21, 2024 • 20
User-LLM: Efficient LLM Contextualization with User Embeddings Paper • 2402.13598 • Published Feb 21, 2024 • 20
Indic Datasets List of text and voice datasets to train and finetune Indic LLMs ai4bharat/sangraha Viewer • Updated Mar 5 • 268M • 15.9k • 65 uonlp/CulturaX Viewer • Updated Dec 16, 2024 • 7.18B • 10.5k • 556 pary/hind_encorp Updated Jan 18, 2024 • 104 • 2 PleIAs/YouTube-Commons Updated Jun 26, 2024 • 4.19k • 367
Alignment Dataset English and other model alignment datasets. H-D-T/Buzz-8b-Large-v0.5 Text Generation • 8B • Updated May 14, 2024 • 31 • 29 allenai/WildChat-1M Viewer • Updated Oct 17, 2024 • 838k • 7.9k • 401 nvidia/ChatQA-Training-Data Viewer • Updated Jun 4, 2024 • 442k • 835 • 172 nvidia/ChatRAG-Bench Viewer • Updated May 24, 2024 • 34.6k • 2.02k • 114