BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution Paper ⢠2510.08697 ⢠Published 23 days ago ⢠33
Estimating the Carbon Footprint of BLOOM, a 176B Parameter Language Model Paper ⢠2211.02001 ⢠Published Nov 3, 2022
Hype, Sustainability, and the Price of the Bigger-is-Better Paradigm in AI Paper ⢠2409.14160 ⢠Published Sep 21, 2024 ⢠3
From Efficiency Gains to Rebound Effects: The Problem of Jevons' Paradox in AI's Polarized Environmental Debate Paper ⢠2501.16548 ⢠Published Jan 27
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language Paper ⢠2506.20920 ⢠Published Jun 26 ⢠73
SmolVLM: Redefining small and efficient multimodal models Paper ⢠2504.05299 ⢠Published Apr 7 ⢠200
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper ⢠2502.02737 ⢠Published Feb 4 ⢠243
Towards Best Practices for Open Datasets for LLM Training Paper ⢠2501.08365 ⢠Published Jan 14 ⢠63
SelfCodeAlign: Self-Alignment for Code Generation Paper ⢠2410.24198 ⢠Published Oct 31, 2024 ⢠24
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale Paper ⢠2406.17557 ⢠Published Jun 25, 2024 ⢠97
Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations Paper ⢠2405.18392 ⢠Published May 28, 2024 ⢠12
StarCoder 2 and The Stack v2: The Next Generation Paper ⢠2402.19173 ⢠Published Feb 29, 2024 ⢠148
Power Hungry Processing: Watts Driving the Cost of AI Deployment? Paper ⢠2311.16863 ⢠Published Nov 28, 2023 ⢠6
What's in the Box? A Preliminary Analysis of Undesirable Content in the Common Crawl Corpus Paper ⢠2105.02732 ⢠Published May 6, 2021