Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds Paper β’ 2511.08892 β’ Published 8 days ago β’ 167
The Best of N Worlds: Aligning Reinforcement Learning with Best-of-N Sampling via max@k Optimisation Paper β’ 2510.23393 β’ Published 23 days ago β’ 20
view article Article Building the Open Agent Ecosystem Together: Introducing OpenEnv 28 days ago β’ 127
The Complexity Trap: Simple Observation Masking Is as Efficient as LLM Summarization for Agent Context Management Paper β’ 2508.21433 β’ Published Aug 29 β’ 7
π Repository-Level Pre-Trained OpenCoder π§© Collection All the checkpoints from Table 3 of the paper βOn Pretraining for Project-Level Code Completion.β β’ 33 items β’ Updated Oct 17 β’ 3
PIPer: On-Device Environment Setup via Online Reinforcement Learning Paper β’ 2509.25455 β’ Published Sep 29 β’ 36
SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents Paper β’ 2505.20411 β’ Published May 26 β’ 89
OpenDevin: An Open Platform for AI Software Developers as Generalist Agents Paper β’ 2407.16741 β’ Published Jul 23, 2024 β’ 74
Agentless: Demystifying LLM-based Software Engineering Agents Paper β’ 2407.01489 β’ Published Jul 1, 2024 β’ 64
π Interpretability & Analysis of LMs Collection Outstanding research in LM interpretability and evaluation, summarized β’ 134 items β’ Updated about 1 month ago β’ 116
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale Paper β’ 2406.17557 β’ Published Jun 25, 2024 β’ 98
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions Paper β’ 2406.15877 β’ Published Jun 22, 2024 β’ 48
ποΈ Long Code Arena Collection All the resources for our Long Code Arena benchmark! β’ 13 items β’ Updated Mar 14 β’ 6
Long Code Arena: a Set of Benchmarks for Long-Context Code Models Paper β’ 2406.11612 β’ Published Jun 17, 2024 β’ 25
π Kotlin ML Pack Collection A collection of datasets, fine-tuned models and benchmarks to train your models for perfect Kotlin code generation. β’ 9 items β’ Updated Jun 11, 2024 β’ 24
π« StarCoder2 Collection StarCoder2 models and datasets! β’ 8 items β’ Updated Mar 1, 2024 β’ 89
ChatMusician: Understanding and Generating Music Intrinsically with LLM Paper β’ 2402.16153 β’ Published Feb 25, 2024 β’ 60