Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
83.4
TFLOPS
17
161
948
Miguel Guerrero
PRO
apol
Follow
pcuenq's profile picture
venturespace's profile picture
luke-bechtel's profile picture
34 followers
·
196 following
https://miguelguerrero.eu
apolmig
apolmig
AI & ML interests
nlp, avatars, gans, time series, memory, education, govtech
Recent Activity
upvoted
an
article
about 1 hour ago
Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries
reacted
to
sergiopaniego
's
post
with 🚀
about 1 hour ago
We just released a big blog surveying 16 OSS frameworks for async RL training of LLMs! We're building a new async GRPO trainer for TRL and as first step, we needed to understand how the ecosystem solves this problem today. The problem: in synchronous RL training, generation dominates wall-clock time. 32K-token rollouts on a 32B model take hours while training GPUs sit completely idle. With reasoning models and agentic RL making rollouts longer and more variable, this only gets worse. The ecosystem converged on the same fix: separate inference + training onto different GPU pools, rollout buffer, and async weight sync. We compared 16 frameworks across 7 axes: orchestration, buffer design, weight sync, staleness management, partial rollouts, LoRA, and MoE support. This survey is step one. The async GRPO trainer for TRL is next! https://huggingface.co/blog/async-rl-training-landscape
updated
a Space
1 day ago
apol/spain-persona-research-observatory
View all activity
Organizations
apol
's models
3
Sort: Recently updated
apol/med-llm-triage-es
1B
•
Updated
Feb 18
•
22
•
1
apol/test
Updated
Mar 10, 2022
apol/dalle-mini
Text-to-Image
•
Updated
Aug 17, 2021
•
11
•
9