V_{0.5}: Generalist Value Model as a Prior for Sparse RL Rollouts
Paper
• 2603.10848 • Published
• 7
None defined yet.
$V_{0.5}$: Generalist Value Model as a Prior for Sparse RL Rollouts
ScaleEnv: Scaling Environment Synthesis from Scratch for Generalist Interactive Tool-Use Agent Training