Workflow-GYM: Towards Long-Horizon Evaluation of Computer-use Agentic tasks in Real-World Professional Fields Paper • 2606.11042 • Published 15 days ago • 21
Reverse-Engineered Reasoning for Open-Ended Generation Paper • 2509.06160 • Published Sep 7, 2025 • 151
CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization Paper • 2507.06181 • Published Jul 8, 2025 • 45