Cogito, Ergo Ludo: An Agent that Learns to Play by Reasoning and Planning Paper • 2509.25052 • Published Sep 29 • 4
Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward Paper • 2510.03222 • Published Oct 3 • 74
PaSa: An LLM Agent for Comprehensive Academic Paper Search Paper • 2501.10120 • Published Jan 17 • 53