DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search Paper • 2509.25454 • Published Sep 29 • 140
Diffusion Transformers with Representation Autoencoders Paper • 2510.11690 • Published Oct 13 • 165
MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use Paper • 2509.24002 • Published Sep 28 • 173
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs Paper • 2510.11696 • Published Oct 13 • 176
AuON: A Linear-time Alternative to Semi-Orthogonal Momentum Updates Paper • 2509.24320 • Published Sep 29 • 2