DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search Paper • 2509.25454 • Published Sep 29 • 136
Diffusion Transformers with Representation Autoencoders Paper • 2510.11690 • Published 21 days ago • 160
MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use Paper • 2509.24002 • Published Sep 28 • 170
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs Paper • 2510.11696 • Published 21 days ago • 169
AuON: A Linear-time Alternative to Semi-Orthogonal Momentum Updates Paper • 2509.24320 • Published Sep 29 • 1
AuON: A Linear-time Alternative to Semi-Orthogonal Momentum Updates Paper • 2509.24320 • Published Sep 29 • 1