RefineBench: Evaluating Refinement Capability of Language Models via Checklists Paper • 2511.22173 • Published 7 days ago • 12
RefineBench: Evaluating Refinement Capability of Language Models via Checklists Paper • 2511.22173 • Published 7 days ago • 12
DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research Paper • 2511.19399 • Published 9 days ago • 53
OmniScientist: Toward a Co-evolving Ecosystem of Human and AI Scientists Paper • 2511.16931 • Published 13 days ago • 6
WorldGen: From Text to Traversable and Interactive 3D Worlds Paper • 2511.16825 • Published 13 days ago • 20
O-Mem: Omni Memory System for Personalized, Long Horizon, Self-Evolving Agents Paper • 2511.13593 • Published 16 days ago • 24
OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe Paper • 2511.16334 • Published 13 days ago • 90
Nemotron Elastic: Towards Efficient Many-in-One Reasoning LLMs Paper • 2511.16664 • Published 13 days ago • 24
Can World Simulators Reason? Gen-ViRe: A Generative Visual Reasoning Benchmark Paper • 2511.13853 • Published 16 days ago • 34