Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence Paper • 2510.20579 • Published 6 days ago • 50
Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs Paper • 2510.18876 • Published 8 days ago • 35
Grasp-Any-Region Collection Models and datasets for Grasp-Any-Region • 4 items • Updated 8 days ago • 1
Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR Paper • 2508.14029 • Published Aug 19 • 118
Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology Paper • 2507.07999 • Published Jul 10 • 48