CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents Paper • 2603.24440 • Published 2 days ago • 83
Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs? Paper • 2603.24472 • Published 2 days ago • 37
CohereLabs/cohere-transcribe-03-2026 Automatic Speech Recognition • Updated about 5 hours ago • 3.96k • 250
Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale Paper • 2603.25040 • Published 1 day ago • 82
From Static Templates to Dynamic Runtime Graphs: A Survey of Workflow Optimization for LLM Agents Paper • 2603.22386 • Published 4 days ago • 49
DA-Flow: Degradation-Aware Optical Flow Estimation with Diffusion Models Paper • 2603.23499 • Published 3 days ago • 46
Repurposing Geometric Foundation Models for Multi-view Diffusion Paper • 2603.22275 • Published 4 days ago • 41
Rethinking Token-Level Policy Optimization for Multimodal Chain-of-Thought Paper • 2603.22847 • Published 4 days ago • 21
OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis Paper • 2603.20278 • Published 10 days ago • 86
ThinkJEPA: Empowering Latent World Models with Large Vision-Language Reasoning Model Paper • 2603.22281 • Published 4 days ago • 13
mSFT: Addressing Dataset Mixtures Overfiting Heterogeneously in Multi-task SFT Paper • 2603.21606 • Published 5 days ago • 36
LongCat-Flash-Prover: Advancing Native Formal Reasoning via Agentic Tool-Integrated Reinforcement Learning Paper • 2603.21065 • Published 6 days ago • 71
Speed by Simplicity: A Single-Stream Architecture for Fast Audio-Video Generative Foundation Model Paper • 2603.21986 • Published 4 days ago • 111
On the Direction of RLVR Updates for LLM Reasoning: Identification and Exploitation Paper • 2603.22117 • Published 4 days ago • 22
Beyond Single Tokens: Distilling Discrete Diffusion Models via Discrete MMD Paper • 2603.20155 • Published 7 days ago • 8
3DreamBooth: High-Fidelity 3D Subject-Driven Video Generation Model Paper • 2603.18524 • Published 9 days ago • 57