The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding Paper โข 2512.19693 โข Published 4 days ago โข 60
OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe Paper โข 2511.16334 โข Published Nov 20 โข 91
Scaling Spatial Intelligence with Multimodal Foundation Models Paper โข 2511.13719 โข Published Nov 17 โข 45
PhysX-Anything: Simulation-Ready Physical 3D Assets from Single Image Paper โข 2511.13648 โข Published Nov 17 โข 52
First Try Matters: Revisiting the Role of Reflection in Reasoning Models Paper โข 2510.08308 โข Published Oct 9 โข 24
VChain: Chain-of-Visual-Thought for Reasoning in Video Generation Paper โข 2510.05094 โข Published Oct 6 โข 37
4DNeX: Feed-Forward 4D Generative Modeling Made Easy Paper โข 2508.13154 โข Published Aug 18 โข 62
GoT-R1: Unleashing Reasoning Capability of MLLM for Visual Generation with Reinforcement Learning Paper โข 2505.17022 โข Published May 22 โข 27
Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models Paper โข 2505.10554 โข Published May 15 โข 120
Even Small Reasoners Should Quote Their Sources: Introducing the Pleias-RAG Model Family Paper โข 2504.18225 โข Published Apr 25 โข 13