MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training Recipe Paper • 2509.18154 • Published Sep 16 • 49
MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization Paper • 2510.08540 • Published 25 days ago • 108
RLPR: Extrapolating RLVR to General Domains without Verifiers Paper • 2506.18254 • Published Jun 23 • 31
LLMtimesMapReduce: Simplified Long-Sequence Processing using Large Language Models Paper • 2410.09342 • Published Oct 12, 2024 • 39