Beyond Multiple Choice: Verifiable OpenQA for Robust Vision-Language RFT Paper • 2511.17405 • Published 13 days ago • 10
SofT-GRPO: Surpassing Discrete-Token LLM Reinforcement Learning via Gumbel-Reparameterized Soft-Thinking Policy Optimization Paper • 2511.06411 • Published 25 days ago • 16
Generating an Image From 1,000 Words: Enhancing Text-to-Image With Structured Captions Paper • 2511.06876 • Published 24 days ago • 26
Routing Manifold Alignment Improves Generalization of Mixture-of-Experts LLMs Paper • 2511.07419 • Published 24 days ago • 25
IterResearch: Rethinking Long-Horizon Agents via Markovian State Reconstruction Paper • 2511.07327 • Published 24 days ago • 73
HaluMem: Evaluating Hallucinations in Memory Systems of Agents Paper • 2511.03506 • Published 29 days ago • 92
Grounding Computer Use Agents on Human Demonstrations Paper • 2511.07332 • Published 24 days ago • 103
Adaptive Multi-Agent Response Refinement in Conversational Systems Paper • 2511.08319 • Published 23 days ago • 40
VideoSSR: Video Self-Supervised Reinforcement Learning Paper • 2511.06281 • Published 25 days ago • 24
The Path Not Taken: RLVR Provably Learns Off the Principals Paper • 2511.08567 • Published 23 days ago • 31
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B Paper • 2511.06221 • Published 25 days ago • 127
Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds Paper • 2511.08892 • Published 22 days ago • 192
SliderEdit: Continuous Image Editing with Fine-Grained Instruction Control Paper • 2511.09715 • Published 22 days ago • 8
Black-Box On-Policy Distillation of Large Language Models Paper • 2511.10643 • Published 21 days ago • 46
PAN: A World Model for General, Interactable, and Long-Horizon World Simulation Paper • 2511.09057 • Published 22 days ago • 74
Depth Anything 3: Recovering the Visual Space from Any Views Paper • 2511.10647 • Published 21 days ago • 91
SpatialThinker: Reinforcing 3D Reasoning in Multimodal LLMs via Spatial Rewards Paper • 2511.07403 • Published 24 days ago • 13
LiteAttention: A Temporal Sparse Attention for Diffusion Transformers Paper • 2511.11062 • Published 20 days ago • 30