-
Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models
Paper • 2602.12036 • Published • 95 -
Reinforcement Learning for Self-Improving Agent with Skill Library
Paper • 2512.17102 • Published • 36 -
Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation
Paper • 2512.23705 • Published • 45 -
Schoenfeld's Anatomy of Mathematical Reasoning by Language Models
Paper • 2512.19995 • Published • 16
Collections
Discover the best community collections!
Collections including paper arxiv:2512.17102
-
O-Mem: Omni Memory System for Personalized, Long Horizon, Self-Evolving Agents
Paper • 2511.13593 • Published • 27 -
OmniScientist: Toward a Co-evolving Ecosystem of Human and AI Scientists
Paper • 2511.16931 • Published • 8 -
General Agentic Memory Via Deep Research
Paper • 2511.18423 • Published • 167 -
MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling
Paper • 2511.11793 • Published • 187
-
Scaling Agent Learning via Experience Synthesis
Paper • 2511.03773 • Published • 82 -
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration
Paper • 2511.21689 • Published • 125 -
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization
Paper • 2601.05242 • Published • 228 -
Reinforcement Learning for Self-Improving Agent with Skill Library
Paper • 2512.17102 • Published • 36
-
Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models
Paper • 2602.12036 • Published • 95 -
Reinforcement Learning for Self-Improving Agent with Skill Library
Paper • 2512.17102 • Published • 36 -
Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation
Paper • 2512.23705 • Published • 45 -
Schoenfeld's Anatomy of Mathematical Reasoning by Language Models
Paper • 2512.19995 • Published • 16
-
O-Mem: Omni Memory System for Personalized, Long Horizon, Self-Evolving Agents
Paper • 2511.13593 • Published • 27 -
OmniScientist: Toward a Co-evolving Ecosystem of Human and AI Scientists
Paper • 2511.16931 • Published • 8 -
General Agentic Memory Via Deep Research
Paper • 2511.18423 • Published • 167 -
MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling
Paper • 2511.11793 • Published • 187
-
Scaling Agent Learning via Experience Synthesis
Paper • 2511.03773 • Published • 82 -
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration
Paper • 2511.21689 • Published • 125 -
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization
Paper • 2601.05242 • Published • 228 -
Reinforcement Learning for Self-Improving Agent with Skill Library
Paper • 2512.17102 • Published • 36