Understanding Reasoning in LLMs through Strategic Information Allocation under Uncertainty Paper • 2603.15500 • Published 2 days ago • 11
Scaling Data Difficulty: Improving Coding Models via Reinforcement Learning on Fresh and Challenging Problems Paper • 2603.07779 • Published 10 days ago • 5
Breaking Training Bottlenecks: Effective and Stable Reinforcement Learning for Coding Models Paper • 2603.07777 • Published 10 days ago • 5
Sparse-BitNet: 1.58-bit LLMs are Naturally Friendly to Semi-Structured Sparsity Paper • 2603.05168 • Published 13 days ago • 4
Scaling Agentic Capabilities, Not Context: Efficient Reinforcement Finetuning for Large Toolspaces Paper • 2603.06713 • Published 13 days ago • 15
Learning When to Act or Refuse: Guarding Agentic Reasoning Models for Safe Multi-Step Tool Use Paper • 2603.03205 • Published 15 days ago • 11
Self-Hinting Language Models Enhance Reinforcement Learning Paper • 2602.03143 • Published Feb 3 • 30
Training LLMs for Divide-and-Conquer Reasoning Elevates Test-Time Scalability Paper • 2602.02477 • Published Feb 2 • 11
Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge Paper • 2601.08808 • Published Jan 13 • 39