-
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 463 -
SpikingBrain Technical Report: Spiking Brain-inspired Large Models
Paper • 2509.05276 • Published • 3 -
Self-Adapting Language Models
Paper • 2506.10943 • Published • 6 -
The Art of Scaling Reinforcement Learning Compute for LLMs
Paper • 2510.13786 • Published • 30
Collections
Discover the best community collections!
Collections including paper arxiv:2506.10943
-
Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Paper • 2312.06585 • Published • 29 -
Enable Language Models to Implicitly Learn Self-Improvement From Data
Paper • 2310.00898 • Published • 23 -
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent
Paper • 2312.10003 • Published • 44 -
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
Paper • 2401.01335 • Published • 68
-
microsoft/Phi-4-mini-flash-reasoning
Text Generation • 4B • Updated • 2.76k • 246 -
Promptomatix: An Automatic Prompt Optimization Framework for Large Language Models
Paper • 2507.14241 • Published • 17 -
Kosmos-2.5: A Multimodal Literate Model
Paper • 2309.11419 • Published • 55 -
Self-Adapting Language Models
Paper • 2506.10943 • Published • 6
-
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 463 -
SpikingBrain Technical Report: Spiking Brain-inspired Large Models
Paper • 2509.05276 • Published • 3 -
Self-Adapting Language Models
Paper • 2506.10943 • Published • 6 -
The Art of Scaling Reinforcement Learning Compute for LLMs
Paper • 2510.13786 • Published • 30
-
microsoft/Phi-4-mini-flash-reasoning
Text Generation • 4B • Updated • 2.76k • 246 -
Promptomatix: An Automatic Prompt Optimization Framework for Large Language Models
Paper • 2507.14241 • Published • 17 -
Kosmos-2.5: A Multimodal Literate Model
Paper • 2309.11419 • Published • 55 -
Self-Adapting Language Models
Paper • 2506.10943 • Published • 6
-
Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Paper • 2312.06585 • Published • 29 -
Enable Language Models to Implicitly Learn Self-Improvement From Data
Paper • 2310.00898 • Published • 23 -
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent
Paper • 2312.10003 • Published • 44 -
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
Paper • 2401.01335 • Published • 68