SRFT: A Single-Stage Method with Supervised and Reinforcement Fine-Tuning for Reasoning Paper • 2506.19767 • Published Jun 24 • 15
Recent highlights Collection Some recent models worth checking out • 18 items • Updated Nov 1, 2024 • 54