Repo for paper Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability.
Qihan Ren
jasonrqh
AI & ML interests
explainable AI, LLM
Recent Activity
upvoted a paper about 2 hours ago
Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs? commentedon a paper about 17 hours ago
Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability