DFT
					Collection
				
				6 items
				• 
				Updated
					
				•
					
					2
This model was presented in the paper On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification.
Code: https://github.com/yongliang-wu/DFT?tab=readme-ov-file