OpenGVLab/InternVL3_5-GPT-OSS-20B-A4B-Preview Image-Text-to-Text • 0.4B • Updated Aug 29 • 37.6k • 74
view article Article Simplifying Alignment: From RLHF to Direct Preference Optimization (DPO) By ariG23498 • Jan 19 • 32