Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled Image-Text-to-Text • 28B • Updated 8 days ago • 585k • 2.62k
view article Article DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge Feb 7, 2025 • 286
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 Dec 1, 2025 • 307