Yang Su
yang-su2000
AI & ML interests
Long-Horizon RL Agent Alignment
Recent Activity
liked
a dataset
about 21 hours ago
openai/gdpval
liked
a dataset
about 2 months ago
Agent-Ark/Toucan-1.5M
new activity
7 months ago
Qwen/Qwen3-32B:The correct way of fine-tuning on multi-turn trajectories