H's picture

1 8

H

SunSwallow

AI & ML interests

None yet

Recent Activity

upvoted a paper 9 days ago

V-ReasonBench: Toward Unified Reasoning Benchmark Suite for Video Generation Models

upvoted a paper about 2 months ago

Agent Learning via Early Experience

upvoted a paper about 2 months ago

Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning

View all activity

Organizations

None yet

upvoted a paper 9 days ago

V-ReasonBench: Toward Unified Reasoning Benchmark Suite for Video Generation Models

Paper • 2511.16668 • Published 10 days ago • 52

upvoted 3 papers about 2 months ago

Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9 • 265

Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning

Paper • 2509.22601 • Published Sep 26 • 29

Training-Free Group Relative Policy Optimization

Paper • 2510.08191 • Published Oct 9 • 44

upvoted a paper 2 months ago

From Uniform to Heterogeneous: Tailoring Policy Optimization to Every Token's Nature

Paper • 2509.16591 • Published Sep 20 • 2

upvoted a paper 3 months ago

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2 • 224

upvoted a paper 4 months ago

WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent

Paper • 2508.05748 • Published Aug 7 • 139

upvoted a collection 4 months ago

OpenMathReasoning

Models and datasets from "AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset" • 7 items • Updated 6 days ago • 45