Alan's picture

4 6 4

Alan

wizardII

·

wizard-III

AI & ML interests

RL & LLM

Recent Activity

updated a collection about 1 month ago

updated a model about 1 month ago

Fate-Zero/Archer2.0-Code-1.5B-Preview

upvoted a paper about 1 month ago

ASPO: Asymmetric Importance Sampling Policy Optimization

View all activity

Organizations

updated a collection about 1 month ago

Archer2.0

5 items • Updated Oct 8 • 1

updated a model about 1 month ago

Fate-Zero/Archer2.0-Code-1.5B-Preview

2B • Updated Oct 8 • 2 • 3

upvoted 3 papers about 1 month ago

ASPO: Asymmetric Importance Sampling Policy Optimization

Paper • 2510.06062 • Published Oct 7 • 13

DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

Paper • 2509.25454 • Published Sep 29 • 137

Attention as a Compass: Efficient Exploration for Process-Supervised RL in Reasoning Models

Paper • 2509.26628 • Published Sep 30 • 14

New activity in deepseek-ai/DeepSeek-R1-0528-Qwen3-8B about 2 months ago

This model was distilled using only SFT, or through a combination of SFT and RL?

#23 opened about 2 months ago by

upvoted a paper about 2 months ago

Tree Search for LLM Agent Reinforcement Learning

Paper • 2509.21240 • Published Sep 25 • 87

updated a Space about 2 months ago

README

published a Space about 2 months ago

README

New activity in Fate-Zero/Archer2.0-Code-1.5B-Train_IS_Ratio about 2 months ago

[bot] Conversion to Parquet

#1 opened about 2 months ago by

parquet-converter

updated 2 datasets about 2 months ago

Fate-Zero/Archer2.0-Math-1.5B

Viewer • Updated Sep 26 • 70.8k • 59

Fate-Zero/Archer2.0-Code-1.5B-Train_IS_Ratio

Viewer • Updated Sep 25 • 87.6k • 16

published a dataset about 2 months ago

Fate-Zero/Archer2.0-Code-1.5B-Train_IS_Ratio

Viewer • Updated Sep 25 • 87.6k • 16

liked a model about 2 months ago

Fate-Zero/Archer2.0-Code-1.5B-Preview

2B • Updated Oct 8 • 2 • 3

updated a dataset 2 months ago

Fate-Zero/Archer2.0-Code-1.5B

Viewer • Updated Sep 8 • 8.87k • 100

updated a collection 2 months ago

Archer2.0

5 items • Updated Oct 8 • 1

published a dataset 2 months ago

Fate-Zero/Archer2.0-Math-1.5B

Viewer • Updated Sep 26 • 70.8k • 59

updated 2 collections 2 months ago

Archer1.0

4 items • Updated Sep 9

Archer2.0

5 items • Updated Oct 8 • 1

published a model 2 months ago

Fate-Zero/Archer2.0-Math-1.5B-Preview