Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
rzzhan
's Collections
ThinMQM
ExGRPO
ExGRPO
updated
Oct 3
Model collections trained using ExGRPO.
Upvote
1
rzzhan/ExGRPO-Qwen2.5-Math-7B-Zero
Text Generation
•
8B
•
Updated
10 days ago
•
89
rzzhan/ExGRPO-LUFFY-7B-Continual
Text Generation
•
8B
•
Updated
10 days ago
•
10
rzzhan/ExGRPO-Qwen2.5-7B-Instruct
Text Generation
•
8B
•
Updated
10 days ago
•
7
rzzhan/ExGRPO-Qwen2.5-Math-1.5B-Zero
Text Generation
•
2B
•
Updated
10 days ago
•
33
rzzhan/ExGRPO-Llama3.1-8B-Zero
Text Generation
•
8B
•
Updated
10 days ago
•
11
rzzhan/ExGRPO-Llama3.1-8B-Instruct
Text Generation
•
8B
•
Updated
10 days ago
•
9
ExGRPO: Learning to Reason from Experience
Paper
•
2510.02245
•
Published
Oct 2
•
77
Upvote
1
Share collection
View history
Collection guide
Browse collections