Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
DPO-RM
/
Qwen2.5-Math-1.5B-prime-no_logSoftmax_refRM-beta1-eurus_rl_15k-step110-reward
like
0
Follow
DPO-RM
1
Safetensors
qwen2
Model card
Files
Files and versions
xet
Community
main
Qwen2.5-Math-1.5B-prime-no_logSoftmax_refRM-beta1-eurus_rl_15k-step110-reward
Commit History
Add files using upload-large-folder tool
b7e7e91
verified
FlippyDora
commited on
May 5
initial commit
aa67ca3
verified
FlippyDora
commited on
May 5