Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
DPO-RM
/
Qwen2.5-Math-1.5B-prime-no_logSoftmax_refRM-beta1-eurus_rl_15k-step110-reward
like
0
Follow
DPO-RM
1
Safetensors
qwen2
Model card
Files
Files and versions
xet
Community
main
Qwen2.5-Math-1.5B-prime-no_logSoftmax_refRM-beta1-eurus_rl_15k-step110-reward
/
vocab.json
FlippyDora
Add files using upload-large-folder tool
b7e7e91
verified
7 months ago
raw
Copy download link
history
contribute
delete
Safe
2.78 MB
File too large to display, you can
check the raw version
instead.