Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
wh-zhu 's Collections
PSFT
Realigner-TrRa
Weak-to-Strong
Realigner-InRa

PSFT

updated Aug 28

PSFT+RL models

Upvote
-

  • wh-zhu/Qwen2.5-7B-PSFT-RL-DAPO-90

    8B • Updated Aug 12 • 4

  • wh-zhu/Qwen2.5-7B-Instruct-PSFT-1300

    8B • Updated Jul 26 • 3

  • wh-zhu/Qwen2.5-7B-SFT-RL-DAPO-90

    8B • Updated Aug 13 • 3

  • wh-zhu/Qwen2.5-7B-Instruct-SFT-700

    8B • Updated Jul 26 • 3

  • wh-zhu/llama3.1-8B-PSFT-dapo90

    8B • Updated Aug 13 • 3

  • wh-zhu/Llama3.1-8B-Instruct-PSFT-1500

    8B • Updated Jul 26 • 3

  • wh-zhu/Llama3.1-8B-Instruct-SFT-1200

    8B • Updated Jul 27 • 3

  • wh-zhu/llama3.1-8B-SFT-dapo100

    8B • Updated Aug 14

  • wh-zhu/LLama3.1-8B-Instruct-SFT200warmup-PSFT

    8B • Updated Aug 19

  • wh-zhu/Qwen2.5-7B-Instruct-SFT100warmup-PSFT

    8B • Updated Aug 19
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs