Transformers
PyTorch
English
llama
reward model
RLHF
RLAIF
text-generation-inference

Commit History

Update README.md
9dcf3eb

banghua commited on

Delete global_step1400
de54e9f

banghua commited on

Create README.md
126c676

banghua commited on

Duplicate from banghua/n_rm
6f8f5dc

Banghua Zhu commited on