RaushanTurganbay
/

reward_model_deberta_large_Anthropic_hh

Text Classification

Model card Files Files and versions

RaushanTurganbay HF Staff commited on Dec 2, 2023

Commit

61a8927

·

1 Parent(s): dbfbc30

Update README.md

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -7,7 +7,8 @@ language:
 pipeline_tag: text-classification
 ---
-deberta-large-v3-reward-model: A reward model trained on deberta-large-v3 using Anthropic-hh dataset.
 To use this model for reward scoring:
 ```python

 pipeline_tag: text-classification
 ---
+A reward model trained on deberta-large-v3 using Anthropic-hh dataset. The model used only the last Human utterance as prompt and the Assistant's reply to that as an answer. It achieves an accuracy of 87% on this dataset.
 To use this model for reward scoring:
 ```python