CobraMamba
/

mamba-gpt-3b-v2

Text Generation

large language model

text-generation-inference

Model card Files Files and versions

chiliu commited on Jul 28, 2023

Commit

13d7a7a

·

1 Parent(s): 935f4d9

add benchmark

Files changed (2) hide show

.gitattributes +1 -0
README.md +16 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+pytorch_model.bin filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -13,6 +13,22 @@ license: apache-2.0
 ---
 # Model Card
 ## Summary
 We have fine-tuned the open-lama model and surpassed the original model in multiple evaluation subtasks, making it currently the best performing 3B model with comparable performance to llama-7b

 ---
 # Model Card
+** The Best 3B Model! Surpassing dolly-v2-12b **
+The best 3B model on the [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard), with performance surpassing dolly-v2-12b
+| Metric                | Value |
+|-----------------------|-------|
+| MMLU (5-shot)         | 27.1  |
+| ARC (25-shot)         | 42.2  |
+| HellaSwag (10-shot)   | 71.5  |
+| TruthfulQA (0-shot)   | 36.7  |
+| Avg.                  | 44.4  |
+We use state-of-the-art [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) to run the benchmark tests above.
 ## Summary
 We have fine-tuned the open-lama model and surpassed the original model in multiple evaluation subtasks, making it currently the best performing 3B model with comparable performance to llama-7b