chargoddard
/

llama2-22b

Text Generation

text-generation-inference

Model card Files Files and versions

llama2-22b / README.md

chargoddard's picture

Adding Evaluation Results (#6)

2780222 almost 2 years ago

|

history blame contribute delete

1.16 kB

	---
	model_type: llama
	pipeline_tag: text-generation
	datasets:
	- togethercomputer/RedPajama-Data-1T-Sample
	tags:
	- llama
	---

	This is [Llama 2 13b](https://huggingface.co/meta-llama/Llama-2-13b-hf) with some additional attention heads from original-flavor Llama 33b frankensteined on.

	Fine-tuned on ~10M tokens from RedPajama to settle in the transplants a little.

	Not intended for use as-is - this model is meant to serve as a base for further tuning, hopefully with a greater capacity for learning than 13b.
	# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
	Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_chargoddard__llama2-22b)

	\| Metric \| Value \|
	\|-----------------------\|---------------------------\|
	\| Avg. \| 46.85 \|
	\| ARC (25-shot) \| 58.53 \|
	\| HellaSwag (10-shot) \| 82.55 \|
	\| MMLU (5-shot) \| 54.68 \|
	\| TruthfulQA (0-shot) \| 39.84 \|
	\| Winogrande (5-shot) \| 76.32 \|
	\| GSM8K (5-shot) \| 9.93 \|
	\| DROP (3-shot) \| 6.08 \|