llama-2-13b-hf-smooth

This model has SmoothQuant smoothing applied. No quantization has been applied.

Smoothing Configuration

Parameter	Value
Base model	`meta-llama/Llama-2-13b-hf`
Smoothing alpha(migration strength)	`0.85`
Act scales source	`mit-han-lab/smoothquant-scales`
Quantization	None (smoothing only)

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("jsyeom/llama-2-13b-hf-smooth")
tokenizer = AutoTokenizer.from_pretrained("jsyeom/llama-2-13b-hf-smooth")

Downloads last month: 1

Safetensors

Model size

13B params

Tensor type

F16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for jsyeom/llama-2-13b-hf-smooth

Base model

meta-llama/Llama-2-13b-hf

Finetuned

(60)

this model