llama-2-13b-hf-smooth

This model has SmoothQuant smoothing applied. No quantization has been applied.

Smoothing Configuration

Parameter Value
Base model meta-llama/Llama-2-13b-hf
Smoothing alpha(migration strength) 0.85
Act scales source mit-han-lab/smoothquant-scales
Quantization None (smoothing only)

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("jsyeom/llama-2-13b-hf-smooth")
tokenizer = AutoTokenizer.from_pretrained("jsyeom/llama-2-13b-hf-smooth")
Downloads last month
1
Safetensors
Model size
13B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for jsyeom/llama-2-13b-hf-smooth

Finetuned
(60)
this model