jusjinuk
/

Meta-Llama-3-8B-3bit-SqueezeLLM

Model card Files Files and versions

Meta-Llama-3-8B-3bit-SqueezeLLM / README.md

jusjinuk's picture

Create README.md

a146e9d verified 5 months ago

|

history blame contribute delete

519 Bytes

	---
	base_model:
	- meta-llama/Meta-Llama-3-8B
	base_model_relation: quantized
	license: llama3
	---
	# Model Card

	- Base model: `meta-llama/Meta-Llama-3-8B`
	- Quantization method: SqueezeLLM
	- Target bit-width: 3
	- Backend kernel: Any-Precision-LLM kernel (`ap-gemv`)
	- Calibration data: RedPajama (1024 sentences / 4096 tokens)
	- Calibration objective: Next-token prediction

	# How to run
	- Follow the instruction in https://github.com/snu-mllab/GuidedQuant.

	# References
	- [Model Paper](https://arxiv.org/abs/2505.07004)