File size: 519 Bytes
a146e9d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
---
base_model:
- meta-llama/Meta-Llama-3-8B
base_model_relation: quantized
license: llama3
---
# Model Card
- Base model: `meta-llama/Meta-Llama-3-8B`
- Quantization method: SqueezeLLM
- Target bit-width: 3
- Backend kernel: Any-Precision-LLM kernel (`ap-gemv`)
- Calibration data: RedPajama (1024 sentences / 4096 tokens)
- Calibration objective: Next-token prediction
# How to run
- Follow the instruction in https://github.com/snu-mllab/GuidedQuant.
# References
- [Model Paper](https://arxiv.org/abs/2505.07004) |