LLaMA 3.7B - Bfloat16
This is one of the checkpoints supplementing the paper 1-Bit-Wonder: Improving QAT Performance in the Low-Bit Regime through K-Means Quantization. Instructions on how to use the model for inference can be found in the corresponding repository.
β οΈ IMPORTANT: This model is intended for research purposes only. It is provided as-is without warranties for production use.
Model Details
- Architecture: LLaMA
- Size: 3.7B (3,747,523,584 parameters)
Directory Structure
.
βββ config.json # HuggingFace model config
βββ generation_config.json # Default generation settings
βββ tokenizer.json # Tokenizer files
βββ model.safetensors # Weights (in Bfloat16)
License
See LICENSE file in the repository.
- Downloads last month
- 22
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support