YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

This is the official QAT FP-Quant checkpoint of meta-llama/Llama-3.1-8B-Instruct, produced as described in the "Bridging the Gap Between Promise and Performance for Microscaling FP4 Quantization" paper.

This model can be run on Blackwell-generation NVIDIA GPUs via QuTLASS and FP-Quant in either transformers or vLLM.

The approximate recipe for training this model (up to local batch size and LR) is available here.

This checkpoint has the following performance relative to the original model and the RTN quantization:

Model MMLU GSM8k Hellaswag Winogrande Avg
meta-llama/Llama-3.1-8B-Instruct 72.8 85.1 80.0 77.9 78.9
RTN 67.0 77.4 77.3 74.4 74.0
QAT (THIS) 68.9 81.6 79.0 75.1 76.1
Downloads last month
43
Safetensors
Model size
5B params
Tensor type
BF16
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including ISTA-DASLab/Llama-3.1-8B-Instruct-FPQuant-QAT-NVFP4