YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

This is the official QAT FP-Quant checkpoint of meta-llama/Llama-3.1-8B-Instruct, produced as described in the "Bridging the Gap Between Promise and Performance for Microscaling FP4 Quantization" paper.

This model can be run on Blackwell-generation NVIDIA GPUs via QuTLASS and FP-Quant in either transformers or vLLM.

The approximate recipe for training this model (up to local batch size and LR) is available here.

This checkpoint has the following performance relative to the original model and the RTN quantization:

Model	MMLU	GSM8k	Hellaswag	Winogrande	Avg
`meta-llama/Llama-3.1-8B-Instruct`	72.8	85.1	80.0	77.9	78.9
RTN	67.0	77.4	77.3	74.4	74.0
QAT (THIS)	68.9	81.6	79.0	75.1	76.1

Safetensors

Model size

5B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including ISTA-DASLab/Llama-3.1-8B-Instruct-FPQuant-QAT-NVFP4