High-quality QAT FP4 models to use with the fp_quant vLLM/Transformers integration on Blackwell NVIDIA GPUs. See https://arxiv.org/abs/2509.23202
AI & ML interests
None defined yet.
Recent Activity
View all activity
Papers
Bridging the Gap Between Promise and Performance for Microscaling FP4 Quantization
The Geometry of LLM Quantization: GPTQ as Babai's Nearest Plane Algorithm
models
135
ISTA-DASLab/NVIDIA-Nemotron-Nano-9B-v2-W4A4-nvfp4-gptq-identity-transform-actorder
7B
•
Updated
ISTA-DASLab/NVIDIA-Nemotron-Nano-9B-v2-W4A4-nvfp4-gptq-identity-transform
7B
•
Updated
ISTA-DASLab/NVIDIA-Nemotron-Nano-9B-v2-W4A4-nvfp4-gptq-hadamard-transform
7B
•
Updated
ISTA-DASLab/NVIDIA-Nemotron-Nano-9B-v2-W4A4-nvfp4-gptq-hadamard-transform-actorder
7B
•
Updated
ISTA-DASLab/NVIDIA-Nemotron-Nano-9B-v2-W4A4-mxfp4-gptq-hadamard-transform
7B
•
Updated
ISTA-DASLab/NVIDIA-Nemotron-Nano-9B-v2-W4A4-mxfp4-gptq-identity-transform
7B
•
Updated
ISTA-DASLab/Qwen3-8B-FPQuant-QAT-NVFP4
5B
•
Updated
•
31
ISTA-DASLab/Qwen3-8B-FPQuant-QAT-MXFP4
5B
•
Updated
•
56
ISTA-DASLab/Llama-3.1-8B-Instruct-FPQuant-QAT-NVFP4
5B
•
Updated
•
43
ISTA-DASLab/Llama-3.1-8B-Instruct-FPQuant-QAT-MXFP4
5B
•
Updated
•
38