FP8 weights
#41
by
						
chriswritescode
	
							
						- opened
							
					
Push a FP8 release? looks like llmcompressor does not support the arch yet.
Has anyone gotten this to convert ?
@getfit: Thanks for your question! We used the llmcompressor recipe to create the FP8 checkpoint for Maverick here: https://huggingface.co/meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8.
We'll confirm with the team with ETA of adding FP8 and INT4 for Scout. cc: 
@wukaixingxp
	 
@Hamid-Nazeri
	 
Just found this model for anyone looking
https://huggingface.co/RedHatAI/Llama-4-Scout-17B-16E-Instruct-FP8-dynamic
It should work with vLLM but haven't tested it yet