FP8 weights

#41

by chriswritescode - opened Apr 7

Apr 7

Push a FP8 release? looks like llmcompressor does not support the arch yet.

Apr 7

Has anyone gotten this to convert ?

Meta Llama org Apr 7

@getfit: Thanks for your question! We used the llmcompressor recipe to create the FP8 checkpoint for Maverick here: https://huggingface.co/meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8.
We'll confirm with the team with ETA of adding FP8 and INT4 for Scout. cc: @wukaixingxp @Hamid-Nazeri

Apr 10

@yecharlotteqi Are there any updates on this?

Apr 10

Just found this model for anyone looking

It should work with vLLM but haven't tested it yet

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment