FP8?

by notafraud - opened 9 days ago

Discussion

notafraud

9 days ago

Hello! Can you please upload fp8 version of the model? GGUF works slower than fp8, so it makes sense.

qpqpqpqpqpqp

8 days ago

GGUF can be slower due to the need for model decompression, but actual performance also depends on hardware
He has other quants, no idea why they haven't been in the repo
P.S. I would fail to convert it to fp8, sadly?

Mescalamba

8 days ago

There is full model, shouldnt be problem to make one.

I mean if its fp32 .. is it?

notafraud

8 days ago

Yes, I can probably download diffusers version and cast into fp8, but the model is twice the size of fp16. FP8 version would be great.

qpqpqpqpqpqp

8 days ago

https://x.com/AstraliteHeart/status/1982558682362872264 : "GGUFs ... of various sizes" + https://hf.co/purplesmartai/pony-v7-base/blob/main/gguf/comparison.png
Where are they?
@Mescalamba
The model is fp16.

Why not bf16? Would be better

DiffusionFanatic1

7 days ago

FP8 Scaled if possible would be best I think.

astralite-heart

PurpleSmartAI, INC org 7 days ago

Scaled FP8 soon

qpqpqpqpqpqp

7 days ago

@notafraud @astralite-heart @DiffusionFanatic1
https://huggingface.co/silveroxides/pony-v7-base-fp8_scaled-and-GGUF

DemonAlone

5 days ago

https://huggingface.co/silveroxides/pony-v7-base-fp8_scaled-and-GGUF

With identical settings it generates just a black image.

notafraud

5 days ago

https://huggingface.co/silveroxides/pony-v7-base-fp8_scaled-and-GGUF

With identical settings it generates just a black image.

See my comment in that repo, hybrid fp8 requires https://github.com/silveroxides/ComfyUI_Hybrid-Scaled_fp8-Loader

DemonAlone

5 days ago

See my comment in that repo, hybrid fp8 requires https://github.com/silveroxides/ComfyUI_Hybrid-Scaled_fp8-Loader

With this loader, it works. In total, it gives me around a 15% increase in the speed of generations. Not bad, but I expected more.

qpqpqpqpqpqp

5 days ago

This comment has been hidden (marked as Resolved)

DiffusionFanatic1

3 days ago

See my comment in that repo, hybrid fp8 requires https://github.com/silveroxides/ComfyUI_Hybrid-Scaled_fp8-Loader

With this loader, it works. In total, it gives me around a 15% increase in the speed of generations. Not bad, but I expected more.

It would still be nice to have just a normal FP8 Scaled version that doesn't need that node TBH.

DemonAlone

3 days ago

•

edited 3 days ago

@qpqpqpqpqpqp , I don't know why your comment is hidden, but there's still an answer here. I meant it is faster than fp16; in my opinion, GGUF models are garbage at all. Any GGUF models I tried were always slower than fp16, often several times slower, and usually of worse quality than fp8. I think they might be useful if you have a very powerful video card with fairly low VRAM.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

FP8?

https://x.com/AstraliteHeart/status/1982558682362872264 : "GGUFs ... of various sizes" + https://hf.co/purplesmartai/pony-v7-base/blob/main/gguf/comparison.pngWhere are they? @Mescalamba The model is fp16.

https://x.com/AstraliteHeart/status/1982558682362872264 : "GGUFs ... of various sizes" + https://hf.co/purplesmartai/pony-v7-base/blob/main/gguf/comparison.png
Where are they?
@Mescalamba
The model is fp16.