FP8?
Hello! Can you please upload fp8 version of the model? GGUF works slower than fp8, so it makes sense.
- GGUF can be slower due to the need for model decompression, but actual performance also depends on hardware
- He has other quants, no idea why they haven't been in the repo
P.S. I would fail to convert it to fp8, sadly?
There is full model, shouldnt be problem to make one.
I mean if its fp32 .. is it?
Yes, I can probably download diffusers version and cast into fp8, but the model is twice the size of fp16. FP8 version would be great.
https://x.com/AstraliteHeart/status/1982558682362872264 : "GGUFs ... of various sizes" + https://hf.co/purplesmartai/pony-v7-base/blob/main/gguf/comparison.png
Where are they?
@Mescalamba
The model is fp16.
Why not bf16? Would be better
FP8 Scaled if possible would be best I think.
Scaled FP8 soon
https://huggingface.co/silveroxides/pony-v7-base-fp8_scaled-and-GGUF
With identical settings it generates just a black image.
https://huggingface.co/silveroxides/pony-v7-base-fp8_scaled-and-GGUF
With identical settings it generates just a black image.
See my comment in that repo, hybrid fp8 requires https://github.com/silveroxides/ComfyUI_Hybrid-Scaled_fp8-Loader
See my comment in that repo, hybrid fp8 requires https://github.com/silveroxides/ComfyUI_Hybrid-Scaled_fp8-Loader
With this loader, it works. In total, it gives me around a 15% increase in the speed of generations. Not bad, but I expected more.
See my comment in that repo, hybrid fp8 requires https://github.com/silveroxides/ComfyUI_Hybrid-Scaled_fp8-Loader
With this loader, it works. In total, it gives me around a 15% increase in the speed of generations. Not bad, but I expected more.
It would still be nice to have just a normal FP8 Scaled version that doesn't need that node TBH.
@qpqpqpqpqpqp , I don't know why your comment is hidden, but there's still an answer here. I meant it is faster than fp16; in my opinion, GGUF models are garbage at all. Any GGUF models I tried were always slower than fp16, often several times slower, and usually of worse quality than fp8. I think they might be useful if you have a very powerful video card with fairly low VRAM.