Bf16.. VRAM

#203

by deditz111 - opened 3 days ago

3 days ago

Now that mega12 using bf16, is it gonna need more vram?, I'm currently using v10 on a 12gb vram, rtx 5070ti laptop gpu and it uses 100% of my gpu but I'm using it without any issue (hardware aspect), i want to try the mega12 but idk I'll be able to use it or not and I don't know much about these things that's why I'm asking

qpqpqpqpqpqp

3 days ago

No, you can run it in lower precision

Phr00t

Owner 3 days ago

Mega v12 is still fp8. I only start with bf16 to perform merging and then save in fp8.

jerrydev11

3 days ago

Is there a performance difference between the full bf16 merge and the one saved in fp8? I'm assuming there is, but just wondering how much big of a difference it is. Do you mind posting the bf16 version for those of us that are able to run it just for testing?

qpqpqpqpqpqp

2 days ago

@jerrydev11 bf16 would give higher quality, but the model has enough params not to suck while being quantized
https://huggingface.co/Phr00t/WAN2.2-14B-Rapid-AllInOne/discussions/10 The op used to make fp16 versions of Wan Rapid

diveren2010

2 days ago

Bummer that fp8 is not supported on MAC (mps). fp16 and bf16 (a bit faster) is.
Only solution is to use quants.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment