4 bit (UINT4 with SVD rank 32) quantization of black-forest-labs/FLUX.1-Kontext-dev using SDNQ.

Usage:

pip install git+https://github.com/Disty0/sdnq
import torch
import diffusers
from diffusers.utils import load_image
from sdnq import SDNQConfig # import sdnq to register it into diffusers and transformers

pipe = diffusers.FluxKontextPipeline.from_pretrained("Disty0/FLUX.1-Kontext-dev-SDNQ-uint4-svd-r32", torch_dtype=torch.bfloat16)
pipe.enable_model_cpu_offload()

input_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png")
image = pipe(
    image=input_image,
    prompt="Add a hat to the cat",
    guidance_scale=2.5,
    generator=torch.manual_seed(0),
).images[0]
image.save("flux-kontext-dev-sdnq-uint4-svd-r32.png.png")

Original BF16 vs SDNQ quantization comparison:

Quantization Model Size Visualization
Input Image - Input Image
Original BF16 23.8 GB Original BF16
SDNQ UINT4 6.8 GB SDNQ UINT4
Downloads last month
33
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Disty0/FLUX.1-Kontext-dev-SDNQ-uint4-svd-r32

Quantized
(15)
this model

Collection including Disty0/FLUX.1-Kontext-dev-SDNQ-uint4-svd-r32