4 bit (UINT4 with SVD rank 32) quantization of vladmandic/Qwen-Lightning using SDNQ.

Usage:

pip install git+https://github.com/Disty0/sdnq
import torch
import diffusers
from sdnq import SDNQConfig # import sdnq to register it into diffusers and transformers

pipe = diffusers.QwenImagePipeline.from_pretrained("Disty0/Qwen-Image-Lightning-SDNQ-uint4-svd-r32", torch_dtype=torch.bfloat16)
pipe.enable_model_cpu_offload()

prompt = "a tiny astronaut hatching from an egg on the moon, Ultra HD, 4K, cinematic composition."
negative_prompt = " "
image = pipe(
    prompt=prompt,
    negative_prompt=negative_prompt,
    width=1024,
    height=1024,
    num_inference_steps=8,
    true_cfg_scale=1.0,
    generator=torch.manual_seed(0),
).images[0]

image.save("qwen-image-lightning-sdnq-uint4-svd-r32.png")

Original BF16 vs SDNQ quantization comparison:

Quantization Model Size Visualization
Original BF16 40.9 GB Original BF16
SDNQ UINT4 11.6 GB SDNQ UINT4
Downloads last month
17
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Disty0/Qwen-Image-Lightning-SDNQ-uint4-svd-r32

Base model

Qwen/Qwen-Image
Finetuned
(1)
this model

Collection including Disty0/Qwen-Image-Lightning-SDNQ-uint4-svd-r32