--- license: apache-2.0 base_model: - vladmandic/Qwen-Lightning base_model_relation: quantized library_name: diffusers tags: - sdnq - qwen_image - 4-bit --- 4 bit (UINT4 with SVD rank 32) quantization of [vladmandic/Qwen-Lightning](https://huggingface.co/vladmandic/Qwen-Lightning) using [SDNQ](https://github.com/vladmandic/sdnext/wiki/SDNQ-Quantization). Usage: ``` pip install git+https://github.com/Disty0/sdnq ``` ```py import torch import diffusers from sdnq import SDNQConfig # import sdnq to register it into diffusers and transformers pipe = diffusers.QwenImagePipeline.from_pretrained("Disty0/Qwen-Image-Lightning-SDNQ-uint4-svd-r32", torch_dtype=torch.bfloat16) pipe.enable_model_cpu_offload() prompt = "a tiny astronaut hatching from an egg on the moon, Ultra HD, 4K, cinematic composition." negative_prompt = " " image = pipe( prompt=prompt, negative_prompt=negative_prompt, width=1024, height=1024, num_inference_steps=8, true_cfg_scale=1.0, generator=torch.manual_seed(0), ).images[0] image.save("qwen-image-lightning-sdnq-uint4-svd-r32.png") ``` Original BF16 vs SDNQ quantization comparison: | Quantization | Model Size | Visualization | | --- | --- | --- | | Original BF16 | 40.9 GB | ![Original BF16](https://cdn-uploads.huggingface.co/production/uploads/6456af6195082f722d178522/OQ9vhQij2b4tBMxzlOa4e.png) | | SDNQ UINT4 | 11.6 GB | ![SDNQ UINT4](https://cdn-uploads.huggingface.co/production/uploads/6456af6195082f722d178522/i803Rv8HAwhi8b0H1J-jU.png) |