Hybrid-Sensitivity-Weighted-Quantization (HSWQ)
High-fidelity FP8 quantization for diffusion models (Z Image Turbo family). HSWQ uses sensitivity and importance analysis instead of naive uniform cast, and offers two modes: standard-compatible (V1) and high-performance scaled (V2).
Technical details: md/HSWQ_ Hybrid Sensitivity Weighted Quantization.md
How to quantize: How to quantize Z Image.md
Z Image Benchmark Test Results: Z Image Benchmark Test Results.md
Overview
| Feature | V1: Standard Compatible | V2: High Performance Scaled |
|---|---|---|
| Compatibility | Full (100%), any FP8 loader | The scaled model does not perform well in the current ComfyUI. |
| File format | Standard FP8 (torch.float8_e4m3fn) |
Extended FP8 (weights + .scale metadata) |
| Image quality (SSIM) | ~0.96 (theoretical limit) | ~Unable to measure at this time |
| Mechanism | Optimal clipping (smart clipping) | Full-range scaling (dynamic scaling) |
| Use case | Distribution, general users | In-house, max quality, server-side |
File size is reduced by about 60-70% vs FP16 while keeping best quality per use case.
Architecture
Dual Monitor System - During calibration, two metrics are collected:
- Sensitivity (output variance): layers that hurt image quality most if corrupted - top fraction kept in FP16 per keep ratio.
- Importance (input mean absolute value): per-channel contribution - used as weights in the weighted histogram.
Rigorous FP8 Grid Simulation - Uses a physical grid (all 0-255 values cast to
torch.float8_e4m3fn) instead of theoretical formulas, so MSE matches real runtime.Weighted MSE Optimization - Finds parameters that minimize quantization error using the importance histogram.
Modes
- V1 (
scaled=False): No scaling; only the clipping threshold (amax) is optimized. Output is standard FP8 weights. Use when you need maximum compatibility. - V2 (
scaled=True): Weights are scaled to FP8 range, quantized, and inverse scaleSis stored in Safetensors (.scale). Unavailable until a dedicated loader exists.
Recommended Parameters
- Samples: 32 (recommended).
- Keep ratio: 0.05-0.25 (5-25%) - keeps critical layers in FP16; for Z Image Turbo (ZIT), 5-10% often gives sufficient quality.
- Steps: 25 (recommended) - to include early denoising sensitivity.
Benchmark (Reference)
| Model | SSIM (Avg) | File size | Compatibility |
|---|---|---|---|
| Original FP16 | 1.0000 | 100% (6.5GB) | High |
| Naive FP8 | 0.75-0.92 | 50% | High |
| HSWQ V1 | 0.88-0.99 | 60-70% (FP16 mixed) | High |
| HSWQ V2 | Unable to measure at this time | 60-70% (FP16 mixed) | Low (custom loader) |
HSWQ V1 gives a clear gain over Naive FP8 with full compatibility; V2 is unavailable until a dedicated loader exists.
Available Models
Quantized checkpoints use suffix _hswq_r32_r0.05_v1 (R32 calibration samples, keep ratio r0.05, HSWQ v1).
| Filename | Base Model | Version | License |
|---|---|---|---|
darkBeastMar2126Latest_dbzit8SDAFOK_hswq_r32_r0.05_v1.safetensors |
darkBeastMar2126Latest_dbzit8SDAFOK | v8 | Apache 2.0 |
harukiMIX_zit2603_hswq_r32_r0.05_v1.safetensors |
harukiMIX_zit2603 | v2603 | Apache 2.0 |
moodyRealMix_zitV5DPO_hswq_r32_r0.05_v1.safetensors |
moodyRealMix_zitV5DPO | v5 | Apache 2.0 |
moodyRealMix_zitV6DPO_hswq_r32_r0.05_v1.safetensors |
moodyRealMix_zitV6DPO | v6 | Apache 2.0 |
moodyWildMix_v02_hswq_r32_r0.05_v1.safetensors |
moodyWildMix_v02 | v0.2 | Apache 2.0 |
unstableRevolution_V2Fp16_hswq_r32_r0.05_v1.safetensors |
unstableRevolution_V2Fp16 | v2 | Apache 2.0 |
zit_hswq_R32_r0.05_v1.safetensors |
Official Z Image Turbo weights redistributed as Comfy split checkpoints: Comfy-Org/z_image_turbo (diffusion file used for quantization: split_files/diffusion_models/z_image_turbo_bf16.safetensors) |
Turbo official | See upstream repo |
Credits & License
Base Models
These models are derivatives of their respective creators or upstream distributions. All credit for training and aesthetic tuning belongs to the original authors.
- darkBeastMar2126Latest_dbzit8SDAFOK: Created by AiMetatron.
- harukiMIX_zit2603: Created by HARUKI3.
- moodyRealMix_zitV5DPO / moodyRealMix_zitV6DPO: Created by catlover1937 (Moody Real Mix on Civitai).
- moodyWildMix_v02: Created by catlover1937.
- unstableRevolution_V2Fp16: Created by Peli86.
- Official Z Image Turbo (ZIT): Distribution Comfy-Org/z_image_turbo on Hugging Face - follow upstream terms.
Disclaimer: These models are provided for optimization and research purposes. Please adhere to the original licenses of the base models.