flux_qint_8bit

Pre-quantized FLUX models using optimum-quanto for memory-efficient inference on consumer hardware.

Summary

Metric Value
Total Models 8
Total Size 146.3 GB
Quantization qint8 (8-bit integer)
Platforms MPS, CUDA, CPU

Available Quantizations

Model Transformer Text Encoder Path
FLUX.1 Canny [dev] βœ… 11.09 GB βœ… 4.56 GB flux-1-canny-dev/
FLUX.1 Depth [dev] βœ… 11.09 GB βœ… 4.56 GB flux-1-depth-dev/
FLUX.1 Fill [dev] βœ… 11.09 GB βœ… 4.56 GB flux-1-fill-dev/
FLUX.1 Kontext [dev] βœ… 11.09 GB βœ… 4.56 GB flux-1-kontext-dev/
FLUX.1 [dev] βœ… 11.09 GB βœ… 4.56 GB flux-1-dev/
FLUX.1 [schnell] βœ… 11.08 GB βœ… 4.56 GB flux-1-schnell/
FLUX.2 [dev] βœ… 30.02 GB βœ… 22.37 GB flux-2-dev/

Model Details

FLUX.1 Canny [dev]

Source: black-forest-labs/FLUX.1-Canny-dev

Pipeline: FluxControlPipeline

Use case: 12B canny edge-guided generation model

Component Params Size Path
Transformer 12.0B 11.09 GB flux-1-canny-dev/transformer/qint8
Text Encoder (T5-XXL) 4.7B 4.56 GB flux-1-canny-dev/text_encoder/qint8

FLUX.1 Depth [dev]

Source: black-forest-labs/FLUX.1-Depth-dev

Pipeline: FluxControlPipeline

Use case: 12B depth-guided generation model

Component Params Size Path
Transformer 12.0B 11.09 GB flux-1-depth-dev/transformer/qint8
Text Encoder (T5-XXL) 4.7B 4.56 GB flux-1-depth-dev/text_encoder/qint8

FLUX.1 Fill [dev]

Source: black-forest-labs/FLUX.1-Fill-dev

Pipeline: FluxFillPipeline

Use case: 12B inpainting/outpainting model

Component Params Size Path
Transformer 12.0B 11.09 GB flux-1-fill-dev/transformer/qint8
Text Encoder (T5-XXL) 4.7B 4.56 GB flux-1-fill-dev/text_encoder/qint8

FLUX.1 Kontext [dev]

Source: black-forest-labs/FLUX.1-Kontext-dev

Pipeline: FluxKontextPipeline

Use case: 12B image editing model (in-context generation)

Component Params Size Path
Transformer 12.0B 11.09 GB flux-1-kontext-dev/transformer/qint8
Text Encoder (T5-XXL) 4.7B 4.56 GB flux-1-kontext-dev/text_encoder/qint8

FLUX.1 [dev]

Source: black-forest-labs/FLUX.1-dev

Pipeline: FluxPipeline

Use case: 12B high-quality generation model (guidance distilled)

Component Params Size Path
Transformer 12.0B 11.09 GB flux-1-dev/transformer/qint8
Text Encoder (T5-XXL) 4.7B 4.56 GB flux-1-dev/text_encoder/qint8

FLUX.1 [schnell]

Source: black-forest-labs/FLUX.1-schnell

Pipeline: FluxPipeline

Use case: 12B fast 4-step generation model (Apache 2.0 license)

Component Params Size Path
Transformer 12.0B 11.08 GB flux-1-schnell/transformer/qint8
Text Encoder (T5-XXL) 4.7B 4.56 GB flux-1-schnell/text_encoder/qint8

FLUX.2 [dev]

Source: black-forest-labs/FLUX.2-dev

Pipeline: Flux2Pipeline

Use case: 32B unified multi-modal model (text-to-image, inpainting, depth, canny, etc.)

Component Params Size Path
Transformer 32.0B 30.02 GB flux-2-dev/transformer/qint8
Text Encoder (Mistral) 24.0B 22.37 GB flux-2-dev/text_encoder/qint8

Usage Examples

FLUX.1 Models (text-to-image, inpainting, depth, canny, etc.)

from diffusers import FluxPipeline  # or FluxFillPipeline, FluxControlPipeline, etc.
from diffusers.models import FluxTransformer2DModel
from transformers import T5EncoderModel
from optimum.quanto import QuantizedDiffusersModel, QuantizedTransformersModel
from huggingface_hub import snapshot_download
import torch

REPO_ID = "VincentGOURBIN/flux_qint_8bit"
quant_path = snapshot_download(REPO_ID)

# Quantized model classes for FLUX.1
class QuantizedFluxTransformer2DModel(QuantizedDiffusersModel):
    base_class = FluxTransformer2DModel

class QuantizedT5EncoderModel(QuantizedTransformersModel):
    auto_class = T5EncoderModel

# Example: Load FLUX.1-dev with quantized transformer
pipe = FluxPipeline.from_pretrained(
    "black-forest-labs/FLUX.1-dev",
    transformer=None,
    torch_dtype=torch.bfloat16
)

transformer = QuantizedFluxTransformer2DModel.from_pretrained(
    f"{quant_path}/flux-1-dev/transformer/qint8"
)
pipe.transformer = transformer.to("mps")  # or "cuda"

# Optional: Load quantized T5 text encoder (saves ~9GB)
# text_encoder_2 = QuantizedT5EncoderModel.from_pretrained(
#     f"{quant_path}/flux-1-dev/text_encoder/qint8"
# )
# pipe.text_encoder_2 = text_encoder_2.to("mps")

image = pipe("A majestic mountain at sunset", num_inference_steps=28).images[0]
image.save("output.png")

FLUX.2 Models (unified multi-modal)

from diffusers import Flux2Pipeline
from diffusers.models import Flux2Transformer2DModel
from transformers import AutoModel
from optimum.quanto import QuantizedDiffusersModel, QuantizedTransformersModel
from huggingface_hub import snapshot_download
import torch

REPO_ID = "VincentGOURBIN/flux_qint_8bit"
quant_path = snapshot_download(REPO_ID)

# Quantized model classes for FLUX.2
class QuantizedFlux2Transformer2DModel(QuantizedDiffusersModel):
    base_class = Flux2Transformer2DModel

class QuantizedFlux2TextEncoder(QuantizedTransformersModel):
    auto_class = AutoModel

# Load FLUX.2-dev with quantized transformer
pipe = Flux2Pipeline.from_pretrained(
    "black-forest-labs/FLUX.2-dev",
    transformer=None,
    torch_dtype=torch.bfloat16
)

transformer = QuantizedFlux2Transformer2DModel.from_pretrained(
    f"{quant_path}/flux-2-dev/transformer/qint8"
)
pipe.transformer = transformer.to("mps")  # or "cuda"

# Optional: Load quantized Mistral text encoder (saves ~36GB)
# text_encoder = QuantizedFlux2TextEncoder.from_pretrained(
#     f"{quant_path}/flux-2-dev/text_encoder/qint8"
# )
# pipe.text_encoder = text_encoder.to("mps")

image = pipe("A beautiful landscape", num_inference_steps=28, guidance_scale=4.0).images[0]
image.save("output.png")

Memory Requirements

Model Family Transformer qint8 Text Encoder qint8 Total RAM to Quantize
FLUX.2 ~30 GB ~22 GB ~52 GB ~64 GB
FLUX.1 ~11 GB ~4.4 GB ~15 GB ~24 GB

Compatibility

Platform Status Notes
MPS (Apple Silicon) βœ… Fully supported Best for M1/M2/M3 Macs
CUDA (NVIDIA) βœ… Fully supported RTX 3090+ recommended
CPU ⚠️ Slow Not recommended for production

Installation

pip install diffusers transformers accelerate safetensors
pip install optimum[quanto]
pip install huggingface_hub

Important Notes

  • VAE is NOT quantized - Quantizing VAE causes visual artifacts
  • LoRA compatible - Quantized models work with LoRA adapters (unlike GGUF)
  • Text encoders are optional - Transformer-only quantization saves significant memory while the text encoder runs in bfloat16

File Structure

flux_qint_8bit/
β”œβ”€β”€ flux-2-dev/           # FLUX.2 models (if present)
β”‚   β”œβ”€β”€ transformer/
β”‚   β”‚   └── qint8/
β”‚   └── text_encoder/
β”‚       └── qint8/
β”œβ”€β”€ flux-1-dev/           # FLUX.1 models
β”‚   β”œβ”€β”€ transformer/
β”‚   β”‚   └── qint8/
β”‚   └── text_encoder/
β”‚       └── qint8/
β”œβ”€β”€ flux-1-schnell/       # Fast model
β”‚   └── ...
└── README.md

Generated With

flux-quantizer - Gradio tool for batch quantizing and publishing FLUX models.


Last updated: 2025-12-23 18:18 UTC

Downloads last month
6
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support