dhairyashil
/

FLUX.1-dev-mflux-8bit

+---
+language: en
+license:  apache-2.0
+tags:
+  - text-to-image
+  - diffusion
+  - mflux
+  - development
+datasets:
+  - custom
+---
+# FLUX.1-dev-mflux-8bit
+[![Hugging Face](https://img.shields.io/badge/🤗%20Hugging%20Face-FLUX.1--dev--mflux--8bit-blue)](https://huggingface.co/dhairyashil/FLUX.1-dev-mflux-8bit)
+![comparison_output](comparison.png)
+A quantized version of the [FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev) text-to-image model, implemented using the [mflux](https://github.com/filipstrand/mflux) (version 0.6.2) quantization approach.
+## Overview
+This repository contains the 8-bit quantized FLUX.1 model, which significantly reduces the memory footprint while maintaining most of the generation quality. The quantization was performed using the mflux.
+### Benefits of 8-bit Quantization
+- **Reduced Memory Usage**: ~50% reduction in memory requirements compared to the original model
+- **Faster Loading Times**: Smaller model size means quicker initialization
+- **Lower Storage Requirements**: Significantly smaller disk footprint
+- **Accessibility**: Can run on consumer hardware with limited VRAM
+- **Minimal Quality Loss**: Maintains nearly identical output quality to the original model
+## Model Structure
+This repository contains the following components:
+- `text_encoder/`: CLIP text encoder (8-bit quantized)
+- `text_encoder_2/`: Secondary text encoder (8-bit quantized)
+- `tokenizer/`: CLIP tokenizer configuration and vocabulary
+- `tokenizer_2/`: Secondary tokenizer configuration
+- `transformer/`: Main diffusion model components (8-bit quantized)
+- `vae/`: Variational autoencoder for image encoding/decoding (8-bit quantized)
+## Usage
+### Requirements
+- Python
+- PyTorch
+- Transformers
+- Diffusers
+- [mflux](https://github.com/filipstrand/mflux) library (for 8-bit model support)
+### Installation
+```bash
+pip install torch diffusers transformers accelerate
+uv tool install mflux # check mflux README for more details
+```
+### Example Usage
+```bash
+# export path for mflux
+% mflux-generate \
+    --path "dhairyashil/FLUX.1-dev-mflux-8bit" \
+    --model dev \
+    --steps 25 \
+    --seed 2 \
+    --height 1920 \
+    --width 1024 \
+    --prompt "hot chocolate dish"
+```
+### Comparison Output
+The images generated from above prompt for different models are shown at the top.
+fp16 and 8-bit results look visibly almost the same, with the 8-bit version maintaining excellent quality while using significantly less memory.
+A [4-bit development model](https://huggingface.co/dhairyashil/FLUX.1-dev-mflux-4bit) may also be available for testing, though with more noticeable quality difference.
+## Performance Comparison
+| Model Version | Memory Usage | Inference Speed | Quality |
+|---------------|--------------|-----------------|--------|
+| Original FP16 | ~36 GB      | Base            | Base   |
+| 8-bit Quantized | ~18 GB    | Nearly identical | Nearly identical |
+| 4-bit Quantized | ~9 GB     | Nearly identical | Moderately reduced |
+## Other Highlights
+- Very minimal quality degradation compared to the original model
+- Nearly identical inference speed
+- Rare artifacts that are generally imperceptible in most use cases
+## Acknowledgements
+- [Black Forest Labs](https://huggingface.co/black-forest-labs) for creating the original FLUX.1 model family
+- [Filip Strand](https://github.com/filipstrand) for developing the mflux quantization methodology
+- The Hugging Face team for their Diffusers and Transformers libraries
+- All contributors to the development version for their testing and improvements
+## License
+This model inherits the license of the original FLUX.1 model. Please refer to the [original model repository](https://huggingface.co/black-forest-labs/FLUX.1) for licensing information.

config.json ADDED Viewed

	@@ -0,0 +1,59 @@

+{
+  "_class_name": "FluxPipeline",
+  "_diffusers_version": "0.19.0",
+  "force_zeros_for_empty_prompt": true,
+  "add_watermarker": false,
+  "feature_extractor": [
+    "transformers",
+    "CLIPImageProcessor"
+  ],
+  "text_encoder": [
+    "transformers",
+    "CLIPTextModel"
+  ],
+  "text_encoder_2": [
+    "transformers",
+    "CLIPTextModelWithProjection"
+  ],
+  "tokenizer": [
+    "transformers",
+    "CLIPTokenizer"
+  ],
+  "tokenizer_2": [
+    "transformers",
+    "CLIPTokenizer"
+  ],
+  "transformer": [
+    "diffusers",
+    "FluxTransformerModel"
+  ],
+  "vae": [
+    "diffusers",
+    "AutoencoderKL"
+  ],
+  "model_type": "flux-rectified-flow",
+  "architecture": "rectified-flow-transformer",
+  "parameters": 12000000000,
+  "prediction_type": "flow",
+  "max_sequence_length": 256,
+  "requires_safety_checker": false,
+  "safety_checker": null,
+  "original_model": "black-forest-labs/FLUX.1-dev",
+  "model_description": "A development version of the 12 billion parameter rectified flow transformer capable of generating images from text descriptions using a hybrid architecture of multimodal and parallel diffusion transformer blocks",
+  "quantization": {
+    "method": "mflux",
+    "version": "0.6.2",
+    "bits": 8,
+    "original_dtype": "float16"
+  },
+  "memory_requirements": {
+    "original_fp16": "~57 GB",
+    "quantized_8bit": "~18 GB"
+  },
+  "recommended_inference_parameters": {
+    "steps": 25,
+    "guidance_scale": 0.0,
+    "max_sequence_length": 256
+  },
+  "license": "apache-2.0"
+}