Post
2755
It's been a while we shipped native quantization support in
We currently support
This post is just a reminder of what's possible:
1. Loading a model with a quantization config
2. Saving a model with quantization config
3. Loading a pre-quantized model
4.
5. Training and loading LoRAs into quantized checkpoints
Docs:
https://huggingface.co/docs/diffusers/main/en/quantization/bitsandbytes
diffusers 🧨We currently support
bistandbytes as the official backend but using others like torchao is already very simple. This post is just a reminder of what's possible:
1. Loading a model with a quantization config
2. Saving a model with quantization config
3. Loading a pre-quantized model
4.
enable_model_cpu_offload()5. Training and loading LoRAs into quantized checkpoints
Docs:
https://huggingface.co/docs/diffusers/main/en/quantization/bitsandbytes