|
|
--- |
|
|
tags: |
|
|
- accuracy recovery adapter |
|
|
- AI Toolkit |
|
|
--- |
|
|
# Accuracy Recovery Adapters |
|
|
|
|
|
This repo contains various accuracy recovery adapters (ARAs) that I have trained, primarialy for use with [AI Toolkit](https://github.com/ostris/ai-toolkit). |
|
|
An ARA is a LoRA that is trained via student teacher training with the student being quantized down to a low precision and the teacher having a high precision. |
|
|
The goal is to have a side chain LoRA, at bfloat16, that runs parallel to highly quantized layers in a network to compensate for the loss in precision that happens |
|
|
when weights are quantized. The training is done on a per layer basis in order to match the parent output as much as possible. |
|
|
|
|
|
While this can be used on inference, my primary goal is to make large models finetunable on consumer grade hardware. With the 3bit Qwen Image adapter, it |
|
|
is now possible to train a LoRA on top of it, with 1 MP images, on a 24 GB GPU, such as a 3090/4090. |
|
|
|
|
|
I have found the sweet spot, at least for [Qwen-Image](https://huggingface.co/Qwen/Qwen-Image), is 3 bit quantization with a rank 16 adapter. |
|
|
|
|
|
More info, examples, links, training scripts, AI Toolkit example configs, and adapters to some soon. |
|
|
|
|
|
## License |
|
|
|
|
|
All adapters inherit the parent model license. Apache 2.0 for Apache 2.0, BFL License for BFL License, etc. |
|
|
|
|
|
## Qwen-Image 3 bit quantization |
|
|
|
|
|
 |