Upload folder using huggingface_hub
Browse files
README.md
CHANGED
|
@@ -10,7 +10,7 @@ tags:
|
|
| 10 |
- image-text-to-text
|
| 11 |
---
|
| 12 |
|
| 13 |
-
<!-- README Version: v1.
|
| 14 |
|
| 15 |
# Qwen2.5-VL-7B-Instruct (Abliterated)
|
| 16 |
|
|
@@ -39,7 +39,7 @@ Qwen2.5-VL-7B-Instruct is an instruction-tuned multimodal large language model t
|
|
| 39 |
|
| 40 |
```
|
| 41 |
qwen2.5-vl-7b-instruct/
|
| 42 |
-
βββ qwen2.5-vl-7b-instruct-abliterated.safetensors #
|
| 43 |
βββ qwen2.5-vl-7b-instruct-abliterated-f16.gguf # 15GB (FP16 GGUF)
|
| 44 |
βββ qwen2.5-vl-7b-instruct-abliterated-q5-k-m.gguf # 5.1GB (Q5_K_M quantized)
|
| 45 |
βββ qwen2.5-vl-7b-instruct-abliterated-q4-k-m.gguf # 4.4GB (Q4_K_M quantized)
|
|
@@ -49,7 +49,7 @@ qwen2.5-vl-7b-instruct/
|
|
| 49 |
|
| 50 |
### Format Descriptions
|
| 51 |
|
| 52 |
-
- **SafeTensors (FP16)**: Full precision format for transformers/diffusers libraries (
|
| 53 |
- **GGUF F16**: Full precision GGUF format for llama.cpp and compatible runtimes (15GB)
|
| 54 |
- **GGUF Q5_K_M**: 5-bit mixed quantization balancing quality and size (5.1GB)
|
| 55 |
- **GGUF Q4_K_M**: 4-bit mixed quantization for maximum efficiency (4.4GB)
|
|
@@ -58,7 +58,7 @@ qwen2.5-vl-7b-instruct/
|
|
| 58 |
|
| 59 |
| Format | VRAM Required | Disk Space | Recommended GPU |
|
| 60 |
|--------|---------------|------------|-----------------|
|
| 61 |
-
| FP16 SafeTensors | ~
|
| 62 |
| FP16 GGUF | ~15-16GB | 15GB | RTX 4090, A100, A6000 |
|
| 63 |
| Q5_K_M GGUF | ~6-7GB | 5.1GB | RTX 3090, RTX 4070 Ti, V100 |
|
| 64 |
| Q4_K_M GGUF | ~5-6GB | 4.4GB | RTX 3060 12GB, RTX 4060 Ti |
|
|
@@ -261,9 +261,10 @@ For technical issues or questions:
|
|
| 261 |
|
| 262 |
## Version History
|
| 263 |
|
|
|
|
| 264 |
- **v1.1** (2025-10-29): Updated documentation with accurate file information and abliterated model details
|
| 265 |
- **v1.0** (2025-10-28): Initial README with Hugging Face metadata
|
| 266 |
|
| 267 |
---
|
| 268 |
|
| 269 |
-
**Model Format**: SafeTensors + GGUF | **Precision**: FP16, Q5_K_M, Q4_K_M | **Size**: 4.4GB -
|
|
|
|
| 10 |
- image-text-to-text
|
| 11 |
---
|
| 12 |
|
| 13 |
+
<!-- README Version: v1.2 -->
|
| 14 |
|
| 15 |
# Qwen2.5-VL-7B-Instruct (Abliterated)
|
| 16 |
|
|
|
|
| 39 |
|
| 40 |
```
|
| 41 |
qwen2.5-vl-7b-instruct/
|
| 42 |
+
βββ qwen2.5-vl-7b-instruct-abliterated.safetensors # 16GB (FP16 SafeTensors)
|
| 43 |
βββ qwen2.5-vl-7b-instruct-abliterated-f16.gguf # 15GB (FP16 GGUF)
|
| 44 |
βββ qwen2.5-vl-7b-instruct-abliterated-q5-k-m.gguf # 5.1GB (Q5_K_M quantized)
|
| 45 |
βββ qwen2.5-vl-7b-instruct-abliterated-q4-k-m.gguf # 4.4GB (Q4_K_M quantized)
|
|
|
|
| 49 |
|
| 50 |
### Format Descriptions
|
| 51 |
|
| 52 |
+
- **SafeTensors (FP16)**: Full precision format for transformers/diffusers libraries (16GB)
|
| 53 |
- **GGUF F16**: Full precision GGUF format for llama.cpp and compatible runtimes (15GB)
|
| 54 |
- **GGUF Q5_K_M**: 5-bit mixed quantization balancing quality and size (5.1GB)
|
| 55 |
- **GGUF Q4_K_M**: 4-bit mixed quantization for maximum efficiency (4.4GB)
|
|
|
|
| 58 |
|
| 59 |
| Format | VRAM Required | Disk Space | Recommended GPU |
|
| 60 |
|--------|---------------|------------|-----------------|
|
| 61 |
+
| FP16 SafeTensors | ~16-18GB | 16GB | RTX 4090, A100, A6000 |
|
| 62 |
| FP16 GGUF | ~15-16GB | 15GB | RTX 4090, A100, A6000 |
|
| 63 |
| Q5_K_M GGUF | ~6-7GB | 5.1GB | RTX 3090, RTX 4070 Ti, V100 |
|
| 64 |
| Q4_K_M GGUF | ~5-6GB | 4.4GB | RTX 3060 12GB, RTX 4060 Ti |
|
|
|
|
| 261 |
|
| 262 |
## Version History
|
| 263 |
|
| 264 |
+
- **v1.2** (2025-10-29): Corrected SafeTensors file size (16GB) and VRAM requirements
|
| 265 |
- **v1.1** (2025-10-29): Updated documentation with accurate file information and abliterated model details
|
| 266 |
- **v1.0** (2025-10-28): Initial README with Hugging Face metadata
|
| 267 |
|
| 268 |
---
|
| 269 |
|
| 270 |
+
**Model Format**: SafeTensors + GGUF | **Precision**: FP16, Q5_K_M, Q4_K_M | **Size**: 4.4GB - 16GB
|