wangkanai
/

qwen2.5-vl-7b-instruct

@@ -10,7 +10,7 @@ tags:
   - image-text-to-text
 ---
-<!-- README Version: v1.1 -->
 # Qwen2.5-VL-7B-Instruct (Abliterated)
@@ -39,7 +39,7 @@ Qwen2.5-VL-7B-Instruct is an instruction-tuned multimodal large language model t
 ```
 qwen2.5-vl-7b-instruct/
-├── qwen2.5-vl-7b-instruct-abliterated.safetensors       # 15GB (FP16 SafeTensors)
 ├── qwen2.5-vl-7b-instruct-abliterated-f16.gguf          # 15GB (FP16 GGUF)
 ├── qwen2.5-vl-7b-instruct-abliterated-q5-k-m.gguf       # 5.1GB (Q5_K_M quantized)
 └── qwen2.5-vl-7b-instruct-abliterated-q4-k-m.gguf       # 4.4GB (Q4_K_M quantized)
@@ -49,7 +49,7 @@ qwen2.5-vl-7b-instruct/
 ### Format Descriptions
-- **SafeTensors (FP16)**: Full precision format for transformers/diffusers libraries (15GB)
 - **GGUF F16**: Full precision GGUF format for llama.cpp and compatible runtimes (15GB)
 - **GGUF Q5_K_M**: 5-bit mixed quantization balancing quality and size (5.1GB)
 - **GGUF Q4_K_M**: 4-bit mixed quantization for maximum efficiency (4.4GB)
@@ -58,7 +58,7 @@ qwen2.5-vl-7b-instruct/
 | Format | VRAM Required | Disk Space | Recommended GPU |
 |--------|---------------|------------|-----------------|
-| FP16 SafeTensors | ~15-16GB | 15GB | RTX 4090, A100, A6000 |
 | FP16 GGUF | ~15-16GB | 15GB | RTX 4090, A100, A6000 |
 | Q5_K_M GGUF | ~6-7GB | 5.1GB | RTX 3090, RTX 4070 Ti, V100 |
 | Q4_K_M GGUF | ~5-6GB | 4.4GB | RTX 3060 12GB, RTX 4060 Ti |
@@ -261,9 +261,10 @@ For technical issues or questions:
 ## Version History
 - **v1.1** (2025-10-29): Updated documentation with accurate file information and abliterated model details
 - **v1.0** (2025-10-28): Initial README with Hugging Face metadata
 ---
-**Model Format**: SafeTensors + GGUF | **Precision**: FP16, Q5_K_M, Q4_K_M | **Size**: 4.4GB - 15GB

   - image-text-to-text
 ---
+<!-- README Version: v1.2 -->
 # Qwen2.5-VL-7B-Instruct (Abliterated)
 ```
 qwen2.5-vl-7b-instruct/
+├── qwen2.5-vl-7b-instruct-abliterated.safetensors       # 16GB (FP16 SafeTensors)
 ├── qwen2.5-vl-7b-instruct-abliterated-f16.gguf          # 15GB (FP16 GGUF)
 ├── qwen2.5-vl-7b-instruct-abliterated-q5-k-m.gguf       # 5.1GB (Q5_K_M quantized)
 └── qwen2.5-vl-7b-instruct-abliterated-q4-k-m.gguf       # 4.4GB (Q4_K_M quantized)
 ### Format Descriptions
+- **SafeTensors (FP16)**: Full precision format for transformers/diffusers libraries (16GB)
 - **GGUF F16**: Full precision GGUF format for llama.cpp and compatible runtimes (15GB)
 - **GGUF Q5_K_M**: 5-bit mixed quantization balancing quality and size (5.1GB)
 - **GGUF Q4_K_M**: 4-bit mixed quantization for maximum efficiency (4.4GB)
 | Format | VRAM Required | Disk Space | Recommended GPU |
 |--------|---------------|------------|-----------------|
+| FP16 SafeTensors | ~16-18GB | 16GB | RTX 4090, A100, A6000 |
 | FP16 GGUF | ~15-16GB | 15GB | RTX 4090, A100, A6000 |
 | Q5_K_M GGUF | ~6-7GB | 5.1GB | RTX 3090, RTX 4070 Ti, V100 |
 | Q4_K_M GGUF | ~5-6GB | 4.4GB | RTX 3060 12GB, RTX 4060 Ti |
 ## Version History
+- **v1.2** (2025-10-29): Corrected SafeTensors file size (16GB) and VRAM requirements
 - **v1.1** (2025-10-29): Updated documentation with accurate file information and abliterated model details
 - **v1.0** (2025-10-28): Initial README with Hugging Face metadata
 ---
+**Model Format**: SafeTensors + GGUF | **Precision**: FP16, Q5_K_M, Q4_K_M | **Size**: 4.4GB - 16GB