Matrix
commited on
Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -5,7 +5,6 @@ license: gemma
|
|
| 5 |
pipeline_tag: image-text-to-text
|
| 6 |
tags:
|
| 7 |
- llama-cpp
|
| 8 |
-
- gguf-my-repo
|
| 9 |
extra_gated_heading: Access Gemma on Hugging Face
|
| 10 |
extra_gated_prompt: To access Gemma on Hugging Face, you’re required to review and
|
| 11 |
agree to Google’s usage license. To do this, please ensure you’re logged in to Hugging
|
|
@@ -14,45 +13,26 @@ extra_gated_button_content: Acknowledge license
|
|
| 14 |
---
|
| 15 |
|
| 16 |
# matrixportal/gemma-3-4b-it-GGUF
|
| 17 |
-
This model was converted to GGUF format from [`google/gemma-3-4b-it`](https://huggingface.co/google/gemma-3-4b-it)
|
| 18 |
Refer to the [original model card](https://huggingface.co/google/gemma-3-4b-it) for more details on the model.
|
| 19 |
|
| 20 |
-
##
|
| 21 |
-
|
| 22 |
-
|
| 23 |
-
|
| 24 |
-
|
| 25 |
-
|
| 26 |
-
|
| 27 |
-
|
| 28 |
-
|
| 29 |
-
|
| 30 |
-
|
| 31 |
-
|
| 32 |
-
|
| 33 |
-
|
| 34 |
-
|
| 35 |
-
|
| 36 |
-
|
| 37 |
-
|
| 38 |
-
|
| 39 |
-
|
| 40 |
-
|
| 41 |
-
Step 1: Clone llama.cpp from GitHub.
|
| 42 |
-
```
|
| 43 |
-
git clone https://github.com/ggerganov/llama.cpp
|
| 44 |
-
```
|
| 45 |
-
|
| 46 |
-
Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
|
| 47 |
-
```
|
| 48 |
-
cd llama.cpp && LLAMA_CURL=1 make
|
| 49 |
-
```
|
| 50 |
-
|
| 51 |
-
Step 3: Run inference through the main binary.
|
| 52 |
-
```
|
| 53 |
-
./llama-cli --hf-repo matrixportal/gemma-3-4b-it-GGUF --hf-file gemma-3-4b-it-f16.gguf -p "The meaning to life and the universe is"
|
| 54 |
-
```
|
| 55 |
-
or
|
| 56 |
-
```
|
| 57 |
-
./llama-server --hf-repo matrixportal/gemma-3-4b-it-GGUF --hf-file gemma-3-4b-it-f16.gguf -c 2048
|
| 58 |
-
```
|
|
|
|
| 5 |
pipeline_tag: image-text-to-text
|
| 6 |
tags:
|
| 7 |
- llama-cpp
|
|
|
|
| 8 |
extra_gated_heading: Access Gemma on Hugging Face
|
| 9 |
extra_gated_prompt: To access Gemma on Hugging Face, you’re required to review and
|
| 10 |
agree to Google’s usage license. To do this, please ensure you’re logged in to Hugging
|
|
|
|
| 13 |
---
|
| 14 |
|
| 15 |
# matrixportal/gemma-3-4b-it-GGUF
|
| 16 |
+
This model was converted to GGUF format from [`google/gemma-3-4b-it`](https://huggingface.co/google/gemma-3-4b-it)
|
| 17 |
Refer to the [original model card](https://huggingface.co/google/gemma-3-4b-it) for more details on the model.
|
| 18 |
|
| 19 |
+
## 🔍 Quantized Models Download List 🔍
|
| 20 |
+
**✨ Recommended for CPU:** `Q4_K_M` | **⚡ Recommended for ARM CPU:** `Q4_0` | **🏆 Best Quality:** `Q8_0`
|
| 21 |
+
|
| 22 |
+
| 🚀 Download | 🔢 Type | 📝 Notes |
|
| 23 |
+
|:---------|:-----|:------|
|
| 24 |
+
| [Download](https://huggingface.co/matrixportal/gemma-3-4b-it-GGUF/resolve/main/gemma-3-4b-it-q2_k.gguf) |  | Basic quantization |
|
| 25 |
+
| [Download](https://huggingface.co/matrixportal/gemma-3-4b-it-GGUF/resolve/main/gemma-3-4b-it-q3_k_s.gguf) |  | Small size |
|
| 26 |
+
| [Download](https://huggingface.co/matrixportal/gemma-3-4b-it-GGUF/resolve/main/gemma-3-4b-it-q3_k_m.gguf) |  | Balanced quality |
|
| 27 |
+
| [Download](https://huggingface.co/matrixportal/gemma-3-4b-it-GGUF/resolve/main/gemma-3-4b-it-q3_k_l.gguf) |  | Better quality |
|
| 28 |
+
| [Download](https://huggingface.co/matrixportal/gemma-3-4b-it-GGUF/resolve/main/gemma-3-4b-it-q4_0.gguf) |  | Fast on ARM |
|
| 29 |
+
| [Download](https://huggingface.co/matrixportal/gemma-3-4b-it-GGUF/resolve/main/gemma-3-4b-it-q4_k_s.gguf) |  | Fast, recommended |
|
| 30 |
+
| [Download](https://huggingface.co/matrixportal/gemma-3-4b-it-GGUF/resolve/main/gemma-3-4b-it-q4_k_m.gguf) |  ⭐ | Best balance |
|
| 31 |
+
| [Download](https://huggingface.co/matrixportal/gemma-3-4b-it-GGUF/resolve/main/gemma-3-4b-it-q5_0.gguf) |  | Good quality |
|
| 32 |
+
| [Download](https://huggingface.co/matrixportal/gemma-3-4b-it-GGUF/resolve/main/gemma-3-4b-it-q5_k_s.gguf) |  | Balanced |
|
| 33 |
+
| [Download](https://huggingface.co/matrixportal/gemma-3-4b-it-GGUF/resolve/main/gemma-3-4b-it-q5_k_m.gguf) |  | High quality |
|
| 34 |
+
| [Download](https://huggingface.co/matrixportal/gemma-3-4b-it-GGUF/resolve/main/gemma-3-4b-it-q6_k.gguf) |  🏆 | Very good quality |
|
| 35 |
+
| [Download](https://huggingface.co/matrixportal/gemma-3-4b-it-GGUF/resolve/main/gemma-3-4b-it-q8_0.gguf) |  ⚡ | Fast, best quality |
|
| 36 |
+
| [Download](https://huggingface.co/matrixportal/gemma-3-4b-it-GGUF/resolve/main/gemma-3-4b-it-f16.gguf) |  | Maximum accuracy |
|
| 37 |
+
|
| 38 |
+
💡 **Tip:** Use `F16` for maximum precision when quality is critical
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|