Qwen2.5-3B-Instruct-Q4_K_M-GGUF

GGUF quantized version of Qwen2.5-3B-Instruct for mobile and edge deployment.

Model Details

./llama-cli -m Qwen2.5-3B-Instruct-Q4_K_M.gguf -p "Hello, how are you?"

This model is optimized for running on iOS devices using the llama.swiftui app.

<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
{user_message}<|im_end|>
<|im_start|>assistant

Apache 2.0 (following the base model license)

GGUF

Model size

3B params

Architecture

qwen2

Hardware compatibility

4-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Qwen/Qwen2.5-3B

Finetuned

Quantized

(168)

this model