--- base_model: Qwen/Qwen2.5-Coder-0.5B datasets: None language: - en license: apache-2.0 tags: - text-generation-inference - transformers - torch - trl - unsloth - llama - gguf --- # Uploaded model - **Developed by:** student-abdullah - **License:** apache-2.0 - **Quantized from model:** Qwen2.5-Coder-0.5B - **Created on:** 06th July, 2025 --- # Acknowledgement
--- # Quantization Description This model is quantized using *selective quantization* from the Qwen2.5-Coder-0.5B base model to increase its speed while preserving the capabilities in generating relevant and accurate responses related python programming. The quantization method included *32-bit* quantization of the following Layers: - q_proj - k_proj - v_proj - o_proj - down_proj - gate_proj - up_proj - lm_head Rest of the remaining layers were quantized to *q3_k_l* --- # Model Description | Layer Name | Role (Short) | Type | | ---------------------------- | ----------------------------------------------------- | -------------- | | `q_proj`, `k_proj`, `v_proj` | Compute query, key, and value for attention mechanism | Attention Proj | | `o_proj` | Projects attention output back to model hidden size | Attention Proj | | `down_proj` | Projects MLP output down to hidden size | MLP | | `gate_proj` | First part of Gated MLP, controls info flow | MLP | | `up_proj` | Expands hidden size in MLP | MLP | | `lm_head` | Final linear layer for logits | Output Head | | `embed_tokens` | Token embedding layer | Input Embed | | `norm` | Final layernorm | Normalization | | `*_layernorm` | Normalize inputs to layers | Normalization | --- # Model Architect
Qwen2ForCausalLM(
  (model): Qwen2Model(
    (embed_tokens): Embedding(151936, 896, padding_idx=151665)
    (layers): ModuleList(
      (0-23): 24 x Qwen2DecoderLayer(
        (self_attn): Qwen2Attention(
          (q_proj): Linear(in_features=896, out_features=896, bias=True)
          (k_proj): Linear(in_features=896, out_features=128, bias=True)
          (v_proj): Linear(in_features=896, out_features=128, bias=True)
          (o_proj): Linear(in_features=896, out_features=896, bias=False)
          (rotary_emb): LlamaRotaryEmbedding()
        )
        (mlp): Qwen2MLP(
          (gate_proj): Linear(in_features=896, out_features=4864, bias=False)
          (up_proj): Linear(in_features=896, out_features=4864, bias=False)
          (down_proj): Linear(in_features=4864, out_features=896, bias=False)
          (act_fn): SiLU()
        )
        (input_layernorm): Qwen2RMSNorm((896,), eps=1e-06)
        (post_attention_layernorm): Qwen2RMSNorm((896,), eps=1e-06)
      )
    )
    (norm): Qwen2RMSNorm((896,), eps=1e-06)
    (rotary_emb): LlamaRotaryEmbedding()
  )
  (lm_head): Linear(in_features=896, out_features=151936, bias=False)
)
--- # Performance & Limitations - YET TO BE EXAMINED --- # Model Performace Evaluation: - YET TO BE EVALUATED