---
base_model: unsloth/llama-3.2-3b-instruct
tags:
- text-generation-inference
- transformers
- unsloth
- llama
- trl
license: apache-2.0
language:
- en
- sw
datasets:
- saillab/alpaca_swahili_taco
metrics:
- bleu
- accuracy
- cer
- rouge
pipeline_tag: text-generation
---

# 🧠 SALAMA LLM — Swahili Instruction-Tuned Text Generation Model

**👨‍💻 Developer:** AI4NNOV  
**✍️ Authors:** AI4NNOV  
**📦 Version:** v1.0  
**📜 License:** Apache 2.0  
**🛠️ Model Type:** Instruction-Tuned Large Language Model  
**🧩 Base Model:** `Jacaranda/UlizaLlama`

---

## 🌍 Overview

**SALAMA LLM** is the **language understanding and generation engine** of the **SALAMA Framework** — a modular Speech-to-Speech (STS) AI pipeline built for African languages.  
The model is fine-tuned on Swahili instruction datasets to enable natural, culturally relevant responses in text generation, summarization, question answering, and translation.

This model represents a major step in bridging the linguistic digital divide by providing **high-quality Swahili AI text generation** capabilities within an open, scalable framework.

---

## 🧱️ Model Architecture

SALAMA LLM is based on **Jacaranda/UlizaLlama**, fine-tuned using **Parameter-Efficient Fine-Tuning (PEFT)** via **LoRA/QLoRA**.  
The architecture supports mixed Swahili-English text inputs while focusing on fluent Swahili text generation for both casual and formal domains.

| Parameter | Value |
|------------|--------|
| **Base Model** | `Jacaranda/UlizaLlama` |
| **Fine-Tuning** | QLoRA / LoRA (PEFT) |
| **Precision** | 4-bit quantization |
| **Optimizer** | AdamW |
| **Learning Rate** | 2e-5 |
| **Epochs** | 3–5 |
| **Frameworks** | Transformers, TRL, PEFT, Unsloth |
| **Languages** | Swahili (sw), English (en) |

---

## 📚 Datasets

| Dataset | Description | Purpose |
|----------|--------------|----------|
| `saillab/alpaca_swahili_taco` | Swahili Alpaca-style instruction-response dataset | Instruction tuning |
| `Jacaranda/kiswallama-pretrained` | 321M Swahili tokens, custom tokenizer (20K vocab) | Base Swahili adaptation |
| Custom Swahili QA corpus | Curated Q&A and summarization samples | Conversational fine-tuning |

---

## 🧠 Model Capabilities

✅ Text generation in **Swahili and English**  
✅ Instruction-following, summarization, and dialogue  
✅ Question answering and translation (EN ↔ SW)  
✅ Sentiment and named-entity recognition  
✅ Contextually and culturally aligned text generation  

---

## 📊 Evaluation Metrics

| Metric | Score | Description |
|---------|-------|-------------|
| **BLEU** | 0.49 | Measures fluency and translation accuracy |
| **ROUGE-L** | 0.61 | Summarization recall and overlap |
| **Accuracy (QA)** | 95.5% | Accuracy on Swahili QA tasks |
| **CER** | 0.28 | Character Error Rate |
| **F1 (avg)** | 0.90+ | Weighted average across tasks |

---

## ⚙️ Usage (Python Example)

Below is a quick example to load and use **SALAMA LLM** for Swahili text generation:

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# Load model and tokenizer
model_name = "EYEDOL/salama-llm"  # Change to your Hugging Face repo name
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Swahili text prompt
prompt = "Andika sentensi fupi kuhusu umuhimu wa elimu."

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=120,
    temperature=0.7,
    top_p=0.9,
    repetition_penalty=1.05
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

**🦩 Example Output:**

> “Elimu ni msingi wa maendeleo, humwezesha mtu kuelewa dunia na kuboresha maisha yake na jamii kwa ujumla.”

---

## ⚡ Key Features

- 🧩 Optimized for African low-resource NLP contexts  
- 💬 Instruction-following in Swahili and English  
- ⚙️ Lightweight and efficient (QLoRA fine-tuned; runs on single 24 GB GPU)  
- 🌍 Culturally aligned text generation  
- 🦶 Open-source and extendable to other African languages  

---

## 🚫 Limitations

- ⚠️ May underperform with heavy code-switching (Swahili-English mix)  
- 👤 Not yet optimized for rare dialects or poetic forms  
- 📚 Limited exposure to specialized (medical/legal) corpora  
- 🔊 Relies on accurate STT transcription in end-to-end speech-to-speech use  

---

## 🔗 Related Models

| Model | Description |
|--------|-------------|
| [`EYEDOL/salama-stt`](https://huggingface.co/EYEDOL/salama-stt) | Swahili Speech-to-Text model (Whisper-small fine-tuned) |
| [`EYEDOL/salama-tts`](https://huggingface.co/EYEDOL/salama-tts) | Swahili Text-to-Speech model (VITS architecture) |

---

## 🧾 Citation

If you use **SALAMA LLM**, please cite:

```bibtex
@misc{salama_llm_2025,
  title={SALAMA LLM: Swahili Instruction-Tuned Text Generation Model},
  author={AI4NNOV},
  year={2025},
  publisher={Hugging Face},
  howpublished={\url{https://huggingface.co/EYEDOL/salama-llm}}
}
```

---

**💡 “Elimu ni msingi wa maendeleo — Knowledge is the foundation of progress.”**