--- base_model: unsloth/llama-3.2-3b-instruct tags: - text-generation-inference - transformers - unsloth - llama - trl license: apache-2.0 language: - en - sw datasets: - saillab/alpaca_swahili_taco metrics: - bleu - accuracy - cer - rouge pipeline_tag: text-generation --- # 🧠 SALAMA LLM β€” Swahili Instruction-Tuned Text Generation Model **πŸ‘¨β€πŸ’» Developer:** AI4NNOV **✍️ Authors:** AI4NNOV **πŸ“¦ Version:** v1.0 **πŸ“œ License:** Apache 2.0 **πŸ› οΈ Model Type:** Instruction-Tuned Large Language Model **🧩 Base Model:** `Jacaranda/UlizaLlama` --- ## 🌍 Overview **SALAMA LLM** is the **language understanding and generation engine** of the **SALAMA Framework** β€” a modular Speech-to-Speech (STS) AI pipeline built for African languages. The model is fine-tuned on Swahili instruction datasets to enable natural, culturally relevant responses in text generation, summarization, question answering, and translation. This model represents a major step in bridging the linguistic digital divide by providing **high-quality Swahili AI text generation** capabilities within an open, scalable framework. --- ## 🧱️ Model Architecture SALAMA LLM is based on **Jacaranda/UlizaLlama**, fine-tuned using **Parameter-Efficient Fine-Tuning (PEFT)** via **LoRA/QLoRA**. The architecture supports mixed Swahili-English text inputs while focusing on fluent Swahili text generation for both casual and formal domains. | Parameter | Value | |------------|--------| | **Base Model** | `Jacaranda/UlizaLlama` | | **Fine-Tuning** | QLoRA / LoRA (PEFT) | | **Precision** | 4-bit quantization | | **Optimizer** | AdamW | | **Learning Rate** | 2e-5 | | **Epochs** | 3–5 | | **Frameworks** | Transformers, TRL, PEFT, Unsloth | | **Languages** | Swahili (sw), English (en) | --- ## πŸ“š Datasets | Dataset | Description | Purpose | |----------|--------------|----------| | `saillab/alpaca_swahili_taco` | Swahili Alpaca-style instruction-response dataset | Instruction tuning | | `Jacaranda/kiswallama-pretrained` | 321M Swahili tokens, custom tokenizer (20K vocab) | Base Swahili adaptation | | Custom Swahili QA corpus | Curated Q&A and summarization samples | Conversational fine-tuning | --- ## 🧠 Model Capabilities βœ… Text generation in **Swahili and English** βœ… Instruction-following, summarization, and dialogue βœ… Question answering and translation (EN ↔ SW) βœ… Sentiment and named-entity recognition βœ… Contextually and culturally aligned text generation --- ## πŸ“Š Evaluation Metrics | Metric | Score | Description | |---------|-------|-------------| | **BLEU** | 0.49 | Measures fluency and translation accuracy | | **ROUGE-L** | 0.61 | Summarization recall and overlap | | **Accuracy (QA)** | 95.5% | Accuracy on Swahili QA tasks | | **CER** | 0.28 | Character Error Rate | | **F1 (avg)** | 0.90+ | Weighted average across tasks | --- ## βš™οΈ Usage (Python Example) Below is a quick example to load and use **SALAMA LLM** for Swahili text generation: ```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch # Load model and tokenizer model_name = "EYEDOL/salama-llm" # Change to your Hugging Face repo name tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype=torch.bfloat16, device_map="auto" ) # Swahili text prompt prompt = "Andika sentensi fupi kuhusu umuhimu wa elimu." inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate( **inputs, max_new_tokens=120, temperature=0.7, top_p=0.9, repetition_penalty=1.05 ) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` **🦩 Example Output:** > β€œElimu ni msingi wa maendeleo, humwezesha mtu kuelewa dunia na kuboresha maisha yake na jamii kwa ujumla.” --- ## ⚑ Key Features - 🧩 Optimized for African low-resource NLP contexts - πŸ’¬ Instruction-following in Swahili and English - βš™οΈ Lightweight and efficient (QLoRA fine-tuned; runs on single 24 GB GPU) - 🌍 Culturally aligned text generation - 🦢 Open-source and extendable to other African languages --- ## 🚫 Limitations - ⚠️ May underperform with heavy code-switching (Swahili-English mix) - πŸ‘€ Not yet optimized for rare dialects or poetic forms - πŸ“š Limited exposure to specialized (medical/legal) corpora - πŸ”Š Relies on accurate STT transcription in end-to-end speech-to-speech use --- ## πŸ”— Related Models | Model | Description | |--------|-------------| | [`EYEDOL/salama-stt`](https://huggingface.co/EYEDOL/salama-stt) | Swahili Speech-to-Text model (Whisper-small fine-tuned) | | [`EYEDOL/salama-tts`](https://huggingface.co/EYEDOL/salama-tts) | Swahili Text-to-Speech model (VITS architecture) | --- ## 🧾 Citation If you use **SALAMA LLM**, please cite: ```bibtex @misc{salama_llm_2025, title={SALAMA LLM: Swahili Instruction-Tuned Text Generation Model}, author={AI4NNOV}, year={2025}, publisher={Hugging Face}, howpublished={\url{https://huggingface.co/EYEDOL/salama-llm}} } ``` --- **πŸ’‘ β€œElimu ni msingi wa maendeleo β€” Knowledge is the foundation of progress.”**