Update README.md
Browse files
README.md
CHANGED
|
@@ -22,39 +22,39 @@ pipeline_tag: text-generation
|
|
| 22 |
|
| 23 |
# π§ SALAMA LLM β Swahili Instruction-Tuned Text Generation Model
|
| 24 |
|
| 25 |
-
|
| 26 |
-
|
| 27 |
-
|
| 28 |
-
|
| 29 |
-
|
| 30 |
-
|
| 31 |
|
| 32 |
---
|
| 33 |
|
| 34 |
## π Overview
|
| 35 |
|
| 36 |
**SALAMA LLM** is the **language understanding and generation engine** of the **SALAMA Framework** β a modular Speech-to-Speech (STS) AI pipeline built for African languages.
|
| 37 |
-
The model is fine-tuned on Swahili instruction datasets to enable natural, culturally relevant responses in text generation, summarization, question answering, and translation.
|
| 38 |
|
| 39 |
-
This model represents a major step in bridging the linguistic digital divide by providing high-quality Swahili AI text generation capabilities within an open, scalable framework.
|
| 40 |
|
| 41 |
---
|
| 42 |
|
| 43 |
-
##
|
| 44 |
|
| 45 |
SALAMA LLM is based on **Jacaranda/UlizaLlama**, fine-tuned using **Parameter-Efficient Fine-Tuning (PEFT)** via **LoRA/QLoRA**.
|
| 46 |
The architecture supports mixed Swahili-English text inputs while focusing on fluent Swahili text generation for both casual and formal domains.
|
| 47 |
|
| 48 |
| Parameter | Value |
|
| 49 |
|------------|--------|
|
| 50 |
-
| Base Model | `Jacaranda/UlizaLlama` |
|
| 51 |
-
| Fine-Tuning | QLoRA / LoRA (PEFT) |
|
| 52 |
-
| Precision | 4-bit quantization |
|
| 53 |
-
| Optimizer | AdamW |
|
| 54 |
-
| Learning Rate | 2e-5 |
|
| 55 |
-
| Epochs | 3β5 |
|
| 56 |
-
| Frameworks | Transformers, TRL, PEFT, Unsloth |
|
| 57 |
-
| Languages | Swahili (sw), English (en) |
|
| 58 |
|
| 59 |
---
|
| 60 |
|
|
@@ -70,11 +70,11 @@ The architecture supports mixed Swahili-English text inputs while focusing on fl
|
|
| 70 |
|
| 71 |
## π§ Model Capabilities
|
| 72 |
|
| 73 |
-
|
| 74 |
-
|
| 75 |
-
|
| 76 |
-
|
| 77 |
-
|
| 78 |
|
| 79 |
---
|
| 80 |
|
|
@@ -120,11 +120,11 @@ outputs = model.generate(
|
|
| 120 |
)
|
| 121 |
|
| 122 |
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
|
|
|
| 123 |
|
|
|
|
| 124 |
|
| 125 |
-
|
| 126 |
-
|
| 127 |
-
> *βElimu ni msingi wa maendeleo, humwezesha mtu kuelewa dunia na kuboresha maisha yake na jamii kwa ujumla.β*
|
| 128 |
|
| 129 |
---
|
| 130 |
|
|
@@ -132,16 +132,16 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
|
| 132 |
|
| 133 |
- π§© Optimized for African low-resource NLP contexts
|
| 134 |
- π¬ Instruction-following in Swahili and English
|
| 135 |
-
- βοΈ Lightweight and efficient
|
| 136 |
- π Culturally aligned text generation
|
| 137 |
-
-
|
| 138 |
|
| 139 |
---
|
| 140 |
|
| 141 |
## π« Limitations
|
| 142 |
|
| 143 |
- β οΈ May underperform with heavy code-switching (Swahili-English mix)
|
| 144 |
-
-
|
| 145 |
- π Limited exposure to specialized (medical/legal) corpora
|
| 146 |
- π Relies on accurate STT transcription in end-to-end speech-to-speech use
|
| 147 |
|
|
@@ -154,3 +154,23 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
|
| 154 |
| [`EYEDOL/salama-stt`](https://huggingface.co/EYEDOL/salama-stt) | Swahili Speech-to-Text model (Whisper-small fine-tuned) |
|
| 155 |
| [`EYEDOL/salama-tts`](https://huggingface.co/EYEDOL/salama-tts) | Swahili Text-to-Speech model (VITS architecture) |
|
| 156 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 22 |
|
| 23 |
# π§ SALAMA LLM β Swahili Instruction-Tuned Text Generation Model
|
| 24 |
|
| 25 |
+
**π¨βπ» Developer:** AI4NNOV
|
| 26 |
+
**βοΈ Authors:** AI4NNOV
|
| 27 |
+
**π¦ Version:** v1.0
|
| 28 |
+
**π License:** Apache 2.0
|
| 29 |
+
**π οΈ Model Type:** Instruction-Tuned Large Language Model
|
| 30 |
+
**π§© Base Model:** `Jacaranda/UlizaLlama`
|
| 31 |
|
| 32 |
---
|
| 33 |
|
| 34 |
## π Overview
|
| 35 |
|
| 36 |
**SALAMA LLM** is the **language understanding and generation engine** of the **SALAMA Framework** β a modular Speech-to-Speech (STS) AI pipeline built for African languages.
|
| 37 |
+
The model is fine-tuned on Swahili instruction datasets to enable natural, culturally relevant responses in text generation, summarization, question answering, and translation.
|
| 38 |
|
| 39 |
+
This model represents a major step in bridging the linguistic digital divide by providing **high-quality Swahili AI text generation** capabilities within an open, scalable framework.
|
| 40 |
|
| 41 |
---
|
| 42 |
|
| 43 |
+
## π§±οΈ Model Architecture
|
| 44 |
|
| 45 |
SALAMA LLM is based on **Jacaranda/UlizaLlama**, fine-tuned using **Parameter-Efficient Fine-Tuning (PEFT)** via **LoRA/QLoRA**.
|
| 46 |
The architecture supports mixed Swahili-English text inputs while focusing on fluent Swahili text generation for both casual and formal domains.
|
| 47 |
|
| 48 |
| Parameter | Value |
|
| 49 |
|------------|--------|
|
| 50 |
+
| **Base Model** | `Jacaranda/UlizaLlama` |
|
| 51 |
+
| **Fine-Tuning** | QLoRA / LoRA (PEFT) |
|
| 52 |
+
| **Precision** | 4-bit quantization |
|
| 53 |
+
| **Optimizer** | AdamW |
|
| 54 |
+
| **Learning Rate** | 2e-5 |
|
| 55 |
+
| **Epochs** | 3β5 |
|
| 56 |
+
| **Frameworks** | Transformers, TRL, PEFT, Unsloth |
|
| 57 |
+
| **Languages** | Swahili (sw), English (en) |
|
| 58 |
|
| 59 |
---
|
| 60 |
|
|
|
|
| 70 |
|
| 71 |
## π§ Model Capabilities
|
| 72 |
|
| 73 |
+
β
Text generation in **Swahili and English**
|
| 74 |
+
β
Instruction-following, summarization, and dialogue
|
| 75 |
+
β
Question answering and translation (EN β SW)
|
| 76 |
+
β
Sentiment and named-entity recognition
|
| 77 |
+
β
Contextually and culturally aligned text generation
|
| 78 |
|
| 79 |
---
|
| 80 |
|
|
|
|
| 120 |
)
|
| 121 |
|
| 122 |
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
| 123 |
+
```
|
| 124 |
|
| 125 |
+
**𦩠Example Output:**
|
| 126 |
|
| 127 |
+
> βElimu ni msingi wa maendeleo, humwezesha mtu kuelewa dunia na kuboresha maisha yake na jamii kwa ujumla.β
|
|
|
|
|
|
|
| 128 |
|
| 129 |
---
|
| 130 |
|
|
|
|
| 132 |
|
| 133 |
- π§© Optimized for African low-resource NLP contexts
|
| 134 |
- π¬ Instruction-following in Swahili and English
|
| 135 |
+
- βοΈ Lightweight and efficient (QLoRA fine-tuned; runs on single 24 GB GPU)
|
| 136 |
- π Culturally aligned text generation
|
| 137 |
+
- π¦Ά Open-source and extendable to other African languages
|
| 138 |
|
| 139 |
---
|
| 140 |
|
| 141 |
## π« Limitations
|
| 142 |
|
| 143 |
- β οΈ May underperform with heavy code-switching (Swahili-English mix)
|
| 144 |
+
- π€ Not yet optimized for rare dialects or poetic forms
|
| 145 |
- π Limited exposure to specialized (medical/legal) corpora
|
| 146 |
- π Relies on accurate STT transcription in end-to-end speech-to-speech use
|
| 147 |
|
|
|
|
| 154 |
| [`EYEDOL/salama-stt`](https://huggingface.co/EYEDOL/salama-stt) | Swahili Speech-to-Text model (Whisper-small fine-tuned) |
|
| 155 |
| [`EYEDOL/salama-tts`](https://huggingface.co/EYEDOL/salama-tts) | Swahili Text-to-Speech model (VITS architecture) |
|
| 156 |
|
| 157 |
+
---
|
| 158 |
+
|
| 159 |
+
## π§Ύ Citation
|
| 160 |
+
|
| 161 |
+
If you use **SALAMA LLM**, please cite:
|
| 162 |
+
|
| 163 |
+
```bibtex
|
| 164 |
+
@misc{salama_llm_2025,
|
| 165 |
+
title={SALAMA LLM: Swahili Instruction-Tuned Text Generation Model},
|
| 166 |
+
author={AI4NNOV},
|
| 167 |
+
year={2025},
|
| 168 |
+
publisher={Hugging Face},
|
| 169 |
+
howpublished={\url{https://huggingface.co/EYEDOL/salama-llm}}
|
| 170 |
+
}
|
| 171 |
+
```
|
| 172 |
+
|
| 173 |
+
---
|
| 174 |
+
|
| 175 |
+
**π‘ βElimu ni msingi wa maendeleo β Knowledge is the foundation of progress.β**
|
| 176 |
+
|