EYEDOL commited on
Commit
cd7c5a4
Β·
verified Β·
1 Parent(s): d0ead2f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +113 -0
README.md CHANGED
@@ -9,6 +9,15 @@ tags:
9
  license: apache-2.0
10
  language:
11
  - en
 
 
 
 
 
 
 
 
 
12
  ---
13
 
14
  # Uploaded model
@@ -20,3 +29,107 @@ language:
20
  This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
21
 
22
  [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  license: apache-2.0
10
  language:
11
  - en
12
+ - sw
13
+ datasets:
14
+ - saillab/alpaca_swahili_taco
15
+ metrics:
16
+ - bleu
17
+ - accuracy
18
+ - cer
19
+ - rouge
20
+ pipeline_tag: text-generation
21
  ---
22
 
23
  # Uploaded model
 
29
  This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
30
 
31
  [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
32
+
33
+
34
+ # Model Card: SALAMA LLM
35
+
36
+ **Model Name:** SALAMA LLM
37
+ **Developed by:** [Your Team or Organization Name]
38
+ **Model Type:** Large Language Model (LLM)
39
+ **Base Models:** UlizaLlama-7B, Llama 3.2, Google Gemma (2B–9B)
40
+ **Language(s):** Swahili, English
41
+ **License:** Apache 2.0
42
+ **Repository:** [Hugging Face Link Here]
43
+
44
+ ---
45
+
46
+ ## Overview
47
+
48
+ SALAMA LLM is the central **language understanding and generation module** within the **SALAMA Framework** β€” a scalable, end-to-end **speech-to-speech AI system** for African languages.
49
+ It interprets transcribed speech, performs reasoning, and generates contextually appropriate responses in Swahili and English.
50
+
51
+ This model was fine-tuned on Swahili-centric instruction data to enhance fluency, comprehension, and cultural relevance for conversational and task-based applications.
52
+
53
+ ---
54
+
55
+ ## ✳️ Architecture
56
+
57
+ SALAMA LLM builds on top of **UlizaLlama (7B)** and leverages **Parameter-Efficient Fine-Tuning (PEFT)** using **LoRA/QLoRA** for resource-efficient adaptation.
58
+ Training was conducted on a mixture of:
59
+ - Instructional and dialogue datasets in Swahili and English
60
+ - Domain-specific corpora for comprehension, summarization, question answering, and translation
61
+
62
+ ---
63
+
64
+ ## 🧾 Training Data
65
+
66
+ | Dataset | Source | Tokens / Examples | Purpose |
67
+ |----------|---------|------------------|----------|
68
+ | Jacaranda/kiswallama-pretrained | Hugging Face | 321M Swahili tokens | Base pretraining |
69
+ | Google Gemma Swahili Fine-tuning | Internal dataset | 20+ prompt-response pairs | Instruction tuning |
70
+ | Custom Swahili QA corpus | Local compilation | 50K examples | Conversational fine-tuning |
71
+
72
+ ---
73
+
74
+ ## βš™οΈ Training Details
75
+
76
+ - **Technique:** QLoRA Fine-tuning
77
+ - **Precision:** 4-bit quantization
78
+ - **Optimizer:** AdamW
79
+ - **Learning Rate:** 2e-5
80
+ - **Batch Size:** 8
81
+ - **Epochs:** 3–5
82
+ - **Hardware:** 1x A100 (24GB)
83
+
84
+ ---
85
+
86
+ ## 🧠 Capabilities
87
+
88
+ - Contextual understanding of Swahili and English queries
89
+ - Instruction following and summarization
90
+ - Question answering and translation
91
+ - Conversational generation
92
+ - Named entity recognition and sentiment analysis
93
+
94
+ ---
95
+
96
+ ## πŸ“Š Evaluation Metrics
97
+
98
+ | Task | Precision | Recall | F1 | BLEU | ROUGE | Accuracy |
99
+ |------|------------|--------|----|------|--------|----------|
100
+ | Question Answering | 0.955 | 0.782 | 0.879 | 0.50 | 0.61 | β€” |
101
+ | Translation | β€” | β€” | β€” | 0.49 | 0.59 | β€” |
102
+ | Sentiment Analysis | 0.968 | 0.943 | 0.954 | β€” | β€” | 97.9% |
103
+ | Entity Recognition | 0.853 | 0.847 | 0.887 | β€” | β€” | β€” |
104
+
105
+ ---
106
+
107
+ ## πŸš€ Applications
108
+
109
+ - Conversational voice assistants for Swahili
110
+ - Educational bots and content summarizers
111
+ - Low-resource multilingual chat systems
112
+ - Research in African LLM adaptation
113
+
114
+ ---
115
+
116
+ ## 🧩 Limitations
117
+
118
+ - Performance declines for code-mixed (Swahili-English) slang
119
+ - May misinterpret rare dialectal expressions
120
+ - Dependent on STT transcription accuracy in full STS pipeline
121
+
122
+ ---
123
+
124
+ ## 🀝 Citation
125
+
126
+ If you use this model, please cite:
127
+
128
+ > Adegoke Israel et al. (2025). *SALAMA: Scalable African Language Multimodal AI Framework*. Technical Report.
129
+
130
+ ---
131
+
132
+ ## πŸ”— Related Models
133
+
134
+ - [`SALAMA-STT`](https://huggingface.co/yourname/salama-stt) β€” Swahili Whisper Fine-tuned
135
+ - [`SALAMA-TTS`](https://huggingface.co/yourname/salama-tts) β€” Swahili VITS-based TTS