tawkeed-embedding
tawkeed-embedding is an Arabic-first text embedding model built by Tawkeed, fine-tuned for on-device and edge AI deployment.
Forked from BAAI/bge-m3 and fine-tuned on Arabic semantic similarity and retrieval data, this model powers Arabic search, RAG, and similarity tasks running natively on Tawkeed devices.
Highlights
- Arabic-first embeddings — trained and rigorously tested on Arabic text for semantic understanding
- Edge-optimized — efficient enough to run embedding pipelines on Tawkeed edge hardware
- Production-ready — validated on Arabic retrieval and similarity benchmarks
- Multilingual — retains strong multilingual capability from BGE-M3
Model Details
| Property | Value |
|---|---|
| Base Model | BAAI/bge-m3 |
| Language | Arabic (ar), English (en), + multilingual |
| License | MIT |
| Task | Text Embedding / Retrieval / Similarity |
| Fine-tuning | Arabic semantic similarity & retrieval data |
| Deployment | On-device / Edge / Cloud |
Training
This model is fine-tuned for Arabic embeddings through:
- Fork of the BGE-M3 multilingual embedding model
- Fine-tuning on Arabic semantic similarity and retrieval datasets
- Evaluation on Arabic retrieval benchmarks
Usage
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("tawkeed-sa/tawkeed-embedding")
sentences = [
"الذكاء الاصطناعي يغير العالم",
"تقنيات التعلم العميق تتطور بسرعة",
"الطقس جميل اليوم"
]
embeddings = model.encode(sentences)
print(embeddings.shape)
Tawkeed Model Family
A complete suite of Arabic AI models — from compact edge models to large-scale MoE — all fine-tuned and tested for Arabic.
| Model | Size | Type |
|---|---|---|
| tawkeed-sa/tawkeed-0.8b | 0.8b | Arabic LLM |
| tawkeed-sa/tawkeed-2b | 2b | Arabic LLM |
| tawkeed-sa/tawkeed-4b | 4b | Arabic LLM |
| tawkeed-sa/tawkeed-9b | 9b | Arabic LLM |
| tawkeed-sa/tawkeed-27b | 27b | Arabic LLM |
| tawkeed-sa/tawkeed-40b | 40b | Arabic LLM |
| tawkeed-sa/tawkeed-27b-MLX | 27b 8-bit | LLM — Apple Silicon (MLX) |
| tawkeed-sa/tawkeed-27b-GGUF | 27b Q8_0 | LLM — Ollama / llama.cpp |
| tawkeed-sa/tawkeed-ocr | — | OCR |
| tawkeed-sa/tawkeed-embedding | — | Embedding |
About Tawkeed
Tawkeed builds Arabic-native AI that runs on the edge. Every model in the family is fine-tuned for Arabic, tested on Arabic benchmarks, and optimized for deployment on Tawkeed devices.
Built by Tawkeed.
- Downloads last month
- 3
Model tree for tawkeed-sa/tawkeed-embedding
Base model
BAAI/bge-m3