tawkeed-embedding

tawkeed-embedding is an Arabic-first text embedding model built by Tawkeed, fine-tuned for on-device and edge AI deployment.

Forked from BAAI/bge-m3 and fine-tuned on Arabic semantic similarity and retrieval data, this model powers Arabic search, RAG, and similarity tasks running natively on Tawkeed devices.

Highlights

  • Arabic-first embeddings — trained and rigorously tested on Arabic text for semantic understanding
  • Edge-optimized — efficient enough to run embedding pipelines on Tawkeed edge hardware
  • Production-ready — validated on Arabic retrieval and similarity benchmarks
  • Multilingual — retains strong multilingual capability from BGE-M3

Model Details

Property Value
Base Model BAAI/bge-m3
Language Arabic (ar), English (en), + multilingual
License MIT
Task Text Embedding / Retrieval / Similarity
Fine-tuning Arabic semantic similarity & retrieval data
Deployment On-device / Edge / Cloud

Training

This model is fine-tuned for Arabic embeddings through:

  1. Fork of the BGE-M3 multilingual embedding model
  2. Fine-tuning on Arabic semantic similarity and retrieval datasets
  3. Evaluation on Arabic retrieval benchmarks

Usage

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("tawkeed-sa/tawkeed-embedding")

sentences = [
    "الذكاء الاصطناعي يغير العالم",
    "تقنيات التعلم العميق تتطور بسرعة",
    "الطقس جميل اليوم"
]

embeddings = model.encode(sentences)
print(embeddings.shape)

Tawkeed Model Family

A complete suite of Arabic AI models — from compact edge models to large-scale MoE — all fine-tuned and tested for Arabic.

Model Size Type
tawkeed-sa/tawkeed-0.8b 0.8b Arabic LLM
tawkeed-sa/tawkeed-2b 2b Arabic LLM
tawkeed-sa/tawkeed-4b 4b Arabic LLM
tawkeed-sa/tawkeed-9b 9b Arabic LLM
tawkeed-sa/tawkeed-27b 27b Arabic LLM
tawkeed-sa/tawkeed-40b 40b Arabic LLM
tawkeed-sa/tawkeed-27b-MLX 27b 8-bit LLM — Apple Silicon (MLX)
tawkeed-sa/tawkeed-27b-GGUF 27b Q8_0 LLM — Ollama / llama.cpp
tawkeed-sa/tawkeed-ocr OCR
tawkeed-sa/tawkeed-embedding Embedding

About Tawkeed

Tawkeed builds Arabic-native AI that runs on the edge. Every model in the family is fine-tuned for Arabic, tested on Arabic benchmarks, and optimized for deployment on Tawkeed devices.

Built by Tawkeed.

Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tawkeed-sa/tawkeed-embedding

Base model

BAAI/bge-m3
Quantized
(82)
this model