🕌 Whisper-Small-Quran

Fine-tuned Whisper model for Quranic Recitation ASR and Verse Localization

📖 Overview

Whisper-Small-Quran is a fine-tuned version of OpenAI’s Whisper-Small model specialized for Quranic recitations in Arabic.
It’s part of the QV Finder project — a research system for AI-powered Quran verse transcription and retrieval.

Unlike general ASR systems, this model captures Tajweed-influenced pronunciations, regional recitation styles, and noisy real-world recordings, achieving high accuracy across both professional and normal reciters.

📘 Reference Paper:

Bashar Al-Rfooh, Mohammad Abdel-Majeed, Shorouq Al-Awawdeh, and Khaled Darabkeh,
“QV Finder: An Accurate Quran Verse Finder System,” Indonesian Journal of Electrical Engineering and Computer Science, Vol. 99, No. 1, 2099:contentReference[oaicite:0]{index=0}

✨ Key Features

🎧 Arabic ASR tuned for Quranic Recitation
🕋 Verse localization via tokenization + FuzzyWuzzy two-stage string matching
📚 Dataset: 64 reciters · 399k verses (~2023 h total)
🧠 Optimized for ≤ 30 s segments, with silence-based segmentation for longer ones
📈 High accuracy:
- Normal Reciters → WER 10.1 · CER 3.3
- Professional Reciters → WER 9.9 · CER 4.1
🔍 100 % retrieval @ 85 % threshold with ≈ 2.5 % false positives

🧩 Intended Use

Task	Description
Speech-to-Text (ASR)	Converts Quranic recitations into Arabic text
Verse Search	Identifies Surah + Ayah from partial recitations
Education / Annotation	Useful for Tajweed feedback, transcription, and archiving

⚠️ Not a general Arabic ASR; tailored for Quranic speech and Tajweed patterns.

🧠 Model Architecture & Training

Parameter	Value
Base Model	Whisper-Small (244 M params)
Epochs	11
Batch Size	32
Optimizer	Adam (lr 5 e-4 · weight decay 0.01)
Seed	3407
Data Augmentation	Noise · Pitch · Tempo
Hardware	NVIDIA A100 (80 GB)

Trained on EveryAyah dataset; includes both professional and normal recitations, ensuring robustness under varied Tajweed styles and noise levels:contentReference[oaicite:1]{index=1}.

🔍 Verse Localization Pipeline

After ASR transcription:

Tokenization: Sliding-window n-grams (1–10 words + full verse).
Matching: FuzzyWuzzy ratio + Levenshtein distance.
Threshold: ≥ 85 % similarity → correct match.
Two-Stage Search: Fast coarse filter → accurate rerank → false positives ↓ to ≈ 2.5 %:contentReference[oaicite:2]{index=2}.

📊 Evaluation Results

Reciter Type	Segment Length	CER	WER	Retrieval Accuracy	False Positives
Professional	< 30 s	4.1	9.9	100 % @ 85 %	2.5 %
Normal	< 30 s	3.3	10.1	100 % @ 85 %	4.1 %
Professional	> 30 s	12.5	24.5	92 %	4.6 %

🚀 Quick Start

from transformers import WhisperProcessor, WhisperForConditionalGeneration
import librosa 
import torch 


model_name = "basharalrfooh/whisper-small-quran"
processor = WhisperProcessor.from_pretrained(model_name)
model = WhisperForConditionalGeneration.from_pretrained(model_name)
model.config.forced_decoder_ids = None


audio_file = "Your .wav file"

speech_array, sampling_rate = librosa.load(audio_file, sr=16000)

inputs = processor(speech_array, return_tensors="pt", sampling_rate=sampling_rate)

with torch.no_grad():
    predicted_ids = model.generate(inputs["input_features"])

# Decode token IDs to text
transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)

print(f"Transcription: {transcription}")

⚠️ Limitations

Optimized for ≤ 30 s segments (aligned with Whisper’s training scheme).
Domain-specific — intended for Quranic recitations only.
Requires proper text normalization for accurate verse retrieval.

📜 Citation

If you use this model, please cite:

Bashar Al-Rfooh, Mohammad Abdel-Majeed, Shorouq Al-Awawdeh, and Khaled Darabkeh.
“QV Finder: An Accurate Quran Verse Finder System.”
Indonesian Journal of Electrical Engineering and Computer Science, Vol. 99, No. 1, 2099.
https://huggingface.co/basharalrfooh/whisper-small-quran

🙏 Acknowledgements

This work was supported by Maqsam.

Downloads last month: 10

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for basharalrfooh/whisper-small-quran

Base model

openai/whisper-small

Finetuned

(3070)

this model