π Whisper-Small-Quran
Fine-tuned Whisper model for Quranic Recitation ASR and Verse Localization
π Overview
Whisper-Small-Quran is a fine-tuned version of OpenAIβs Whisper-Small model specialized for Quranic recitations in Arabic.
Itβs part of the QV Finder project β a research system for AI-powered Quran verse transcription and retrieval.
Unlike general ASR systems, this model captures Tajweed-influenced pronunciations, regional recitation styles, and noisy real-world recordings, achieving high accuracy across both professional and normal reciters.
π Reference Paper:
Bashar Al-Rfooh, Mohammad Abdel-Majeed, Shorouq Al-Awawdeh, and Khaled Darabkeh,
βQV Finder: An Accurate Quran Verse Finder System,β Indonesian Journal of Electrical Engineering and Computer Science, Vol. 99, No. 1, 2099:contentReference[oaicite:0]{index=0}
β¨ Key Features
- π§ Arabic ASR tuned for Quranic Recitation
- π Verse localization via tokenization + FuzzyWuzzy two-stage string matching
- π Dataset: 64 reciters Β· 399k verses (~2023 h total)
- π§ Optimized for β€ 30 s segments, with silence-based segmentation for longer ones
- π High accuracy:
- Normal Reciters β WER 10.1 Β· CER 3.3
- Professional Reciters β WER 9.9 Β· CER 4.1
- π 100 % retrieval @ 85 % threshold with β 2.5 % false positives
π§© Intended Use
| Task | Description |
|---|---|
| Speech-to-Text (ASR) | Converts Quranic recitations into Arabic text |
| Verse Search | Identifies Surah + Ayah from partial recitations |
| Education / Annotation | Useful for Tajweed feedback, transcription, and archiving |
β οΈ Not a general Arabic ASR; tailored for Quranic speech and Tajweed patterns.
π§ Model Architecture & Training
| Parameter | Value |
|---|---|
| Base Model | Whisper-Small (244 M params) |
| Epochs | 11 |
| Batch Size | 32 |
| Optimizer | Adam (lr 5 e-4 Β· weight decay 0.01) |
| Seed | 3407 |
| Data Augmentation | Noise Β· Pitch Β· Tempo |
| Hardware | NVIDIA A100 (80 GB) |
Trained on EveryAyah dataset; includes both professional and normal recitations, ensuring robustness under varied Tajweed styles and noise levels:contentReference[oaicite:1]{index=1}.
π Verse Localization Pipeline
After ASR transcription:
- Tokenization: Sliding-window n-grams (1β10 words + full verse).
- Matching: FuzzyWuzzy ratio + Levenshtein distance.
- Threshold: β₯ 85 % similarity β correct match.
- Two-Stage Search: Fast coarse filter β accurate rerank β false positives β to β 2.5 %:contentReference[oaicite:2]{index=2}.
π Evaluation Results
| Reciter Type | Segment Length | CER | WER | Retrieval Accuracy | False Positives |
|---|---|---|---|---|---|
| Professional | < 30 s | 4.1 | 9.9 | 100 % @ 85 % | 2.5 % |
| Normal | < 30 s | 3.3 | 10.1 | 100 % @ 85 % | 4.1 % |
| Professional | > 30 s | 12.5 | 24.5 | 92 % | 4.6 % |
π Quick Start
from transformers import WhisperProcessor, WhisperForConditionalGeneration
import librosa
import torch
model_name = "basharalrfooh/whisper-small-quran"
processor = WhisperProcessor.from_pretrained(model_name)
model = WhisperForConditionalGeneration.from_pretrained(model_name)
model.config.forced_decoder_ids = None
audio_file = "Your .wav file"
speech_array, sampling_rate = librosa.load(audio_file, sr=16000)
inputs = processor(speech_array, return_tensors="pt", sampling_rate=sampling_rate)
with torch.no_grad():
predicted_ids = model.generate(inputs["input_features"])
# Decode token IDs to text
transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
print(f"Transcription: {transcription}")
β οΈ Limitations
- Optimized for β€ 30 s segments (aligned with Whisperβs training scheme).
- Domain-specific β intended for Quranic recitations only.
- Requires proper text normalization for accurate verse retrieval.
π Citation
If you use this model, please cite:
Bashar Al-Rfooh, Mohammad Abdel-Majeed, Shorouq Al-Awawdeh, and Khaled Darabkeh.
βQV Finder: An Accurate Quran Verse Finder System.β
Indonesian Journal of Electrical Engineering and Computer Science, Vol. 99, No. 1, 2099.
https://huggingface.co/basharalrfooh/whisper-small-quran
π Acknowledgements
This work was supported by Maqsam.
- Downloads last month
- 10
Model tree for basharalrfooh/whisper-small-quran
Base model
openai/whisper-small