πŸ•Œ Whisper-Small-Quran

Fine-tuned Whisper model for Quranic Recitation ASR and Verse Localization

Paper DOI
License: MIT
Model


πŸ“– Overview

Whisper-Small-Quran is a fine-tuned version of OpenAI’s Whisper-Small model specialized for Quranic recitations in Arabic.
It’s part of the QV Finder project β€” a research system for AI-powered Quran verse transcription and retrieval.

Unlike general ASR systems, this model captures Tajweed-influenced pronunciations, regional recitation styles, and noisy real-world recordings, achieving high accuracy across both professional and normal reciters.

πŸ“˜ Reference Paper:

Bashar Al-Rfooh, Mohammad Abdel-Majeed, Shorouq Al-Awawdeh, and Khaled Darabkeh,
β€œQV Finder: An Accurate Quran Verse Finder System,” Indonesian Journal of Electrical Engineering and Computer Science, Vol. 99, No. 1, 2099
:contentReference[oaicite:0]{index=0}


✨ Key Features

  • 🎧 Arabic ASR tuned for Quranic Recitation
  • πŸ•‹ Verse localization via tokenization + FuzzyWuzzy two-stage string matching
  • πŸ“š Dataset: 64 reciters Β· 399k verses (~2023 h total)
  • 🧠 Optimized for ≀ 30 s segments, with silence-based segmentation for longer ones
  • πŸ“ˆ High accuracy:
    • Normal Reciters β†’ WER 10.1 Β· CER 3.3
    • Professional Reciters β†’ WER 9.9 Β· CER 4.1
  • πŸ” 100 % retrieval @ 85 % threshold with β‰ˆ 2.5 % false positives

🧩 Intended Use

Task Description
Speech-to-Text (ASR) Converts Quranic recitations into Arabic text
Verse Search Identifies Surah + Ayah from partial recitations
Education / Annotation Useful for Tajweed feedback, transcription, and archiving

⚠️ Not a general Arabic ASR; tailored for Quranic speech and Tajweed patterns.


🧠 Model Architecture & Training

Parameter Value
Base Model Whisper-Small (244 M params)
Epochs 11
Batch Size 32
Optimizer Adam (lr 5 e-4 Β· weight decay 0.01)
Seed 3407
Data Augmentation Noise Β· Pitch Β· Tempo
Hardware NVIDIA A100 (80 GB)

Trained on EveryAyah dataset; includes both professional and normal recitations, ensuring robustness under varied Tajweed styles and noise levels:contentReference[oaicite:1]{index=1}.


πŸ” Verse Localization Pipeline

After ASR transcription:

  1. Tokenization: Sliding-window n-grams (1–10 words + full verse).
  2. Matching: FuzzyWuzzy ratio + Levenshtein distance.
  3. Threshold: β‰₯ 85 % similarity β†’ correct match.
  4. Two-Stage Search: Fast coarse filter β†’ accurate rerank β†’ false positives ↓ to β‰ˆ 2.5 %:contentReference[oaicite:2]{index=2}.

πŸ“Š Evaluation Results

Reciter Type Segment Length CER WER Retrieval Accuracy False Positives
Professional < 30 s 4.1 9.9 100 % @ 85 % 2.5 %
Normal < 30 s 3.3 10.1 100 % @ 85 % 4.1 %
Professional > 30 s 12.5 24.5 92 % 4.6 %

πŸš€ Quick Start

from transformers import WhisperProcessor, WhisperForConditionalGeneration
import librosa 
import torch 


model_name = "basharalrfooh/whisper-small-quran"
processor = WhisperProcessor.from_pretrained(model_name)
model = WhisperForConditionalGeneration.from_pretrained(model_name)
model.config.forced_decoder_ids = None


audio_file = "Your .wav file"

speech_array, sampling_rate = librosa.load(audio_file, sr=16000)

inputs = processor(speech_array, return_tensors="pt", sampling_rate=sampling_rate)

with torch.no_grad():
    predicted_ids = model.generate(inputs["input_features"])

# Decode token IDs to text
transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)

print(f"Transcription: {transcription}")

⚠️ Limitations

  • Optimized for ≀ 30 s segments (aligned with Whisper’s training scheme).
  • Domain-specific β€” intended for Quranic recitations only.
  • Requires proper text normalization for accurate verse retrieval.

πŸ“œ Citation

If you use this model, please cite:

Bashar Al-Rfooh, Mohammad Abdel-Majeed, Shorouq Al-Awawdeh, and Khaled Darabkeh.
β€œQV Finder: An Accurate Quran Verse Finder System.”
Indonesian Journal of Electrical Engineering and Computer Science, Vol. 99, No. 1, 2099.
https://huggingface.co/basharalrfooh/whisper-small-quran


πŸ™ Acknowledgements

This work was supported by Maqsam.

Downloads last month
10
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for basharalrfooh/whisper-small-quran

Finetuned
(3070)
this model