Spaces:
Sleeping
Sleeping
| title: English Accent Detector | |
| emoji: 💻 | |
| colorFrom: blue | |
| colorTo: green | |
| sdk: gradio | |
| sdk_version: 5.30.0 | |
| app_file: app.py | |
| pinned: false | |
| short_description: Lightweight Gradio demo that detects whether a video’s speec | |
| # Random‑slice English Accent Classifier | |
| A Gradio-based tool for detecting and classifying English accents from public video URLs (e.g., YouTube, Loom). It samples multiple 8‑second clips, filters for English speech, then predicts accents (American, British, Australian, etc.) with confidence scores. | |
| --- | |
| ## 🚀 Features | |
| * **Public URL support**: Download audio from YouTube, Loom, or direct MP4 links via `yt_dlp`. | |
| * **Language filtering**: Uses SpeechBrain’s language-ID model to skip non-English content. | |
| * **Random‑slice sampling**: Analyzes *N* random 8‑second windows to avoid full‑audio processing. | |
| * **Accent classification**: Classifies each slice using a pretrained ECAPA model and aggregates via majority vote. | |
| * **Confidence scores**: Returns confidence percentages for language detection, per‑slice accent, and overall accent. | |
| * **Interactive UI**: Simple Gradio interface—paste URL, choose sample count, click *Analyse*. | |
| --- | |
| ## 🛠️ Requirements | |
| * Python 3.8+ (tested on 3.10) | |
| * `yt_dlp` | |
| * `torch`, `torchaudio` | |
| * `gradio` | |
| * `speechbrain` | |
| You can install all dependencies via: | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| **`requirements.txt`** should include: | |
| ```text | |
| yt_dlp | |
| torch | |
| torchaudio | |
| gradio | |
| speechbrain | |
| ``` | |
| --- | |
| ## 📦 Installation | |
| 1. **Clone the repo**: | |
| ```bash | |
| git clone | |
| cd english-accent-classifier | |
| ``` | |
| 2. **Install dependencies**: | |
| ```bash | |
| pip install -r requirements.txt | |
| ```` | |
| ## ▶️ Usage | |
| ### CLI Mode | |
| Run the script directly: | |
| ```bash | |
| python3 main.py --share | |
| ```` | |
| * The `--share` flag enables a public Gradio link for easy testing. | |
| * By default, the app runs on `http://localhost:7860`. | |
| ### Gradio UI | |
| 1. Open the link in your browser (e.g., `http://localhost:7860`). | |
| 2. Paste a **public** video URL in the *Video URL* field. | |
| 3. Choose the number of random 8‑second samples (1–10). | |
| 4. Click **Analyse**. | |
| 5. Inspect the JSON output for: | |
| * `language` & `language_confidence` | |
| * `accent_overall` & `overall_confidence_avg` | |
| * `per_clip` array with individual slice details | |
| * `summary` string | |
| #### Example Output | |
| ```json | |
| { | |
| "language": "English", | |
| "language_confidence": 98.7, | |
| "accent_overall": "British", | |
| "overall_confidence_avg": 87.5, | |
| "per_clip": [ | |
| {"clip": 0, "start": "00:01:23", "end": "00:01:31", "accent": "British", "confidence": 89.3}, | |
| ... | |
| ], | |
| "summary": "English detected. Overall accent = British (≈87.5% on 3/4 slices)." | |
| } | |
| ``` | |
| --- | |
| ## 📝 Code Structure | |
| * `main.py`: Core logic, Gradio UI, and entry point. | |
| * **Helpers**: | |
| * `download_audio`: Fetches best audio track via `yt_dlp`. | |
| * `extract_wav`: Cuts 8‑second WAV clips with `torchaudio`. | |
| * `classify_language` / `classify_accent`: Run SpeechBrain models. | |
| * `pick_random_offsets`: Selects random start times. | |
| * **Models**: | |
| * `speechbrain/lang-id-voxlingua107-ecapa` (language detection) | |
| * `Jzuluaga/accent-id-commonaccent_ecapa` (accent classification) | |
| --- | |
| ## ⚙️ Configuration | |
| * **DEVICE**: Change to `"cuda"` in `main.py` if you have a GPU. | |
| * **Sample length**: Default is 8 seconds—adjust `dur` in `extract_wav` if desired. | |
| * **Model IDs**: Swap out for custom models by updating `ACCENT_MODEL_ID` and `LANG_MODEL_ID`. | |
| --- | |
| ## 🐛 Troubleshooting | |
| * **Private or invalid URLs**: Ensure the video is publicly accessible. | |
| * **Short audio**: Audio shorter than 8 seconds will trigger an error. | |
| * **Missing dependencies**: Double-check `pip install -r requirements.txt`. | |
| * **Slow startup**: Model downloads occur on first run—expect \~10–20 s delay. | |
| * **YouTube bot detection on hosted spaces**: On Hugging Face Spaces, direct YouTube downloads may be blocked But loom link are working. | |
| --- | |