Spaces:

dogukancck
/

English_Accent_Detector

Sleeping

App Files Files Community

English_Accent_Detector / README.md

dogukancck

Update README.md

80ce39f verified 7 months ago

preview code

raw

history blame contribute delete

3.95 kB

	---
	title: English Accent Detector
	emoji: 💻
	colorFrom: blue
	colorTo: green
	sdk: gradio
	sdk_version: 5.30.0
	app_file: app.py
	pinned: false
	short_description: Lightweight Gradio demo that detects whether a video’s speec
	---
	# Random‑slice English Accent Classifier

	A Gradio-based tool for detecting and classifying English accents from public video URLs (e.g., YouTube, Loom). It samples multiple 8‑second clips, filters for English speech, then predicts accents (American, British, Australian, etc.) with confidence scores.
	---

	## 🚀 Features

	* Public URL support: Download audio from YouTube, Loom, or direct MP4 links via `yt_dlp`.
	* Language filtering: Uses SpeechBrain’s language-ID model to skip non-English content.
	* Random‑slice sampling: Analyzes N random 8‑second windows to avoid full‑audio processing.
	* Accent classification: Classifies each slice using a pretrained ECAPA model and aggregates via majority vote.
	* Confidence scores: Returns confidence percentages for language detection, per‑slice accent, and overall accent.
	* Interactive UI: Simple Gradio interface—paste URL, choose sample count, click Analyse.

	---

	## 🛠️ Requirements

	* Python 3.8+ (tested on 3.10)
	* `yt_dlp`
	* `torch`, `torchaudio`
	* `gradio`
	* `speechbrain`

	You can install all dependencies via:

	```bash
	pip install -r requirements.txt
	```

	`requirements.txt` should include:

	```text
	yt_dlp
	torch
	torchaudio
	gradio
	speechbrain
	```

	---

	## 📦 Installation

	1. Clone the repo:

	```bash
	git clone
	cd english-accent-classifier
	```

	2. Install dependencies:
	```bash
	pip install -r requirements.txt
	````



	## ▶️ Usage

	### CLI Mode

	Run the script directly:

	```bash
	python3 main.py --share
	````

	* The `--share` flag enables a public Gradio link for easy testing.
	* By default, the app runs on `http://localhost:7860`.

	### Gradio UI

	1. Open the link in your browser (e.g., `http://localhost:7860`).
	2. Paste a public video URL in the Video URL field.
	3. Choose the number of random 8‑second samples (1–10).
	4. Click Analyse.
	5. Inspect the JSON output for:

	* `language` & `language_confidence`
	* `accent_overall` & `overall_confidence_avg`
	* `per_clip` array with individual slice details
	* `summary` string

	#### Example Output

	```json
	{
	"language": "English",
	"language_confidence": 98.7,
	"accent_overall": "British",
	"overall_confidence_avg": 87.5,
	"per_clip": [
	{"clip": 0, "start": "00:01:23", "end": "00:01:31", "accent": "British", "confidence": 89.3},
	...
	],
	"summary": "English detected. Overall accent = British (≈87.5% on 3/4 slices)."
	}
	```

	---

	## 📝 Code Structure

	* `main.py`: Core logic, Gradio UI, and entry point.
	* Helpers:

	* `download_audio`: Fetches best audio track via `yt_dlp`.
	* `extract_wav`: Cuts 8‑second WAV clips with `torchaudio`.
	* `classify_language` / `classify_accent`: Run SpeechBrain models.
	* `pick_random_offsets`: Selects random start times.
	* Models:

	* `speechbrain/lang-id-voxlingua107-ecapa` (language detection)
	* `Jzuluaga/accent-id-commonaccent_ecapa` (accent classification)

	---

	## ⚙️ Configuration

	* DEVICE: Change to `"cuda"` in `main.py` if you have a GPU.
	* Sample length: Default is 8 seconds—adjust `dur` in `extract_wav` if desired.
	* Model IDs: Swap out for custom models by updating `ACCENT_MODEL_ID` and `LANG_MODEL_ID`.

	---

	## 🐛 Troubleshooting

	* Private or invalid URLs: Ensure the video is publicly accessible.
	* Short audio: Audio shorter than 8 seconds will trigger an error.
	* Missing dependencies: Double-check `pip install -r requirements.txt`.
	* Slow startup: Model downloads occur on first run—expect \~10–20 s delay.
	* YouTube bot detection on hosted spaces: On Hugging Face Spaces, direct YouTube downloads may be blocked But loom link are working.

	---