File size: 1,606 Bytes
d22d81d
1433cd3
d22d81d
 
 
 
 
 
 
 
1433cd3
f1b748e
 
 
5c2c99b
 
 
 
 
 
 
 
 
 
 
 
 
 
d22d81d
5c2c99b
d22d81d
97ee8eb
c93756b
354b928
 
 
5c2c99b
354b928
5e875f5
354b928
 
 
97ee8eb
354b928
 
 
5e875f5
354b928
 
 
 
 
5c2c99b
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
---
title: NVIDIA Parakeet TDT 0.6B V2 Real Time Mic Transcription ASR STT
emoji: 📊
colorFrom: purple
colorTo: blue
sdk: gradio
sdk_version: 5.31.0
app_file: app.py
pinned: false
license: apache-2.0
short_description: Real-Time, Speak to Mic, NO MODEL DOWNLOAD NEEDED!!
language: en
inference: true
tags:
- audio
- speech-recognition
- asr
- real-time
- cpu
- nvidia
- parakeet
- microphone
- voice
- speech
- browser
- gradio
- nemo
- huggingface
---
**Real-time English speech-to-text in your browser — no GPU required.**

This Space runs the 600 M-parameter [`nvidia/parakeet-tdt-0.6b-v2`](https://huggingface.co/nvidia/parakeet-tdt-0.6b-v2) model that fits comfortably on the **CPU Basic (2 vCPU)** tier.
1. Click **“Record”**
2. **Allow microphone** access and start speaking.  
3. Watch live text appear in the **Transcription** box.

**Stalled UI?** Refresh the browser tab — this fully restarts the Space and clears any stuck threads.


| Technique | Why it matters |
|-----------|----------------|
| **`OMP_NUM_THREADS=2` & `torch.set_num_threads(2)`** | Matches the 2 vCPUs for optimal throughput |
| **FBGEMM backend** | Fastest kernels on x86 |
| **4-second streaming window** | Low latency & small memory footprint |
| **Gradio `stream_every=0.5`** | Updates the transcript twice per second for real-time feel |


| Item | Licence |
|------|---------|
| **Demo code (this repo)** | Apache-2.0 |
| **Model weights** – `nvidia/parakeet-tdt-0.6b-v2` | CC-BY-4.0 (© NVIDIA) |

**If you redistribute transcripts or fine-tuned weights, please retain the CC-BY-4.0 attribution notice.**