Removed demo file from README and removed INT8 references from README for now
Browse filesINT8 feature was not working fully, need to investigate more before applying again to this repository.
README.md
CHANGED
|
@@ -1,5 +1,5 @@
|
|
| 1 |
---
|
| 2 |
-
title: NVIDIA Parakeet TDT 0.6B V2
|
| 3 |
emoji: π
|
| 4 |
colorFrom: purple
|
| 5 |
colorTo: blue
|
|
@@ -17,7 +17,6 @@ tags:
|
|
| 17 |
- speech-recognition
|
| 18 |
- asr
|
| 19 |
- real-time
|
| 20 |
-
- int8
|
| 21 |
- cpu
|
| 22 |
- nvidia
|
| 23 |
- parakeet
|
|
@@ -30,10 +29,10 @@ tags:
|
|
| 30 |
- huggingface
|
| 31 |
---
|
| 32 |
|
| 33 |
-
# π¦ NVIDIA Parakeet-TDT-0.6B-v2
|
| 34 |
|
| 35 |
**Real-time English speech-to-text in your browser β no GPU required.**
|
| 36 |
-
This Space runs the 600 M-parameter [`nvidia/parakeet-tdt-0.6b-v2`](https://huggingface.co/nvidia/parakeet-tdt-0.6b-v2) model
|
| 37 |
|
| 38 |
## π Quick Start
|
| 39 |
1. Click **βRecordβ**
|
|
@@ -42,10 +41,6 @@ This Space runs the 600 M-parameter [`nvidia/parakeet-tdt-0.6b-v2`](https://hugg
|
|
| 42 |
|
| 43 |
> **Stalled UI?** Refresh the browser tab β this fully restarts the Space and clears any stuck threads.
|
| 44 |
|
| 45 |
-
<video src="https://huggingface.co/spaces/WJ88/NVIDIA-Parakeet-TDT-0.6B-v2-INT8-Real-Time-Mic-Transcription/resolve/main/demo0__5-24-2025.mp4" controls style="max-width: 100%; height: auto;">
|
| 46 |
-
Your browser does not support the video tag.
|
| 47 |
-
</video>
|
| 48 |
-
|
| 49 |
## π§ Build on This
|
| 50 |
- **Duplicate** the Space (button at the top-right) to kick-start your own ASR ideas.
|
| 51 |
- Swap in another NeMo/HF model β the quantization + streaming scaffold is ready.
|
|
@@ -54,9 +49,8 @@ This Space runs the 600 M-parameter [`nvidia/parakeet-tdt-0.6b-v2`](https://hugg
|
|
| 54 |
## βοΈ Under the Hood
|
| 55 |
| Technique | Why it matters |
|
| 56 |
|-----------|----------------|
|
| 57 |
-
| **Dynamic INT8 quantization** (`torch.quantization.quantize_dynamic`) | ~4Γ smaller, faster CPU inference with minimal accuracy loss |
|
| 58 |
| **`OMP_NUM_THREADS=2` & `torch.set_num_threads(2)`** | Matches the 2 vCPUs for optimal throughput |
|
| 59 |
-
| **FBGEMM backend** | Fastest
|
| 60 |
| **4-second streaming window** | Low latency & small memory footprint |
|
| 61 |
| **Gradio `stream_every=0.5`** | Updates the transcript twice per second for real-time feel |
|
| 62 |
|
|
@@ -70,4 +64,4 @@ Feel free to browse `app.py` for the full implementation.
|
|
| 70 |
|
| 71 |
If you redistribute transcripts or fine-tuned weights, please retain the CC-BY-4.0 attribution notice.
|
| 72 |
|
| 73 |
-
β **If this Space helps you, please give it a like and share your feedback!**
|
|
|
|
| 1 |
---
|
| 2 |
+
title: NVIDIA Parakeet TDT 0.6B V2 Real Time Mic Transcription
|
| 3 |
emoji: π
|
| 4 |
colorFrom: purple
|
| 5 |
colorTo: blue
|
|
|
|
| 17 |
- speech-recognition
|
| 18 |
- asr
|
| 19 |
- real-time
|
|
|
|
| 20 |
- cpu
|
| 21 |
- nvidia
|
| 22 |
- parakeet
|
|
|
|
| 29 |
- huggingface
|
| 30 |
---
|
| 31 |
|
| 32 |
+
# π¦ NVIDIA Parakeet-TDT-0.6B-v2 β CPU-Only Streaming ASR
|
| 33 |
|
| 34 |
**Real-time English speech-to-text in your browser β no GPU required.**
|
| 35 |
+
This Space runs the 600 M-parameter [`nvidia/parakeet-tdt-0.6b-v2`](https://huggingface.co/nvidia/parakeet-tdt-0.6b-v2) model that fits comfortably on the **CPU Basic (2 vCPU)** tier.
|
| 36 |
|
| 37 |
## π Quick Start
|
| 38 |
1. Click **βRecordβ**
|
|
|
|
| 41 |
|
| 42 |
> **Stalled UI?** Refresh the browser tab β this fully restarts the Space and clears any stuck threads.
|
| 43 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 44 |
## π§ Build on This
|
| 45 |
- **Duplicate** the Space (button at the top-right) to kick-start your own ASR ideas.
|
| 46 |
- Swap in another NeMo/HF model β the quantization + streaming scaffold is ready.
|
|
|
|
| 49 |
## βοΈ Under the Hood
|
| 50 |
| Technique | Why it matters |
|
| 51 |
|-----------|----------------|
|
|
|
|
| 52 |
| **`OMP_NUM_THREADS=2` & `torch.set_num_threads(2)`** | Matches the 2 vCPUs for optimal throughput |
|
| 53 |
+
| **FBGEMM backend** | Fastest kernels on x86 |
|
| 54 |
| **4-second streaming window** | Low latency & small memory footprint |
|
| 55 |
| **Gradio `stream_every=0.5`** | Updates the transcript twice per second for real-time feel |
|
| 56 |
|
|
|
|
| 64 |
|
| 65 |
If you redistribute transcripts or fine-tuned weights, please retain the CC-BY-4.0 attribution notice.
|
| 66 |
|
| 67 |
+
β **If this Space helps you, please give it a like and share your feedback!**
|