FastWhisper: whisper-large-v3-edacc-commonvoice-l2arctic-v3 (CTranslate2)
This is a CTranslate2-optimized version of 2snem6/whisper-large-v3-edacc-commonvoice-l2arctic-v3 for use with the faster-whisper library.
π Performance Benefits
- Faster inference: Up to 4x speed improvement over standard Transformers
- Lower memory usage: Reduced VRAM requirements
- Optimized for production: Built for real-time applications
- Quantization: FLOAT16 precision for optimal speed/quality balance
π Model Details
- Original Model: 2snem6/whisper-large-v3-edacc-commonvoice-l2arctic-v3
- Conversion Date: 2025-09-25
- Quantization: float16
- Format: CTranslate2
- Library: faster-whisper
- Type: Fine-tuned model
Model Size
| File | Size |
|---|---|
| added_tokens.json | 0.0 MB |
| tokenizer_config.json | 0.3 MB |
| special_tokens_map.json | 0.0 MB |
| normalizer.json | 0.1 MB |
| preprocessor_config.json | 0.0 MB |
| config.json | 0.0 MB |
| vocab.json | 1.0 MB |
| vocabulary.json | 1.0 MB |
| model.bin | 2944.3 MB |
| merges.txt | 0.5 MB |
π§ Installation & Usage
Installation
pip install faster-whisper
Basic Usage
from faster_whisper import WhisperModel
# Load the model
model = WhisperModel("2snem6/faster-whisper-large-v3-edacc-commonvoice-l2arctic-v3")
# Transcribe audio
segments, info = model.transcribe("audio.wav")
print(f"Detected language: {info.language} (probability: {info.language_probability:.2f})")
for segment in segments:
print(f"[{segment.start:.2f}s -> {segment.end:.2f}s] {segment.text}")
Advanced Usage
# With custom parameters
segments, info = model.transcribe(
"audio.wav",
beam_size=5,
language="en", # Force English
condition_on_previous_text=False,
temperature=0.0
)
# Batch processing
audio_files = ["file1.wav", "file2.wav", "file3.wav"]
for audio_file in audio_files:
segments, info = model.transcribe(audio_file)
# Process segments...
Loading from Local Path
# If you've downloaded the model locally
model = WhisperModel("/path/to/downloaded/model")
β‘ Performance Comparison
FastWhisper (CTranslate2) vs Standard Transformers:
| Metric | Standard Transformers | FastWhisper (CT2) | Improvement |
|---|---|---|---|
| Speed | 1x | 2-4x | 2-4x faster |
| Memory | 1x | 0.5-0.8x | 20-50% less |
| Model Size | 1x | 0.5-0.8x | 20-50% smaller |
Performance may vary depending on hardware and audio length.
π― Use Cases
This optimized model is perfect for:
- Real-time transcription applications
- Production deployments requiring fast inference
- Resource-constrained environments
- Batch processing of audio files
- API services with high throughput requirements
π Technical Details
Conversion Process
This model was converted using the ct2-transformers-converter tool:
ct2-transformers-converter \
--model 2snem6/whisper-large-v3-edacc-commonvoice-l2arctic-v3 \
--output_dir faster-whisper-large-v3-edacc-commonvoice-l2arctic-v3 \
--quantization float16 \
--copy_files tokenizer.json preprocessor_config.json
Quantization
- FLOAT16: Half-precision floating point for optimal speed/quality balance
π Original Model
This is a converted version of a fine-tuned Whisper model. The original model 2snem6/whisper-large-v3-edacc-commonvoice-l2arctic-v3 was likely fine-tuned for specific:
- Accents or dialects
- Domain-specific vocabulary
- Improved accuracy on certain audio types
Please refer to the original model card for training details and performance metrics.
π Citation
If you use this converted model, please cite both the original Whisper paper and the CTranslate2 library:
Original Whisper
@misc{radford2022whisper,
title={Robust Speech Recognition via Large-Scale Weak Supervision},
author={Alec Radford and Jong Wook Kim and Tao Xu and Greg Brockman and Christine McLeavey and Ilya Sutskever},
year={2022},
eprint={2212.04356},
archivePrefix={arXiv},
primaryClass={eess.AS}
}
CTranslate2
@misc{ctranslate2,
title={CTranslate2: Fast inference with Transformers and OpenNMT models},
author={Guillaume Klein},
year={2020},
url={https://github.com/OpenNMT/CTranslate2}
}
π€ Contributing
Found an issue or want to improve this model?
- Original model issues: Report to 2snem6/whisper-large-v3-edacc-commonvoice-l2arctic-v3
- Conversion issues: Open an issue with details about the conversion process
- Performance issues: Check the faster-whisper documentation
π License
This model inherits the license from the original model: Apache 2.0
Converted with β€οΈ using CTranslate2 and faster-whisper
- Downloads last month
- 573