FastWhisper: whisper-large-v3-edacc-commonvoice-l2arctic-v3 (CTranslate2)

This is a CTranslate2-optimized version of 2snem6/whisper-large-v3-edacc-commonvoice-l2arctic-v3 for use with the faster-whisper library.

πŸš€ Performance Benefits

  • Faster inference: Up to 4x speed improvement over standard Transformers
  • Lower memory usage: Reduced VRAM requirements
  • Optimized for production: Built for real-time applications
  • Quantization: FLOAT16 precision for optimal speed/quality balance

πŸ“‹ Model Details

Model Size

File Size
added_tokens.json 0.0 MB
tokenizer_config.json 0.3 MB
special_tokens_map.json 0.0 MB
normalizer.json 0.1 MB
preprocessor_config.json 0.0 MB
config.json 0.0 MB
vocab.json 1.0 MB
vocabulary.json 1.0 MB
model.bin 2944.3 MB
merges.txt 0.5 MB

πŸ”§ Installation & Usage

Installation

pip install faster-whisper

Basic Usage

from faster_whisper import WhisperModel

# Load the model
model = WhisperModel("2snem6/faster-whisper-large-v3-edacc-commonvoice-l2arctic-v3")

# Transcribe audio
segments, info = model.transcribe("audio.wav")

print(f"Detected language: {info.language} (probability: {info.language_probability:.2f})")

for segment in segments:
    print(f"[{segment.start:.2f}s -> {segment.end:.2f}s] {segment.text}")

Advanced Usage

# With custom parameters
segments, info = model.transcribe(
    "audio.wav",
    beam_size=5,
    language="en",  # Force English
    condition_on_previous_text=False,
    temperature=0.0
)

# Batch processing
audio_files = ["file1.wav", "file2.wav", "file3.wav"]
for audio_file in audio_files:
    segments, info = model.transcribe(audio_file)
    # Process segments...

Loading from Local Path

# If you've downloaded the model locally
model = WhisperModel("/path/to/downloaded/model")

⚑ Performance Comparison

FastWhisper (CTranslate2) vs Standard Transformers:

Metric Standard Transformers FastWhisper (CT2) Improvement
Speed 1x 2-4x 2-4x faster
Memory 1x 0.5-0.8x 20-50% less
Model Size 1x 0.5-0.8x 20-50% smaller

Performance may vary depending on hardware and audio length.

🎯 Use Cases

This optimized model is perfect for:

  • Real-time transcription applications
  • Production deployments requiring fast inference
  • Resource-constrained environments
  • Batch processing of audio files
  • API services with high throughput requirements

πŸ“ Technical Details

Conversion Process

This model was converted using the ct2-transformers-converter tool:

ct2-transformers-converter \
    --model 2snem6/whisper-large-v3-edacc-commonvoice-l2arctic-v3 \
    --output_dir faster-whisper-large-v3-edacc-commonvoice-l2arctic-v3 \
    --quantization float16 \
    --copy_files tokenizer.json preprocessor_config.json

Quantization

  • FLOAT16: Half-precision floating point for optimal speed/quality balance

πŸ”„ Original Model

This is a converted version of a fine-tuned Whisper model. The original model 2snem6/whisper-large-v3-edacc-commonvoice-l2arctic-v3 was likely fine-tuned for specific:

  • Accents or dialects
  • Domain-specific vocabulary
  • Improved accuracy on certain audio types

Please refer to the original model card for training details and performance metrics.

πŸ“š Citation

If you use this converted model, please cite both the original Whisper paper and the CTranslate2 library:

Original Whisper

@misc{radford2022whisper,
  title={Robust Speech Recognition via Large-Scale Weak Supervision},
  author={Alec Radford and Jong Wook Kim and Tao Xu and Greg Brockman and Christine McLeavey and Ilya Sutskever},
  year={2022},
  eprint={2212.04356},
  archivePrefix={arXiv},
  primaryClass={eess.AS}
}

CTranslate2

@misc{ctranslate2,
  title={CTranslate2: Fast inference with Transformers and OpenNMT models},
  author={Guillaume Klein},
  year={2020},
  url={https://github.com/OpenNMT/CTranslate2}
}

🀝 Contributing

Found an issue or want to improve this model?

πŸ“„ License

This model inherits the license from the original model: Apache 2.0


Converted with ❀️ using CTranslate2 and faster-whisper

Downloads last month
573
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for 2snem6/faster-whisper-large-v3-edacc-commonvoice-l2arctic-v3