Configuration Parsing
Warning:
Invalid JSON for config file config.json
XTTS v2 - Setswana Fine-Tune
This is a fine-tuned version of Coqui XTTS v2 for Setswana (Tswana).
It was trained on 3,428 high-quality clips from the Mozilla Common Voice 17.0 dataset to capture authentic Setswana prosody, rhythm, and intonation.
Model Capabilities
- Authentic Prosody: Captures the melodic flow and stress patterns of native Setswana speech.
- Native Pronunciation: Improved handling of specific Setswana phonemes compared to the base model.
- Cross-Lingual Inference: Can transfer the Setswana voice style to other languages supported by XTTS.
Training Metrics
- Base Model: XTTS v2.0.2
- Dataset: Common Voice Setswana (Validated, >2 upvotes)
- Training Steps: ~850+ (Epoch 1 Complete)
- Initial Loss: 3.46
- Final Eval Loss: 2.22
- Current Training Loss: ~1.83
Usage
from TTS.api import TTS
tts = TTS("ogaufi/xtts-v2-setswana", gpu=True)
# Generate speech
tts.tts_to_file(text="Dumêla rra, o tsogile jang?",
file_path="output.wav",
speaker_wav="reference_speaker.wav",
language="en") # Use 'en' as base language
- Downloads last month
- 4
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support