ChunkFormer Classification Model
This model performs speech classification tasks such as gender recognition, dialect identification, emotion detection, and age classification.
Classification Tasks
- Age: 5 classes
- Dialect: 5 classes
- Emotion: 8 classes
- Gender: 2 classes
Usage
Install the package:
pip install chunkformer
Single Audio Classification
from chunkformer import ChunkFormerModel
# Load the model
model = ChunkFormerModel.from_pretrained("khanhld/chunkformer-gender-emotion-dialect-age-classification")
# Classify a single audio file
result = model.classify_audio(
audio_path="path/to/your/audio.wav",
chunk_size=-1, # -1 for full attention
left_context_size=-1,
right_context_size=-1
)
print(result)
# Output example:
# {
# 'gender': {
# 'label': 'female',
# 'label_id': 0,
# 'prob': 0.95
# },
# 'dialect': {
# 'label': 'northern dialect',
# 'label_id': 3,
# 'prob': 0.70
# },
# 'emotion': {
# 'label': 'neutral',
# 'label_id': 5,
# 'prob': 0.80
# }
# }
Command Line Usage
chunkformer-decode \
--model_checkpoint khanhld/chunkformer-gender-emotion-dialect-age-classification \
--audio_file path/to/audio.wav
Training
This model was trained using the ChunkFormer framework. For more details about the training process and to access the source code, please visit: https://github.com/khanld/chunkformer
Paper: https://arxiv.org/abs/2502.14673
Citation
If you use this work in your research, please cite:
@INPROCEEDINGS{10888640,
author={Le, Khanh and Ho, Tuan Vu and Tran, Dung and Chau, Duc Thanh},
booktitle={ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
title={ChunkFormer: Masked Chunking Conformer For Long-Form Speech Transcription},
year={2025},
volume={},
number={},
pages={1-5},
keywords={Scalability;Memory management;Graphics processing units;Signal processing;Performance gain;Hardware;Resource management;Speech processing;Standards;Context modeling;chunkformer;masked batch;long-form transcription},
doi={10.1109/ICASSP49660.2025.10888640}}
- Downloads last month
- 10