xbgoose/ravdess
Viewer • Updated • 1.44k • 478 • 2
How to use LincolnD/wav2vec2-base-finetuned-ravdess-personalization with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("audio-classification", model="LincolnD/wav2vec2-base-finetuned-ravdess-personalization") # Load model directly
from transformers import AutoProcessor, AutoModelForAudioClassification
processor = AutoProcessor.from_pretrained("LincolnD/wav2vec2-base-finetuned-ravdess-personalization")
model = AutoModelForAudioClassification.from_pretrained("LincolnD/wav2vec2-base-finetuned-ravdess-personalization")This model is part of our ICASSP 2026 paper: Test-Time Adaptation Methods for Speech Emotion Recognition.
Wav2Vec2 fine-tuned on RAVDESS for Task1 (intra-corpus personalization)
from transformers import AutoModelForAudioClassification, AutoFeatureExtractor
import torch
import torchaudio
# Load model and processor
model_checkpoint = "LincolnD/wav2vec2-base-finetuned-ravdess-personalization"
processor = AutoFeatureExtractor.from_pretrained(model_checkpoint)
model = AutoModelForAudioClassification.from_pretrained(model_checkpoint)
# Load and process audio
audio_path = "path/to/your/audio.wav"
waveform, sample_rate = torchaudio.load(audio_path)
# Resample to 16kHz if needed
if sample_rate != 16000:
resampler = torchaudio.transforms.Resample(sample_rate, 16000)
waveform = resampler(waveform)
# Process and predict
inputs = processor(waveform.squeeze().numpy(), sampling_rate=16000, return_tensors="pt", padding=True)
with torch.no_grad():
outputs = model(**inputs)
predictions = torch.argmax(outputs.logits, dim=-1)
emotion_id = predictions.item()
emotion = model.config.id2label[emotion_id]
print(f"Predicted emotion: {emotion}")
This model was fine-tuned from the pre-trained Wav2Vec2 base model on the RAVDESS dataset.
This model is designed for use in Test-Time Adaptation (TTA) experiments as part of our research on adapting speech emotion recognition systems to new domains and speakers.
For detailed evaluation results and comparison with various TTA methods, please refer to our paper.
LincolnD
MIT License - See repository for details.