Model Card for OpenEmotions-Mistral-24B
This model fine-tunes a Mistral-24B generative language model to evaluate the intensity of multiple emotions expressed in text, including broader affective dimensions such as Valence and Arousal. Rather than classifying whether emotions are present, the model predicts continuous 0–100 scores for each emotion, enabling fine-grained emotional profiling useful for downstream analysis in behavioral science, finance, and psychology.
Model Details
Model Description
- Developed by: Francesco A. Fabozzi, Dasol Kim, and William Goetzmann
- Shared by: Francesco A. Fabozzi
- Model type: Decoder-only LLM (fine-tuned for multi-emotion intensity regression)
- Language(s): English
- License: apache-2.0
- Finetuned from model: mistralai/Mistral-Small-24B-Instruct-2501
Model Sources
- Repository: [Link to GitHub repository if applicable]
- Paper: [Link to paper or arXiv when posted]
- Demo: [Optional HuggingFace Space link if available]
Uses
Direct Use
This model can be used to automatically score the intensity of a fixed list of emotions in text. It outputs a JSON object mapping each emotion to a score between 0 and 100, along with Valence and Arousal scores.
Downstream Use
Ideal for research settings where emotional variables are inputs to regression or forecasting models—such as behavioral finance, psychology studies, and social science research.
Out-of-Scope Use
This model is not designed for empathetic dialogue generation, emotion classification for chatbots, clinical diagnosis, or high-stakes decision-making without human validation.
Bias, Risks, and Limitations
As with all emotion modeling systems, outputs may reflect biases present in the annotation guidelines and the underlying data. Emotional intensity is inherently subjective, and model predictions should be used carefully, particularly in sensitive domains.
Recommendations
We recommend thorough evaluation before using this model in applications where misinterpretation of emotional signals could have serious consequences.
How to Get Started with the Model
Load the model and tokenizer using the Huggingface transformers library to generate emotion intensity scores for any input text.
Training Details
Training Data
- Fine-tuned on the OpenEmotions dataset: 1,177 texts labeled by expert annotators.
- Each text is scored for a fixed list of emotions on a 0–100 scale, along with Valence and Arousal.
Training Procedure
- Finetuning technique: LoRA (Low-Rank Adaptation)
- LoRA configuration: rank=16, alpha=32, dropout=0.1, attached to
q_projandv_projnodes - Quantization: 4-bit precision during training
- Training precision: mixed precision (fp16)
- Optimizer: AdamW, learning rate 5e-5
- Training epochs: 10
- Batch size: 16 (training), 32 (inference)
- Training hardware: Three NVIDIA A100 80GB GPUs
Evaluation
Testing Data
- Evaluated on a held-out test set (approximately 25% of OpenEmotions dataset).
Factors
- Disaggregated results by each emotion category, including Valence and Arousal.
Metrics
- Concordance Correlation Coefficient (CCC): Measures ranking and scale alignment with human annotators.
- Zero-Match F1 Score: Evaluates whether the model correctly predicts zero intensity when the emotion is absent.
Results
- Fine-tuned Mistral-24B achieves an average CCC of over 80% across emotions.
- Dramatic improvement over pretrained LLMs and RoBERTa classification baselines.
- Even without direct supervision, the model accurately infers Valence and Arousal, demonstrating latent affective structure learning.
- Generalizes to unseen emotions and maintains reliable prediction consistency.
Summary
The model produces human-aligned, interpretable emotion intensity scores across a wide range of emotions and affective dimensions. Its capacity to generalize to Valence and Arousal without direct labels highlights its strength in capturing richer emotional representations, including underexplored dimensions like Arousal.
Model Examination
Future work includes cross-lingual evaluation, robustness testing across domains, and expansion to more nuanced emotional states.
- Hardware Type: 3x NVIDIA A100 80GB GPUs
- Hours used: Approximately 1-2 hours
Technical Specifications
Model Architecture and Objective
- Decoder-only generative model.
- Fine-tuned via instruction-following to output JSON-formatted emotion intensity scores.
Compute Infrastructure
- Hardware: 3x A100 80GB GPUs
- Software: Huggingface Transformers, BitsAndBytes, PEFT
Citation
BibTeX:
@misc{fabozzi2025openemotions,
author = {Francesco A. Fabozzi, Dasol Kim, and William Goetzmann},
title = {OpenEmotions: Fine-Grained Emotion Intensity Evaluation with Generative Language Models},
year = {2025},
howpublished = {arXiv preprint},
note = {EMNLP 2025 Submission}
}
APA
Fabozzi, F. A., Kim, D., Goetzmann, W. (2025). OpenEmotions: Fine-Grained Emotion Intensity Evaluation with Generative Language Models. arXiv preprint. Submission to EMNLP 2025.
Model Card Authors
- Francesco A. Fabozzi
Model Card Contact
Model tree for fabozzi/OpenEmotions-Mistral-24B-Instruct
Base model
mistralai/Mistral-Small-24B-Base-2501