Update README.md

a48a50f verified 4 months ago

9.49 kB

	---
	base_model:
	- microsoft/mdeberta-v3-base
	language:
	- bg
	library_name: transformers
	license: cc-by-4.0
	metrics:
	- accuracy
	- f1
	tags:
	- generated_from_trainer
	- text-classification
	pipeline_tag: text-classification
	model-index:
	- name: mdeberta-v3-base-subjectivity-sentiment-bulgarian
	results: []
	datasets:
	- MatteoFasulo/clef2025_checkthat_task1_subjectivity
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# mdeberta-v3-base-subjectivity-sentiment-bulgarian

	This model is a fine-tuned version of [microsoft/mdeberta-v3-base](https://huggingface.co/microsoft/mdeberta-v3-base) on the [CheckThat! Lab Task 1 Subjectivity Detection at CLEF 2025](https://arxiv.org/abs/2507.11764).
	It achieves the following results on the evaluation set:
	- Loss: 0.4555
	- Macro F1: 0.8291
	- Macro P: 0.8289
	- Macro R: 0.8293
	- Subj F1: 0.8100
	- Subj P: 0.8071
	- Subj R: 0.8129
	- Accuracy: 0.8312

	## Model description

	This model is a fine-tuned `mDeBERTaV3-base` classifier designed for subjectivity detection in news articles, specifically for the Bulgarian language. It classifies sentences as either subjective (opinion-laden) or objective.

	The core innovation of this approach, as detailed in the associated paper, is the enhancement of transformer-based embeddings by integrating sentiment scores derived from an auxiliary model. This sentiment-augmented architecture aims to improve upon standard fine-tuning, demonstrating significant performance boosts, particularly in subjective F1 score. The training also incorporates robust decision threshold calibration to effectively address class imbalance.

	This model is part of the AI Wizards' participation in the CLEF 2025 CheckThat! Lab Task 1: Subjectivity Detection in News Articles. The broader research explored its application across monolingual, multilingual, and zero-shot settings.

	## Intended uses & limitations

	This model is intended for classifying sentences as subjective or objective within news articles, which is a key component in combating misinformation, improving fact-checking pipelines, and supporting journalists. This specific checkpoint is optimized for the Bulgarian language.

	Limitations:
	* While the sentiment-augmented approach showed consistent performance gains, the paper notes that BERT-like models still surpassed LLM baselines in various scenarios, indicating areas for further research.
	* The model's effectiveness may vary for domains or linguistic styles significantly different from the news articles it was trained on.
	* Generalization to other languages beyond those explicitly evaluated (Arabic, German, English, Italian, Bulgarian, Greek, Romanian, Polish, Ukrainian) is not guaranteed without further fine-tuning.

	## Training and evaluation data

	This model was fine-tuned on the data provided for the CLEF 2025 CheckThat! Lab Task 1: Subjectivity Detection in News Articles. Training and development datasets were provided for various languages, including Bulgarian. The final evaluation included additional unseen languages to assess generalization capabilities.

	To address class imbalance, a common issue across languages, decision threshold calibration was employed during training and optimized on the development set. The base model, `microsoft/mdeberta-v3-base`, was pretrained on a diverse multilingual corpus.

	## How to use

	You can use this model directly with the Hugging Face `transformers` library:

	```python
	import torch
	import torch.nn as nn
	from transformers import DebertaV2Model, DebertaV2Config, AutoTokenizer, PreTrainedModel, pipeline, AutoModelForSequenceClassification
	from transformers.models.deberta.modeling_deberta import ContextPooler

	sent_pipe = pipeline(
	"sentiment-analysis",
	model="cardiffnlp/twitter-xlm-roberta-base-sentiment",
	tokenizer="cardiffnlp/twitter-xlm-roberta-base-sentiment",
	top_k=None, # return all 3 sentiment scores
	)

	class CustomModel(PreTrainedModel):
	config_class = DebertaV2Config
	def __init__(self, config, sentiment_dim=3, num_labels=2, args, *kwargs):
	super().__init__(config, args, *kwargs)
	self.deberta = DebertaV2Model(config)
	self.pooler = ContextPooler(config)
	output_dim = self.pooler.output_dim
	self.dropout = nn.Dropout(0.1)
	self.classifier = nn.Linear(output_dim + sentiment_dim, num_labels)

	def forward(self, input_ids, positive, neutral, negative, token_type_ids=None, attention_mask=None, labels=None):
	outputs = self.deberta(input_ids=input_ids, attention_mask=attention_mask)
	encoder_layer = outputs[0]
	pooled_output = self.pooler(encoder_layer)
	sentiment_features = torch.stack((positive, neutral, negative), dim=1).to(pooled_output.dtype)
	combined_features = torch.cat((pooled_output, sentiment_features), dim=1)
	logits = self.classifier(self.dropout(combined_features))
	return {'logits': logits}

	model_name = "MatteoFasulo/mdeberta-v3-base-subjectivity-sentiment-bulgarian"
	tokenizer = AutoTokenizer.from_pretrained("microsoft/mdeberta-v3-base")
	config = DebertaV2Config.from_pretrained(
	model_name,
	num_labels=2,
	id2label={0: 'OBJ', 1: 'SUBJ'},
	label2id={'OBJ': 0, 'SUBJ': 1},
	output_attentions=False,
	output_hidden_states=False
	)
	model = CustomModel(config=config, sentiment_dim=3, num_labels=2).from_pretrained(model_name)

	def classify_subjectivity(text: str):
	# get full sentiment distribution
	dist = sent_pipe(text)[0]
	pos = next(d["score"] for d in dist if d["label"] == "positive")
	neu = next(d["score"] for d in dist if d["label"] == "neutral")
	neg = next(d["score"] for d in dist if d["label"] == "negative")

	# tokenize the text
	inputs = tokenizer(text, padding=True, truncation=True, max_length=256, return_tensors='pt')

	# feeding in the three sentiment scores
	with torch.no_grad():
	outputs = model(
	input_ids=inputs["input_ids"],
	attention_mask=inputs["attention_mask"],
	positive=torch.tensor(pos).unsqueeze(0).float(),
	neutral=torch.tensor(neu).unsqueeze(0).float(),
	negative=torch.tensor(neg).unsqueeze(0).float()
	)

	# compute probabilities and pick the top label
	probs = torch.softmax(outputs.get('logits')[0], dim=-1)
	label = model.config.id2label[int(probs.argmax())]
	score = probs.max().item()

	return {"label": label, "score": score}

	examples = [
	"По принцип никой не иска войни, но за нещастие те се случват.",
	"В един момент започнал сам да търси изход за своето спасение и здраве",
	]
	for text in examples:
	result = classify_subjectivity(text)
	print(f"Text: {text}")
	print(f"→ Subjectivity: {result['label']} (score={result['score']:.2f})\n")
	```

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 1e-05
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 42
	- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
	- lr_scheduler_type: linear
	- num_epochs: 6

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Macro F1 \| Macro P \| Macro R \| Subj F1 \| Subj P \| Subj R \| Accuracy \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------:\|:-------:\|:-------:\|:-------:\|:------:\|:------:\|:--------:\|
	\| No log \| 1.0 \| 46 \| 0.6893 \| 0.4013 \| 0.6563 \| 0.5378 \| 0.6282 \| 0.4626 \| 0.9784 \| 0.4873 \|
	\| No log \| 2.0 \| 92 \| 0.5960 \| 0.7650 \| 0.7791 \| 0.7616 \| 0.7194 \| 0.7982 \| 0.6547 \| 0.7739 \|
	\| No log \| 3.0 \| 138 \| 0.5138 \| 0.7928 \| 0.7939 \| 0.7920 \| 0.7664 \| 0.7778 \| 0.7554 \| 0.7962 \|
	\| No log \| 4.0 \| 184 \| 0.4831 \| 0.8054 \| 0.8073 \| 0.8042 \| 0.7794 \| 0.7970 \| 0.7626 \| 0.8089 \|
	\| No log \| 5.0 \| 230 \| 0.4792 \| 0.8082 \| 0.8115 \| 0.8063 \| 0.7807 \| 0.8077 \| 0.7554 \| 0.8121 \|
	\| No log \| 6.0 \| 276 \| 0.4555 \| 0.8291 \| 0.8289 \| 0.8293 \| 0.8100 \| 0.8071 \| 0.8129 \| 0.8312 \|


	### Framework versions

	- Transformers 4.49.0
	- Pytorch 2.5.1+cu121
	- Datasets 3.3.1
	- Tokenizers 0.21.0

	## Code

	The official code and materials for this project are available on GitHub:
	[https://github.com/MatteoFasulo/clef2025-checkthat](https://github.com/MatteoFasulo/clef2025-checkthat)

	## Project Page

	Explore the collection of models and interactive results on the Hugging Face Hub:
	[AI Wizards @ CLEF 2025 - CheckThat! Lab - Task 1 Subjectivity](https://huggingface.co/collections/MatteoFasulo/clef-2025-checkthat-lab-task-1-subjectivity-6878f0199d302acdfe2ceddb)

	## Citation

	If you find our work helpful or inspiring, please feel free to cite it:

	```bibtex
	@misc{fasulo2025aiwizardscheckthat2025,
	title={AI Wizards at CheckThat! 2025: Enhancing Transformer-Based Embeddings with Sentiment for Subjectivity Detection in News Articles},
	author={Matteo Fasulo and Luca Babboni and Luca Tedeschini},
	year={2025},
	eprint={2507.11764},
	archivePrefix={arXiv},
	primaryClass={cs.CL},
	url={https://arxiv.org/abs/2507.11764},
	}
	```