Ukrainian Manipulation Detector - ModernBERT

Model Description

This model detects propaganda and manipulation techniques in Ukrainian text. It is a fine-tuned version of Goader/modern-liberta-large trained on the UNLP 2025 Shared Task dataset for multi-label classification of manipulation techniques.

Task: Manipulation Technique Classification

The model performs multi-label text classification, identifying 5 major manipulation categories consolidated from 10 original techniques. A single text can contain multiple techniques.

Manipulation Categories

Category	Label Name	Description & Consolidated Techniques
Emotional Manipulation	`emotional_manipulation`	Involves using loaded language with strong emotional connotations or a euphoric tone to boost morale and sway opinion.
Fear Appeals	`fear_appeals`	Preys on fears, stereotypes, or prejudices. Includes Fear, Uncertainty, and Doubt (FUD) tactics.
Bandwagon Effect	`bandwagon_effect`	Uses glittering generalities (abstract, positive concepts) or appeals to the masses ("everyone thinks this") to encourage agreement.
Selective Truth	`selective_truth`	Employs logical fallacies like cherry-picking facts, whataboutism to deflect criticism, or creating straw man arguments to distort an opponent's position.
Thought-Terminating Cliché	`cliche`	Uses formulaic phrases designed to shut down critical thinking and end a discussion. Examples: "Все не так однозначно", "Де ви були 8 років?"

Training Data

The model was trained on the dataset from the UNLP 2025 Shared Task on manipulation technique classification.

Dataset: UNLP 2025 Techniques Classification
Source Texts: Ukrainian texts filtered from a larger multilingual dataset.
Size: 2,147 training examples.
Task: Multi-label classification.

Label Distribution in Training Data

Technique	Count	Percentage
Emotional Manipulation	1,094	50.9%
Bandwagon Effect	451	21.0%
Thought-Terminating Cliché	240	11.2%
Fear Appeals	198	9.2%
Selective Truth	187	8.7%

Training Configuration

The model was fine-tuned using the following hyperparameters:

Parameter	Value
Base Model	`Goader/modern-liberta-large`
Learning Rate	`2e-5`
Train Batch Size	`16`
Eval Batch Size	`32`
Epochs	`10`
Max Sequence Length	`512`
Optimizer	AdamW
Loss Function	`BCEWithLogitsLoss` (with class weights)
Train/Val Split	90% / 10%

Usage

Installation

First, install the necessary libraries:

pip install transformers torch

Quick Start

Here is how to use the model to classify a single piece of text:

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

# Define model and label names
model_name = "olehmell/ukr-manipulation-detector-modern-bert"
labels = [
    'emotional_manipulation', 
    'fear_appeals', 
    'bandwagon_effect', 
    'selective_truth', 
    'cliche'
]

# Load pretrained model and tokenizer
model = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Prepare text
text = "Всі експерти вже давно це підтвердили, тільки ви не розумієте, що відбувається насправді."

# Tokenize and predict
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
    outputs = model(**inputs)
    # Apply sigmoid to convert logits to probabilities
    predictions = torch.sigmoid(outputs.logits)

# Get detected techniques (using a threshold of 0.5)
threshold = 0.5
detected_techniques = {}
for i, score in enumerate(predictions[0]):
    if score > threshold:
        detected_techniques[labels[i]] = f"{score:.2f}"

if detected_techniques:
    print("Detected techniques:")
    for technique, score in detected_techniques.items():
        print(f"- {technique} (Score: {score})")
else:
    print("No manipulation techniques detected.")

Batch Processing

For processing multiple texts efficiently, use batching:

def detect_manipulation_batch(texts, batch_size=32):
    """Processes a list of texts in batches."""
    results = []
    for i in range(0, len(texts), batch_size):
        batch = texts[i:i+batch_size]
        inputs = tokenizer(
            batch, 
            return_tensors="pt", 
            padding=True, 
            truncation=True, 
            max_length=512
        )
        
        with torch.no_grad():
            outputs = model(**inputs)
            predictions = torch.sigmoid(outputs.logits)
            results.extend(predictions.cpu().numpy())
            
    return results

# Example usage
corpus = [
    "Це жахливо, вони хочуть нас усіх знищити!",
    "Весь світ підтримує це рішення, і тільки зрадники проти.",
    "Просто роби, що тобі кажуть, і не став зайвих питань."
]

batch_results = detect_manipulation_batch(corpus)

# Print results for the batch
for i, text in enumerate(corpus):
    print(f"\nText: \"{text}\"")
    detected_batch = {}
    for j, score in enumerate(batch_results[i]):
        if score > threshold:
            detected_batch[labels[j]] = f"{score:.2f}"
    
    if detected_batch:
        print("Detected techniques:")
        for technique, score in detected_batch.items():
            print(f"- {technique} (Score: {score})")
    else:
        print("No manipulation techniques detected.")

Performance

Note: Metrics will be updated after the final evaluation.

Metric	Value
F1 Macro	0.46
F1 Micro	0.68

Limitations

Language Specificity: The model is optimized for Ukrainian and may not perform well on other languages.
Domain Sensitivity: Trained primarily on political and social media discourse, its performance may vary on other text domains.
Context Length: The model is limited to short texts (up to 512 tokens). Longer documents must be chunked or truncated.
Class Imbalance: Some manipulation techniques are underrepresented in the training data, which may affect their detection accuracy.
Mixed Language: Accuracy may be reduced on text with heavy code-mixing of Ukrainian and Russian.

Ethical Considerations

Purpose: This model is intended as a tool to support media literacy and critical thinking, not as an arbiter of truth.
Human Oversight: Model outputs should be interpreted with human judgment and a full understanding of the context. It should not be used to automatically censor content.
Potential Biases: The model may reflect biases present in the training data.

Citation

If you use this model in your research, please cite the following:

@misc{ukrainian-manipulation-modernbert-2025,
  author = {Oleh Mell},
  title = {Ukrainian Manipulation Detector - ModernBERT},
  year = {2025},
  publisher = {Hugging Face},
  url = {[https://huggingface.co/olehmell/ukr-manipulation-detector-modern-bert](https://huggingface.co/olehmell/ukr-manipulation-detector-modern-bert)}
}

@inproceedings{unlp2025shared,
  title={UNLP 2025 Shared Task on Techniques Classification},
  author={UNLP Workshop Organizers},
  booktitle={UNLP 2025 Workshop},
  year={2025},
  url={[https://github.com/unlp-workshop/unlp-2025-shared-task](https://github.com/unlp-workshop/unlp-2025-shared-task)}
}

License

This model is licensed under the Apache 2.0 License.

Contact

For questions or feedback, please open an issue on the model's Hugging Face repository.

Acknowledgments

The organizers of the UNLP 2025 Workshop for providing the dataset.
Goader for creating and sharing the base modern-liberta-large model for Ukrainian.

Downloads last month: 1

Safetensors

Model size

0.4B params

Tensor type

F32

Model tree for olehmell/ukr-manipulation-detector-modern-liberta

Base model

Goader/modern-liberta-large

Finetuned

(3)

this model