Ukrainian Manipulation Detector - ModernBERT

Model Description

This model detects propaganda and manipulation techniques in Ukrainian text. It is a fine-tuned version of Goader/modern-liberta-large trained on the UNLP 2025 Shared Task dataset for multi-label classification of manipulation techniques.

Task: Manipulation Technique Classification

The model performs multi-label text classification, identifying 5 major manipulation categories consolidated from 10 original techniques. A single text can contain multiple techniques.

Manipulation Categories

Category Label Name Description & Consolidated Techniques
Emotional Manipulation emotional_manipulation Involves using loaded language with strong emotional connotations or a euphoric tone to boost morale and sway opinion.
Fear Appeals fear_appeals Preys on fears, stereotypes, or prejudices. Includes Fear, Uncertainty, and Doubt (FUD) tactics.
Bandwagon Effect bandwagon_effect Uses glittering generalities (abstract, positive concepts) or appeals to the masses ("everyone thinks this") to encourage agreement.
Selective Truth selective_truth Employs logical fallacies like cherry-picking facts, whataboutism to deflect criticism, or creating straw man arguments to distort an opponent's position.
Thought-Terminating Cliché cliche Uses formulaic phrases designed to shut down critical thinking and end a discussion. Examples: "Все не так однозначно", "Де ви були 8 років?"

Training Data

The model was trained on the dataset from the UNLP 2025 Shared Task on manipulation technique classification.

  • Dataset: UNLP 2025 Techniques Classification
  • Source Texts: Ukrainian texts filtered from a larger multilingual dataset.
  • Size: 2,147 training examples.
  • Task: Multi-label classification.

Label Distribution in Training Data

Technique Count Percentage
Emotional Manipulation 1,094 50.9%
Bandwagon Effect 451 21.0%
Thought-Terminating Cliché 240 11.2%
Fear Appeals 198 9.2%
Selective Truth 187 8.7%

Training Configuration

The model was fine-tuned using the following hyperparameters:

Parameter Value
Base Model Goader/modern-liberta-large
Learning Rate 2e-5
Train Batch Size 16
Eval Batch Size 32
Epochs 10
Max Sequence Length 512
Optimizer AdamW
Loss Function BCEWithLogitsLoss (with class weights)
Train/Val Split 90% / 10%

Usage

Installation

First, install the necessary libraries:

pip install transformers torch

Quick Start

Here is how to use the model to classify a single piece of text:

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

# Define model and label names
model_name = "olehmell/ukr-manipulation-detector-modern-bert"
labels = [
    'emotional_manipulation', 
    'fear_appeals', 
    'bandwagon_effect', 
    'selective_truth', 
    'cliche'
]

# Load pretrained model and tokenizer
model = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Prepare text
text = "Всі експерти вже давно це підтвердили, тільки ви не розумієте, що відбувається насправді."

# Tokenize and predict
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
    outputs = model(**inputs)
    # Apply sigmoid to convert logits to probabilities
    predictions = torch.sigmoid(outputs.logits)

# Get detected techniques (using a threshold of 0.5)
threshold = 0.5
detected_techniques = {}
for i, score in enumerate(predictions[0]):
    if score > threshold:
        detected_techniques[labels[i]] = f"{score:.2f}"

if detected_techniques:
    print("Detected techniques:")
    for technique, score in detected_techniques.items():
        print(f"- {technique} (Score: {score})")
else:
    print("No manipulation techniques detected.")

Batch Processing

For processing multiple texts efficiently, use batching:

def detect_manipulation_batch(texts, batch_size=32):
    """Processes a list of texts in batches."""
    results = []
    for i in range(0, len(texts), batch_size):
        batch = texts[i:i+batch_size]
        inputs = tokenizer(
            batch, 
            return_tensors="pt", 
            padding=True, 
            truncation=True, 
            max_length=512
        )
        
        with torch.no_grad():
            outputs = model(**inputs)
            predictions = torch.sigmoid(outputs.logits)
            results.extend(predictions.cpu().numpy())
            
    return results

# Example usage
corpus = [
    "Це жахливо, вони хочуть нас усіх знищити!",
    "Весь світ підтримує це рішення, і тільки зрадники проти.",
    "Просто роби, що тобі кажуть, і не став зайвих питань."
]

batch_results = detect_manipulation_batch(corpus)

# Print results for the batch
for i, text in enumerate(corpus):
    print(f"\nText: \"{text}\"")
    detected_batch = {}
    for j, score in enumerate(batch_results[i]):
        if score > threshold:
            detected_batch[labels[j]] = f"{score:.2f}"
    
    if detected_batch:
        print("Detected techniques:")
        for technique, score in detected_batch.items():
            print(f"- {technique} (Score: {score})")
    else:
        print("No manipulation techniques detected.")

Performance

Note: Metrics will be updated after the final evaluation.

Metric Value
F1 Macro 0.46
F1 Micro 0.68

Limitations

  • Language Specificity: The model is optimized for Ukrainian and may not perform well on other languages.
  • Domain Sensitivity: Trained primarily on political and social media discourse, its performance may vary on other text domains.
  • Context Length: The model is limited to short texts (up to 512 tokens). Longer documents must be chunked or truncated.
  • Class Imbalance: Some manipulation techniques are underrepresented in the training data, which may affect their detection accuracy.
  • Mixed Language: Accuracy may be reduced on text with heavy code-mixing of Ukrainian and Russian.

Ethical Considerations

  • Purpose: This model is intended as a tool to support media literacy and critical thinking, not as an arbiter of truth.
  • Human Oversight: Model outputs should be interpreted with human judgment and a full understanding of the context. It should not be used to automatically censor content.
  • Potential Biases: The model may reflect biases present in the training data.

Citation

If you use this model in your research, please cite the following:

@misc{ukrainian-manipulation-modernbert-2025,
  author = {Oleh Mell},
  title = {Ukrainian Manipulation Detector - ModernBERT},
  year = {2025},
  publisher = {Hugging Face},
  url = {[https://huggingface.co/olehmell/ukr-manipulation-detector-modern-bert](https://huggingface.co/olehmell/ukr-manipulation-detector-modern-bert)}
}
@inproceedings{unlp2025shared,
  title={UNLP 2025 Shared Task on Techniques Classification},
  author={UNLP Workshop Organizers},
  booktitle={UNLP 2025 Workshop},
  year={2025},
  url={[https://github.com/unlp-workshop/unlp-2025-shared-task](https://github.com/unlp-workshop/unlp-2025-shared-task)}
}

License

This model is licensed under the Apache 2.0 License.

Contact

For questions or feedback, please open an issue on the model's Hugging Face repository.

Acknowledgments

  • The organizers of the UNLP 2025 Workshop for providing the dataset.
  • Goader for creating and sharing the base modern-liberta-large model for Ukrainian.
Downloads last month
1
Safetensors
Model size
0.4B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for olehmell/ukr-manipulation-detector-modern-liberta

Finetuned
(3)
this model