olehmell
/

ukr-manipulation-detector-modern-liberta

@@ -18,107 +18,130 @@ pipeline_tag: text-classification
 ## Model Description
-This model detects propaganda and manipulation techniques in Ukrainian text. It's a fine-tuned version of [Goader/modern-liberta-large](https://huggingface.co/Goader/modern-liberta-large) trained on the UNLP 2025 Shared Task dataset for multi-label classification of manipulation techniques.
-## Task Description
-The model identifies 5 major manipulation categories (consolidated from 10 original techniques):
-### 1. **Emotional Manipulation** (`emotional_manipulation`)
-* Loaded language with strong emotional connotation
-* Euphoria and celebratory tone to boost morale
-### 2. **Fear Appeals** (`fear_appeals`)
-* Appeal to fear through stereotypes or prejudices
-* FUD (Fear, Uncertainty, Doubt) tactics
-### 3. **Bandwagon Effect** (`bandwagon_effect`)
-* Glittering generalities using abstract positive concepts
-* Appeal to people/masses ("everyone thinks this")
-### 4. **Selective Truth** (`selective_truth`)
-* Cherry picking facts to support arguments
-* Whataboutism to deflect criticism
-* Straw man arguments distorting opponent's position
-### 5. **Thought-Terminating Cliché** (`cliche`)
-* Phrases that block critical thinking
-* Examples: "Все не так однозначно", "Де ви були 8 років?"
-## Dataset
-* **Training Data**: 2,147 Ukrainian texts from UNLP 2025 Shared Task
-* **Source**: [UNLP 2025 Techniques Classification](https://www.google.com/search?q=https://github.com/unlp-workshop/unlp-2025-shared-task/tree/main/data/techniques_classification)
-* **Language**: Ukrainian (filtered from multilingual dataset)
-* **Task Type**: Multi-label classification (texts can contain multiple techniques)
-Training Configuration
-Base Model: Goader/modern-liberta-large
-Learning Rate: 2e-5
-Batch Size: 16 (train), 32 (eval)
-Epochs: 10
-Max Sequence Length: 512
-Optimizer: AdamW
-Loss Function: BCEWithLogitsLoss with class weights
-Train/Val Split: 90/10
-Usage
-Installation
 pip install transformers torch
-Quick Start
 from transformers import AutoModelForSequenceClassification, AutoTokenizer
 import torch
-# Load model
 model_name = "olehmell/ukr-manipulation-detector-modern-bert"
 model = AutoModelForSequenceClassification.from_pretrained(model_name)
 tokenizer = AutoTokenizer.from_pretrained(model_name)
 # Prepare text
-text = "Всі експерти вже давно це підтвердили, тільки ви не розумієте"
 # Tokenize and predict
 inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
 with torch.no_grad():
     outputs = model(**inputs)
     predictions = torch.sigmoid(outputs.logits)
-# Get detected techniques (threshold = 0.5)
 threshold = 0.5
-labels = ['emotional_manipulation', 'fear_appeals', 'bandwagon_effect',
-          'selective_truth', 'cliche']
-detected = []
 for i, score in enumerate(predictions[0]):
     if score > threshold:
-        detected.append(f"{labels[i]}: {score:.2f}")
-print(f"Detected techniques: {detected}")
-Batch Processing
 def detect_manipulation_batch(texts, batch_size=32):
     results = []
     for i in range(0, len(texts), batch_size):
         batch = texts[i:i+batch_size]
-        inputs = tokenizer(batch, return_tensors="pt", padding=True,
-                         truncation=True, max_length=512)
         with torch.no_grad():
             outputs = model(**inputs)
@@ -127,48 +150,59 @@ def detect_manipulation_batch(texts, batch_size=32):
     return results
-Performance Metrics
-|
 | Metric | Value |
-| F1 Macro | TBD |
-| F1 Micro | TBD |
-| Hamming Loss | TBD |
-Note: Metrics to be updated after final evaluation
-Label Distribution in Training Data
-| Technique | Count | Percentage |
-| Emotional Manipulation | 1,094 | 50.9% |
-| Bandwagon Effect | 451 | 21.0% |
-| Thought-Terminating Cliché | 240 | 11.2% |
-| Fear Appeals | 198 | 9.2% |
-| Selective Truth | 187 | 8.7% |
-Limitations
-Language: Optimized for Ukrainian; may not perform well on other languages
-Domain: Trained primarily on political and social media discourse
-Context: Short texts (up to 512 tokens); longer documents need to be truncated
-Class Imbalance: Some techniques are underrepresented in training data
-Mixed Language: May have reduced accuracy on heavily code-mixed Ukrainian-Russian text
-Ethical Considerations
-This model is intended as a tool to support media literacy and critical thinking
-Results should be interpreted with human judgment and contextual understanding
-Should not be used as the sole arbiter of truth or to censor content
-May reflect biases present in the training data
-Citation
-If you use this model, please cite:
 @misc{ukrainian-manipulation-modernbert-2025,
   author = {Oleh Mell},
   title = {Ukrainian Manipulation Detector - ModernBERT},
@@ -176,7 +210,8 @@ If you use this model, please cite:
   publisher = {Hugging Face},
   url = {[https://huggingface.co/olehmell/ukr-manipulation-detector-modern-bert](https://huggingface.co/olehmell/ukr-manipulation-detector-modern-bert)}
 }
 @inproceedings{unlp2025shared,
   title={UNLP 2025 Shared Task on Techniques Classification},
   author={UNLP Workshop Organizers},
@@ -184,16 +219,17 @@ If you use this model, please cite:
   year={2025},
   url={[https://github.com/unlp-workshop/unlp-2025-shared-task](https://github.com/unlp-workshop/unlp-2025-shared-task)}
 }
-License
-MIT
-Contact
-For questions or feedback, please open an issue on the model repository.
-Acknowledgments
-UNLP 2025 Workshop organizers for providing the dataset
-Goader for the base ModernBERT Ukrainian model

 ## Model Description
+This model detects propaganda and manipulation techniques in Ukrainian text. It is a fine-tuned version of [Goader/modern-liberta-large](https://huggingface.co/Goader/modern-liberta-large) trained on the UNLP 2025 Shared Task dataset for multi-label classification of manipulation techniques.
+## Task: Manipulation Technique Classification
+The model performs multi-label text classification, identifying 5 major manipulation categories consolidated from 10 original techniques. A single text can contain multiple techniques.
+### Manipulation Categories
+| Category | Label Name | Description & Consolidated Techniques |
+| :--- | :--- | :--- |
+| **Emotional Manipulation** | `emotional_manipulation` | Involves using loaded language with strong emotional connotations or a euphoric tone to boost morale and sway opinion. |
+| **Fear Appeals** | `fear_appeals` | Preys on fears, stereotypes, or prejudices. Includes Fear, Uncertainty, and Doubt (FUD) tactics. |
+| **Bandwagon Effect** | `bandwagon_effect` | Uses glittering generalities (abstract, positive concepts) or appeals to the masses ("everyone thinks this") to encourage agreement. |
+| **Selective Truth** | `selective_truth` | Employs logical fallacies like cherry-picking facts, whataboutism to deflect criticism, or creating straw man arguments to distort an opponent's position. |
+| **Thought-Terminating Cliché** | `cliche` | Uses formulaic phrases designed to shut down critical thinking and end a discussion. *Examples: "Все не так однозначно", "Де ви були 8 років?"* |
+## Training Data
+The model was trained on the dataset from the UNLP 2025 Shared Task on manipulation technique classification.
+* **Dataset:** [UNLP 2025 Techniques Classification](https://github.com/unlp-workshop/unlp-2025-shared-task/tree/main/data/techniques_classification)
+* **Source Texts:** Ukrainian texts filtered from a larger multilingual dataset.
+* **Size:** 2,147 training examples.
+* **Task:** Multi-label classification.
+### Label Distribution in Training Data
+| Technique | Count | Percentage |
+| :--- | :--- | :--- |
+| Emotional Manipulation | 1,094 | 50.9% |
+| Bandwagon Effect | 451 | 21.0% |
+| Thought-Terminating Cliché | 240 | 11.2% |
+| Fear Appeals | 198 | 9.2% |
+| Selective Truth | 187 | 8.7% |
+## Training Configuration
+The model was fine-tuned using the following hyperparameters:
+| Parameter | Value |
+| :--- | :--- |
+| **Base Model** | `Goader/modern-liberta-large` |
+| **Learning Rate** | `2e-5` |
+| **Train Batch Size** | `16` |
+| **Eval Batch Size**| `32` |
+| **Epochs** | `10` |
+| **Max Sequence Length** | `512` |
+| **Optimizer** | AdamW |
+| **Loss Function** | `BCEWithLogitsLoss` (with class weights) |
+| **Train/Val Split** | 90% / 10% |
+## Usage
+### Installation
+First, install the necessary libraries:
+```bash
 pip install transformers torch
+```
+### Quick Start
+Here is how to use the model to classify a single piece of text:
+```python
 from transformers import AutoModelForSequenceClassification, AutoTokenizer
 import torch
+# Define model and label names
 model_name = "olehmell/ukr-manipulation-detector-modern-bert"
+labels = [
+    'emotional_manipulation',
+    'fear_appeals',
+    'bandwagon_effect',
+    'selective_truth',
+    'cliche'
+]
+# Load pretrained model and tokenizer
 model = AutoModelForSequenceClassification.from_pretrained(model_name)
 tokenizer = AutoTokenizer.from_pretrained(model_name)
 # Prepare text
+text = "Всі експерти вже давно це підтвердили, тільки ви не розумієте, що відбувається насправді."
 # Tokenize and predict
 inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
 with torch.no_grad():
     outputs = model(**inputs)
+    # Apply sigmoid to convert logits to probabilities
     predictions = torch.sigmoid(outputs.logits)
+# Get detected techniques (using a threshold of 0.5)
 threshold = 0.5
+detected_techniques = {}
 for i, score in enumerate(predictions[0]):
     if score > threshold:
+        detected_techniques[labels[i]] = f"{score:.2f}"
+if detected_techniques:
+    print("Detected techniques:")
+    for technique, score in detected_techniques.items():
+        print(f"- {technique} (Score: {score})")
+else:
+    print("No manipulation techniques detected.")
+```
+### Batch Processing
+For processing multiple texts efficiently, use batching:
+```python
 def detect_manipulation_batch(texts, batch_size=32):
+    """Processes a list of texts in batches."""
     results = []
     for i in range(0, len(texts), batch_size):
         batch = texts[i:i+batch_size]
+        inputs = tokenizer(
+            batch,
+            return_tensors="pt",
+            padding=True,
+            truncation=True,
+            max_length=512
+        )
         with torch.no_grad():
             outputs = model(**inputs)
     return results
+# Example usage
+corpus = [
+    "Це жахливо, вони хочуть нас усіх знищити!",
+    "Весь світ підтримує це рішення, і тільки зрадники проти.",
+    "Просто роби, що тобі кажуть, і не став зайвих питань."
+]
+batch_results = detect_manipulation_batch(corpus)
+# Print results for the batch
+for i, text in enumerate(corpus):
+    print(f"\nText: \"{text}\"")
+    detected_batch = {}
+    for j, score in enumerate(batch_results[i]):
+        if score > threshold:
+            detected_batch[labels[j]] = f"{score:.2f}"
+    if detected_batch:
+        print("Detected techniques:")
+        for technique, score in detected_batch.items():
+            print(f"- {technique} (Score: {score})")
+    else:
+        print("No manipulation techniques detected.")
+```
+## Performance
+*Note: Metrics will be updated after the final evaluation.*
 | Metric | Value |
+| :--- | :--- |
+| F1 Macro | 0.46 |
+| F1 Micro | 0.68 |
+## Limitations
+* **Language Specificity:** The model is optimized for Ukrainian and may not perform well on other languages.
+* **Domain Sensitivity:** Trained primarily on political and social media discourse, its performance may vary on other text domains.
+* **Context Length:** The model is limited to short texts (up to 512 tokens). Longer documents must be chunked or truncated.
+* **Class Imbalance:** Some manipulation techniques are underrepresented in the training data, which may affect their detection accuracy.
+* **Mixed Language:** Accuracy may be reduced on text with heavy code-mixing of Ukrainian and Russian.
+## Ethical Considerations
+* **Purpose:** This model is intended as a tool to support media literacy and critical thinking, not as an arbiter of truth.
+* **Human Oversight:** Model outputs should be interpreted with human judgment and a full understanding of the context. It should not be used to automatically censor content.
+* **Potential Biases:** The model may reflect biases present in the training data.
+## Citation
+If you use this model in your research, please cite the following:
+```bibtex
 @misc{ukrainian-manipulation-modernbert-2025,
   author = {Oleh Mell},
   title = {Ukrainian Manipulation Detector - ModernBERT},
   publisher = {Hugging Face},
   url = {[https://huggingface.co/olehmell/ukr-manipulation-detector-modern-bert](https://huggingface.co/olehmell/ukr-manipulation-detector-modern-bert)}
 }
+```
+```bibtex
 @inproceedings{unlp2025shared,
   title={UNLP 2025 Shared Task on Techniques Classification},
   author={UNLP Workshop Organizers},
   year={2025},
   url={[https://github.com/unlp-workshop/unlp-2025-shared-task](https://github.com/unlp-workshop/unlp-2025-shared-task)}
 }
+```
+## License
+This model is licensed under the **Apache 2.0 License**.
+## Contact
+For questions or feedback, please open an issue on the model's Hugging Face repository.
+## Acknowledgments
+* The organizers of the UNLP 2025 Workshop for providing the dataset.
+* Goader for creating and sharing the base `modern-liberta-large` model for Ukrainian.