---
library_name: transformers
license: mit
base_model: intfloat/multilingual-e5-base
tags:
- generated_from_trainer
metrics:
- precision
- recall
- accuracy
model-index:
- name: manipulative-score-model
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# manipulative-score-model

This model is a fine-tuned version of [intfloat/multilingual-e5-base](https://huggingface.co/intfloat/multilingual-e5-base) on an UNLP 2025 Shared Task dataset.
It achieves the following results on the evaluation set:
- Loss: 0.1483
- Precision: 0.7768
- Recall: 0.7324
- F1 Macro: 0.7468
- Accuracy: 0.7962


## Model description

This model measure how likely the given text is of manipulative nature.

## Intended uses & limitations

Data filtering and evaluation of pretraining data at scale

## Training and evaluation data

Take a look into https://github.com/lapa-llm/lapa-llm/blob/main/pretraining/quality-classifiers/manipulative_detector.py

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 32
- eval_batch_size: 128
- seed: 0
- distributed_type: multi-GPU
- num_devices: 8
- total_train_batch_size: 256
- total_eval_batch_size: 1024
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 60
- num_epochs: 20

### Training results

| Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 Macro | Accuracy |
|:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:--------:|:--------:|
| No log        | 0     | 0    | 0.2718          | 0.3406    | 0.5    | 0.4052   | 0.6813   |
| No log        | 1.0   | 15   | 0.2254          | 0.5393    | 0.5041 | 0.4294   | 0.6764   |
| No log        | 2.0   | 30   | 0.2176          | 0.6641    | 0.5041 | 0.4161   | 0.6828   |
| No log        | 3.0   | 45   | 0.2020          | 0.7742    | 0.5255 | 0.4603   | 0.6959   |
| No log        | 4.0   | 60   | 0.1851          | 0.7603    | 0.5817 | 0.5649   | 0.7255   |
| No log        | 5.0   | 75   | 0.1731          | 0.7466    | 0.6506 | 0.6624   | 0.7550   |
| No log        | 6.0   | 90   | 0.1654          | 0.7556    | 0.6772 | 0.6924   | 0.7683   |
| 0.203         | 7.0   | 105  | 0.1606          | 0.7602    | 0.6873 | 0.7032   | 0.7737   |
| 0.203         | 8.0   | 120  | 0.1574          | 0.7701    | 0.6920 | 0.7089   | 0.7789   |
| 0.203         | 9.0   | 135  | 0.1557          | 0.7798    | 0.6845 | 0.7017   | 0.7787   |
| 0.203         | 10.0  | 150  | 0.1548          | 0.7858    | 0.6824 | 0.6998   | 0.7794   |
| 0.203         | 11.0  | 165  | 0.1525          | 0.7812    | 0.7002 | 0.7182   | 0.7859   |
| 0.203         | 12.0  | 180  | 0.1517          | 0.7862    | 0.7023 | 0.7208   | 0.7883   |
| 0.203         | 13.0  | 195  | 0.1515          | 0.7895    | 0.6991 | 0.7178   | 0.7880   |
| 0.1516        | 14.0  | 210  | 0.1502          | 0.7724    | 0.7295 | 0.7435   | 0.7932   |
| 0.1516        | 15.0  | 225  | 0.1495          | 0.7751    | 0.7293 | 0.7440   | 0.7944   |
| 0.1516        | 16.0  | 240  | 0.1489          | 0.7763    | 0.7277 | 0.7429   | 0.7944   |
| 0.1516        | 17.0  | 255  | 0.1485          | 0.7781    | 0.7230 | 0.7391   | 0.7935   |
| 0.1516        | 18.0  | 270  | 0.1483          | 0.7781    | 0.7275 | 0.7430   | 0.7951   |
| 0.1516        | 19.0  | 285  | 0.1482          | 0.7787    | 0.7258 | 0.7417   | 0.7948   |
| 0.1421        | 20.0  | 300  | 0.1483          | 0.7768    | 0.7324 | 0.7468   | 0.7962   |


### Framework versions

- Transformers 4.56.1
- Pytorch 2.6.0a0+ecf3bae40a.nv25.01
- Datasets 4.0.0
- Tokenizers 0.22.0