fasttext-quality-score
This model is a fine-tuned version of intfloat/multilingual-e5-base on an transferred from English. It achieves the following results on the evaluation set:
- Loss: 0.1726
- Precision: 0.7268
- Recall: 0.6680
- F1 Macro: 0.6791
- Accuracy: 0.7465
Model description
This model measure the coherence of the given text, as defined by similarity to ELI5 texts from Reddit.
Intended uses & limitations
Data filtering and evaluation of pretraining data at scale.
Training and evaluation data
Take a look at https://github.com/lapa-llm/lapa-llm/blob/main/pretraining/quality-classifiers/fasttext_classifier.py
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 8e-05
- train_batch_size: 32
- eval_batch_size: 128
- seed: 0
- distributed_type: multi-GPU
- num_devices: 8
- total_train_batch_size: 256
- total_eval_batch_size: 1024
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 200
- num_epochs: 20
Training results
| Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 Macro | Accuracy |
|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 0.2774 | 0.3331 | 0.5 | 0.3998 | 0.6662 |
| 0.1863 | 0.7895 | 300 | 0.1846 | 0.7007 | 0.6493 | 0.6580 | 0.7295 |
| 0.1804 | 1.5789 | 600 | 0.1883 | 0.6808 | 0.6817 | 0.6812 | 0.7157 |
| 0.1804 | 2.3684 | 900 | 0.1785 | 0.7175 | 0.6490 | 0.6581 | 0.7364 |
| 0.1781 | 3.1579 | 1200 | 0.1774 | 0.7201 | 0.6597 | 0.6700 | 0.7410 |
| 0.1765 | 3.9474 | 1500 | 0.1795 | 0.6990 | 0.6816 | 0.6878 | 0.7336 |
| 0.174 | 4.7368 | 1800 | 0.1768 | 0.7214 | 0.6531 | 0.6628 | 0.7393 |
| 0.1777 | 5.5263 | 2100 | 0.1838 | 0.6943 | 0.6920 | 0.6931 | 0.7286 |
| 0.1758 | 6.3158 | 2400 | 0.1950 | 0.7731 | 0.6021 | 0.5918 | 0.7266 |
| 0.1749 | 7.1053 | 2700 | 0.1753 | 0.7147 | 0.6729 | 0.6830 | 0.7423 |
| 0.1733 | 7.8947 | 3000 | 0.1748 | 0.7304 | 0.6525 | 0.6621 | 0.7422 |
| 0.1696 | 8.6842 | 3300 | 0.1758 | 0.7125 | 0.6767 | 0.6863 | 0.7420 |
| 0.1723 | 9.4737 | 3600 | 0.1743 | 0.7243 | 0.6627 | 0.6734 | 0.7437 |
| 0.1705 | 10.2632 | 3900 | 0.1740 | 0.7261 | 0.6601 | 0.6706 | 0.7435 |
| 0.1682 | 11.0526 | 4200 | 0.1756 | 0.7316 | 0.6481 | 0.6569 | 0.7408 |
| 0.171 | 11.8421 | 4500 | 0.1734 | 0.7242 | 0.6647 | 0.6756 | 0.7444 |
| 0.1699 | 12.6316 | 4800 | 0.1748 | 0.7351 | 0.6473 | 0.6560 | 0.7416 |
| 0.1696 | 13.4211 | 5100 | 0.1731 | 0.7235 | 0.6723 | 0.6833 | 0.7464 |
| 0.1705 | 14.2105 | 5400 | 0.1738 | 0.7322 | 0.6557 | 0.6659 | 0.7441 |
| 0.1697 | 15.0 | 5700 | 0.1729 | 0.7205 | 0.6681 | 0.6788 | 0.7438 |
| 0.1686 | 15.7895 | 6000 | 0.1726 | 0.7227 | 0.6710 | 0.6819 | 0.7457 |
| 0.1663 | 16.5789 | 6300 | 0.1726 | 0.7229 | 0.6707 | 0.6816 | 0.7457 |
| 0.1684 | 17.3684 | 6600 | 0.1727 | 0.7213 | 0.6709 | 0.6817 | 0.7450 |
| 0.1667 | 18.1579 | 6900 | 0.1726 | 0.7224 | 0.6704 | 0.6813 | 0.7454 |
| 0.1687 | 18.9474 | 7200 | 0.1726 | 0.7288 | 0.6656 | 0.6767 | 0.7465 |
| 0.1675 | 19.7368 | 7500 | 0.1726 | 0.7268 | 0.6680 | 0.6791 | 0.7465 |
Framework versions
- Transformers 4.56.1
- Pytorch 2.6.0a0+ecf3bae40a.nv25.01
- Datasets 4.0.0
- Tokenizers 0.22.0
- Downloads last month
- 32
Model tree for lapa-llm/fasttext-quality-score
Base model
intfloat/multilingual-e5-base