4d18f0b7b0fc742690ac565165d0438f
This model is a fine-tuned version of google/long-t5-local-large on the Helsinki-NLP/opus_books [de-es] dataset. It achieves the following results on the evaluation set:
- Loss: 2.2221
- Data Size: 1.0
- Epoch Runtime: 301.5996
- Bleu: 1.1422
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 235.5920 | 0 | 21.6725 | 0.0151 |
| No log | 1 | 688 | 158.6363 | 0.0078 | 25.2709 | 0.0065 |
| No log | 2 | 1376 | 85.2046 | 0.0156 | 27.3339 | 0.0036 |
| No log | 3 | 2064 | 30.0130 | 0.0312 | 33.1372 | 0.0016 |
| 2.6754 | 4 | 2752 | 16.1713 | 0.0625 | 41.7777 | 0.0725 |
| 1.863 | 5 | 3440 | 12.0949 | 0.125 | 58.9698 | 0.1161 |
| 14.77 | 6 | 4128 | 9.1553 | 0.25 | 94.4603 | 0.1050 |
| 10.7326 | 7 | 4816 | 7.1104 | 0.5 | 162.4500 | 0.0777 |
| 7.1993 | 8.0 | 5504 | 4.8120 | 1.0 | 296.1590 | 0.0317 |
| 5.7684 | 9.0 | 6192 | 4.0821 | 1.0 | 296.1616 | 0.0404 |
| 4.8521 | 10.0 | 6880 | 3.5462 | 1.0 | 295.2833 | 0.1019 |
| 4.3215 | 11.0 | 7568 | 3.3064 | 1.0 | 293.7784 | 0.0986 |
| 3.9813 | 12.0 | 8256 | 3.1607 | 1.0 | 294.3076 | 0.1001 |
| 3.7181 | 13.0 | 8944 | 2.9624 | 1.0 | 293.7822 | 0.1871 |
| 3.5393 | 14.0 | 9632 | 2.9189 | 1.0 | 294.3335 | 0.2653 |
| 3.3765 | 15.0 | 10320 | 2.8352 | 1.0 | 297.6367 | 0.3737 |
| 3.2647 | 16.0 | 11008 | 2.7518 | 1.0 | 295.5343 | 0.2972 |
| 3.1507 | 17.0 | 11696 | 2.7165 | 1.0 | 293.9573 | 0.3930 |
| 3.0727 | 18.0 | 12384 | 2.6494 | 1.0 | 294.2747 | 0.3934 |
| 2.9689 | 19.0 | 13072 | 2.6156 | 1.0 | 293.8179 | 0.4885 |
| 2.9318 | 20.0 | 13760 | 2.5777 | 1.0 | 296.2244 | 0.4388 |
| 2.8625 | 21.0 | 14448 | 2.5504 | 1.0 | 292.5996 | 0.6039 |
| 2.8148 | 22.0 | 15136 | 2.5291 | 1.0 | 294.3252 | 0.5524 |
| 2.7718 | 23.0 | 15824 | 2.4954 | 1.0 | 295.4557 | 0.6300 |
| 2.7298 | 24.0 | 16512 | 2.4774 | 1.0 | 296.3096 | 0.6497 |
| 2.6893 | 25.0 | 17200 | 2.4443 | 1.0 | 295.0107 | 0.7417 |
| 2.6229 | 26.0 | 17888 | 2.4218 | 1.0 | 292.1687 | 0.6751 |
| 2.582 | 27.0 | 18576 | 2.3986 | 1.0 | 297.3084 | 0.6930 |
| 2.5685 | 28.0 | 19264 | 2.3927 | 1.0 | 300.0445 | 0.7737 |
| 2.5398 | 29.0 | 19952 | 2.3765 | 1.0 | 300.2355 | 0.7913 |
| 2.5014 | 30.0 | 20640 | 2.3554 | 1.0 | 295.5488 | 0.8455 |
| 2.4913 | 31.0 | 21328 | 2.3539 | 1.0 | 293.5029 | 0.7853 |
| 2.4406 | 32.0 | 22016 | 2.3387 | 1.0 | 297.9184 | 0.8931 |
| 2.4185 | 33.0 | 22704 | 2.3182 | 1.0 | 298.8404 | 0.9088 |
| 2.398 | 34.0 | 23392 | 2.3152 | 1.0 | 296.9342 | 0.7916 |
| 2.3568 | 35.0 | 24080 | 2.3110 | 1.0 | 298.0802 | 0.9005 |
| 2.3569 | 36.0 | 24768 | 2.2831 | 1.0 | 298.2648 | 0.9909 |
| 2.3031 | 37.0 | 25456 | 2.2923 | 1.0 | 299.1979 | 0.9535 |
| 2.2997 | 38.0 | 26144 | 2.2674 | 1.0 | 297.0862 | 0.9699 |
| 2.258 | 39.0 | 26832 | 2.2698 | 1.0 | 300.9418 | 1.1056 |
| 2.2468 | 40.0 | 27520 | 2.2597 | 1.0 | 294.5861 | 1.0727 |
| 2.227 | 41.0 | 28208 | 2.2564 | 1.0 | 298.8053 | 1.0456 |
| 2.1836 | 42.0 | 28896 | 2.2530 | 1.0 | 300.2710 | 1.1390 |
| 2.1922 | 43.0 | 29584 | 2.2477 | 1.0 | 302.2907 | 0.9623 |
| 2.1422 | 44.0 | 30272 | 2.2420 | 1.0 | 303.2315 | 1.0065 |
| 2.1354 | 45.0 | 30960 | 2.2412 | 1.0 | 302.1064 | 1.1697 |
| 2.1167 | 46.0 | 31648 | 2.2408 | 1.0 | 300.0712 | 1.1307 |
| 2.0993 | 47.0 | 32336 | 2.2337 | 1.0 | 300.4522 | 1.1571 |
| 2.0814 | 48.0 | 33024 | 2.2205 | 1.0 | 299.3218 | 1.0888 |
| 2.0697 | 49.0 | 33712 | 2.2232 | 1.0 | 302.0744 | 1.1725 |
| 2.0495 | 50.0 | 34400 | 2.2221 | 1.0 | 301.5996 | 1.1422 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- 12
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for contemmcm/4d18f0b7b0fc742690ac565165d0438f
Base model
google/long-t5-local-large