4c71ce1eafdb5d36bfec031e622c7637
This model is a fine-tuned version of google/long-t5-local-large on the Helsinki-NLP/opus_books [de-it] dataset. It achieves the following results on the evaluation set:
- Loss: 2.2967
- Data Size: 1.0
- Epoch Runtime: 296.9825
- Bleu: 1.5312
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 229.1420 | 0 | 21.3235 | 0.0102 |
| No log | 1 | 684 | 169.5335 | 0.0078 | 23.9654 | 0.0061 |
| No log | 2 | 1368 | 105.1773 | 0.0156 | 27.4962 | 0.0041 |
| No log | 3 | 2052 | 37.4363 | 0.0312 | 33.8458 | 0.0015 |
| No log | 4 | 2736 | 17.6935 | 0.0625 | 42.6523 | 0.0095 |
| 22.5653 | 5 | 3420 | 12.8546 | 0.125 | 61.7509 | 0.2321 |
| 15.8601 | 6 | 4104 | 9.7817 | 0.25 | 97.5144 | 0.0232 |
| 11.2192 | 7 | 4788 | 7.5349 | 0.5 | 167.9222 | 0.0168 |
| 7.604 | 8.0 | 5472 | 5.1665 | 1.0 | 311.9814 | 0.1142 |
| 5.8428 | 9.0 | 6156 | 4.0933 | 1.0 | 306.4764 | 0.1187 |
| 5.0192 | 10.0 | 6840 | 3.7044 | 1.0 | 308.3549 | 0.1499 |
| 4.485 | 11.0 | 7524 | 3.3896 | 1.0 | 300.2145 | 0.1799 |
| 4.0706 | 12.0 | 8208 | 3.1497 | 1.0 | 298.1315 | 0.3534 |
| 3.8092 | 13.0 | 8892 | 3.0523 | 1.0 | 300.3850 | 0.3495 |
| 3.6472 | 14.0 | 9576 | 2.9593 | 1.0 | 298.2718 | 0.4008 |
| 3.4461 | 15.0 | 10260 | 2.8402 | 1.0 | 297.2106 | 0.4580 |
| 3.3258 | 16.0 | 10944 | 2.7898 | 1.0 | 301.6928 | 0.4834 |
| 3.2073 | 17.0 | 11628 | 2.7334 | 1.0 | 300.2841 | 0.6354 |
| 3.1163 | 18.0 | 12312 | 2.6852 | 1.0 | 298.0317 | 0.6447 |
| 3.0383 | 19.0 | 12996 | 2.6541 | 1.0 | 297.7672 | 0.6590 |
| 2.9504 | 20.0 | 13680 | 2.6051 | 1.0 | 300.4691 | 0.7512 |
| 2.8632 | 21.0 | 14364 | 2.5677 | 1.0 | 297.4315 | 0.8523 |
| 2.8375 | 22.0 | 15048 | 2.5375 | 1.0 | 296.9936 | 0.8462 |
| 2.7784 | 23.0 | 15732 | 2.5039 | 1.0 | 299.7437 | 0.9535 |
| 2.7273 | 24.0 | 16416 | 2.4831 | 1.0 | 299.1126 | 0.9218 |
| 2.7019 | 25.0 | 17100 | 2.4658 | 1.0 | 307.1864 | 0.9450 |
| 2.6508 | 26.0 | 17784 | 2.4403 | 1.0 | 298.0001 | 1.0115 |
| 2.5866 | 27.0 | 18468 | 2.4429 | 1.0 | 300.0931 | 1.0111 |
| 2.5655 | 28.0 | 19152 | 2.4098 | 1.0 | 298.6058 | 1.0417 |
| 2.5107 | 29.0 | 19836 | 2.3942 | 1.0 | 297.7683 | 1.2072 |
| 2.4684 | 30.0 | 20520 | 2.3807 | 1.0 | 297.4598 | 1.1409 |
| 2.4466 | 31.0 | 21204 | 2.3601 | 1.0 | 298.9890 | 1.1560 |
| 2.4378 | 32.0 | 21888 | 2.3492 | 1.0 | 298.1015 | 1.2198 |
| 2.375 | 33.0 | 22572 | 2.3348 | 1.0 | 298.1212 | 1.2304 |
| 2.3469 | 34.0 | 23256 | 2.3288 | 1.0 | 296.2021 | 1.3018 |
| 2.3185 | 35.0 | 23940 | 2.3205 | 1.0 | 298.5160 | 1.3001 |
| 2.2792 | 36.0 | 24624 | 2.3057 | 1.0 | 298.3413 | 1.3007 |
| 2.2462 | 37.0 | 25308 | 2.3119 | 1.0 | 297.9856 | 1.3171 |
| 2.2336 | 38.0 | 25992 | 2.3040 | 1.0 | 300.3649 | 1.4047 |
| 2.178 | 39.0 | 26676 | 2.2890 | 1.0 | 298.0711 | 1.3388 |
| 2.1507 | 40.0 | 27360 | 2.2871 | 1.0 | 299.2080 | 1.4162 |
| 2.1355 | 41.0 | 28044 | 2.2830 | 1.0 | 296.7871 | 1.4717 |
| 2.113 | 42.0 | 28728 | 2.2790 | 1.0 | 297.9198 | 1.4811 |
| 2.0881 | 43.0 | 29412 | 2.2894 | 1.0 | 297.2793 | 1.5521 |
| 2.0622 | 44.0 | 30096 | 2.2832 | 1.0 | 298.7365 | 1.4748 |
| 2.0253 | 45.0 | 30780 | 2.2811 | 1.0 | 295.0279 | 1.5263 |
| 2.0143 | 46.0 | 31464 | 2.2967 | 1.0 | 296.9825 | 1.5312 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- 9
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for contemmcm/4c71ce1eafdb5d36bfec031e622c7637
Base model
google/long-t5-local-large