c45e5d95c04fb6c57a4c8033d39fca63
This model is a fine-tuned version of google/long-t5-local-large on the Helsinki-NLP/opus_books [es-fi] dataset. It achieves the following results on the evaluation set:
- Loss: 2.8312
- Data Size: 1.0
- Epoch Runtime: 40.7791
- Bleu: 0.2250
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 209.7094 | 0 | 3.3267 | 0.0010 |
| No log | 1 | 83 | 184.2445 | 0.0078 | 3.9533 | 0.0012 |
| No log | 2 | 166 | 161.0579 | 0.0156 | 5.0809 | 0.0013 |
| No log | 3 | 249 | 144.9138 | 0.0312 | 6.7955 | 0.0013 |
| 5.3868 | 4 | 332 | 111.1862 | 0.0625 | 9.0224 | 0.0011 |
| 5.3868 | 5 | 415 | 65.2582 | 0.125 | 12.1194 | 0.0011 |
| 5.3868 | 6 | 498 | 25.2881 | 0.25 | 16.3851 | 0.0011 |
| 13.292 | 7 | 581 | 13.2800 | 0.5 | 25.3184 | 0.0035 |
| 18.2018 | 8.0 | 664 | 9.5933 | 1.0 | 42.7601 | 0.0043 |
| 15.789 | 9.0 | 747 | 8.5477 | 1.0 | 41.0928 | 0.0045 |
| 13.0828 | 10.0 | 830 | 7.7491 | 1.0 | 41.1897 | 0.0043 |
| 11.4015 | 11.0 | 913 | 7.6218 | 1.0 | 41.2410 | 0.0044 |
| 10.7068 | 12.0 | 996 | 6.7302 | 1.0 | 41.2708 | 0.0077 |
| 9.7104 | 13.0 | 1079 | 6.2087 | 1.0 | 40.5168 | 0.0122 |
| 8.8903 | 14.0 | 1162 | 5.9130 | 1.0 | 41.7129 | 0.0248 |
| 8.6231 | 15.0 | 1245 | 5.1895 | 1.0 | 41.0517 | 0.0652 |
| 7.9684 | 16.0 | 1328 | 5.2467 | 1.0 | 41.2612 | 0.0485 |
| 7.5212 | 17.0 | 1411 | 4.8904 | 1.0 | 41.8618 | 0.0594 |
| 7.3085 | 18.0 | 1494 | 4.9712 | 1.0 | 41.2331 | 0.0317 |
| 6.8657 | 19.0 | 1577 | 4.5537 | 1.0 | 41.8624 | 0.0458 |
| 6.5853 | 20.0 | 1660 | 4.4626 | 1.0 | 41.9728 | 0.0634 |
| 6.3659 | 21.0 | 1743 | 4.3185 | 1.0 | 41.6910 | 0.0857 |
| 6.1174 | 22.0 | 1826 | 4.2137 | 1.0 | 41.7061 | 0.0646 |
| 5.8922 | 23.0 | 1909 | 4.1079 | 1.0 | 41.7720 | 0.0545 |
| 5.747 | 24.0 | 1992 | 3.9319 | 1.0 | 41.2620 | 0.1931 |
| 5.5505 | 25.0 | 2075 | 3.8795 | 1.0 | 41.2630 | 0.1169 |
| 5.3418 | 26.0 | 2158 | 3.8160 | 1.0 | 41.6766 | 0.0979 |
| 5.1912 | 27.0 | 2241 | 3.7565 | 1.0 | 41.6077 | 0.0972 |
| 5.0714 | 28.0 | 2324 | 3.6059 | 1.0 | 41.8966 | 0.1575 |
| 4.8811 | 29.0 | 2407 | 3.5714 | 1.0 | 41.3058 | 0.1184 |
| 4.8396 | 30.0 | 2490 | 3.6037 | 1.0 | 41.7404 | 0.0959 |
| 4.6897 | 31.0 | 2573 | 3.5540 | 1.0 | 40.8913 | 0.0918 |
| 4.5818 | 32.0 | 2656 | 3.4263 | 1.0 | 41.7674 | 0.1365 |
| 4.4231 | 33.0 | 2739 | 3.2331 | 1.0 | 40.9916 | 0.1768 |
| 4.345 | 34.0 | 2822 | 3.3233 | 1.0 | 41.6434 | 0.1605 |
| 4.2561 | 35.0 | 2905 | 3.2043 | 1.0 | 41.8876 | 0.1920 |
| 4.192 | 36.0 | 2988 | 3.2323 | 1.0 | 42.0650 | 0.1436 |
| 4.1131 | 37.0 | 3071 | 3.1558 | 1.0 | 41.6571 | 0.1903 |
| 4.0248 | 38.0 | 3154 | 3.0953 | 1.0 | 41.1832 | 0.2231 |
| 3.9353 | 39.0 | 3237 | 3.0665 | 1.0 | 40.7884 | 0.2473 |
| 3.8715 | 40.0 | 3320 | 3.0905 | 1.0 | 41.5836 | 0.1599 |
| 3.7659 | 41.0 | 3403 | 3.0597 | 1.0 | 41.5660 | 0.1769 |
| 3.7303 | 42.0 | 3486 | 2.9279 | 1.0 | 41.5299 | 0.2641 |
| 3.6671 | 43.0 | 3569 | 2.9700 | 1.0 | 41.6461 | 0.2292 |
| 3.5778 | 44.0 | 3652 | 2.9280 | 1.0 | 41.0377 | 0.3111 |
| 3.5576 | 45.0 | 3735 | 2.9184 | 1.0 | 42.2988 | 0.2039 |
| 3.4826 | 46.0 | 3818 | 2.9182 | 1.0 | 40.8568 | 0.2763 |
| 3.4323 | 47.0 | 3901 | 2.8435 | 1.0 | 40.8695 | 0.2529 |
| 3.3986 | 48.0 | 3984 | 2.8897 | 1.0 | 42.0452 | 0.2206 |
| 3.346 | 49.0 | 4067 | 2.8105 | 1.0 | 41.3435 | 0.2463 |
| 3.2765 | 50.0 | 4150 | 2.8312 | 1.0 | 40.7791 | 0.2250 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- 7
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for contemmcm/c45e5d95c04fb6c57a4c8033d39fca63
Base model
google/long-t5-local-large