970d213fd939ababbf9e70ddf0dabd99
This model is a fine-tuned version of google/long-t5-local-large on the Helsinki-NLP/opus_books [es-no] dataset. It achieves the following results on the evaluation set:
- Loss: 2.8375
- Data Size: 1.0
- Epoch Runtime: 44.8970
- Bleu: 0.2456
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 221.3844 | 0 | 3.6443 | 0.0043 |
| No log | 1 | 89 | 197.2671 | 0.0078 | 4.7221 | 0.0040 |
| No log | 2 | 178 | 168.6531 | 0.0156 | 6.0983 | 0.0041 |
| No log | 3 | 267 | 144.2118 | 0.0312 | 7.7860 | 0.0049 |
| No log | 4 | 356 | 105.7805 | 0.0625 | 11.2901 | 0.0038 |
| No log | 5 | 445 | 53.3554 | 0.125 | 13.4013 | 0.0040 |
| 8.0695 | 6 | 534 | 22.2945 | 0.25 | 17.7091 | 0.0047 |
| 13.7423 | 7 | 623 | 13.4378 | 0.5 | 26.3652 | 0.0147 |
| 17.3418 | 8.0 | 712 | 9.8752 | 1.0 | 44.4832 | 0.0184 |
| 13.432 | 9.0 | 801 | 8.4544 | 1.0 | 43.5700 | 0.0224 |
| 12.4192 | 10.0 | 890 | 8.5307 | 1.0 | 42.9915 | 0.0163 |
| 11.1762 | 11.0 | 979 | 7.0403 | 1.0 | 43.5339 | 0.0305 |
| 10.0983 | 12.0 | 1068 | 6.7428 | 1.0 | 43.5426 | 0.0467 |
| 9.197 | 13.0 | 1157 | 6.3366 | 1.0 | 43.5504 | 0.0322 |
| 8.8429 | 14.0 | 1246 | 5.7970 | 1.0 | 43.4010 | 0.0517 |
| 8.2187 | 15.0 | 1335 | 5.7831 | 1.0 | 43.4180 | 0.0518 |
| 7.7566 | 16.0 | 1424 | 5.1652 | 1.0 | 44.4639 | 0.0629 |
| 7.3222 | 17.0 | 1513 | 4.9510 | 1.0 | 43.4638 | 0.0519 |
| 6.9894 | 18.0 | 1602 | 4.8395 | 1.0 | 44.4988 | 0.1446 |
| 6.6839 | 19.0 | 1691 | 4.8684 | 1.0 | 42.7534 | 0.1052 |
| 6.3975 | 20.0 | 1780 | 4.6020 | 1.0 | 43.2935 | 0.0793 |
| 6.0659 | 21.0 | 1869 | 4.3094 | 1.0 | 44.0930 | 0.0938 |
| 5.8541 | 22.0 | 1958 | 4.2773 | 1.0 | 43.8304 | 0.1281 |
| 5.705 | 23.0 | 2047 | 4.1064 | 1.0 | 44.2400 | 0.1201 |
| 5.4843 | 24.0 | 2136 | 3.9477 | 1.0 | 44.2527 | 0.1240 |
| 5.2874 | 25.0 | 2225 | 3.9733 | 1.0 | 44.9309 | 0.1097 |
| 5.1221 | 26.0 | 2314 | 3.7253 | 1.0 | 44.9021 | 0.1569 |
| 5.0033 | 27.0 | 2403 | 3.6918 | 1.0 | 45.3199 | 0.1699 |
| 4.8753 | 28.0 | 2492 | 3.7214 | 1.0 | 44.7313 | 0.1219 |
| 4.703 | 29.0 | 2581 | 3.5098 | 1.0 | 45.1712 | 0.1493 |
| 4.5988 | 30.0 | 2670 | 3.4984 | 1.0 | 44.9037 | 0.1332 |
| 4.4454 | 31.0 | 2759 | 3.4042 | 1.0 | 44.9299 | 0.1913 |
| 4.3765 | 32.0 | 2848 | 3.3953 | 1.0 | 44.7618 | 0.1557 |
| 4.2578 | 33.0 | 2937 | 3.2852 | 1.0 | 44.7162 | 0.2015 |
| 4.1686 | 34.0 | 3026 | 3.3326 | 1.0 | 45.8108 | 0.1795 |
| 4.054 | 35.0 | 3115 | 3.2380 | 1.0 | 44.9969 | 0.1693 |
| 3.9608 | 36.0 | 3204 | 3.2087 | 1.0 | 45.1706 | 0.1048 |
| 3.9334 | 37.0 | 3293 | 3.2222 | 1.0 | 44.9701 | 0.1708 |
| 3.8444 | 38.0 | 3382 | 3.1124 | 1.0 | 45.3180 | 0.2026 |
| 3.7511 | 39.0 | 3471 | 3.1404 | 1.0 | 45.2833 | 0.1787 |
| 3.6928 | 40.0 | 3560 | 3.1486 | 1.0 | 44.9131 | 0.1809 |
| 3.6322 | 41.0 | 3649 | 3.0144 | 1.0 | 46.0521 | 0.2648 |
| 3.5504 | 42.0 | 3738 | 3.0636 | 1.0 | 45.6709 | 0.1582 |
| 3.5065 | 43.0 | 3827 | 3.0089 | 1.0 | 44.2839 | 0.2405 |
| 3.4138 | 44.0 | 3916 | 2.9631 | 1.0 | 45.5604 | 0.1989 |
| 3.3972 | 45.0 | 4005 | 2.9918 | 1.0 | 44.8559 | 0.1767 |
| 3.3264 | 46.0 | 4094 | 2.9466 | 1.0 | 45.2181 | 0.2354 |
| 3.306 | 47.0 | 4183 | 2.8690 | 1.0 | 45.7905 | 0.2823 |
| 3.2279 | 48.0 | 4272 | 2.8803 | 1.0 | 44.6883 | 0.2865 |
| 3.1756 | 49.0 | 4361 | 2.8623 | 1.0 | 45.3366 | 0.2921 |
| 3.1417 | 50.0 | 4450 | 2.8375 | 1.0 | 44.8970 | 0.2456 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- 1
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for contemmcm/970d213fd939ababbf9e70ddf0dabd99
Base model
google/long-t5-local-large