93beed4cf0a292354d00816feccfa413
This model is a fine-tuned version of google/long-t5-local-large on the Helsinki-NLP/opus_books [es-pt] dataset. It achieves the following results on the evaluation set:
- Loss: 4.2710
- Data Size: 1.0
- Epoch Runtime: 21.8637
- Bleu: 0.9816
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 236.3281 | 0 | 1.8417 | 0.0218 |
| No log | 1 | 33 | 220.4164 | 0.0078 | 2.4454 | 0.0215 |
| No log | 2 | 66 | 204.8405 | 0.0156 | 3.2886 | 0.0205 |
| No log | 3 | 99 | 178.9747 | 0.0312 | 5.1289 | 0.0170 |
| 10.8048 | 4 | 132 | 149.1265 | 0.0625 | 7.5365 | 0.0163 |
| 10.8048 | 5 | 165 | 115.3729 | 0.125 | 9.7194 | 0.0064 |
| 10.8048 | 6 | 198 | 73.5350 | 0.25 | 11.7979 | 0.0057 |
| 23.6174 | 7 | 231 | 35.5477 | 0.5 | 14.7557 | 0.0068 |
| 36.4933 | 8.0 | 264 | 17.0832 | 1.0 | 22.0719 | 0.0452 |
| 36.4933 | 9.0 | 297 | 14.2073 | 1.0 | 21.0046 | 0.0598 |
| 26.2706 | 10.0 | 330 | 12.5317 | 1.0 | 21.5360 | 0.0276 |
| 19.6304 | 11.0 | 363 | 11.6492 | 1.0 | 20.4911 | 0.1435 |
| 19.6304 | 12.0 | 396 | 10.6991 | 1.0 | 20.8877 | 0.0648 |
| 16.9168 | 13.0 | 429 | 10.2809 | 1.0 | 20.8863 | 0.0476 |
| 15.2537 | 14.0 | 462 | 9.7464 | 1.0 | 20.6664 | 0.0335 |
| 15.2537 | 15.0 | 495 | 9.8049 | 1.0 | 20.8270 | 0.0558 |
| 13.8518 | 16.0 | 528 | 8.2535 | 1.0 | 20.6819 | 0.1067 |
| 12.748 | 17.0 | 561 | 8.4780 | 1.0 | 21.5175 | 0.0440 |
| 12.748 | 18.0 | 594 | 8.1782 | 1.0 | 20.7945 | 0.0967 |
| 11.8111 | 19.0 | 627 | 7.4566 | 1.0 | 20.7800 | 0.2006 |
| 11.0365 | 20.0 | 660 | 7.3057 | 1.0 | 21.1041 | 0.1273 |
| 11.0365 | 21.0 | 693 | 6.7133 | 1.0 | 21.0027 | 0.3471 |
| 10.5073 | 22.0 | 726 | 6.9228 | 1.0 | 20.9700 | 0.3550 |
| 9.8583 | 23.0 | 759 | 7.0375 | 1.0 | 21.6299 | 0.2705 |
| 9.8583 | 24.0 | 792 | 6.7037 | 1.0 | 21.2713 | 0.4168 |
| 9.3498 | 25.0 | 825 | 6.2121 | 1.0 | 20.5182 | 0.5571 |
| 8.9215 | 26.0 | 858 | 5.9844 | 1.0 | 20.7437 | 0.5474 |
| 8.9215 | 27.0 | 891 | 6.0323 | 1.0 | 20.7299 | 0.6197 |
| 8.5465 | 28.0 | 924 | 5.6314 | 1.0 | 20.8857 | 0.5808 |
| 8.1801 | 29.0 | 957 | 5.5487 | 1.0 | 21.6938 | 0.5568 |
| 8.1801 | 30.0 | 990 | 5.6767 | 1.0 | 20.6651 | 0.4915 |
| 7.8944 | 31.0 | 1023 | 5.3007 | 1.0 | 20.7470 | 0.6073 |
| 7.6164 | 32.0 | 1056 | 5.4566 | 1.0 | 20.6885 | 0.5843 |
| 7.6164 | 33.0 | 1089 | 5.2941 | 1.0 | 21.0663 | 0.4686 |
| 7.3031 | 34.0 | 1122 | 5.2816 | 1.0 | 21.5694 | 0.7061 |
| 7.101 | 35.0 | 1155 | 5.2643 | 1.0 | 20.7937 | 0.6266 |
| 7.101 | 36.0 | 1188 | 5.0665 | 1.0 | 20.9322 | 0.5939 |
| 6.8672 | 37.0 | 1221 | 4.9107 | 1.0 | 21.7358 | 0.6661 |
| 6.6882 | 38.0 | 1254 | 4.9897 | 1.0 | 21.7348 | 0.7313 |
| 6.6882 | 39.0 | 1287 | 5.2259 | 1.0 | 22.1833 | 0.5689 |
| 6.48 | 40.0 | 1320 | 4.8984 | 1.0 | 22.8371 | 0.7928 |
| 6.3136 | 41.0 | 1353 | 4.7463 | 1.0 | 21.5830 | 0.7910 |
| 6.3136 | 42.0 | 1386 | 4.6112 | 1.0 | 21.7582 | 0.7291 |
| 6.1308 | 43.0 | 1419 | 4.8187 | 1.0 | 22.0048 | 0.6860 |
| 5.9509 | 44.0 | 1452 | 4.6719 | 1.0 | 22.0582 | 0.7044 |
| 5.9509 | 45.0 | 1485 | 4.4241 | 1.0 | 21.8078 | 0.9858 |
| 5.8166 | 46.0 | 1518 | 4.5210 | 1.0 | 21.7949 | 0.8620 |
| 5.6598 | 47.0 | 1551 | 4.5541 | 1.0 | 22.3571 | 0.7085 |
| 5.6598 | 48.0 | 1584 | 4.3421 | 1.0 | 21.4648 | 0.7746 |
| 5.4744 | 49.0 | 1617 | 4.2760 | 1.0 | 21.7306 | 0.8698 |
| 5.4035 | 50.0 | 1650 | 4.2710 | 1.0 | 21.8637 | 0.9816 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for contemmcm/93beed4cf0a292354d00816feccfa413
Base model
google/long-t5-local-large