0ddb16dd8694cc9f14ef15cd3c9b0f99
This model is a fine-tuned version of google/long-t5-local-large on the Helsinki-NLP/opus_books [es-fr] dataset. It achieves the following results on the evaluation set:
- Loss: 1.6883
- Data Size: 1.0
- Epoch Runtime: 606.7991
- Bleu: 9.8666
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 211.2462 | 0 | 43.1118 | 0.0094 |
| No log | 1 | 1407 | 116.7262 | 0.0078 | 48.9993 | 0.0061 |
| No log | 2 | 2814 | 43.9352 | 0.0156 | 53.4298 | 0.0089 |
| 1.9756 | 3 | 4221 | 20.3504 | 0.0312 | 62.9143 | 0.0526 |
| 25.9311 | 4 | 5628 | 14.8927 | 0.0625 | 81.5991 | 0.0315 |
| 18.2335 | 5 | 7035 | 12.3942 | 0.125 | 114.9738 | 0.0366 |
| 13.6658 | 6 | 8442 | 8.6721 | 0.25 | 186.0882 | 0.2032 |
| 9.2752 | 7 | 9849 | 6.5468 | 0.5 | 322.8414 | 0.1898 |
| 6.0579 | 8.0 | 11256 | 4.5514 | 1.0 | 601.7147 | 0.4214 |
| 4.963 | 9.0 | 12663 | 3.9254 | 1.0 | 600.3578 | 0.5662 |
| 4.4639 | 10.0 | 14070 | 3.6574 | 1.0 | 596.9364 | 0.7584 |
| 4.1302 | 11.0 | 15477 | 3.4710 | 1.0 | 601.6454 | 0.7089 |
| 3.9235 | 12.0 | 16884 | 3.3633 | 1.0 | 601.1381 | 0.9264 |
| 3.7208 | 13.0 | 18291 | 3.2380 | 1.0 | 596.8704 | 0.9029 |
| 3.5889 | 14.0 | 19698 | 3.1404 | 1.0 | 598.6068 | 1.0998 |
| 3.4594 | 15.0 | 21105 | 3.0636 | 1.0 | 606.0485 | 1.1165 |
| 3.4105 | 16.0 | 22512 | 2.9922 | 1.0 | 609.0398 | 1.2355 |
| 3.282 | 17.0 | 23919 | 2.9320 | 1.0 | 617.0727 | 1.3752 |
| 3.1533 | 18.0 | 25326 | 2.8606 | 1.0 | 608.2587 | 1.6477 |
| 3.1363 | 19.0 | 26733 | 2.8001 | 1.0 | 604.8040 | 1.6915 |
| 2.9962 | 20.0 | 28140 | 2.7005 | 1.0 | 610.7178 | 2.4405 |
| 2.8654 | 21.0 | 29547 | 2.5369 | 1.0 | 611.4810 | 3.1834 |
| 2.7104 | 22.0 | 30954 | 2.3868 | 1.0 | 609.9735 | 4.4239 |
| 2.562 | 23.0 | 32361 | 2.2705 | 1.0 | 605.8442 | 5.1809 |
| 2.4284 | 24.0 | 33768 | 2.1780 | 1.0 | 605.7224 | 6.0046 |
| 2.3612 | 25.0 | 35175 | 2.1002 | 1.0 | 604.3007 | 6.5047 |
| 2.2352 | 26.0 | 36582 | 2.0346 | 1.0 | 608.0183 | 7.3406 |
| 2.1532 | 27.0 | 37989 | 1.9861 | 1.0 | 609.8636 | 7.1664 |
| 2.0999 | 28.0 | 39396 | 1.9416 | 1.0 | 609.9335 | 7.5815 |
| 2.0621 | 29.0 | 40803 | 1.9056 | 1.0 | 617.9091 | 8.5064 |
| 1.9897 | 30.0 | 42210 | 1.8810 | 1.0 | 609.4563 | 8.4202 |
| 1.9056 | 31.0 | 43617 | 1.8452 | 1.0 | 611.9425 | 8.0794 |
| 1.8396 | 32.0 | 45024 | 1.8380 | 1.0 | 610.2999 | 8.4263 |
| 1.8326 | 33.0 | 46431 | 1.8088 | 1.0 | 608.1780 | 8.6867 |
| 1.7374 | 34.0 | 47838 | 1.7780 | 1.0 | 608.5015 | 9.3405 |
| 1.7379 | 35.0 | 49245 | 1.7702 | 1.0 | 607.3163 | 9.4303 |
| 1.6659 | 36.0 | 50652 | 1.7438 | 1.0 | 610.6225 | 9.3951 |
| 1.6528 | 37.0 | 52059 | 1.7336 | 1.0 | 614.5718 | 9.3208 |
| 1.5992 | 38.0 | 53466 | 1.7224 | 1.0 | 611.7702 | 9.3421 |
| 1.517 | 39.0 | 54873 | 1.7274 | 1.0 | 610.9096 | 9.2879 |
| 1.5352 | 40.0 | 56280 | 1.7126 | 1.0 | 607.9050 | 9.7942 |
| 1.5186 | 41.0 | 57687 | 1.6964 | 1.0 | 607.0325 | 9.3278 |
| 1.5035 | 42.0 | 59094 | 1.6927 | 1.0 | 608.1321 | 9.7688 |
| 1.4307 | 43.0 | 60501 | 1.6996 | 1.0 | 609.1138 | 9.7955 |
| 1.3876 | 44.0 | 61908 | 1.6807 | 1.0 | 608.1156 | 9.8583 |
| 1.3587 | 45.0 | 63315 | 1.6871 | 1.0 | 612.6054 | 9.9905 |
| 1.3011 | 46.0 | 64722 | 1.6893 | 1.0 | 614.4003 | 9.9209 |
| 1.3287 | 47.0 | 66129 | 1.6864 | 1.0 | 608.6889 | 9.6559 |
| 1.2669 | 48.0 | 67536 | 1.6883 | 1.0 | 606.7991 | 9.8666 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- 7
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for contemmcm/0ddb16dd8694cc9f14ef15cd3c9b0f99
Base model
google/long-t5-local-large