0ddb16dd8694cc9f14ef15cd3c9b0f99

This model is a fine-tuned version of google/long-t5-local-large on the Helsinki-NLP/opus_books [es-fr] dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6883
  • Data Size: 1.0
  • Epoch Runtime: 606.7991
  • Bleu: 9.8666

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 211.2462 0 43.1118 0.0094
No log 1 1407 116.7262 0.0078 48.9993 0.0061
No log 2 2814 43.9352 0.0156 53.4298 0.0089
1.9756 3 4221 20.3504 0.0312 62.9143 0.0526
25.9311 4 5628 14.8927 0.0625 81.5991 0.0315
18.2335 5 7035 12.3942 0.125 114.9738 0.0366
13.6658 6 8442 8.6721 0.25 186.0882 0.2032
9.2752 7 9849 6.5468 0.5 322.8414 0.1898
6.0579 8.0 11256 4.5514 1.0 601.7147 0.4214
4.963 9.0 12663 3.9254 1.0 600.3578 0.5662
4.4639 10.0 14070 3.6574 1.0 596.9364 0.7584
4.1302 11.0 15477 3.4710 1.0 601.6454 0.7089
3.9235 12.0 16884 3.3633 1.0 601.1381 0.9264
3.7208 13.0 18291 3.2380 1.0 596.8704 0.9029
3.5889 14.0 19698 3.1404 1.0 598.6068 1.0998
3.4594 15.0 21105 3.0636 1.0 606.0485 1.1165
3.4105 16.0 22512 2.9922 1.0 609.0398 1.2355
3.282 17.0 23919 2.9320 1.0 617.0727 1.3752
3.1533 18.0 25326 2.8606 1.0 608.2587 1.6477
3.1363 19.0 26733 2.8001 1.0 604.8040 1.6915
2.9962 20.0 28140 2.7005 1.0 610.7178 2.4405
2.8654 21.0 29547 2.5369 1.0 611.4810 3.1834
2.7104 22.0 30954 2.3868 1.0 609.9735 4.4239
2.562 23.0 32361 2.2705 1.0 605.8442 5.1809
2.4284 24.0 33768 2.1780 1.0 605.7224 6.0046
2.3612 25.0 35175 2.1002 1.0 604.3007 6.5047
2.2352 26.0 36582 2.0346 1.0 608.0183 7.3406
2.1532 27.0 37989 1.9861 1.0 609.8636 7.1664
2.0999 28.0 39396 1.9416 1.0 609.9335 7.5815
2.0621 29.0 40803 1.9056 1.0 617.9091 8.5064
1.9897 30.0 42210 1.8810 1.0 609.4563 8.4202
1.9056 31.0 43617 1.8452 1.0 611.9425 8.0794
1.8396 32.0 45024 1.8380 1.0 610.2999 8.4263
1.8326 33.0 46431 1.8088 1.0 608.1780 8.6867
1.7374 34.0 47838 1.7780 1.0 608.5015 9.3405
1.7379 35.0 49245 1.7702 1.0 607.3163 9.4303
1.6659 36.0 50652 1.7438 1.0 610.6225 9.3951
1.6528 37.0 52059 1.7336 1.0 614.5718 9.3208
1.5992 38.0 53466 1.7224 1.0 611.7702 9.3421
1.517 39.0 54873 1.7274 1.0 610.9096 9.2879
1.5352 40.0 56280 1.7126 1.0 607.9050 9.7942
1.5186 41.0 57687 1.6964 1.0 607.0325 9.3278
1.5035 42.0 59094 1.6927 1.0 608.1321 9.7688
1.4307 43.0 60501 1.6996 1.0 609.1138 9.7955
1.3876 44.0 61908 1.6807 1.0 608.1156 9.8583
1.3587 45.0 63315 1.6871 1.0 612.6054 9.9905
1.3011 46.0 64722 1.6893 1.0 614.4003 9.9209
1.3287 47.0 66129 1.6864 1.0 608.6889 9.6559
1.2669 48.0 67536 1.6883 1.0 606.7991 9.8666

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
7
Safetensors
Model size
0.8B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for contemmcm/0ddb16dd8694cc9f14ef15cd3c9b0f99

Finetuned
(38)
this model