970d213fd939ababbf9e70ddf0dabd99

This model is a fine-tuned version of google/long-t5-local-large on the Helsinki-NLP/opus_books [es-no] dataset. It achieves the following results on the evaluation set:

  • Loss: 2.8375
  • Data Size: 1.0
  • Epoch Runtime: 44.8970
  • Bleu: 0.2456

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 221.3844 0 3.6443 0.0043
No log 1 89 197.2671 0.0078 4.7221 0.0040
No log 2 178 168.6531 0.0156 6.0983 0.0041
No log 3 267 144.2118 0.0312 7.7860 0.0049
No log 4 356 105.7805 0.0625 11.2901 0.0038
No log 5 445 53.3554 0.125 13.4013 0.0040
8.0695 6 534 22.2945 0.25 17.7091 0.0047
13.7423 7 623 13.4378 0.5 26.3652 0.0147
17.3418 8.0 712 9.8752 1.0 44.4832 0.0184
13.432 9.0 801 8.4544 1.0 43.5700 0.0224
12.4192 10.0 890 8.5307 1.0 42.9915 0.0163
11.1762 11.0 979 7.0403 1.0 43.5339 0.0305
10.0983 12.0 1068 6.7428 1.0 43.5426 0.0467
9.197 13.0 1157 6.3366 1.0 43.5504 0.0322
8.8429 14.0 1246 5.7970 1.0 43.4010 0.0517
8.2187 15.0 1335 5.7831 1.0 43.4180 0.0518
7.7566 16.0 1424 5.1652 1.0 44.4639 0.0629
7.3222 17.0 1513 4.9510 1.0 43.4638 0.0519
6.9894 18.0 1602 4.8395 1.0 44.4988 0.1446
6.6839 19.0 1691 4.8684 1.0 42.7534 0.1052
6.3975 20.0 1780 4.6020 1.0 43.2935 0.0793
6.0659 21.0 1869 4.3094 1.0 44.0930 0.0938
5.8541 22.0 1958 4.2773 1.0 43.8304 0.1281
5.705 23.0 2047 4.1064 1.0 44.2400 0.1201
5.4843 24.0 2136 3.9477 1.0 44.2527 0.1240
5.2874 25.0 2225 3.9733 1.0 44.9309 0.1097
5.1221 26.0 2314 3.7253 1.0 44.9021 0.1569
5.0033 27.0 2403 3.6918 1.0 45.3199 0.1699
4.8753 28.0 2492 3.7214 1.0 44.7313 0.1219
4.703 29.0 2581 3.5098 1.0 45.1712 0.1493
4.5988 30.0 2670 3.4984 1.0 44.9037 0.1332
4.4454 31.0 2759 3.4042 1.0 44.9299 0.1913
4.3765 32.0 2848 3.3953 1.0 44.7618 0.1557
4.2578 33.0 2937 3.2852 1.0 44.7162 0.2015
4.1686 34.0 3026 3.3326 1.0 45.8108 0.1795
4.054 35.0 3115 3.2380 1.0 44.9969 0.1693
3.9608 36.0 3204 3.2087 1.0 45.1706 0.1048
3.9334 37.0 3293 3.2222 1.0 44.9701 0.1708
3.8444 38.0 3382 3.1124 1.0 45.3180 0.2026
3.7511 39.0 3471 3.1404 1.0 45.2833 0.1787
3.6928 40.0 3560 3.1486 1.0 44.9131 0.1809
3.6322 41.0 3649 3.0144 1.0 46.0521 0.2648
3.5504 42.0 3738 3.0636 1.0 45.6709 0.1582
3.5065 43.0 3827 3.0089 1.0 44.2839 0.2405
3.4138 44.0 3916 2.9631 1.0 45.5604 0.1989
3.3972 45.0 4005 2.9918 1.0 44.8559 0.1767
3.3264 46.0 4094 2.9466 1.0 45.2181 0.2354
3.306 47.0 4183 2.8690 1.0 45.7905 0.2823
3.2279 48.0 4272 2.8803 1.0 44.6883 0.2865
3.1756 49.0 4361 2.8623 1.0 45.3366 0.2921
3.1417 50.0 4450 2.8375 1.0 44.8970 0.2456

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
1
Safetensors
Model size
0.8B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for contemmcm/970d213fd939ababbf9e70ddf0dabd99

Finetuned
(38)
this model