db1020b67a2da5dbf95757572e7274b5

This model is a fine-tuned version of google/long-t5-tglobal-xl on the Helsinki-NLP/opus_books [it-ru] dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8862
  • Data Size: 1.0
  • Epoch Runtime: 262.9328
  • Bleu: 12.5863

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 2.7129 0 18.8115 0.1735
No log 1 447 2.2923 0.0078 21.4956 0.7406
0.0432 2 894 2.0834 0.0156 26.2604 1.3894
0.0553 3 1341 1.9620 0.0312 32.2765 0.9524
0.0865 4 1788 1.8338 0.0625 41.8070 1.4238
0.1473 5 2235 1.6963 0.125 58.7972 2.3192
1.8625 6 2682 1.5436 0.25 87.9583 3.1803
1.592 7 3129 1.3679 0.5 144.7514 4.6704
1.3172 8.0 3576 1.1527 1.0 262.7576 6.4959
1.1637 9.0 4023 1.0329 1.0 260.6613 7.8519
1.0307 10.0 4470 0.9629 1.0 261.6965 8.9914
0.934 11.0 4917 0.9085 1.0 261.6134 9.6975
0.8688 12.0 5364 0.8760 1.0 261.6942 10.3006
0.7792 13.0 5811 0.8555 1.0 262.2055 10.8635
0.7232 14.0 6258 0.8461 1.0 260.3408 11.1665
0.678 15.0 6705 0.8286 1.0 261.0828 11.5097
0.6177 16.0 7152 0.8356 1.0 260.9886 11.7365
0.574 17.0 7599 0.8305 1.0 259.4559 12.0655
0.5306 18.0 8046 0.8270 1.0 261.9602 12.1897
0.4945 19.0 8493 0.8496 1.0 260.2814 12.2705
0.4503 20.0 8940 0.8637 1.0 259.6172 12.4815
0.4242 21.0 9387 0.8771 1.0 260.4173 12.5712
0.37 22.0 9834 0.8862 1.0 262.9328 12.5863

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
2
Safetensors
Model size
0.7B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for contemmcm/db1020b67a2da5dbf95757572e7274b5

Finetuned
(49)
this model