0f40a08bd56f064bb06e316611011c6f

This model is a fine-tuned version of google/long-t5-tglobal-xl on the Helsinki-NLP/opus_books [de-fr] dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5367
  • Data Size: 1.0
  • Epoch Runtime: 479.1850
  • Bleu: 8.0936

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 3.6991 0 36.2742 0.5984
No log 1 872 2.5391 0.0078 39.1399 3.6394
No log 2 1744 2.4010 0.0156 46.0979 3.5846
0.0492 3 2616 2.2881 0.0312 55.9255 2.7109
0.174 4 3488 2.1849 0.0625 71.0811 3.3386
2.5406 5 4360 2.0689 0.125 98.1880 3.9442
2.3052 6 5232 1.9422 0.25 155.0986 4.6396
2.0354 7 6104 1.7929 0.5 258.4892 5.5812
1.835 8.0 6976 1.6543 1.0 475.3948 6.2439
1.6452 9.0 7848 1.5737 1.0 471.5503 7.0059
1.5284 10.0 8720 1.5231 1.0 472.1171 7.1486
1.4013 11.0 9592 1.4923 1.0 473.8437 7.4109
1.267 12.0 10464 1.4667 1.0 474.7735 7.5972
1.1788 13.0 11336 1.4660 1.0 471.5183 7.8321
1.0611 14.0 12208 1.4696 1.0 477.0063 7.8760
0.99 15.0 13080 1.4834 1.0 472.0804 8.0243
0.9122 16.0 13952 1.5032 1.0 476.3341 8.1620
0.8147 17.0 14824 1.5367 1.0 479.1850 8.0936

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
3
Safetensors
Model size
0.7B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for contemmcm/0f40a08bd56f064bb06e316611011c6f

Finetuned
(49)
this model