d97211242dece02bd8b2d809a3a38bf1

This model is a fine-tuned version of google-t5/t5-base on the Helsinki-NLP/opus_books dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0666
  • Data Size: 1.0
  • Epoch Runtime: 736.0040
  • Bleu: 15.4028

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 1.6684 0 74.0586 10.0814
No log 1 3177 1.5800 0.0078 68.1763 10.9074
0.027 2 6354 1.5189 0.0156 76.0990 11.3439
1.6491 3 9531 1.4607 0.0312 110.2664 11.3189
1.6056 4 12708 1.4157 0.0625 110.4240 11.4732
1.52 5 15885 1.3731 0.125 194.1773 12.1400
1.4766 6 19062 1.3199 0.25 281.8424 12.4997
1.4006 7 22239 1.2583 0.5 460.6353 13.0648
1.3184 8.0 25416 1.1954 1.0 798.1731 13.6015
1.2527 9.0 28593 1.1609 1.0 765.5459 13.8703
1.2303 10.0 31770 1.1324 1.0 730.7526 14.1380
1.168 11.0 34947 1.1149 1.0 741.3822 14.6429
1.1215 12.0 38124 1.1003 1.0 759.3569 14.5906
1.0906 13.0 41301 1.0891 1.0 748.3090 14.9765
1.0568 14.0 44478 1.0827 1.0 750.4526 14.9267
1.0492 15.0 47655 1.0760 1.0 758.3424 14.9931
0.9851 16.0 50832 1.0698 1.0 780.3965 15.1991
0.9941 17.0 54009 1.0655 1.0 754.5295 15.1713
0.984 18.0 57186 1.0632 1.0 724.0579 15.3117
0.9459 19.0 60363 1.0621 1.0 721.8757 15.2825
0.9163 20.0 63540 1.0592 1.0 723.9400 15.3073
0.9086 21.0 66717 1.0607 1.0 725.7311 15.3085
0.8928 22.0 69894 1.0615 1.0 726.2045 15.2994
0.8819 23.0 73071 1.0590 1.0 739.2236 15.5599
0.8871 24.0 76248 1.0584 1.0 744.4550 15.3351
0.8469 25.0 79425 1.0609 1.0 724.7774 15.4515
0.8415 26.0 82602 1.0607 1.0 729.5361 15.4759
0.8151 27.0 85779 1.0664 1.0 724.0018 15.5203
0.8087 28.0 88956 1.0666 1.0 736.0040 15.4028

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
1
Safetensors
Model size
0.3B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for contemmcm/d97211242dece02bd8b2d809a3a38bf1

Base model

google-t5/t5-base
Finetuned
(715)
this model