d5ee92af64f5995dfb2cff04cdd13fc7

This model is a fine-tuned version of google-t5/t5-base on the Helsinki-NLP/opus_books dataset. It achieves the following results on the evaluation set:

  • Loss: 1.4281
  • Data Size: 1.0
  • Epoch Runtime: 198.4646
  • Bleu: 9.9183

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 3.3889 0 19.5402 0.5448
No log 1 808 3.2189 0.0078 23.3609 0.7044
No log 2 1616 3.0458 0.0156 29.5394 1.1133
No log 3 2424 2.8911 0.0312 25.6167 1.7552
0.0977 4 3232 2.7318 0.0625 32.6952 2.7459
2.9899 5 4040 2.5805 0.125 42.8306 3.3725
2.7465 6 4848 2.4096 0.25 78.0312 3.9332
2.4591 7 5656 2.2130 0.5 108.7962 4.8950
2.2598 8.0 6464 2.0027 1.0 204.9841 5.9369
2.0827 9.0 7272 1.8715 1.0 201.9693 6.6774
1.9699 10.0 8080 1.7852 1.0 202.4802 7.3452
1.8979 11.0 8888 1.7264 1.0 200.1101 7.4456
1.806 12.0 9696 1.6791 1.0 197.1412 7.8435
1.731 13.0 10504 1.6473 1.0 190.5002 8.0464
1.686 14.0 11312 1.6093 1.0 188.9600 8.2376
1.6499 15.0 12120 1.5840 1.0 188.9396 8.4719
1.636 16.0 12928 1.5622 1.0 189.0705 8.6122
1.5533 17.0 13736 1.5403 1.0 183.1241 8.7637
1.5276 18.0 14544 1.5243 1.0 183.3175 8.8833
1.4875 19.0 15352 1.5121 1.0 207.7781 9.1169
1.4468 20.0 16160 1.4963 1.0 200.3920 9.1539
1.4129 21.0 16968 1.4895 1.0 187.3135 9.3170
1.3828 22.0 17776 1.4767 1.0 194.1532 9.4011
1.357 23.0 18584 1.4679 1.0 193.1984 9.4004
1.3232 24.0 19392 1.4631 1.0 206.8371 9.5002
1.303 25.0 20200 1.4546 1.0 189.9689 9.5300
1.2748 26.0 21008 1.4482 1.0 205.0709 9.7769
1.2569 27.0 21816 1.4486 1.0 200.3493 9.6726
1.2345 28.0 22624 1.4458 1.0 189.3574 9.6684
1.2344 29.0 23432 1.4407 1.0 192.3539 9.7323
1.1679 30.0 24240 1.4383 1.0 210.2699 9.7481
1.1749 31.0 25048 1.4340 1.0 192.7680 9.6688
1.1594 32.0 25856 1.4361 1.0 221.8852 9.7711
1.1395 33.0 26664 1.4381 1.0 192.2963 9.7037
1.1177 34.0 27472 1.4270 1.0 190.7512 9.7191
1.0783 35.0 28280 1.4329 1.0 189.8960 9.7003
1.1051 36.0 29088 1.4316 1.0 216.8254 9.6882
1.0664 37.0 29896 1.4326 1.0 194.1363 9.7428
1.056 38.0 30704 1.4281 1.0 198.4646 9.9183

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
-
Safetensors
Model size
0.3B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for contemmcm/d5ee92af64f5995dfb2cff04cdd13fc7

Base model

google-t5/t5-base
Finetuned
(715)
this model