fbcc96e0819c6f1e6436a26174b44c4b

This model is a fine-tuned version of google/long-t5-local-large on the Helsinki-NLP/opus_books [en-fr] dataset. It achieves the following results on the evaluation set:

  • Loss: 1.4348
  • Data Size: 1.0
  • Epoch Runtime: 1362.9318
  • Bleu: 12.8020

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 224.5627 0 94.6316 0.0156
No log 1 3177 67.4081 0.0078 106.8302 0.0026
1.6781 2 6354 21.4840 0.0156 118.5835 0.0531
25.177 3 9531 15.2979 0.0312 138.9119 0.0195
18.3362 4 12708 11.5978 0.0625 177.0333 0.1558
12.8368 5 15885 8.4458 0.125 259.0782 0.3659
9.0567 6 19062 6.2391 0.25 415.7677 0.2387
6.0024 7 22239 4.4481 0.5 728.3800 0.2294
4.4559 8.0 25416 3.6644 1.0 1357.1125 0.5695
3.9099 9.0 28593 3.3518 1.0 1357.7848 0.7819
3.6106 10.0 31770 3.1036 1.0 1358.3411 1.2742
3.2855 11.0 34947 2.8243 1.0 1353.2540 2.0298
2.8816 12.0 38124 2.4093 1.0 1370.7467 4.9266
2.5902 13.0 41301 2.1679 1.0 1374.0909 6.5066
2.3623 14.0 44478 2.0081 1.0 1352.3307 7.1883
2.1873 15.0 47655 1.8939 1.0 1360.8891 8.0905
2.0319 16.0 50832 1.8014 1.0 1357.6523 9.1352
1.9414 17.0 54009 1.7226 1.0 1353.1928 9.4290
1.8693 18.0 57186 1.6697 1.0 1360.2523 10.1653
1.7658 19.0 60363 1.6275 1.0 1366.9805 10.5411
1.6635 20.0 63540 1.5972 1.0 1350.2265 10.3939
1.6254 21.0 66717 1.5613 1.0 1367.8792 10.9163
1.5495 22.0 69894 1.5264 1.0 1366.2311 11.1095
1.5219 23.0 73071 1.5139 1.0 1357.9950 11.4736
1.4926 24.0 76248 1.4879 1.0 1368.5897 11.4122
1.4216 25.0 79425 1.4841 1.0 1375.2616 11.9370
1.3864 26.0 82602 1.4552 1.0 1388.4205 12.1862
1.3473 27.0 85779 1.4498 1.0 1385.9928 11.9313
1.3051 28.0 88956 1.4359 1.0 1397.6203 11.8834
1.267 29.0 92133 1.4410 1.0 1402.0469 12.2084
1.2332 30.0 95310 1.4445 1.0 1388.6106 12.1456
1.1924 31.0 98487 1.4315 1.0 1398.1549 12.3652
1.152 32.0 101664 1.4226 1.0 1402.4454 12.3847
1.1319 33.0 104841 1.4414 1.0 1362.6028 12.4711
1.136 34.0 108018 1.4304 1.0 1367.1234 12.6131
1.0596 35.0 111195 1.4210 1.0 1382.0640 12.4171
1.0342 36.0 114372 1.4394 1.0 1369.3498 12.5422
1.0277 37.0 117549 1.4366 1.0 1370.2751 12.5769
0.9857 38.0 120726 1.4444 1.0 1378.6863 12.5033
0.9636 39.0 123903 1.4348 1.0 1362.9318 12.8020

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
6
Safetensors
Model size
0.8B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for contemmcm/fbcc96e0819c6f1e6436a26174b44c4b

Finetuned
(38)
this model