41e459012ec40f55b527b839f2fe4dd3
This model is a fine-tuned version of google-t5/t5-base on the Helsinki-NLP/opus_books dataset. It achieves the following results on the evaluation set:
- Loss: 2.2252
- Data Size: 1.0
- Epoch Runtime: 23.1403
- Bleu: 3.0580
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 4.1779 | 0 | 2.4017 | 0.3270 |
| No log | 1 | 91 | 4.1610 | 0.0078 | 2.9235 | 0.3320 |
| No log | 2 | 182 | 4.0748 | 0.0156 | 3.2918 | 0.3121 |
| No log | 3 | 273 | 4.0013 | 0.0312 | 3.5204 | 0.3349 |
| No log | 4 | 364 | 3.8587 | 0.0625 | 4.4063 | 0.3405 |
| No log | 5 | 455 | 3.6791 | 0.125 | 6.4133 | 0.4952 |
| No log | 6 | 546 | 3.4888 | 0.25 | 9.4701 | 0.5835 |
| 0.3701 | 7 | 637 | 3.2777 | 0.5 | 14.2402 | 0.8754 |
| 3.409 | 8.0 | 728 | 3.0421 | 1.0 | 25.1413 | 1.5125 |
| 3.168 | 9.0 | 819 | 2.8988 | 1.0 | 24.1236 | 1.5792 |
| 3.0239 | 10.0 | 910 | 2.7973 | 1.0 | 23.2475 | 1.7011 |
| 2.8881 | 11.0 | 1001 | 2.7269 | 1.0 | 22.9601 | 1.7113 |
| 2.8542 | 12.0 | 1092 | 2.6635 | 1.0 | 22.4017 | 1.8997 |
| 2.7874 | 13.0 | 1183 | 2.6084 | 1.0 | 22.6214 | 2.0116 |
| 2.6943 | 14.0 | 1274 | 2.5658 | 1.0 | 23.3256 | 2.2049 |
| 2.618 | 15.0 | 1365 | 2.5271 | 1.0 | 22.3740 | 2.3431 |
| 2.5771 | 16.0 | 1456 | 2.5020 | 1.0 | 22.0935 | 2.4061 |
| 2.5363 | 17.0 | 1547 | 2.4780 | 1.0 | 22.8189 | 2.4147 |
| 2.4803 | 18.0 | 1638 | 2.4419 | 1.0 | 23.5236 | 2.5185 |
| 2.4168 | 19.0 | 1729 | 2.4226 | 1.0 | 23.6594 | 2.5646 |
| 2.3801 | 20.0 | 1820 | 2.4085 | 1.0 | 22.1487 | 2.6111 |
| 2.3458 | 21.0 | 1911 | 2.3754 | 1.0 | 23.3162 | 2.6308 |
| 2.2813 | 22.0 | 2002 | 2.3594 | 1.0 | 22.4091 | 2.6212 |
| 2.2783 | 23.0 | 2093 | 2.3446 | 1.0 | 22.4703 | 2.7117 |
| 2.2301 | 24.0 | 2184 | 2.3268 | 1.0 | 22.6361 | 2.6914 |
| 2.205 | 25.0 | 2275 | 2.3158 | 1.0 | 23.3835 | 2.7437 |
| 2.166 | 26.0 | 2366 | 2.2981 | 1.0 | 23.7972 | 2.8446 |
| 2.1408 | 27.0 | 2457 | 2.2993 | 1.0 | 23.0056 | 2.8545 |
| 2.1104 | 28.0 | 2548 | 2.2829 | 1.0 | 22.5164 | 2.8376 |
| 2.0603 | 29.0 | 2639 | 2.2716 | 1.0 | 22.5489 | 2.9210 |
| 2.0381 | 30.0 | 2730 | 2.2589 | 1.0 | 23.0023 | 2.8289 |
| 1.9781 | 31.0 | 2821 | 2.2622 | 1.0 | 23.2964 | 2.8548 |
| 2.0004 | 32.0 | 2912 | 2.2412 | 1.0 | 24.5081 | 3.0175 |
| 1.9452 | 33.0 | 3003 | 2.2556 | 1.0 | 24.4450 | 2.9590 |
| 1.8851 | 34.0 | 3094 | 2.2386 | 1.0 | 25.1583 | 3.0262 |
| 1.8973 | 35.0 | 3185 | 2.2518 | 1.0 | 23.8390 | 2.9766 |
| 1.8503 | 36.0 | 3276 | 2.2389 | 1.0 | 23.7478 | 3.0425 |
| 1.8434 | 37.0 | 3367 | 2.2250 | 1.0 | 23.3950 | 3.1194 |
| 1.8245 | 38.0 | 3458 | 2.2336 | 1.0 | 24.4400 | 3.1082 |
| 1.7868 | 39.0 | 3549 | 2.2098 | 1.0 | 24.7132 | 3.0051 |
| 1.7714 | 40.0 | 3640 | 2.2125 | 1.0 | 24.2303 | 3.0249 |
| 1.7089 | 41.0 | 3731 | 2.2155 | 1.0 | 24.2245 | 3.0096 |
| 1.7064 | 42.0 | 3822 | 2.2224 | 1.0 | 24.9689 | 2.9906 |
| 1.689 | 43.0 | 3913 | 2.2252 | 1.0 | 23.1403 | 3.0580 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- -
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for contemmcm/41e459012ec40f55b527b839f2fe4dd3
Base model
google-t5/t5-base