cc6530f988294864455773617553a4e2
This model is a fine-tuned version of google/long-t5-local-large on the Helsinki-NLP/opus_books [fi-fr] dataset. It achieves the following results on the evaluation set:
- Loss: 3.9504
- Data Size: 1.0
- Epoch Runtime: 43.6482
- Bleu: 0.6960
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 210.3118 | 0 | 3.5214 | 0.0339 |
| No log | 1 | 88 | 194.9818 | 0.0078 | 4.3850 | 0.0391 |
| No log | 2 | 176 | 169.7989 | 0.0156 | 5.5429 | 0.0418 |
| No log | 3 | 264 | 138.3005 | 0.0312 | 7.4443 | 0.0376 |
| No log | 4 | 352 | 98.2089 | 0.0625 | 10.0366 | 0.0106 |
| No log | 5 | 440 | 57.4321 | 0.125 | 13.4255 | 0.0171 |
| 8.5688 | 6 | 528 | 29.0110 | 0.25 | 17.6872 | 0.0283 |
| 16.0629 | 7 | 616 | 18.4866 | 0.5 | 26.4004 | 0.2147 |
| 24.2236 | 8.0 | 704 | 14.5337 | 1.0 | 44.8358 | 0.3396 |
| 21.6713 | 9.0 | 792 | 12.8300 | 1.0 | 44.1918 | 0.1911 |
| 18.5801 | 10.0 | 880 | 11.2006 | 1.0 | 42.8811 | 0.0965 |
| 16.5609 | 11.0 | 968 | 10.4868 | 1.0 | 43.8189 | 0.1437 |
| 15.0038 | 12.0 | 1056 | 9.2199 | 1.0 | 43.4298 | 0.1613 |
| 14.2505 | 13.0 | 1144 | 9.2859 | 1.0 | 42.9938 | 0.2398 |
| 13.3176 | 14.0 | 1232 | 8.5574 | 1.0 | 44.2948 | 0.2922 |
| 12.514 | 15.0 | 1320 | 7.9899 | 1.0 | 43.3043 | 0.2558 |
| 11.7392 | 16.0 | 1408 | 7.7368 | 1.0 | 43.6196 | 0.2846 |
| 11.3779 | 17.0 | 1496 | 7.9881 | 1.0 | 44.4180 | 0.2235 |
| 10.8107 | 18.0 | 1584 | 7.3984 | 1.0 | 42.9446 | 0.3104 |
| 10.3264 | 19.0 | 1672 | 6.7918 | 1.0 | 43.6092 | 0.3328 |
| 9.7584 | 20.0 | 1760 | 6.9432 | 1.0 | 42.9454 | 0.2219 |
| 9.5394 | 21.0 | 1848 | 6.6225 | 1.0 | 43.8095 | 0.2157 |
| 9.1196 | 22.0 | 1936 | 5.9313 | 1.0 | 42.8700 | 0.2882 |
| 8.6616 | 23.0 | 2024 | 5.8392 | 1.0 | 44.2303 | 0.3488 |
| 8.3646 | 24.0 | 2112 | 5.9353 | 1.0 | 44.0809 | 0.3769 |
| 8.0709 | 25.0 | 2200 | 5.5733 | 1.0 | 43.3977 | 0.2680 |
| 7.8487 | 26.0 | 2288 | 5.7821 | 1.0 | 43.9842 | 0.5559 |
| 7.5668 | 27.0 | 2376 | 5.6761 | 1.0 | 43.7011 | 0.2355 |
| 7.2396 | 28.0 | 2464 | 5.2907 | 1.0 | 43.4624 | 0.3873 |
| 7.042 | 29.0 | 2552 | 5.2608 | 1.0 | 43.5775 | 0.3210 |
| 6.8641 | 30.0 | 2640 | 5.0165 | 1.0 | 43.7052 | 0.4909 |
| 6.5967 | 31.0 | 2728 | 5.0911 | 1.0 | 43.7801 | 0.3103 |
| 6.423 | 32.0 | 2816 | 4.7339 | 1.0 | 42.6348 | 0.3563 |
| 6.2719 | 33.0 | 2904 | 4.7812 | 1.0 | 44.5199 | 0.4468 |
| 6.1618 | 34.0 | 2992 | 4.8750 | 1.0 | 43.5037 | 0.4621 |
| 5.9636 | 35.0 | 3080 | 4.7315 | 1.0 | 43.6152 | 0.3522 |
| 5.8011 | 36.0 | 3168 | 4.5619 | 1.0 | 43.7941 | 0.5355 |
| 5.7031 | 37.0 | 3256 | 4.4944 | 1.0 | 43.8364 | 0.4846 |
| 5.5746 | 38.0 | 3344 | 4.4306 | 1.0 | 42.8775 | 0.5597 |
| 5.4236 | 39.0 | 3432 | 4.4541 | 1.0 | 44.3190 | 0.5422 |
| 5.2757 | 40.0 | 3520 | 4.4262 | 1.0 | 44.0665 | 0.6967 |
| 5.1584 | 41.0 | 3608 | 4.2310 | 1.0 | 44.1722 | 0.6211 |
| 5.0859 | 42.0 | 3696 | 4.2331 | 1.0 | 43.3939 | 0.9223 |
| 4.9627 | 43.0 | 3784 | 4.1307 | 1.0 | 42.9168 | 0.6862 |
| 4.881 | 44.0 | 3872 | 4.1794 | 1.0 | 43.3896 | 0.6229 |
| 4.7832 | 45.0 | 3960 | 4.0830 | 1.0 | 43.9060 | 0.6629 |
| 4.7222 | 46.0 | 4048 | 4.0359 | 1.0 | 43.2298 | 0.6652 |
| 4.6348 | 47.0 | 4136 | 4.0967 | 1.0 | 44.0093 | 0.6725 |
| 4.5194 | 48.0 | 4224 | 4.0618 | 1.0 | 43.0736 | 0.9749 |
| 4.453 | 49.0 | 4312 | 3.9898 | 1.0 | 43.0868 | 0.7547 |
| 4.3851 | 50.0 | 4400 | 3.9504 | 1.0 | 43.6482 | 0.6960 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- 4
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for contemmcm/cc6530f988294864455773617553a4e2
Base model
google/long-t5-local-large