---
library_name: transformers
license: apache-2.0
base_model: google/long-t5-tglobal-xl
tags:
- generated_from_trainer
metrics:
- bleu
model-index:
- name: 9b119aadb047a403da687c37c5c13ece
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# 9b119aadb047a403da687c37c5c13ece

This model is a fine-tuned version of [google/long-t5-tglobal-xl](https://huggingface.co/google/long-t5-tglobal-xl) on the Helsinki-NLP/opus_books [es-fi] dataset.
It achieves the following results on the evaluation set:
- Loss: 2.1560
- Data Size: 1.0
- Epoch Runtime: 54.9502
- Bleu: 0.9746

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50

### Training results

| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu   |
|:-------------:|:-----:|:----:|:---------------:|:---------:|:-------------:|:------:|
| No log        | 0     | 0    | 3.7815          | 0         | 4.0760        | 0.0666 |
| No log        | 1     | 83   | 3.3776          | 0.0078    | 4.6796        | 0.0690 |
| No log        | 2     | 166  | 3.1356          | 0.0156    | 10.4660       | 0.1529 |
| No log        | 3     | 249  | 3.0205          | 0.0312    | 16.8049       | 0.1821 |
| 0.1101        | 4     | 332  | 2.8977          | 0.0625    | 21.9118       | 0.1338 |
| 0.1101        | 5     | 415  | 2.7921          | 0.125     | 24.6784       | 0.1492 |
| 0.1101        | 6     | 498  | 2.6768          | 0.25      | 28.2243       | 0.1895 |
| 0.5116        | 7     | 581  | 2.5291          | 0.5       | 41.9877       | 0.3425 |
| 2.7389        | 8.0   | 664  | 2.3740          | 1.0       | 63.2199       | 0.6335 |
| 2.5966        | 9.0   | 747  | 2.2830          | 1.0       | 59.8315       | 0.6210 |
| 2.3991        | 10.0  | 830  | 2.2124          | 1.0       | 55.5377       | 0.7248 |
| 2.2481        | 11.0  | 913  | 2.1609          | 1.0       | 58.3829       | 0.7245 |
| 2.18          | 12.0  | 996  | 2.1386          | 1.0       | 56.0157       | 0.7657 |
| 2.0321        | 13.0  | 1079 | 2.1092          | 1.0       | 58.9364       | 0.7355 |
| 1.9481        | 14.0  | 1162 | 2.0960          | 1.0       | 61.8901       | 0.8235 |
| 1.8677        | 15.0  | 1245 | 2.0781          | 1.0       | 56.5170       | 0.8296 |
| 1.7545        | 16.0  | 1328 | 2.0780          | 1.0       | 55.1804       | 0.8630 |
| 1.6672        | 17.0  | 1411 | 2.1007          | 1.0       | 57.5731       | 0.9432 |
| 1.602         | 18.0  | 1494 | 2.0955          | 1.0       | 54.8193       | 0.8842 |
| 1.4741        | 19.0  | 1577 | 2.1109          | 1.0       | 57.8128       | 0.9085 |
| 1.4093        | 20.0  | 1660 | 2.1560          | 1.0       | 54.9502       | 0.9746 |


### Framework versions

- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1