eac34b484a22a3628c2ccdb617917d69

This model is a fine-tuned version of google/long-t5-local-large on the Helsinki-NLP/opus_books [es-ru] dataset. It achieves the following results on the evaluation set:

Loss: 1.5964
Data Size: 1.0
Epoch Runtime: 185.4283
Bleu: 1.3104

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	201.7415	0	13.7183	0.0021
No log	1	419	150.6333	0.0078	16.5532	0.0017
No log	2	838	99.6125	0.0156	17.9770	0.0029
3.5686	3	1257	40.6733	0.0312	21.3690	0.0034
3.5686	4	1676	15.8695	0.0625	27.4520	0.0037
2.029	5	2095	8.8537	0.125	39.4947	0.0151
1.8451	6	2514	7.1642	0.25	60.9853	0.0610
10.1166	7	2933	5.6426	0.5	102.4794	0.0545
7.1548	8.0	3352	3.9471	1.0	187.1191	0.0995
5.8098	9.0	3771	3.2701	1.0	185.9093	0.1842
5.0287	10.0	4190	3.0513	1.0	185.4253	0.2690
4.305	11.0	4609	2.6723	1.0	184.4337	0.3414
3.8963	12.0	5028	2.6301	1.0	185.9095	0.3794
3.5101	13.0	5447	2.3203	1.0	185.7310	0.3516
3.294	14.0	5866	2.2275	1.0	186.2012	0.3561
3.056	15.0	6285	2.1389	1.0	186.7088	0.5289
2.8816	16.0	6704	2.0732	1.0	186.1090	0.5626
2.7164	17.0	7123	2.0461	1.0	186.8601	0.5161
2.6059	18.0	7542	1.9743	1.0	186.4966	0.5527
2.4976	19.0	7961	1.9548	1.0	184.8019	0.3953
2.4114	20.0	8380	1.9271	1.0	187.4944	0.4917
2.3611	21.0	8799	1.8831	1.0	186.7338	0.5610
2.3081	22.0	9218	1.8734	1.0	186.6796	0.5303
2.2358	23.0	9637	1.8519	1.0	187.8298	0.5170
2.1536	24.0	10056	1.8180	1.0	187.0982	0.4831
2.1465	25.0	10475	1.8126	1.0	185.7881	0.6376
2.1027	26.0	10894	1.7823	1.0	186.0421	0.7252
2.0634	27.0	11313	1.7695	1.0	186.5257	0.6960
2.0437	28.0	11732	1.7442	1.0	184.7985	0.7366
2.0119	29.0	12151	1.7512	1.0	184.7184	0.6206
1.9818	30.0	12570	1.7254	1.0	186.1011	0.7023
1.952	31.0	12989	1.7143	1.0	186.7252	0.7113
1.9279	32.0	13408	1.7260	1.0	187.3581	0.7404
1.9076	33.0	13827	1.7100	1.0	187.1126	0.8001
1.882	34.0	14246	1.7079	1.0	187.0736	0.9485
1.8713	35.0	14665	1.6857	1.0	185.7581	0.8636
1.8552	36.0	15084	1.6698	1.0	186.2798	0.9409
1.811	37.0	15503	1.6657	1.0	184.8279	0.9436
1.8074	38.0	15922	1.6667	1.0	185.9223	0.9371
1.7854	39.0	16341	1.6506	1.0	187.9232	1.0613
1.7659	40.0	16760	1.6429	1.0	186.2874	1.0016
1.7587	41.0	17179	1.6280	1.0	185.0036	0.9775
1.748	42.0	17598	1.6436	1.0	185.1594	1.1243
1.7297	43.0	18017	1.6248	1.0	185.3714	1.0467
1.7089	44.0	18436	1.6211	1.0	186.5728	1.1851
1.6861	45.0	18855	1.6184	1.0	184.8577	1.1488
1.6765	46.0	19274	1.6148	1.0	185.9029	1.0135
1.6652	47.0	19693	1.6117	1.0	187.1650	1.1559
1.6399	48.0	20112	1.5886	1.0	187.4137	1.1896
1.6529	49.0	20531	1.5901	1.0	185.5260	1.1178
1.6186	50.0	20950	1.5964	1.0	185.4283	1.3104

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: 6

Safetensors

Model size

0.8B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/eac34b484a22a3628c2ccdb617917d69

Base model

google/long-t5-local-large

Finetuned

(38)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard