970d213fd939ababbf9e70ddf0dabd99

This model is a fine-tuned version of google/long-t5-local-large on the Helsinki-NLP/opus_books [es-no] dataset. It achieves the following results on the evaluation set:

Loss: 2.8375
Data Size: 1.0
Epoch Runtime: 44.8970
Bleu: 0.2456

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	221.3844	0	3.6443	0.0043
No log	1	89	197.2671	0.0078	4.7221	0.0040
No log	2	178	168.6531	0.0156	6.0983	0.0041
No log	3	267	144.2118	0.0312	7.7860	0.0049
No log	4	356	105.7805	0.0625	11.2901	0.0038
No log	5	445	53.3554	0.125	13.4013	0.0040
8.0695	6	534	22.2945	0.25	17.7091	0.0047
13.7423	7	623	13.4378	0.5	26.3652	0.0147
17.3418	8.0	712	9.8752	1.0	44.4832	0.0184
13.432	9.0	801	8.4544	1.0	43.5700	0.0224
12.4192	10.0	890	8.5307	1.0	42.9915	0.0163
11.1762	11.0	979	7.0403	1.0	43.5339	0.0305
10.0983	12.0	1068	6.7428	1.0	43.5426	0.0467
9.197	13.0	1157	6.3366	1.0	43.5504	0.0322
8.8429	14.0	1246	5.7970	1.0	43.4010	0.0517
8.2187	15.0	1335	5.7831	1.0	43.4180	0.0518
7.7566	16.0	1424	5.1652	1.0	44.4639	0.0629
7.3222	17.0	1513	4.9510	1.0	43.4638	0.0519
6.9894	18.0	1602	4.8395	1.0	44.4988	0.1446
6.6839	19.0	1691	4.8684	1.0	42.7534	0.1052
6.3975	20.0	1780	4.6020	1.0	43.2935	0.0793
6.0659	21.0	1869	4.3094	1.0	44.0930	0.0938
5.8541	22.0	1958	4.2773	1.0	43.8304	0.1281
5.705	23.0	2047	4.1064	1.0	44.2400	0.1201
5.4843	24.0	2136	3.9477	1.0	44.2527	0.1240
5.2874	25.0	2225	3.9733	1.0	44.9309	0.1097
5.1221	26.0	2314	3.7253	1.0	44.9021	0.1569
5.0033	27.0	2403	3.6918	1.0	45.3199	0.1699
4.8753	28.0	2492	3.7214	1.0	44.7313	0.1219
4.703	29.0	2581	3.5098	1.0	45.1712	0.1493
4.5988	30.0	2670	3.4984	1.0	44.9037	0.1332
4.4454	31.0	2759	3.4042	1.0	44.9299	0.1913
4.3765	32.0	2848	3.3953	1.0	44.7618	0.1557
4.2578	33.0	2937	3.2852	1.0	44.7162	0.2015
4.1686	34.0	3026	3.3326	1.0	45.8108	0.1795
4.054	35.0	3115	3.2380	1.0	44.9969	0.1693
3.9608	36.0	3204	3.2087	1.0	45.1706	0.1048
3.9334	37.0	3293	3.2222	1.0	44.9701	0.1708
3.8444	38.0	3382	3.1124	1.0	45.3180	0.2026
3.7511	39.0	3471	3.1404	1.0	45.2833	0.1787
3.6928	40.0	3560	3.1486	1.0	44.9131	0.1809
3.6322	41.0	3649	3.0144	1.0	46.0521	0.2648
3.5504	42.0	3738	3.0636	1.0	45.6709	0.1582
3.5065	43.0	3827	3.0089	1.0	44.2839	0.2405
3.4138	44.0	3916	2.9631	1.0	45.5604	0.1989
3.3972	45.0	4005	2.9918	1.0	44.8559	0.1767
3.3264	46.0	4094	2.9466	1.0	45.2181	0.2354
3.306	47.0	4183	2.8690	1.0	45.7905	0.2823
3.2279	48.0	4272	2.8803	1.0	44.6883	0.2865
3.1756	49.0	4361	2.8623	1.0	45.3366	0.2921
3.1417	50.0	4450	2.8375	1.0	44.8970	0.2456

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: 1

Safetensors

Model size

0.8B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/970d213fd939ababbf9e70ddf0dabd99

Base model

google/long-t5-local-large

Finetuned

(38)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard