41e459012ec40f55b527b839f2fe4dd3

This model is a fine-tuned version of google-t5/t5-base on the Helsinki-NLP/opus_books dataset. It achieves the following results on the evaluation set:

Loss: 2.2252
Data Size: 1.0
Epoch Runtime: 23.1403
Bleu: 3.0580

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	4.1779	0	2.4017	0.3270
No log	1	91	4.1610	0.0078	2.9235	0.3320
No log	2	182	4.0748	0.0156	3.2918	0.3121
No log	3	273	4.0013	0.0312	3.5204	0.3349
No log	4	364	3.8587	0.0625	4.4063	0.3405
No log	5	455	3.6791	0.125	6.4133	0.4952
No log	6	546	3.4888	0.25	9.4701	0.5835
0.3701	7	637	3.2777	0.5	14.2402	0.8754
3.409	8.0	728	3.0421	1.0	25.1413	1.5125
3.168	9.0	819	2.8988	1.0	24.1236	1.5792
3.0239	10.0	910	2.7973	1.0	23.2475	1.7011
2.8881	11.0	1001	2.7269	1.0	22.9601	1.7113
2.8542	12.0	1092	2.6635	1.0	22.4017	1.8997
2.7874	13.0	1183	2.6084	1.0	22.6214	2.0116
2.6943	14.0	1274	2.5658	1.0	23.3256	2.2049
2.618	15.0	1365	2.5271	1.0	22.3740	2.3431
2.5771	16.0	1456	2.5020	1.0	22.0935	2.4061
2.5363	17.0	1547	2.4780	1.0	22.8189	2.4147
2.4803	18.0	1638	2.4419	1.0	23.5236	2.5185
2.4168	19.0	1729	2.4226	1.0	23.6594	2.5646
2.3801	20.0	1820	2.4085	1.0	22.1487	2.6111
2.3458	21.0	1911	2.3754	1.0	23.3162	2.6308
2.2813	22.0	2002	2.3594	1.0	22.4091	2.6212
2.2783	23.0	2093	2.3446	1.0	22.4703	2.7117
2.2301	24.0	2184	2.3268	1.0	22.6361	2.6914
2.205	25.0	2275	2.3158	1.0	23.3835	2.7437
2.166	26.0	2366	2.2981	1.0	23.7972	2.8446
2.1408	27.0	2457	2.2993	1.0	23.0056	2.8545
2.1104	28.0	2548	2.2829	1.0	22.5164	2.8376
2.0603	29.0	2639	2.2716	1.0	22.5489	2.9210
2.0381	30.0	2730	2.2589	1.0	23.0023	2.8289
1.9781	31.0	2821	2.2622	1.0	23.2964	2.8548
2.0004	32.0	2912	2.2412	1.0	24.5081	3.0175
1.9452	33.0	3003	2.2556	1.0	24.4450	2.9590
1.8851	34.0	3094	2.2386	1.0	25.1583	3.0262
1.8973	35.0	3185	2.2518	1.0	23.8390	2.9766
1.8503	36.0	3276	2.2389	1.0	23.7478	3.0425
1.8434	37.0	3367	2.2250	1.0	23.3950	3.1194
1.8245	38.0	3458	2.2336	1.0	24.4400	3.1082
1.7868	39.0	3549	2.2098	1.0	24.7132	3.0051
1.7714	40.0	3640	2.2125	1.0	24.2303	3.0249
1.7089	41.0	3731	2.2155	1.0	24.2245	3.0096
1.7064	42.0	3822	2.2224	1.0	24.9689	2.9906
1.689	43.0	3913	2.2252	1.0	23.1403	3.0580

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: -

Safetensors

Model size

0.3B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/41e459012ec40f55b527b839f2fe4dd3

Base model

google-t5/t5-base

Finetuned

(715)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard