fbcc96e0819c6f1e6436a26174b44c4b

This model is a fine-tuned version of google/long-t5-local-large on the Helsinki-NLP/opus_books [en-fr] dataset. It achieves the following results on the evaluation set:

Loss: 1.4348
Data Size: 1.0
Epoch Runtime: 1362.9318
Bleu: 12.8020

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	224.5627	0	94.6316	0.0156
No log	1	3177	67.4081	0.0078	106.8302	0.0026
1.6781	2	6354	21.4840	0.0156	118.5835	0.0531
25.177	3	9531	15.2979	0.0312	138.9119	0.0195
18.3362	4	12708	11.5978	0.0625	177.0333	0.1558
12.8368	5	15885	8.4458	0.125	259.0782	0.3659
9.0567	6	19062	6.2391	0.25	415.7677	0.2387
6.0024	7	22239	4.4481	0.5	728.3800	0.2294
4.4559	8.0	25416	3.6644	1.0	1357.1125	0.5695
3.9099	9.0	28593	3.3518	1.0	1357.7848	0.7819
3.6106	10.0	31770	3.1036	1.0	1358.3411	1.2742
3.2855	11.0	34947	2.8243	1.0	1353.2540	2.0298
2.8816	12.0	38124	2.4093	1.0	1370.7467	4.9266
2.5902	13.0	41301	2.1679	1.0	1374.0909	6.5066
2.3623	14.0	44478	2.0081	1.0	1352.3307	7.1883
2.1873	15.0	47655	1.8939	1.0	1360.8891	8.0905
2.0319	16.0	50832	1.8014	1.0	1357.6523	9.1352
1.9414	17.0	54009	1.7226	1.0	1353.1928	9.4290
1.8693	18.0	57186	1.6697	1.0	1360.2523	10.1653
1.7658	19.0	60363	1.6275	1.0	1366.9805	10.5411
1.6635	20.0	63540	1.5972	1.0	1350.2265	10.3939
1.6254	21.0	66717	1.5613	1.0	1367.8792	10.9163
1.5495	22.0	69894	1.5264	1.0	1366.2311	11.1095
1.5219	23.0	73071	1.5139	1.0	1357.9950	11.4736
1.4926	24.0	76248	1.4879	1.0	1368.5897	11.4122
1.4216	25.0	79425	1.4841	1.0	1375.2616	11.9370
1.3864	26.0	82602	1.4552	1.0	1388.4205	12.1862
1.3473	27.0	85779	1.4498	1.0	1385.9928	11.9313
1.3051	28.0	88956	1.4359	1.0	1397.6203	11.8834
1.267	29.0	92133	1.4410	1.0	1402.0469	12.2084
1.2332	30.0	95310	1.4445	1.0	1388.6106	12.1456
1.1924	31.0	98487	1.4315	1.0	1398.1549	12.3652
1.152	32.0	101664	1.4226	1.0	1402.4454	12.3847
1.1319	33.0	104841	1.4414	1.0	1362.6028	12.4711
1.136	34.0	108018	1.4304	1.0	1367.1234	12.6131
1.0596	35.0	111195	1.4210	1.0	1382.0640	12.4171
1.0342	36.0	114372	1.4394	1.0	1369.3498	12.5422
1.0277	37.0	117549	1.4366	1.0	1370.2751	12.5769
0.9857	38.0	120726	1.4444	1.0	1378.6863	12.5033
0.9636	39.0	123903	1.4348	1.0	1362.9318	12.8020

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: 6

Safetensors

Model size

0.8B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/fbcc96e0819c6f1e6436a26174b44c4b

Base model

google/long-t5-local-large

Finetuned

(38)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard