3bfb80dc6b557b7ad17dc411524e6618

This model is a fine-tuned version of google/long-t5-local-large on the Helsinki-NLP/opus_books [fr-no] dataset. It achieves the following results on the evaluation set:

Loss: 3.0881
Data Size: 1.0
Epoch Runtime: 42.1318
Bleu: 0.3133

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	218.7346	0	3.4585	0.0045
No log	1	86	196.7582	0.0078	4.0391	0.0046
No log	2	172	166.4535	0.0156	5.3531	0.0047
No log	3	258	143.6958	0.0312	6.9691	0.0045
No log	4	344	111.5982	0.0625	9.0460	0.0038
7.9578	5	430	59.1877	0.125	12.0779	0.0042
16.6671	6	516	24.0459	0.25	16.9792	0.0074
10.9092	7	602	13.2290	0.5	25.8824	0.0208
11.4871	8.0	688	9.5083	1.0	43.4705	0.0270
14.4666	9.0	774	8.3197	1.0	42.8579	0.0173
12.1943	10.0	860	7.6958	1.0	42.5365	0.0183
11.4242	11.0	946	7.3058	1.0	42.1732	0.0203
10.1412	12.0	1032	6.6282	1.0	42.0994	0.0222
9.3585	13.0	1118	6.1211	1.0	42.2559	0.0461
8.7241	14.0	1204	6.1753	1.0	42.0939	0.0405
8.4273	15.0	1290	5.6167	1.0	41.5309	0.0672
7.943	16.0	1376	5.1733	1.0	42.5236	0.0932
7.5635	17.0	1462	5.2217	1.0	42.4462	0.0651
7.285	18.0	1548	4.8901	1.0	42.0186	0.0864
6.952	19.0	1634	4.8715	1.0	42.3183	0.1158
6.6396	20.0	1720	4.5846	1.0	43.0245	0.1185
6.3698	21.0	1806	4.4181	1.0	42.5789	0.1607
6.2024	22.0	1892	4.1829	1.0	43.1708	0.1274
5.9276	23.0	1978	4.2267	1.0	42.3384	0.1222
5.718	24.0	2064	4.0444	1.0	41.9655	0.1668
5.4917	25.0	2150	4.0988	1.0	42.1946	0.1844
5.3839	26.0	2236	3.9368	1.0	42.0030	0.1710
5.1638	27.0	2322	3.7155	1.0	42.3713	0.2712
5.0037	28.0	2408	3.6599	1.0	43.0146	0.1863
4.914	29.0	2494	3.7094	1.0	42.0320	0.1741
4.7427	30.0	2580	3.6827	1.0	42.2522	0.2002
4.6182	31.0	2666	3.4507	1.0	42.5265	0.3372
4.4541	32.0	2752	3.3562	1.0	41.7082	0.2701
4.4253	33.0	2838	3.4172	1.0	41.7462	0.2276
4.2938	34.0	2924	3.2388	1.0	41.7162	0.2991
4.2202	35.0	3010	3.3083	1.0	42.7872	0.2769
4.1193	36.0	3096	3.1712	1.0	42.8169	0.3446
4.0442	37.0	3182	3.1268	1.0	42.2866	0.3418
3.9479	38.0	3268	3.2402	1.0	42.3912	0.2641
3.8618	39.0	3354	3.0643	1.0	42.8963	0.4205
3.8026	40.0	3440	3.0979	1.0	42.4685	0.3313
3.7391	41.0	3526	3.0904	1.0	43.5967	0.2817
3.6892	42.0	3612	3.0695	1.0	43.1917	0.3698
3.6092	43.0	3698	3.0881	1.0	42.1318	0.3133

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: 8

Safetensors

Model size

0.8B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/3bfb80dc6b557b7ad17dc411524e6618

Base model

google/long-t5-local-large

Finetuned

(38)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard