0ddb16dd8694cc9f14ef15cd3c9b0f99

This model is a fine-tuned version of google/long-t5-local-large on the Helsinki-NLP/opus_books [es-fr] dataset. It achieves the following results on the evaluation set:

Loss: 1.6883
Data Size: 1.0
Epoch Runtime: 606.7991
Bleu: 9.8666

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
total_train_batch_size: 32
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Data Size	Epoch Runtime	Bleu
No log	0	0	211.2462	0	43.1118	0.0094
No log	1	1407	116.7262	0.0078	48.9993	0.0061
No log	2	2814	43.9352	0.0156	53.4298	0.0089
1.9756	3	4221	20.3504	0.0312	62.9143	0.0526
25.9311	4	5628	14.8927	0.0625	81.5991	0.0315
18.2335	5	7035	12.3942	0.125	114.9738	0.0366
13.6658	6	8442	8.6721	0.25	186.0882	0.2032
9.2752	7	9849	6.5468	0.5	322.8414	0.1898
6.0579	8.0	11256	4.5514	1.0	601.7147	0.4214
4.963	9.0	12663	3.9254	1.0	600.3578	0.5662
4.4639	10.0	14070	3.6574	1.0	596.9364	0.7584
4.1302	11.0	15477	3.4710	1.0	601.6454	0.7089
3.9235	12.0	16884	3.3633	1.0	601.1381	0.9264
3.7208	13.0	18291	3.2380	1.0	596.8704	0.9029
3.5889	14.0	19698	3.1404	1.0	598.6068	1.0998
3.4594	15.0	21105	3.0636	1.0	606.0485	1.1165
3.4105	16.0	22512	2.9922	1.0	609.0398	1.2355
3.282	17.0	23919	2.9320	1.0	617.0727	1.3752
3.1533	18.0	25326	2.8606	1.0	608.2587	1.6477
3.1363	19.0	26733	2.8001	1.0	604.8040	1.6915
2.9962	20.0	28140	2.7005	1.0	610.7178	2.4405
2.8654	21.0	29547	2.5369	1.0	611.4810	3.1834
2.7104	22.0	30954	2.3868	1.0	609.9735	4.4239
2.562	23.0	32361	2.2705	1.0	605.8442	5.1809
2.4284	24.0	33768	2.1780	1.0	605.7224	6.0046
2.3612	25.0	35175	2.1002	1.0	604.3007	6.5047
2.2352	26.0	36582	2.0346	1.0	608.0183	7.3406
2.1532	27.0	37989	1.9861	1.0	609.8636	7.1664
2.0999	28.0	39396	1.9416	1.0	609.9335	7.5815
2.0621	29.0	40803	1.9056	1.0	617.9091	8.5064
1.9897	30.0	42210	1.8810	1.0	609.4563	8.4202
1.9056	31.0	43617	1.8452	1.0	611.9425	8.0794
1.8396	32.0	45024	1.8380	1.0	610.2999	8.4263
1.8326	33.0	46431	1.8088	1.0	608.1780	8.6867
1.7374	34.0	47838	1.7780	1.0	608.5015	9.3405
1.7379	35.0	49245	1.7702	1.0	607.3163	9.4303
1.6659	36.0	50652	1.7438	1.0	610.6225	9.3951
1.6528	37.0	52059	1.7336	1.0	614.5718	9.3208
1.5992	38.0	53466	1.7224	1.0	611.7702	9.3421
1.517	39.0	54873	1.7274	1.0	610.9096	9.2879
1.5352	40.0	56280	1.7126	1.0	607.9050	9.7942
1.5186	41.0	57687	1.6964	1.0	607.0325	9.3278
1.5035	42.0	59094	1.6927	1.0	608.1321	9.7688
1.4307	43.0	60501	1.6996	1.0	609.1138	9.7955
1.3876	44.0	61908	1.6807	1.0	608.1156	9.8583
1.3587	45.0	63315	1.6871	1.0	612.6054	9.9905
1.3011	46.0	64722	1.6893	1.0	614.4003	9.9209
1.3287	47.0	66129	1.6864	1.0	608.6889	9.6559
1.2669	48.0	67536	1.6883	1.0	606.7991	9.8666

Framework versions

Transformers 4.57.0
Pytorch 2.8.0+cu128
Datasets 4.2.0
Tokenizers 0.22.1

Downloads last month: 7

Safetensors

Model size

0.8B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/0ddb16dd8694cc9f14ef15cd3c9b0f99

Base model

google/long-t5-local-large

Finetuned

(38)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard