train_cola_1757340211

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the cola dataset. It achieves the following results on the evaluation set:

Loss: 0.2174
Num Input Tokens Seen: 3668312

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 456
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 10.0

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.2364	0.5	962	0.1978	183008
0.3251	1.0	1924	0.1863	366712
0.1059	1.5	2886	0.1736	550360
0.207	2.0	3848	0.1481	734016
0.2568	2.5	4810	0.1479	917408
0.0476	3.0	5772	0.1609	1100824
0.0386	3.5	6734	0.1598	1283896
0.1213	4.0	7696	0.1442	1467248
0.0282	4.5	8658	0.1766	1651280
0.1311	5.0	9620	0.1388	1834568
0.0748	5.5	10582	0.1554	2017960
0.098	6.0	11544	0.1478	2201464
0.109	6.5	12506	0.1749	2384536
0.0855	7.0	13468	0.1513	2568040
0.0174	7.5	14430	0.1706	2750664
0.058	8.0	15392	0.1701	2934360
0.0563	8.5	16354	0.1804	3118424
0.2265	9.0	17316	0.1770	3301448
0.0602	9.5	18278	0.1786	3485512
0.1016	10.0	19240	0.1781	3668312

Framework versions

PEFT 0.15.2
Transformers 4.51.3
Pytorch 2.8.0+cu128
Datasets 3.6.0
Tokenizers 0.21.1

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cola_1757340211

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2056)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard