train_copa_101112_1760637991

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the copa dataset. It achieves the following results on the evaluation set:

Loss: 0.0305
Num Input Tokens Seen: 562848

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 101112
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.9281	1.0	90	0.5497	28192
0.068	2.0	180	0.0412	56256
0.0154	3.0	270	0.0333	84320
0.0329	4.0	360	0.0329	112416
0.0093	5.0	450	0.0326	140544
0.0023	6.0	540	0.0314	168768
0.0313	7.0	630	0.0322	196896
0.0137	8.0	720	0.0314	225024
0.0757	9.0	810	0.0309	253152
0.1181	10.0	900	0.0314	281312
0.0914	11.0	990	0.0323	309280
0.071	12.0	1080	0.0323	337536
0.0213	13.0	1170	0.0314	365632
0.0429	14.0	1260	0.0324	393632
0.191	15.0	1350	0.0314	421696
0.0206	16.0	1440	0.0305	449984
0.0522	17.0	1530	0.0313	478016
0.0282	18.0	1620	0.0316	506272
0.0144	19.0	1710	0.0313	534432
0.0631	20.0	1800	0.0312	562848

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: 95

Model tree for rbelanec/train_copa_101112_1760637991

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2009)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard