train_qnli_101112_1760638087

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the qnli dataset. It achieves the following results on the evaluation set:

Loss: 0.0383
Num Input Tokens Seen: 207147488

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.03
train_batch_size: 4
eval_batch_size: 4
seed: 101112
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.0516	1.0	23567	0.0472	10356896
0.0494	2.0	47134	0.0439	20715296
0.0298	3.0	70701	0.0406	31065184
0.0369	4.0	94268	0.0407	41428128
0.0546	5.0	117835	0.0401	51784320
0.0607	6.0	141402	0.0394	62144160
0.071	7.0	164969	0.0392	72511552
0.0365	8.0	188536	0.0383	82864256
0.0241	9.0	212103	0.0392	93220320
0.0532	10.0	235670	0.0430	103572992
0.0269	11.0	259237	0.0425	113924768
0.0115	12.0	282804	0.0423	124282240
0.01	13.0	306371	0.0438	134645600
0.0052	14.0	329938	0.0410	145000704
0.0046	15.0	353505	0.0426	155349152
0.0213	16.0	377072	0.0426	165706304
0.0239	17.0	400639	0.0422	176064704
0.0377	18.0	424206	0.0421	186423936
0.017	19.0	447773	0.0421	196786464
0.0164	20.0	471340	0.0421	207147488

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: 84

Model tree for rbelanec/train_qnli_101112_1760638087

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2009)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard