train_qnli_101112_1760638088

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the qnli dataset. It achieves the following results on the evaluation set:

Loss: 0.7920
Num Input Tokens Seen: 207147488

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 4
eval_batch_size: 4
seed: 101112
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.0538	1.0	23567	0.0475	10356896
0.0574	2.0	47134	0.0426	20715296
0.0403	3.0	70701	0.0402	31065184
0.0187	4.0	94268	0.0388	41428128
0.0737	5.0	117835	0.0379	51784320
0.0528	6.0	141402	0.0373	62144160
0.071	7.0	164969	0.0382	72511552
0.0308	8.0	188536	0.0402	82864256
0.0071	9.0	212103	0.0406	93220320
0.0266	10.0	235670	0.0433	103572992
0.0289	11.0	259237	0.0454	113924768
0.0188	12.0	282804	0.0473	124282240
0.0028	13.0	306371	0.0549	134645600
0.0028	14.0	329938	0.0558	145000704
0.0019	15.0	353505	0.0603	155349152
0.0011	16.0	377072	0.0657	165706304
0.0119	17.0	400639	0.0764	176064704
0.0059	18.0	424206	0.0818	186423936
0.004	19.0	447773	0.0868	196786464
0.0055	20.0	471340	0.0884	207147488

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: 76

Model tree for rbelanec/train_qnli_101112_1760638088

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2009)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard