train_qnli_101112_1760638088

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the qnli dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7920
  • Num Input Tokens Seen: 207147488

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 101112
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.0538 1.0 23567 0.0475 10356896
0.0574 2.0 47134 0.0426 20715296
0.0403 3.0 70701 0.0402 31065184
0.0187 4.0 94268 0.0388 41428128
0.0737 5.0 117835 0.0379 51784320
0.0528 6.0 141402 0.0373 62144160
0.071 7.0 164969 0.0382 72511552
0.0308 8.0 188536 0.0402 82864256
0.0071 9.0 212103 0.0406 93220320
0.0266 10.0 235670 0.0433 103572992
0.0289 11.0 259237 0.0454 113924768
0.0188 12.0 282804 0.0473 124282240
0.0028 13.0 306371 0.0549 134645600
0.0028 14.0 329938 0.0558 145000704
0.0019 15.0 353505 0.0603 155349152
0.0011 16.0 377072 0.0657 165706304
0.0119 17.0 400639 0.0764 176064704
0.0059 18.0 424206 0.0818 186423936
0.004 19.0 447773 0.0868 196786464
0.0055 20.0 471340 0.0884 207147488

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
76
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_qnli_101112_1760638088

Adapter
(2009)
this model