train_sst2_101112_1760638078

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the sst2 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8404
  • Num Input Tokens Seen: 67752208

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 101112
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.9299 1.0 15154 0.9122 3387376
0.8209 2.0 30308 0.8524 6776272
1.0562 3.0 45462 0.8469 10162816
0.7067 4.0 60616 0.8411 13553408
0.8766 5.0 75770 0.8414 16938528
0.856 6.0 90924 0.8454 20326208
1.2089 7.0 106078 0.8450 23713072
0.8014 8.0 121232 0.8404 27100096
0.8887 9.0 136386 0.8420 30487024
0.9286 10.0 151540 0.8445 33871648
0.7568 11.0 166694 0.8455 37260608
0.8389 12.0 181848 0.8463 40646320
0.885 13.0 197002 0.8463 44035072
0.9316 14.0 212156 0.8463 47423936
1.0567 15.0 227310 0.8463 50810512
0.7944 16.0 242464 0.8463 54201824
1.0582 17.0 257618 0.8463 57588768
0.8071 18.0 272772 0.8463 60979040
0.8653 19.0 287926 0.8463 64364064
0.7521 20.0 303080 0.8463 67752208

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
111
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_sst2_101112_1760638078

Adapter
(2015)
this model