train_sst2_101112_1760638077

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the sst2 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0642
  • Num Input Tokens Seen: 67752208

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 101112
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.1865 1.0 15154 0.0877 3387376
0.0316 2.0 30308 0.0642 6776272
0.0042 3.0 45462 0.0712 10162816
0.0008 4.0 60616 0.0950 13553408
0.0003 5.0 75770 0.1063 16938528
0.001 6.0 90924 0.1091 20326208
0.0131 7.0 106078 0.1454 23713072
0.0008 8.0 121232 0.1714 27100096
0.0264 9.0 136386 0.1994 30487024
0.0043 10.0 151540 0.1653 33871648
0.0 11.0 166694 0.1943 37260608
0.0 12.0 181848 0.1818 40646320
0.0 13.0 197002 0.2055 44035072
0.0001 14.0 212156 0.2781 47423936
0.0 15.0 227310 0.2884 50810512
0.0 16.0 242464 0.3510 54201824
0.0 17.0 257618 0.3313 57588768
0.0 18.0 272772 0.3547 60979040
0.0 19.0 287926 0.3597 64364064
0.0 20.0 303080 0.3623 67752208

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
98
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_sst2_101112_1760638077

Adapter
(2009)
this model