train_codealpacapy_123_1762506128

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the codealpacapy dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4728
  • Num Input Tokens Seen: 24941912

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.03
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.5889 1.0 1908 0.4966 1248304
0.4331 2.0 3816 0.4844 2497016
0.4405 3.0 5724 0.4833 3742552
0.4097 4.0 7632 0.4753 4985200
0.4668 5.0 9540 0.4767 6233920
0.4781 6.0 11448 0.4731 7478504
0.4447 7.0 13356 0.4728 8722744
0.7575 8.0 15264 0.4755 9977520
0.4325 9.0 17172 0.4789 11225416
0.5884 10.0 19080 0.4832 12472912
0.5296 11.0 20988 0.4880 13721824
0.4288 12.0 22896 0.4906 14970528
0.5057 13.0 24804 0.4987 16220808
0.3346 14.0 26712 0.5085 17464792
0.328 15.0 28620 0.5197 18706976
0.2985 16.0 30528 0.5279 19956544
0.3145 17.0 32436 0.5337 21204416
0.3131 18.0 34344 0.5392 22451928
0.3053 19.0 36252 0.5407 23696296
0.5008 20.0 38160 0.5409 24941912

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
57
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_codealpacapy_123_1762506128

Adapter
(2050)
this model