train_conala_789_1760637894

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the conala dataset. It achieves the following results on the evaluation set:

  • Loss: 1.2822
  • Num Input Tokens Seen: 3037136

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 789
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.6543 1.0 536 0.7184 152296
0.963 2.0 1072 0.6811 304440
0.6835 3.0 1608 0.6779 455928
0.6298 4.0 2144 0.6566 608072
0.6007 5.0 2680 0.6503 759296
0.4515 6.0 3216 0.6554 910984
0.354 7.0 3752 0.6558 1062816
0.5476 8.0 4288 0.6500 1214520
0.6967 9.0 4824 0.6582 1366480
0.4536 10.0 5360 0.6594 1518976
0.3455 11.0 5896 0.6764 1670320
0.3709 12.0 6432 0.6794 1822624
0.3378 13.0 6968 0.7134 1974336
0.3264 14.0 7504 0.7172 2126488
0.3556 15.0 8040 0.7683 2278280
0.2914 16.0 8576 0.7782 2430272
0.2339 17.0 9112 0.8055 2581848
0.2257 18.0 9648 0.8163 2733712
0.2782 19.0 10184 0.8254 2885208
0.2769 20.0 10720 0.8287 3037136

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_conala_789_1760637894

Adapter
(2158)
this model

Evaluation results