train_svamp_101112_1760638001

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the svamp dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2259
  • Num Input Tokens Seen: 1430592

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 101112
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.2239 1.0 158 0.2943 71552
0.246 2.0 316 0.2062 142960
0.0528 3.0 474 0.1448 214432
0.0997 4.0 632 0.1533 286000
0.0227 5.0 790 0.1559 357888
0.0387 6.0 948 0.1550 429456
0.0222 7.0 1106 0.1580 501136
0.0233 8.0 1264 0.1943 573104
0.0281 9.0 1422 0.2004 644752
0.036 10.0 1580 0.1924 716192
0.0096 11.0 1738 0.2382 787200
0.0017 12.0 1896 0.2249 858736
0.001 13.0 2054 0.2267 930160
0.0012 14.0 2212 0.2471 1001792
0.0017 15.0 2370 0.2629 1073248
0.0002 16.0 2528 0.2719 1144672
0.0003 17.0 2686 0.2751 1216160
0.0003 18.0 2844 0.2769 1287728
0.0005 19.0 3002 0.2809 1359120
0.0001 20.0 3160 0.2799 1430592

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
178
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_svamp_101112_1760638001

Adapter
(2009)
this model