train_math_qa_101112_1760638066

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the math_qa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6663
  • Num Input Tokens Seen: 77914328

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 101112
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.6321 1.0 6714 0.7315 3894384
0.6864 2.0 13428 0.7008 7788792
0.9075 3.0 20142 0.6857 11683344
0.7834 4.0 26856 0.6755 15578064
0.8392 5.0 33570 0.6742 19479304
0.5749 6.0 40284 0.6706 23378352
0.5827 7.0 46998 0.6673 27274568
0.7153 8.0 53712 0.6682 31172664
0.6194 9.0 60426 0.6663 35068368
0.7731 10.0 67140 0.6681 38966392
0.6639 11.0 73854 0.6683 42861936
0.575 12.0 80568 0.6695 46756048
0.4167 13.0 87282 0.6692 50652416
0.5908 14.0 93996 0.6714 54546936
0.5584 15.0 100710 0.6727 58442960
0.8154 16.0 107424 0.6729 62338944
0.5358 17.0 114138 0.6727 66231336
0.4849 18.0 120852 0.6730 70127040
0.6368 19.0 127566 0.6740 74021072
0.455 20.0 134280 0.6737 77914328

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
131
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_math_qa_101112_1760638066

Adapter
(2009)
this model