train_wic_789_1760637922

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the wic dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2779
  • Num Input Tokens Seen: 8431032

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 789
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.2279 1.0 1222 0.3245 421768
0.3553 2.0 2444 0.2967 843296
0.3132 3.0 3666 0.2887 1265072
0.2892 4.0 4888 0.2801 1687136
0.2072 5.0 6110 0.2847 2108680
0.3586 6.0 7332 0.2875 2530168
0.1343 7.0 8554 0.2790 2951208
0.1772 8.0 9776 0.2779 3372504
0.1823 9.0 10998 0.2828 3793768
0.4928 10.0 12220 0.2950 4214928
0.2494 11.0 13442 0.2860 4636520
0.179 12.0 14664 0.2868 5057560
0.296 13.0 15886 0.2865 5479248
0.1626 14.0 17108 0.2902 5901056
0.2611 15.0 18330 0.2939 6323016
0.2375 16.0 19552 0.2906 6744792
0.4483 17.0 20774 0.2937 7165960
0.3013 18.0 21996 0.2948 7587872
0.3182 19.0 23218 0.2945 8009040
0.0466 20.0 24440 0.2953 8431032

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
156
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_wic_789_1760637922

Adapter
(2015)
this model