lemexp-task1-v2-template_small_nodefs-Llama-3.2-1B-8lr-24epochs

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1427

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0008
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • total_train_batch_size: 16
  • total_eval_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 24
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.324 0.4001 1440 0.3013
0.2795 0.8002 2880 0.2821
0.2577 1.2003 4320 0.2624
0.2509 1.6004 5760 0.2646
0.2453 2.0006 7200 0.2526
0.2364 2.4007 8640 0.2445
0.2318 2.8008 10080 0.2400
0.2243 3.2009 11520 0.2371
0.2218 3.6010 12960 0.2302
0.2179 4.0011 14400 0.2276
0.2139 4.4012 15840 0.2283
0.2138 4.8013 17280 0.2308
0.2048 5.2014 18720 0.2284
0.2046 5.6016 20160 0.2326
0.202 6.0017 21600 0.2073
0.1995 6.4018 23040 0.2234
0.1961 6.8019 24480 0.2126
0.1889 7.2020 25920 0.2105
0.1894 7.6021 27360 0.2030
0.1903 8.0022 28800 0.2038
0.1805 8.4023 30240 0.2017
0.1793 8.8024 31680 0.1982
0.176 9.2026 33120 0.1923
0.1756 9.6027 34560 0.1913
0.1718 10.0028 36000 0.1857
0.1669 10.4029 37440 0.1869
0.1672 10.8030 38880 0.1876
0.1586 11.2031 40320 0.1832
0.1604 11.6032 41760 0.1814
0.161 12.0033 43200 0.1755
0.1511 12.4034 44640 0.1779
0.1538 12.8036 46080 0.1710
0.1452 13.2037 47520 0.1745
0.1465 13.6038 48960 0.1708
0.1466 14.0039 50400 0.1675
0.1389 14.4040 51840 0.1675
0.1395 14.8041 53280 0.1668
0.129 15.2042 54720 0.1626
0.132 15.6043 56160 0.1613
0.1333 16.0044 57600 0.1642
0.1256 16.4046 59040 0.1572
0.1262 16.8047 60480 0.1540
0.1181 17.2048 61920 0.1577
0.1173 17.6049 63360 0.1532
0.1191 18.0050 64800 0.1537
0.1085 18.4051 66240 0.1500
0.1119 18.8052 67680 0.1515
0.1006 19.2053 69120 0.1514
0.1024 19.6054 70560 0.1501
0.1019 20.0056 72000 0.1474
0.0947 20.4057 73440 0.1466
0.0949 20.8058 74880 0.1433
0.0893 21.2059 76320 0.1446
0.0874 21.6060 77760 0.1429
0.0865 22.0061 79200 0.1411
0.0793 22.4062 80640 0.1451
0.0801 22.8063 82080 0.1425
0.0731 23.2064 83520 0.1432
0.0738 23.6066 84960 0.1427

Framework versions

  • PEFT 0.14.0
  • Transformers 4.47.0
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for yalhessi/lemexp-task1-v2-template_small_nodefs-Llama-3.2-1B-8lr-24epochs

Adapter
(607)
this model