lemexp-task1-v2-template_small_nodefs-Llama-3.2-1B-4lr-24epochs

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1457

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0004
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • total_train_batch_size: 16
  • total_eval_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 24
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.3043 0.4001 1440 0.2784
0.2499 0.8002 2880 0.2477
0.2217 1.2003 4320 0.2500
0.212 1.6004 5760 0.2242
0.204 2.0006 7200 0.2163
0.1934 2.4007 8640 0.2097
0.1907 2.8008 10080 0.1983
0.1791 3.2009 11520 0.1943
0.1771 3.6010 12960 0.1917
0.176 4.0011 14400 0.1886
0.1688 4.4012 15840 0.1918
0.1688 4.8013 17280 0.1820
0.1593 5.2014 18720 0.1825
0.1599 5.6016 20160 0.1843
0.1573 6.0017 21600 0.1765
0.1551 6.4018 23040 0.1774
0.1526 6.8019 24480 0.1734
0.1445 7.2020 25920 0.1763
0.1467 7.6021 27360 0.1782
0.1487 8.0022 28800 0.1711
0.1383 8.4023 30240 0.1756
0.1389 8.8024 31680 0.1660
0.1337 9.2026 33120 0.1637
0.1344 9.6027 34560 0.1637
0.1323 10.0028 36000 0.1611
0.1261 10.4029 37440 0.1630
0.1278 10.8030 38880 0.1620
0.12 11.2031 40320 0.1613
0.1221 11.6032 41760 0.1598
0.125 12.0033 43200 0.1580
0.1156 12.4034 44640 0.1599
0.1187 12.8036 46080 0.1540
0.1087 13.2037 47520 0.1546
0.1103 13.6038 48960 0.1567
0.1131 14.0039 50400 0.1557
0.1055 14.4040 51840 0.1524
0.1071 14.8041 53280 0.1504
0.0975 15.2042 54720 0.1499
0.1014 15.6043 56160 0.1468
0.1028 16.0044 57600 0.1501
0.0964 16.4046 59040 0.1509
0.0977 16.8047 60480 0.1451
0.0908 17.2048 61920 0.1481
0.0895 17.6049 63360 0.1466
0.0922 18.0050 64800 0.1446
0.0843 18.4051 66240 0.1448
0.0883 18.8052 67680 0.1442
0.0792 19.2053 69120 0.1471
0.0809 19.6054 70560 0.1464
0.0813 20.0056 72000 0.1423
0.0749 20.4057 73440 0.1451
0.0758 20.8058 74880 0.1428
0.0723 21.2059 76320 0.1488
0.0709 21.6060 77760 0.1458
0.0708 22.0061 79200 0.1441
0.0656 22.4062 80640 0.1471
0.0666 22.8063 82080 0.1450
0.0613 23.2064 83520 0.1475
0.0632 23.6066 84960 0.1457

Framework versions

  • PEFT 0.14.0
  • Transformers 4.47.0
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for yalhessi/lemexp-task1-v2-template_small_nodefs-Llama-3.2-1B-4lr-24epochs

Adapter
(604)
this model