Llama-3.1-8B-Instruct-EI1-20ep-sft

This model is a fine-tuned version of meta-llama/Llama-3.1-8B-Instruct on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6633

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 6e-06
  • train_batch_size: 1
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 32
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 64
  • total_eval_batch_size: 256
  • optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • num_epochs: 20.0

Training results

Training Loss Epoch Step Validation Loss
No log 0.0562 100 0.5980
No log 0.1124 200 0.5611
No log 0.1685 300 0.5370
No log 0.2247 400 0.5158
0.5584 0.2809 500 0.4963
0.5584 0.3371 600 0.4806
0.5584 0.3933 700 0.4665
0.5584 0.4494 800 0.4543
0.5584 0.5056 900 0.4461
0.4498 0.5618 1000 0.4387
0.4498 0.6180 1100 0.4335
0.4498 0.6742 1200 0.4304
0.4498 0.7303 1300 0.4259
0.4498 0.7865 1400 0.4223
0.4174 0.8427 1500 0.4196
0.4174 0.8989 1600 0.4176
0.4174 0.9551 1700 0.4162
0.4174 1.0112 1800 0.4185
0.4174 1.0674 1900 0.4167
0.3894 1.1236 2000 0.4174
0.3894 1.1798 2100 0.4158
0.3894 1.2360 2200 0.4157
0.3894 1.2921 2300 0.4153
0.3894 1.3483 2400 0.4136
0.3666 1.4045 2500 0.4137
0.3666 1.4607 2600 0.4127
0.3666 1.5169 2700 0.4119
0.3666 1.5730 2800 0.4119
0.3666 1.6292 2900 0.4106
0.3687 1.6854 3000 0.4100
0.3687 1.7416 3100 0.4092
0.3687 1.7978 3200 0.4093
0.3687 1.8539 3300 0.4087
0.3687 1.9101 3400 0.4088
0.3682 1.9663 3500 0.4071
0.3682 2.0225 3600 0.4212
0.3682 2.0787 3700 0.4227
0.3682 2.1348 3800 0.4225
0.3682 2.1910 3900 0.4223
0.3202 2.2472 4000 0.4230
0.3202 2.3034 4100 0.4219
0.3202 2.3596 4200 0.4207
0.3202 2.4157 4300 0.4222
0.3202 2.4719 4400 0.4208
0.319 2.5281 4500 0.4207
0.319 2.5843 4600 0.4193
0.319 2.6404 4700 0.4218
0.319 2.6966 4800 0.4190
0.319 2.7528 4900 0.4193
0.3228 2.8090 5000 0.4200
0.3228 2.8652 5100 0.4190
0.3228 2.9213 5200 0.4198
0.3228 2.9775 5300 0.4188
0.3228 3.0337 5400 0.4582
0.3008 3.0899 5500 0.4559
0.3008 3.1461 5600 0.4573
0.3008 3.2022 5700 0.4597
0.3008 3.2584 5800 0.4616
0.3008 3.3146 5900 0.4591
0.2546 3.3708 6000 0.4575
0.2546 3.4270 6100 0.4589
0.2546 3.4831 6200 0.4578
0.2546 3.5393 6300 0.4588
0.2546 3.5955 6400 0.4577
0.2609 3.6517 6500 0.4553
0.2609 3.7079 6600 0.4567
0.2609 3.7640 6700 0.4564
0.2609 3.8202 6800 0.4536
0.2609 3.8764 6900 0.4558
0.2658 3.9326 7000 0.4534
0.2658 3.9888 7100 0.4561
0.2658 4.0449 7200 0.5239
0.2658 4.1011 7300 0.5263
0.2658 4.1573 7400 0.5315
0.207 4.2135 7500 0.5257
0.207 4.2697 7600 0.5211
0.207 4.3258 7700 0.5196
0.207 4.3820 7800 0.5264
0.207 4.4382 7900 0.5233
0.1916 4.4944 8000 0.5193
0.1916 4.5506 8100 0.5194
0.1916 4.6067 8200 0.5243
0.1916 4.6629 8300 0.5273
0.1916 4.7191 8400 0.5238
0.1966 4.7753 8500 0.5151
0.1966 4.8315 8600 0.5231
0.1966 4.8876 8700 0.5261
0.1966 4.9438 8800 0.5132
0.1966 5.0 8900 0.5170
0.1864 5.0562 9000 0.6101
0.1864 5.1124 9100 0.6017
0.1864 5.1685 9200 0.6207
0.1864 5.2247 9300 0.6162
0.1864 5.2809 9400 0.6106
0.1288 5.3371 9500 0.6041
0.1288 5.3933 9600 0.6113
0.1288 5.4494 9700 0.5965
0.1288 5.5056 9800 0.6054
0.1288 5.5618 9900 0.6042
0.1345 5.6180 10000 0.6132
0.1345 5.6742 10100 0.6155
0.1345 5.7303 10200 0.6126
0.1345 5.7865 10300 0.6092
0.1345 5.8427 10400 0.6130
0.1389 5.8989 10500 0.6022
0.1389 5.9551 10600 0.6034
0.1389 6.0112 10700 0.7321
0.1389 6.0674 10800 0.7067
0.1389 6.1236 10900 0.7035
0.102 6.1798 11000 0.7239
0.102 6.2360 11100 0.6955
0.102 6.2921 11200 0.7161
0.102 6.3483 11300 0.7145
0.102 6.4045 11400 0.7022
0.0851 6.4607 11500 0.7051
0.0851 6.5169 11600 0.7117
0.0851 6.5730 11700 0.7097
0.0851 6.6292 11800 0.7052
0.0851 6.6854 11900 0.7141
0.0889 6.7416 12000 0.7040
0.0889 6.7978 12100 0.6996
0.0889 6.8539 12200 0.7079
0.0889 6.9101 12300 0.7042
0.0889 6.9663 12400 0.7041
0.0882 7.0225 12500 0.8056
0.0882 7.0787 12600 0.7964
0.0882 7.1348 12700 0.8166
0.0882 7.1910 12800 0.7870
0.0882 7.2472 12900 0.7947
0.051 7.3034 13000 0.8039
0.051 7.3596 13100 0.8114
0.051 7.4157 13200 0.8023
0.051 7.4719 13300 0.7966
0.051 7.5281 13400 0.8093
0.0548 7.5843 13500 0.8105
0.0548 7.6404 13600 0.8086
0.0548 7.6966 13700 0.8034
0.0548 7.7528 13800 0.8062
0.0548 7.8090 13900 0.8047
0.0573 7.8652 14000 0.8120
0.0573 7.9213 14100 0.8071
0.0573 7.9775 14200 0.8157
0.0573 8.0337 14300 0.9198
0.0573 8.0899 14400 0.8966
0.0445 8.1461 14500 0.8894
0.0445 8.2022 14600 0.8998
0.0445 8.2584 14700 0.9129
0.0445 8.3146 14800 0.8927
0.0445 8.3708 14900 0.9086
0.0338 8.4270 15000 0.8949
0.0338 8.4831 15100 0.8939
0.0338 8.5393 15200 0.9140
0.0338 8.5955 15300 0.9153
0.0338 8.6517 15400 0.9109
0.0357 8.7079 15500 0.8986
0.0357 8.7640 15600 0.9024
0.0357 8.8202 15700 0.9103
0.0357 8.8764 15800 0.9079
0.0357 8.9326 15900 0.8986
0.0367 8.9888 16000 0.9032
0.0367 9.0449 16100 0.9997
0.0367 9.1011 16200 0.9937
0.0367 9.1573 16300 0.9871
0.0367 9.2135 16400 1.0131
0.0218 9.2697 16500 0.9899
0.0218 9.3258 16600 0.9927
0.0218 9.3820 16700 0.9814
0.0218 9.4382 16800 0.9863
0.0218 9.4944 16900 0.9788
0.023 9.5506 17000 0.9984
0.023 9.6067 17100 0.9826
0.023 9.6629 17200 0.9918
0.023 9.7191 17300 0.9791
0.023 9.7753 17400 0.9865
0.0239 9.8315 17500 0.9847
0.0239 9.8876 17600 0.9927
0.0239 9.9438 17700 0.9830
0.0239 10.0 17800 0.9828
0.0239 10.0562 17900 1.0673
0.0201 10.1124 18000 1.0707
0.0201 10.1685 18100 1.0607
0.0201 10.2247 18200 1.0597
0.0201 10.2809 18300 1.0609
0.0201 10.3371 18400 1.0578
0.0155 10.3933 18500 1.0559
0.0155 10.4494 18600 1.0686
0.0155 10.5056 18700 1.0563
0.0155 10.5618 18800 1.0568
0.0155 10.6180 18900 1.0559
0.0164 10.6742 19000 1.0602
0.0164 10.7303 19100 1.0516
0.0164 10.7865 19200 1.0678
0.0164 10.8427 19300 1.0675
0.0164 10.8989 19400 1.0661
0.0168 10.9551 19500 1.0672
0.0168 11.0112 19600 1.1573
0.0168 11.0674 19700 1.1417
0.0168 11.1236 19800 1.1281
0.0168 11.1798 19900 1.1526
0.0119 11.2360 20000 1.1403
0.0119 11.2921 20100 1.1421
0.0119 11.3483 20200 1.1262
0.0119 11.4045 20300 1.1282
0.0119 11.4607 20400 1.1331
0.0118 11.5169 20500 1.1305
0.0118 11.5730 20600 1.1328
0.0118 11.6292 20700 1.1477
0.0118 11.6854 20800 1.1300
0.0118 11.7416 20900 1.1331
0.0123 11.7978 21000 1.1293
0.0123 11.8539 21100 1.1361
0.0123 11.9101 21200 1.1398
0.0123 11.9663 21300 1.1356
0.0123 12.0225 21400 1.1983
0.0112 12.0787 21500 1.2027
0.0112 12.1348 21600 1.2202
0.0112 12.1910 21700 1.2014
0.0112 12.2472 21800 1.2050
0.0112 12.3034 21900 1.2062
0.0089 12.3596 22000 1.1954
0.0089 12.4157 22100 1.2088
0.0089 12.4719 22200 1.1850
0.0089 12.5281 22300 1.1852
0.0089 12.5843 22400 1.1928
0.0093 12.6404 22500 1.2075
0.0093 12.6966 22600 1.2002
0.0093 12.7528 22700 1.2016
0.0093 12.8090 22800 1.2004
0.0093 12.8652 22900 1.2044
0.0096 12.9213 23000 1.2016
0.0096 12.9775 23100 1.2074
0.0096 13.0337 23200 1.2598
0.0096 13.0899 23300 1.2811
0.0096 13.1461 23400 1.2721
0.0076 13.2022 23500 1.2587
0.0076 13.2584 23600 1.2612
0.0076 13.3146 23700 1.2798
0.0076 13.3708 23800 1.2744
0.0076 13.4270 23900 1.2729
0.0073 13.4831 24000 1.2650
0.0073 13.5393 24100 1.2714
0.0073 13.5955 24200 1.2492
0.0073 13.6517 24300 1.2649
0.0073 13.7079 24400 1.2716
0.0076 13.7640 24500 1.2730
0.0076 13.8202 24600 1.2645
0.0076 13.8764 24700 1.2671
0.0076 13.9326 24800 1.2744
0.0076 13.9888 24900 1.2682
0.0074 14.0449 25000 1.3352
0.0074 14.1011 25100 1.3497
0.0074 14.1573 25200 1.3443
0.0074 14.2135 25300 1.3336
0.0074 14.2697 25400 1.3409
0.006 14.3258 25500 1.3308
0.006 14.3820 25600 1.3369
0.006 14.4382 25700 1.3428
0.006 14.4944 25800 1.3338
0.006 14.5506 25900 1.3404
0.0064 14.6067 26000 1.3460
0.0064 14.6629 26100 1.3435
0.0064 14.7191 26200 1.3457
0.0064 14.7753 26300 1.3494
0.0064 14.8315 26400 1.3507
0.0064 14.8876 26500 1.3577
0.0064 14.9438 26600 1.3359
0.0064 15.0 26700 1.3506
0.0064 15.0562 26800 1.4156
0.0064 15.1124 26900 1.4040
0.0057 15.1685 27000 1.4108
0.0057 15.2247 27100 1.4026
0.0057 15.2809 27200 1.4221
0.0057 15.3371 27300 1.4160
0.0057 15.3933 27400 1.4161
0.0054 15.4494 27500 1.4347
0.0054 15.5056 27600 1.4196
0.0054 15.5618 27700 1.4249
0.0054 15.6180 27800 1.4280
0.0054 15.6742 27900 1.4200
0.0056 15.7303 28000 1.4192
0.0056 15.7865 28100 1.4202
0.0056 15.8427 28200 1.4284
0.0056 15.8989 28300 1.4307
0.0056 15.9551 28400 1.4259
0.0056 16.0112 28500 1.4654
0.0056 16.0674 28600 1.4845
0.0056 16.1236 28700 1.4806
0.0056 16.1798 28800 1.4831
0.0056 16.2360 28900 1.4913
0.0047 16.2921 29000 1.5039
0.0047 16.3483 29100 1.4987
0.0047 16.4045 29200 1.5099
0.0047 16.4607 29300 1.4961
0.0047 16.5169 29400 1.4937
0.005 16.5730 29500 1.4875
0.005 16.6292 29600 1.4967
0.005 16.6854 29700 1.4991
0.005 16.7416 29800 1.5045
0.005 16.7978 29900 1.5078
0.0051 16.8539 30000 1.5007
0.0051 16.9101 30100 1.5067
0.0051 16.9663 30200 1.5082
0.0051 17.0225 30300 1.5553
0.0051 17.0787 30400 1.5679
0.0048 17.1348 30500 1.5638
0.0048 17.1910 30600 1.5660
0.0048 17.2472 30700 1.5699
0.0048 17.3034 30800 1.5750
0.0048 17.3596 30900 1.5751
0.0046 17.4157 31000 1.5774
0.0046 17.4719 31100 1.5797
0.0046 17.5281 31200 1.5793
0.0046 17.5843 31300 1.5804
0.0046 17.6404 31400 1.5801
0.0047 17.6966 31500 1.5828
0.0047 17.7528 31600 1.5791
0.0047 17.8090 31700 1.5865
0.0047 17.8652 31800 1.5907
0.0047 17.9213 31900 1.5855
0.0048 17.9775 32000 1.5867
0.0048 18.0337 32100 1.6185
0.0048 18.0899 32200 1.6252
0.0048 18.1461 32300 1.6293
0.0048 18.2022 32400 1.6332
0.0043 18.2584 32500 1.6315
0.0043 18.3146 32600 1.6371
0.0043 18.3708 32700 1.6388
0.0043 18.4270 32800 1.6405
0.0043 18.4831 32900 1.6431
0.0043 18.5393 33000 1.6423
0.0043 18.5955 33100 1.6411
0.0043 18.6517 33200 1.6418
0.0043 18.7079 33300 1.6443
0.0043 18.7640 33400 1.6423
0.0044 18.8202 33500 1.6415
0.0044 18.8764 33600 1.6426
0.0044 18.9326 33700 1.6420
0.0044 18.9888 33800 1.6422
0.0044 19.0449 33900 1.6532
0.0043 19.1011 34000 1.6594
0.0043 19.1573 34100 1.6612
0.0043 19.2135 34200 1.6618
0.0043 19.2697 34300 1.6626
0.0043 19.3258 34400 1.6628
0.0042 19.3820 34500 1.6637
0.0042 19.4382 34600 1.6635
0.0042 19.4944 34700 1.6633
0.0042 19.5506 34800 1.6634
0.0042 19.6067 34900 1.6634
0.0042 19.6629 35000 1.6632
0.0042 19.7191 35100 1.6635
0.0042 19.7753 35200 1.6632
0.0042 19.8315 35300 1.6634
0.0042 19.8876 35400 1.6632
0.0042 19.9438 35500 1.6633
0.0042 20.0 35600 1.6633

Framework versions

  • Transformers 4.43.4
  • Pytorch 2.4.0+cu121
  • Datasets 3.0.1
  • Tokenizers 0.19.1
Downloads last month
1
Safetensors
Model size
8B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for qfq/Llama-3.1-8B-Instruct-EI1-20ep-sft

Finetuned
(2584)
this model