Llama-3.1-8B-Instruct-EI1-20ep-sft
This model is a fine-tuned version of meta-llama/Llama-3.1-8B-Instruct on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.6633
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 6e-06
- train_batch_size: 1
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 32
- gradient_accumulation_steps: 2
- total_train_batch_size: 64
- total_eval_batch_size: 256
- optimizer: Adam with betas=(0.9,0.95) and epsilon=1e-08
- lr_scheduler_type: cosine
- num_epochs: 20.0
Training results
| Training Loss | Epoch | Step | Validation Loss |
|---|---|---|---|
| No log | 0.0562 | 100 | 0.5980 |
| No log | 0.1124 | 200 | 0.5611 |
| No log | 0.1685 | 300 | 0.5370 |
| No log | 0.2247 | 400 | 0.5158 |
| 0.5584 | 0.2809 | 500 | 0.4963 |
| 0.5584 | 0.3371 | 600 | 0.4806 |
| 0.5584 | 0.3933 | 700 | 0.4665 |
| 0.5584 | 0.4494 | 800 | 0.4543 |
| 0.5584 | 0.5056 | 900 | 0.4461 |
| 0.4498 | 0.5618 | 1000 | 0.4387 |
| 0.4498 | 0.6180 | 1100 | 0.4335 |
| 0.4498 | 0.6742 | 1200 | 0.4304 |
| 0.4498 | 0.7303 | 1300 | 0.4259 |
| 0.4498 | 0.7865 | 1400 | 0.4223 |
| 0.4174 | 0.8427 | 1500 | 0.4196 |
| 0.4174 | 0.8989 | 1600 | 0.4176 |
| 0.4174 | 0.9551 | 1700 | 0.4162 |
| 0.4174 | 1.0112 | 1800 | 0.4185 |
| 0.4174 | 1.0674 | 1900 | 0.4167 |
| 0.3894 | 1.1236 | 2000 | 0.4174 |
| 0.3894 | 1.1798 | 2100 | 0.4158 |
| 0.3894 | 1.2360 | 2200 | 0.4157 |
| 0.3894 | 1.2921 | 2300 | 0.4153 |
| 0.3894 | 1.3483 | 2400 | 0.4136 |
| 0.3666 | 1.4045 | 2500 | 0.4137 |
| 0.3666 | 1.4607 | 2600 | 0.4127 |
| 0.3666 | 1.5169 | 2700 | 0.4119 |
| 0.3666 | 1.5730 | 2800 | 0.4119 |
| 0.3666 | 1.6292 | 2900 | 0.4106 |
| 0.3687 | 1.6854 | 3000 | 0.4100 |
| 0.3687 | 1.7416 | 3100 | 0.4092 |
| 0.3687 | 1.7978 | 3200 | 0.4093 |
| 0.3687 | 1.8539 | 3300 | 0.4087 |
| 0.3687 | 1.9101 | 3400 | 0.4088 |
| 0.3682 | 1.9663 | 3500 | 0.4071 |
| 0.3682 | 2.0225 | 3600 | 0.4212 |
| 0.3682 | 2.0787 | 3700 | 0.4227 |
| 0.3682 | 2.1348 | 3800 | 0.4225 |
| 0.3682 | 2.1910 | 3900 | 0.4223 |
| 0.3202 | 2.2472 | 4000 | 0.4230 |
| 0.3202 | 2.3034 | 4100 | 0.4219 |
| 0.3202 | 2.3596 | 4200 | 0.4207 |
| 0.3202 | 2.4157 | 4300 | 0.4222 |
| 0.3202 | 2.4719 | 4400 | 0.4208 |
| 0.319 | 2.5281 | 4500 | 0.4207 |
| 0.319 | 2.5843 | 4600 | 0.4193 |
| 0.319 | 2.6404 | 4700 | 0.4218 |
| 0.319 | 2.6966 | 4800 | 0.4190 |
| 0.319 | 2.7528 | 4900 | 0.4193 |
| 0.3228 | 2.8090 | 5000 | 0.4200 |
| 0.3228 | 2.8652 | 5100 | 0.4190 |
| 0.3228 | 2.9213 | 5200 | 0.4198 |
| 0.3228 | 2.9775 | 5300 | 0.4188 |
| 0.3228 | 3.0337 | 5400 | 0.4582 |
| 0.3008 | 3.0899 | 5500 | 0.4559 |
| 0.3008 | 3.1461 | 5600 | 0.4573 |
| 0.3008 | 3.2022 | 5700 | 0.4597 |
| 0.3008 | 3.2584 | 5800 | 0.4616 |
| 0.3008 | 3.3146 | 5900 | 0.4591 |
| 0.2546 | 3.3708 | 6000 | 0.4575 |
| 0.2546 | 3.4270 | 6100 | 0.4589 |
| 0.2546 | 3.4831 | 6200 | 0.4578 |
| 0.2546 | 3.5393 | 6300 | 0.4588 |
| 0.2546 | 3.5955 | 6400 | 0.4577 |
| 0.2609 | 3.6517 | 6500 | 0.4553 |
| 0.2609 | 3.7079 | 6600 | 0.4567 |
| 0.2609 | 3.7640 | 6700 | 0.4564 |
| 0.2609 | 3.8202 | 6800 | 0.4536 |
| 0.2609 | 3.8764 | 6900 | 0.4558 |
| 0.2658 | 3.9326 | 7000 | 0.4534 |
| 0.2658 | 3.9888 | 7100 | 0.4561 |
| 0.2658 | 4.0449 | 7200 | 0.5239 |
| 0.2658 | 4.1011 | 7300 | 0.5263 |
| 0.2658 | 4.1573 | 7400 | 0.5315 |
| 0.207 | 4.2135 | 7500 | 0.5257 |
| 0.207 | 4.2697 | 7600 | 0.5211 |
| 0.207 | 4.3258 | 7700 | 0.5196 |
| 0.207 | 4.3820 | 7800 | 0.5264 |
| 0.207 | 4.4382 | 7900 | 0.5233 |
| 0.1916 | 4.4944 | 8000 | 0.5193 |
| 0.1916 | 4.5506 | 8100 | 0.5194 |
| 0.1916 | 4.6067 | 8200 | 0.5243 |
| 0.1916 | 4.6629 | 8300 | 0.5273 |
| 0.1916 | 4.7191 | 8400 | 0.5238 |
| 0.1966 | 4.7753 | 8500 | 0.5151 |
| 0.1966 | 4.8315 | 8600 | 0.5231 |
| 0.1966 | 4.8876 | 8700 | 0.5261 |
| 0.1966 | 4.9438 | 8800 | 0.5132 |
| 0.1966 | 5.0 | 8900 | 0.5170 |
| 0.1864 | 5.0562 | 9000 | 0.6101 |
| 0.1864 | 5.1124 | 9100 | 0.6017 |
| 0.1864 | 5.1685 | 9200 | 0.6207 |
| 0.1864 | 5.2247 | 9300 | 0.6162 |
| 0.1864 | 5.2809 | 9400 | 0.6106 |
| 0.1288 | 5.3371 | 9500 | 0.6041 |
| 0.1288 | 5.3933 | 9600 | 0.6113 |
| 0.1288 | 5.4494 | 9700 | 0.5965 |
| 0.1288 | 5.5056 | 9800 | 0.6054 |
| 0.1288 | 5.5618 | 9900 | 0.6042 |
| 0.1345 | 5.6180 | 10000 | 0.6132 |
| 0.1345 | 5.6742 | 10100 | 0.6155 |
| 0.1345 | 5.7303 | 10200 | 0.6126 |
| 0.1345 | 5.7865 | 10300 | 0.6092 |
| 0.1345 | 5.8427 | 10400 | 0.6130 |
| 0.1389 | 5.8989 | 10500 | 0.6022 |
| 0.1389 | 5.9551 | 10600 | 0.6034 |
| 0.1389 | 6.0112 | 10700 | 0.7321 |
| 0.1389 | 6.0674 | 10800 | 0.7067 |
| 0.1389 | 6.1236 | 10900 | 0.7035 |
| 0.102 | 6.1798 | 11000 | 0.7239 |
| 0.102 | 6.2360 | 11100 | 0.6955 |
| 0.102 | 6.2921 | 11200 | 0.7161 |
| 0.102 | 6.3483 | 11300 | 0.7145 |
| 0.102 | 6.4045 | 11400 | 0.7022 |
| 0.0851 | 6.4607 | 11500 | 0.7051 |
| 0.0851 | 6.5169 | 11600 | 0.7117 |
| 0.0851 | 6.5730 | 11700 | 0.7097 |
| 0.0851 | 6.6292 | 11800 | 0.7052 |
| 0.0851 | 6.6854 | 11900 | 0.7141 |
| 0.0889 | 6.7416 | 12000 | 0.7040 |
| 0.0889 | 6.7978 | 12100 | 0.6996 |
| 0.0889 | 6.8539 | 12200 | 0.7079 |
| 0.0889 | 6.9101 | 12300 | 0.7042 |
| 0.0889 | 6.9663 | 12400 | 0.7041 |
| 0.0882 | 7.0225 | 12500 | 0.8056 |
| 0.0882 | 7.0787 | 12600 | 0.7964 |
| 0.0882 | 7.1348 | 12700 | 0.8166 |
| 0.0882 | 7.1910 | 12800 | 0.7870 |
| 0.0882 | 7.2472 | 12900 | 0.7947 |
| 0.051 | 7.3034 | 13000 | 0.8039 |
| 0.051 | 7.3596 | 13100 | 0.8114 |
| 0.051 | 7.4157 | 13200 | 0.8023 |
| 0.051 | 7.4719 | 13300 | 0.7966 |
| 0.051 | 7.5281 | 13400 | 0.8093 |
| 0.0548 | 7.5843 | 13500 | 0.8105 |
| 0.0548 | 7.6404 | 13600 | 0.8086 |
| 0.0548 | 7.6966 | 13700 | 0.8034 |
| 0.0548 | 7.7528 | 13800 | 0.8062 |
| 0.0548 | 7.8090 | 13900 | 0.8047 |
| 0.0573 | 7.8652 | 14000 | 0.8120 |
| 0.0573 | 7.9213 | 14100 | 0.8071 |
| 0.0573 | 7.9775 | 14200 | 0.8157 |
| 0.0573 | 8.0337 | 14300 | 0.9198 |
| 0.0573 | 8.0899 | 14400 | 0.8966 |
| 0.0445 | 8.1461 | 14500 | 0.8894 |
| 0.0445 | 8.2022 | 14600 | 0.8998 |
| 0.0445 | 8.2584 | 14700 | 0.9129 |
| 0.0445 | 8.3146 | 14800 | 0.8927 |
| 0.0445 | 8.3708 | 14900 | 0.9086 |
| 0.0338 | 8.4270 | 15000 | 0.8949 |
| 0.0338 | 8.4831 | 15100 | 0.8939 |
| 0.0338 | 8.5393 | 15200 | 0.9140 |
| 0.0338 | 8.5955 | 15300 | 0.9153 |
| 0.0338 | 8.6517 | 15400 | 0.9109 |
| 0.0357 | 8.7079 | 15500 | 0.8986 |
| 0.0357 | 8.7640 | 15600 | 0.9024 |
| 0.0357 | 8.8202 | 15700 | 0.9103 |
| 0.0357 | 8.8764 | 15800 | 0.9079 |
| 0.0357 | 8.9326 | 15900 | 0.8986 |
| 0.0367 | 8.9888 | 16000 | 0.9032 |
| 0.0367 | 9.0449 | 16100 | 0.9997 |
| 0.0367 | 9.1011 | 16200 | 0.9937 |
| 0.0367 | 9.1573 | 16300 | 0.9871 |
| 0.0367 | 9.2135 | 16400 | 1.0131 |
| 0.0218 | 9.2697 | 16500 | 0.9899 |
| 0.0218 | 9.3258 | 16600 | 0.9927 |
| 0.0218 | 9.3820 | 16700 | 0.9814 |
| 0.0218 | 9.4382 | 16800 | 0.9863 |
| 0.0218 | 9.4944 | 16900 | 0.9788 |
| 0.023 | 9.5506 | 17000 | 0.9984 |
| 0.023 | 9.6067 | 17100 | 0.9826 |
| 0.023 | 9.6629 | 17200 | 0.9918 |
| 0.023 | 9.7191 | 17300 | 0.9791 |
| 0.023 | 9.7753 | 17400 | 0.9865 |
| 0.0239 | 9.8315 | 17500 | 0.9847 |
| 0.0239 | 9.8876 | 17600 | 0.9927 |
| 0.0239 | 9.9438 | 17700 | 0.9830 |
| 0.0239 | 10.0 | 17800 | 0.9828 |
| 0.0239 | 10.0562 | 17900 | 1.0673 |
| 0.0201 | 10.1124 | 18000 | 1.0707 |
| 0.0201 | 10.1685 | 18100 | 1.0607 |
| 0.0201 | 10.2247 | 18200 | 1.0597 |
| 0.0201 | 10.2809 | 18300 | 1.0609 |
| 0.0201 | 10.3371 | 18400 | 1.0578 |
| 0.0155 | 10.3933 | 18500 | 1.0559 |
| 0.0155 | 10.4494 | 18600 | 1.0686 |
| 0.0155 | 10.5056 | 18700 | 1.0563 |
| 0.0155 | 10.5618 | 18800 | 1.0568 |
| 0.0155 | 10.6180 | 18900 | 1.0559 |
| 0.0164 | 10.6742 | 19000 | 1.0602 |
| 0.0164 | 10.7303 | 19100 | 1.0516 |
| 0.0164 | 10.7865 | 19200 | 1.0678 |
| 0.0164 | 10.8427 | 19300 | 1.0675 |
| 0.0164 | 10.8989 | 19400 | 1.0661 |
| 0.0168 | 10.9551 | 19500 | 1.0672 |
| 0.0168 | 11.0112 | 19600 | 1.1573 |
| 0.0168 | 11.0674 | 19700 | 1.1417 |
| 0.0168 | 11.1236 | 19800 | 1.1281 |
| 0.0168 | 11.1798 | 19900 | 1.1526 |
| 0.0119 | 11.2360 | 20000 | 1.1403 |
| 0.0119 | 11.2921 | 20100 | 1.1421 |
| 0.0119 | 11.3483 | 20200 | 1.1262 |
| 0.0119 | 11.4045 | 20300 | 1.1282 |
| 0.0119 | 11.4607 | 20400 | 1.1331 |
| 0.0118 | 11.5169 | 20500 | 1.1305 |
| 0.0118 | 11.5730 | 20600 | 1.1328 |
| 0.0118 | 11.6292 | 20700 | 1.1477 |
| 0.0118 | 11.6854 | 20800 | 1.1300 |
| 0.0118 | 11.7416 | 20900 | 1.1331 |
| 0.0123 | 11.7978 | 21000 | 1.1293 |
| 0.0123 | 11.8539 | 21100 | 1.1361 |
| 0.0123 | 11.9101 | 21200 | 1.1398 |
| 0.0123 | 11.9663 | 21300 | 1.1356 |
| 0.0123 | 12.0225 | 21400 | 1.1983 |
| 0.0112 | 12.0787 | 21500 | 1.2027 |
| 0.0112 | 12.1348 | 21600 | 1.2202 |
| 0.0112 | 12.1910 | 21700 | 1.2014 |
| 0.0112 | 12.2472 | 21800 | 1.2050 |
| 0.0112 | 12.3034 | 21900 | 1.2062 |
| 0.0089 | 12.3596 | 22000 | 1.1954 |
| 0.0089 | 12.4157 | 22100 | 1.2088 |
| 0.0089 | 12.4719 | 22200 | 1.1850 |
| 0.0089 | 12.5281 | 22300 | 1.1852 |
| 0.0089 | 12.5843 | 22400 | 1.1928 |
| 0.0093 | 12.6404 | 22500 | 1.2075 |
| 0.0093 | 12.6966 | 22600 | 1.2002 |
| 0.0093 | 12.7528 | 22700 | 1.2016 |
| 0.0093 | 12.8090 | 22800 | 1.2004 |
| 0.0093 | 12.8652 | 22900 | 1.2044 |
| 0.0096 | 12.9213 | 23000 | 1.2016 |
| 0.0096 | 12.9775 | 23100 | 1.2074 |
| 0.0096 | 13.0337 | 23200 | 1.2598 |
| 0.0096 | 13.0899 | 23300 | 1.2811 |
| 0.0096 | 13.1461 | 23400 | 1.2721 |
| 0.0076 | 13.2022 | 23500 | 1.2587 |
| 0.0076 | 13.2584 | 23600 | 1.2612 |
| 0.0076 | 13.3146 | 23700 | 1.2798 |
| 0.0076 | 13.3708 | 23800 | 1.2744 |
| 0.0076 | 13.4270 | 23900 | 1.2729 |
| 0.0073 | 13.4831 | 24000 | 1.2650 |
| 0.0073 | 13.5393 | 24100 | 1.2714 |
| 0.0073 | 13.5955 | 24200 | 1.2492 |
| 0.0073 | 13.6517 | 24300 | 1.2649 |
| 0.0073 | 13.7079 | 24400 | 1.2716 |
| 0.0076 | 13.7640 | 24500 | 1.2730 |
| 0.0076 | 13.8202 | 24600 | 1.2645 |
| 0.0076 | 13.8764 | 24700 | 1.2671 |
| 0.0076 | 13.9326 | 24800 | 1.2744 |
| 0.0076 | 13.9888 | 24900 | 1.2682 |
| 0.0074 | 14.0449 | 25000 | 1.3352 |
| 0.0074 | 14.1011 | 25100 | 1.3497 |
| 0.0074 | 14.1573 | 25200 | 1.3443 |
| 0.0074 | 14.2135 | 25300 | 1.3336 |
| 0.0074 | 14.2697 | 25400 | 1.3409 |
| 0.006 | 14.3258 | 25500 | 1.3308 |
| 0.006 | 14.3820 | 25600 | 1.3369 |
| 0.006 | 14.4382 | 25700 | 1.3428 |
| 0.006 | 14.4944 | 25800 | 1.3338 |
| 0.006 | 14.5506 | 25900 | 1.3404 |
| 0.0064 | 14.6067 | 26000 | 1.3460 |
| 0.0064 | 14.6629 | 26100 | 1.3435 |
| 0.0064 | 14.7191 | 26200 | 1.3457 |
| 0.0064 | 14.7753 | 26300 | 1.3494 |
| 0.0064 | 14.8315 | 26400 | 1.3507 |
| 0.0064 | 14.8876 | 26500 | 1.3577 |
| 0.0064 | 14.9438 | 26600 | 1.3359 |
| 0.0064 | 15.0 | 26700 | 1.3506 |
| 0.0064 | 15.0562 | 26800 | 1.4156 |
| 0.0064 | 15.1124 | 26900 | 1.4040 |
| 0.0057 | 15.1685 | 27000 | 1.4108 |
| 0.0057 | 15.2247 | 27100 | 1.4026 |
| 0.0057 | 15.2809 | 27200 | 1.4221 |
| 0.0057 | 15.3371 | 27300 | 1.4160 |
| 0.0057 | 15.3933 | 27400 | 1.4161 |
| 0.0054 | 15.4494 | 27500 | 1.4347 |
| 0.0054 | 15.5056 | 27600 | 1.4196 |
| 0.0054 | 15.5618 | 27700 | 1.4249 |
| 0.0054 | 15.6180 | 27800 | 1.4280 |
| 0.0054 | 15.6742 | 27900 | 1.4200 |
| 0.0056 | 15.7303 | 28000 | 1.4192 |
| 0.0056 | 15.7865 | 28100 | 1.4202 |
| 0.0056 | 15.8427 | 28200 | 1.4284 |
| 0.0056 | 15.8989 | 28300 | 1.4307 |
| 0.0056 | 15.9551 | 28400 | 1.4259 |
| 0.0056 | 16.0112 | 28500 | 1.4654 |
| 0.0056 | 16.0674 | 28600 | 1.4845 |
| 0.0056 | 16.1236 | 28700 | 1.4806 |
| 0.0056 | 16.1798 | 28800 | 1.4831 |
| 0.0056 | 16.2360 | 28900 | 1.4913 |
| 0.0047 | 16.2921 | 29000 | 1.5039 |
| 0.0047 | 16.3483 | 29100 | 1.4987 |
| 0.0047 | 16.4045 | 29200 | 1.5099 |
| 0.0047 | 16.4607 | 29300 | 1.4961 |
| 0.0047 | 16.5169 | 29400 | 1.4937 |
| 0.005 | 16.5730 | 29500 | 1.4875 |
| 0.005 | 16.6292 | 29600 | 1.4967 |
| 0.005 | 16.6854 | 29700 | 1.4991 |
| 0.005 | 16.7416 | 29800 | 1.5045 |
| 0.005 | 16.7978 | 29900 | 1.5078 |
| 0.0051 | 16.8539 | 30000 | 1.5007 |
| 0.0051 | 16.9101 | 30100 | 1.5067 |
| 0.0051 | 16.9663 | 30200 | 1.5082 |
| 0.0051 | 17.0225 | 30300 | 1.5553 |
| 0.0051 | 17.0787 | 30400 | 1.5679 |
| 0.0048 | 17.1348 | 30500 | 1.5638 |
| 0.0048 | 17.1910 | 30600 | 1.5660 |
| 0.0048 | 17.2472 | 30700 | 1.5699 |
| 0.0048 | 17.3034 | 30800 | 1.5750 |
| 0.0048 | 17.3596 | 30900 | 1.5751 |
| 0.0046 | 17.4157 | 31000 | 1.5774 |
| 0.0046 | 17.4719 | 31100 | 1.5797 |
| 0.0046 | 17.5281 | 31200 | 1.5793 |
| 0.0046 | 17.5843 | 31300 | 1.5804 |
| 0.0046 | 17.6404 | 31400 | 1.5801 |
| 0.0047 | 17.6966 | 31500 | 1.5828 |
| 0.0047 | 17.7528 | 31600 | 1.5791 |
| 0.0047 | 17.8090 | 31700 | 1.5865 |
| 0.0047 | 17.8652 | 31800 | 1.5907 |
| 0.0047 | 17.9213 | 31900 | 1.5855 |
| 0.0048 | 17.9775 | 32000 | 1.5867 |
| 0.0048 | 18.0337 | 32100 | 1.6185 |
| 0.0048 | 18.0899 | 32200 | 1.6252 |
| 0.0048 | 18.1461 | 32300 | 1.6293 |
| 0.0048 | 18.2022 | 32400 | 1.6332 |
| 0.0043 | 18.2584 | 32500 | 1.6315 |
| 0.0043 | 18.3146 | 32600 | 1.6371 |
| 0.0043 | 18.3708 | 32700 | 1.6388 |
| 0.0043 | 18.4270 | 32800 | 1.6405 |
| 0.0043 | 18.4831 | 32900 | 1.6431 |
| 0.0043 | 18.5393 | 33000 | 1.6423 |
| 0.0043 | 18.5955 | 33100 | 1.6411 |
| 0.0043 | 18.6517 | 33200 | 1.6418 |
| 0.0043 | 18.7079 | 33300 | 1.6443 |
| 0.0043 | 18.7640 | 33400 | 1.6423 |
| 0.0044 | 18.8202 | 33500 | 1.6415 |
| 0.0044 | 18.8764 | 33600 | 1.6426 |
| 0.0044 | 18.9326 | 33700 | 1.6420 |
| 0.0044 | 18.9888 | 33800 | 1.6422 |
| 0.0044 | 19.0449 | 33900 | 1.6532 |
| 0.0043 | 19.1011 | 34000 | 1.6594 |
| 0.0043 | 19.1573 | 34100 | 1.6612 |
| 0.0043 | 19.2135 | 34200 | 1.6618 |
| 0.0043 | 19.2697 | 34300 | 1.6626 |
| 0.0043 | 19.3258 | 34400 | 1.6628 |
| 0.0042 | 19.3820 | 34500 | 1.6637 |
| 0.0042 | 19.4382 | 34600 | 1.6635 |
| 0.0042 | 19.4944 | 34700 | 1.6633 |
| 0.0042 | 19.5506 | 34800 | 1.6634 |
| 0.0042 | 19.6067 | 34900 | 1.6634 |
| 0.0042 | 19.6629 | 35000 | 1.6632 |
| 0.0042 | 19.7191 | 35100 | 1.6635 |
| 0.0042 | 19.7753 | 35200 | 1.6632 |
| 0.0042 | 19.8315 | 35300 | 1.6634 |
| 0.0042 | 19.8876 | 35400 | 1.6632 |
| 0.0042 | 19.9438 | 35500 | 1.6633 |
| 0.0042 | 20.0 | 35600 | 1.6633 |
Framework versions
- Transformers 4.43.4
- Pytorch 2.4.0+cu121
- Datasets 3.0.1
- Tokenizers 0.19.1
- Downloads last month
- 1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for qfq/Llama-3.1-8B-Instruct-EI1-20ep-sft
Base model
meta-llama/Llama-3.1-8B Finetuned
meta-llama/Llama-3.1-8B-Instruct