train_conala_789_1760637894

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the conala dataset. It achieves the following results on the evaluation set:

Loss: 1.2822
Num Input Tokens Seen: 3037136

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 4
eval_batch_size: 4
seed: 789
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.6543	1.0	536	0.7184	152296
0.963	2.0	1072	0.6811	304440
0.6835	3.0	1608	0.6779	455928
0.6298	4.0	2144	0.6566	608072
0.6007	5.0	2680	0.6503	759296
0.4515	6.0	3216	0.6554	910984
0.354	7.0	3752	0.6558	1062816
0.5476	8.0	4288	0.6500	1214520
0.6967	9.0	4824	0.6582	1366480
0.4536	10.0	5360	0.6594	1518976
0.3455	11.0	5896	0.6764	1670320
0.3709	12.0	6432	0.6794	1822624
0.3378	13.0	6968	0.7134	1974336
0.3264	14.0	7504	0.7172	2126488
0.3556	15.0	8040	0.7683	2278280
0.2914	16.0	8576	0.7782	2430272
0.2339	17.0	9112	0.8055	2581848
0.2257	18.0	9648	0.8163	2733712
0.2782	19.0	10184	0.8254	2885208
0.2769	20.0	10720	0.8287	3037136

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: -

Model tree for rbelanec/train_conala_789_1760637894

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2158)

this model

rbelanec
/

train_conala_789_1760637894

train_conala_789_1760637894

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for rbelanec/train_conala_789_1760637894

Evaluation results