train_conala_101112_1760638008

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the conala dataset. It achieves the following results on the evaluation set:

Loss: 0.5822
Num Input Tokens Seen: 3060208

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 101112
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.5473	1.0	536	0.6247	153344
0.9979	2.0	1072	0.5822	306640
0.6446	3.0	1608	0.5855	459376
0.368	4.0	2144	0.6266	612008
0.3112	5.0	2680	0.6818	764936
0.2302	6.0	3216	0.7083	917624
0.1329	7.0	3752	0.8323	1070488
0.0625	8.0	4288	0.9386	1223384
0.0308	9.0	4824	1.0149	1376240
0.0301	10.0	5360	1.0951	1529640
0.0328	11.0	5896	1.1814	1682336
0.0357	12.0	6432	1.2307	1835928
0.0496	13.0	6968	1.2626	1989136
0.0238	14.0	7504	1.3118	2142632
0.0148	15.0	8040	1.3449	2295280
0.0008	16.0	8576	1.3851	2447904
0.0422	17.0	9112	1.4336	2600776
0.0054	18.0	9648	1.4696	2753536
0.0214	19.0	10184	1.4862	2906984
0.0101	20.0	10720	1.4930	3060208

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: 171

Model tree for rbelanec/train_conala_101112_1760638008

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2013)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard