train_wic_789_1760637922

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the wic dataset. It achieves the following results on the evaluation set:

Loss: 0.2779
Num Input Tokens Seen: 8431032

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 789
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.2279	1.0	1222	0.3245	421768
0.3553	2.0	2444	0.2967	843296
0.3132	3.0	3666	0.2887	1265072
0.2892	4.0	4888	0.2801	1687136
0.2072	5.0	6110	0.2847	2108680
0.3586	6.0	7332	0.2875	2530168
0.1343	7.0	8554	0.2790	2951208
0.1772	8.0	9776	0.2779	3372504
0.1823	9.0	10998	0.2828	3793768
0.4928	10.0	12220	0.2950	4214928
0.2494	11.0	13442	0.2860	4636520
0.179	12.0	14664	0.2868	5057560
0.296	13.0	15886	0.2865	5479248
0.1626	14.0	17108	0.2902	5901056
0.2611	15.0	18330	0.2939	6323016
0.2375	16.0	19552	0.2906	6744792
0.4483	17.0	20774	0.2937	7165960
0.3013	18.0	21996	0.2948	7587872
0.3182	19.0	23218	0.2945	8009040
0.0466	20.0	24440	0.2953	8431032

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: 156

Model tree for rbelanec/train_wic_789_1760637922

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2015)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard