qa_documents_ft_val_raft

This model is a fine-tuned version of TheBloke/Mistral-7B-Instruct-v0.2-GPTQ on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.9265

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 16
optimizer: Use OptimizerNames.PAGED_ADAMW_8BIT with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 2
num_epochs: 20
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
2.48	0.8889	6	1.9979
2.1592	1.8889	12	1.8025
1.952	2.8889	18	1.6321
1.7515	3.8889	24	1.4888
1.58	4.8889	30	1.3749
1.4693	5.8889	36	1.3157
1.3788	6.8889	42	1.2693
1.2924	7.8889	48	1.2205
1.2201	8.8889	54	1.1763
1.1347	9.8889	60	1.1319
1.0467	10.8889	66	1.0933
0.9925	11.8889	72	1.0581
0.9212	12.8889	78	1.0300
0.8719	13.8889	84	1.0036
0.832	14.8889	90	0.9772
0.7933	15.8889	96	0.9592
0.7562	16.8889	102	0.9458
0.725	17.8889	108	0.9356
0.7143	18.8889	114	0.9303
0.6098	19.8889	120	0.9265

Framework versions

PEFT 0.14.0
Transformers 4.50.3
Pytorch 2.1.0+cu121
Datasets 3.5.0
Tokenizers 0.21.1

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for isaiasgutierrezcruz/qa_documents_ft_val_raft

Base model

mistralai/Mistral-7B-Instruct-v0.2

Quantized

TheBloke/Mistral-7B-Instruct-v0.2-GPTQ

Adapter

(438)

this model