ca8aebc8d2d06211b9317c1e9d5c74cd

This model is a fine-tuned version of Qwen/Qwen2.5-7B on the google/boolq dataset. It achieves the following results on the evaluation set:

  • Loss: 3.9882
  • Data Size: 1.0
  • Epoch Runtime: 734.1698
  • Accuracy: 0.6235
  • F1 Macro: 0.6163
  • Rouge1: 0.6238
  • Rouge2: 0.0
  • Rougel: 0.6235
  • Rougelsum: 0.6238

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Accuracy F1 Macro Rouge1 Rouge2 Rougel Rougelsum
No log 0 0 8.0175 0 20.6655 0.5928 0.4152 0.5928 0.0 0.5919 0.5925
No log 1 294 18.6013 0.0078 27.0807 0.3781 0.2751 0.3781 0.0 0.3785 0.3781
No log 2 588 12.0732 0.0156 36.0150 0.6072 0.4104 0.6075 0.0 0.6068 0.6075
No log 3 882 21.2906 0.0312 57.1575 0.6216 0.3909 0.6216 0.0 0.6213 0.6216
0.5979 4 1176 4.9192 0.0625 82.7569 0.4038 0.3602 0.4035 0.0 0.4038 0.4038
0.2704 5 1470 4.7668 0.125 121.8431 0.5524 0.5049 0.5522 0.0 0.5530 0.5527
0.4559 6 1764 3.0337 0.25 202.6585 0.3952 0.3242 0.3952 0.0 0.3958 0.3955
2.8035 7 2058 2.7901 0.5 390.2641 0.6213 0.3832 0.6213 0.0 0.6207 0.6210
2.7666 8.0 2352 2.6635 1.0 736.9165 0.6213 0.3832 0.6213 0.0 0.6207 0.6210
2.7882 9.0 2646 2.6738 1.0 734.9518 0.6213 0.3832 0.6213 0.0 0.6207 0.6210
2.4314 10.0 2940 2.7440 1.0 734.0090 0.6492 0.4912 0.6492 0.0 0.6486 0.6492
1.596 11.0 3234 2.8961 1.0 728.1010 0.6618 0.6328 0.6618 0.0 0.6621 0.6615
0.8903 12.0 3528 3.9882 1.0 734.1698 0.6235 0.6163 0.6238 0.0 0.6235 0.6238

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
7
Safetensors
Model size
2B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for contemmcm/ca8aebc8d2d06211b9317c1e9d5c74cd

Base model

Qwen/Qwen2.5-7B
Finetuned
(747)
this model