99741d8a02311f0bb9052c9d125646d5
This model is a fine-tuned version of distilbert/distilbert-base-uncased-distilled-squad on the nyu-mll/glue [qqp] dataset. It achieves the following results on the evaluation set:
- Loss: 0.3604
- Data Size: 1.0
- Epoch Runtime: 330.9103
- Accuracy: 0.8930
- F1 Macro: 0.8855
- Rouge1: 0.8931
- Rouge2: 0.0
- Rougel: 0.8930
- Rougelsum: 0.8931
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Accuracy | F1 Macro | Rouge1 | Rouge2 | Rougel | Rougelsum |
|---|---|---|---|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 0.6739 | 0 | 10.7956 | 0.6316 | 0.3886 | 0.6314 | 0.0 | 0.6316 | 0.6313 |
| 0.5682 | 1 | 11370 | 0.4679 | 0.0078 | 13.7434 | 0.7657 | 0.7573 | 0.7658 | 0.0 | 0.7657 | 0.7656 |
| 0.46 | 2 | 22740 | 0.4330 | 0.0156 | 15.7083 | 0.7974 | 0.7811 | 0.7975 | 0.0 | 0.7974 | 0.7974 |
| 0.4022 | 3 | 34110 | 0.3779 | 0.0312 | 20.5150 | 0.8242 | 0.8163 | 0.8244 | 0.0 | 0.8242 | 0.8242 |
| 0.3727 | 4 | 45480 | 0.3822 | 0.0625 | 29.1999 | 0.8384 | 0.8237 | 0.8384 | 0.0 | 0.8383 | 0.8383 |
| 0.3457 | 5 | 56850 | 0.3174 | 0.125 | 48.1525 | 0.8600 | 0.8514 | 0.8600 | 0.0 | 0.8600 | 0.8600 |
| 0.3067 | 6 | 68220 | 0.3041 | 0.25 | 82.3618 | 0.8658 | 0.8586 | 0.8658 | 0.0 | 0.8658 | 0.8657 |
| 0.2567 | 7 | 79590 | 0.2765 | 0.5 | 164.7543 | 0.8792 | 0.8721 | 0.8792 | 0.0 | 0.8792 | 0.8792 |
| 0.2553 | 8.0 | 90960 | 0.2829 | 1.0 | 319.7446 | 0.8874 | 0.8801 | 0.8874 | 0.0 | 0.8875 | 0.8875 |
| 0.219 | 9.0 | 102330 | 0.2857 | 1.0 | 329.9028 | 0.8910 | 0.8835 | 0.8910 | 0.0 | 0.8911 | 0.8911 |
| 0.1614 | 10.0 | 113700 | 0.2981 | 1.0 | 327.9676 | 0.8922 | 0.8847 | 0.8922 | 0.0 | 0.8922 | 0.8922 |
| 0.1165 | 11.0 | 125070 | 0.3604 | 1.0 | 330.9103 | 0.8930 | 0.8855 | 0.8931 | 0.0 | 0.8930 | 0.8931 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.3.0
- Tokenizers 0.22.1
- Downloads last month
- 2