distilbert-classn-LinearAlg-finetuned-span-width
This model is a fine-tuned version of dslim/distilbert-NER on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.8048
- Accuracy: 0.7619
- F1: 0.7638
- Precision: 0.7732
- Recall: 0.7619
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 8
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 25
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | Precision | Recall |
|---|---|---|---|---|---|---|---|
| 9.5553 | 1.3562 | 50 | 2.4410 | 0.1111 | 0.0988 | 0.0991 | 0.1111 |
| 9.5082 | 2.7123 | 100 | 2.4145 | 0.1032 | 0.0946 | 0.1005 | 0.1032 |
| 9.2844 | 4.0548 | 150 | 2.3758 | 0.1429 | 0.1286 | 0.1283 | 0.1429 |
| 9.1664 | 5.4110 | 200 | 2.3287 | 0.1587 | 0.1491 | 0.1781 | 0.1587 |
| 8.8251 | 6.7671 | 250 | 2.2563 | 0.2460 | 0.2491 | 0.3626 | 0.2460 |
| 8.3414 | 8.1096 | 300 | 2.1525 | 0.3095 | 0.3109 | 0.3506 | 0.3095 |
| 7.6498 | 9.4658 | 350 | 2.0206 | 0.4048 | 0.4014 | 0.4357 | 0.4048 |
| 6.7787 | 10.8219 | 400 | 1.8072 | 0.5159 | 0.4967 | 0.5453 | 0.5159 |
| 5.5972 | 12.1644 | 450 | 1.5651 | 0.5952 | 0.5807 | 0.6248 | 0.5952 |
| 4.3206 | 13.5205 | 500 | 1.3148 | 0.6905 | 0.6886 | 0.7173 | 0.6905 |
| 3.1237 | 14.8767 | 550 | 1.1469 | 0.7063 | 0.7126 | 0.7461 | 0.7063 |
| 2.1003 | 16.2192 | 600 | 0.9970 | 0.7381 | 0.7431 | 0.7783 | 0.7381 |
| 1.5158 | 17.5753 | 650 | 0.9129 | 0.7698 | 0.7743 | 0.7961 | 0.7698 |
| 1.0663 | 18.9315 | 700 | 0.8501 | 0.7778 | 0.7833 | 0.8084 | 0.7778 |
| 0.7797 | 20.2740 | 750 | 0.7928 | 0.7698 | 0.7726 | 0.7927 | 0.7698 |
| 0.58 | 21.6301 | 800 | 0.7950 | 0.7619 | 0.7649 | 0.7765 | 0.7619 |
| 0.5137 | 22.9863 | 850 | 0.8031 | 0.7778 | 0.7796 | 0.7905 | 0.7778 |
| 0.4405 | 24.3288 | 900 | 0.8048 | 0.7619 | 0.7638 | 0.7732 | 0.7619 |
Framework versions
- Transformers 4.48.2
- Pytorch 2.5.1+cu124
- Datasets 3.2.0
- Tokenizers 0.21.0
- Downloads last month
- -
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for Heather-Driver/distilbert-classn-LinearAlg-finetuned-span-width
Base model
distilbert/distilbert-base-cased
Quantized
dslim/distilbert-NER