Initial model push
Browse files- README.md +25 -40
- model.safetensors +1 -1
README.md
CHANGED
|
@@ -20,11 +20,11 @@ should probably proofread and complete it, then remove this comment. -->
|
|
| 20 |
|
| 21 |
This model is a fine-tuned version of [bert-base-chinese](https://huggingface.co/bert-base-chinese) on an unknown dataset.
|
| 22 |
It achieves the following results on the evaluation set:
|
| 23 |
-
- Loss: 1.
|
| 24 |
-
- Accuracy: 0.
|
| 25 |
-
- Precision: 0.
|
| 26 |
-
- Recall: 0.
|
| 27 |
-
- F1: 0.
|
| 28 |
|
| 29 |
## Model description
|
| 30 |
|
|
@@ -44,10 +44,10 @@ More information needed
|
|
| 44 |
|
| 45 |
The following hyperparameters were used during training:
|
| 46 |
- learning_rate: 5e-05
|
| 47 |
-
- train_batch_size:
|
| 48 |
-
- eval_batch_size:
|
| 49 |
- seed: 42
|
| 50 |
-
- optimizer:
|
| 51 |
- lr_scheduler_type: linear
|
| 52 |
- num_epochs: 5
|
| 53 |
|
|
@@ -55,40 +55,25 @@ The following hyperparameters were used during training:
|
|
| 55 |
|
| 56 |
| Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision | Recall | F1 |
|
| 57 |
|:-------------:|:------:|:----:|:---------------:|:--------:|:---------:|:------:|:------:|
|
| 58 |
-
| 1.
|
| 59 |
-
| 1.
|
| 60 |
-
| 1.
|
| 61 |
-
|
|
| 62 |
-
| 0.
|
| 63 |
-
| 0.
|
| 64 |
-
| 0.
|
| 65 |
-
| 0.
|
| 66 |
-
| 0.
|
| 67 |
-
| 0.
|
| 68 |
-
| 0.
|
| 69 |
-
| 0.
|
| 70 |
-
| 0.
|
| 71 |
-
| 0.
|
| 72 |
-
| 0.4493 | 2.5773 | 750 | 1.0542 | 0.6581 | 0.6698 | 0.6581 | 0.6545 |
|
| 73 |
-
| 0.6027 | 2.7491 | 800 | 0.8682 | 0.6873 | 0.6913 | 0.6873 | 0.6881 |
|
| 74 |
-
| 0.4366 | 2.9210 | 850 | 0.9622 | 0.6753 | 0.6837 | 0.6753 | 0.6700 |
|
| 75 |
-
| 0.3124 | 3.0928 | 900 | 1.0760 | 0.6821 | 0.6877 | 0.6821 | 0.6769 |
|
| 76 |
-
| 0.4476 | 3.2646 | 950 | 1.2845 | 0.6804 | 0.6852 | 0.6804 | 0.6794 |
|
| 77 |
-
| 0.3383 | 3.4364 | 1000 | 1.4062 | 0.6598 | 0.6720 | 0.6598 | 0.6564 |
|
| 78 |
-
| 0.2415 | 3.6082 | 1050 | 1.4028 | 0.6598 | 0.6697 | 0.6598 | 0.6552 |
|
| 79 |
-
| 0.3712 | 3.7801 | 1100 | 1.3918 | 0.6770 | 0.6790 | 0.6770 | 0.6759 |
|
| 80 |
-
| 0.2293 | 3.9519 | 1150 | 1.3776 | 0.6838 | 0.6871 | 0.6838 | 0.6841 |
|
| 81 |
-
| 0.0608 | 4.1237 | 1200 | 1.4518 | 0.6856 | 0.6921 | 0.6856 | 0.6852 |
|
| 82 |
-
| 0.1634 | 4.2955 | 1250 | 1.4920 | 0.6718 | 0.6910 | 0.6718 | 0.6761 |
|
| 83 |
-
| 0.136 | 4.4674 | 1300 | 1.5883 | 0.6667 | 0.6726 | 0.6667 | 0.6660 |
|
| 84 |
-
| 0.2907 | 4.6392 | 1350 | 1.6322 | 0.6718 | 0.6745 | 0.6718 | 0.6683 |
|
| 85 |
-
| 0.0488 | 4.8110 | 1400 | 1.5561 | 0.6787 | 0.6842 | 0.6787 | 0.6797 |
|
| 86 |
-
| 0.3063 | 4.9828 | 1450 | 1.5937 | 0.6684 | 0.6736 | 0.6684 | 0.6676 |
|
| 87 |
|
| 88 |
|
| 89 |
### Framework versions
|
| 90 |
|
| 91 |
-
- Transformers 4.
|
| 92 |
-
- Pytorch 2.
|
| 93 |
- Datasets 3.2.0
|
| 94 |
-
- Tokenizers 0.
|
|
|
|
| 20 |
|
| 21 |
This model is a fine-tuned version of [bert-base-chinese](https://huggingface.co/bert-base-chinese) on an unknown dataset.
|
| 22 |
It achieves the following results on the evaluation set:
|
| 23 |
+
- Loss: 1.2399
|
| 24 |
+
- Accuracy: 0.6873
|
| 25 |
+
- Precision: 0.6966
|
| 26 |
+
- Recall: 0.6873
|
| 27 |
+
- F1: 0.6892
|
| 28 |
|
| 29 |
## Model description
|
| 30 |
|
|
|
|
| 44 |
|
| 45 |
The following hyperparameters were used during training:
|
| 46 |
- learning_rate: 5e-05
|
| 47 |
+
- train_batch_size: 16
|
| 48 |
+
- eval_batch_size: 16
|
| 49 |
- seed: 42
|
| 50 |
+
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
| 51 |
- lr_scheduler_type: linear
|
| 52 |
- num_epochs: 5
|
| 53 |
|
|
|
|
| 55 |
|
| 56 |
| Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision | Recall | F1 |
|
| 57 |
|:-------------:|:------:|:----:|:---------------:|:--------:|:---------:|:------:|:------:|
|
| 58 |
+
| 1.1171 | 0.3425 | 50 | 1.0987 | 0.5962 | 0.6031 | 0.5962 | 0.5899 |
|
| 59 |
+
| 1.1728 | 0.6849 | 100 | 1.0413 | 0.5636 | 0.6400 | 0.5636 | 0.5309 |
|
| 60 |
+
| 1.0455 | 1.0274 | 150 | 1.0055 | 0.6254 | 0.6466 | 0.6254 | 0.6130 |
|
| 61 |
+
| 0.7633 | 1.3699 | 200 | 1.0100 | 0.5928 | 0.6324 | 0.5928 | 0.5809 |
|
| 62 |
+
| 0.7814 | 1.7123 | 250 | 0.9436 | 0.6340 | 0.6499 | 0.6340 | 0.6282 |
|
| 63 |
+
| 0.6404 | 2.0548 | 300 | 0.8559 | 0.6529 | 0.6831 | 0.6529 | 0.6493 |
|
| 64 |
+
| 0.3885 | 2.3973 | 350 | 0.9820 | 0.6718 | 0.6748 | 0.6718 | 0.6637 |
|
| 65 |
+
| 0.463 | 2.7397 | 400 | 0.8935 | 0.6770 | 0.6767 | 0.6770 | 0.6764 |
|
| 66 |
+
| 0.317 | 3.0822 | 450 | 1.0278 | 0.6856 | 0.6865 | 0.6856 | 0.6799 |
|
| 67 |
+
| 0.3404 | 3.4247 | 500 | 1.0437 | 0.6924 | 0.7017 | 0.6924 | 0.6927 |
|
| 68 |
+
| 0.2154 | 3.7671 | 550 | 1.1039 | 0.6838 | 0.6865 | 0.6838 | 0.6843 |
|
| 69 |
+
| 0.2714 | 4.1096 | 600 | 1.1328 | 0.6976 | 0.7001 | 0.6976 | 0.6971 |
|
| 70 |
+
| 0.2692 | 4.4521 | 650 | 1.2025 | 0.6821 | 0.7008 | 0.6821 | 0.6843 |
|
| 71 |
+
| 0.1445 | 4.7945 | 700 | 1.2399 | 0.6873 | 0.6966 | 0.6873 | 0.6892 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 72 |
|
| 73 |
|
| 74 |
### Framework versions
|
| 75 |
|
| 76 |
+
- Transformers 4.44.2
|
| 77 |
+
- Pytorch 2.4.1+cu121
|
| 78 |
- Datasets 3.2.0
|
| 79 |
+
- Tokenizers 0.19.1
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 409106392
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:fce0a614c7286adc576e255ec210a073b624aa98514c67fb6bd7b1dd30449b7a
|
| 3 |
size 409106392
|