wsqstar commited on
Commit
dee3175
·
verified ·
1 Parent(s): 1843499

Initial model push

Browse files
Files changed (2) hide show
  1. README.md +25 -40
  2. model.safetensors +1 -1
README.md CHANGED
@@ -20,11 +20,11 @@ should probably proofread and complete it, then remove this comment. -->
20
 
21
  This model is a fine-tuned version of [bert-base-chinese](https://huggingface.co/bert-base-chinese) on an unknown dataset.
22
  It achieves the following results on the evaluation set:
23
- - Loss: 1.5937
24
- - Accuracy: 0.6684
25
- - Precision: 0.6736
26
- - Recall: 0.6684
27
- - F1: 0.6676
28
 
29
  ## Model description
30
 
@@ -44,10 +44,10 @@ More information needed
44
 
45
  The following hyperparameters were used during training:
46
  - learning_rate: 5e-05
47
- - train_batch_size: 8
48
- - eval_batch_size: 8
49
  - seed: 42
50
- - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
51
  - lr_scheduler_type: linear
52
  - num_epochs: 5
53
 
@@ -55,40 +55,25 @@ The following hyperparameters were used during training:
55
 
56
  | Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision | Recall | F1 |
57
  |:-------------:|:------:|:----:|:---------------:|:--------:|:---------:|:------:|:------:|
58
- | 1.3393 | 0.1718 | 50 | 1.2223 | 0.4175 | 0.3896 | 0.4175 | 0.3254 |
59
- | 1.1139 | 0.3436 | 100 | 1.1012 | 0.5206 | 0.5780 | 0.5206 | 0.4857 |
60
- | 1.0948 | 0.5155 | 150 | 1.1199 | 0.5344 | 0.6315 | 0.5344 | 0.4667 |
61
- | 1.1433 | 0.6873 | 200 | 1.0790 | 0.5464 | 0.6740 | 0.5464 | 0.5311 |
62
- | 0.9917 | 0.8591 | 250 | 0.9433 | 0.6134 | 0.6139 | 0.6134 | 0.6132 |
63
- | 0.8964 | 1.0309 | 300 | 0.9473 | 0.6031 | 0.6419 | 0.6031 | 0.5974 |
64
- | 0.7287 | 1.2027 | 350 | 0.9888 | 0.6168 | 0.6534 | 0.6168 | 0.6160 |
65
- | 0.8982 | 1.3746 | 400 | 0.9988 | 0.6357 | 0.6465 | 0.6357 | 0.6245 |
66
- | 0.8875 | 1.5464 | 450 | 0.9702 | 0.6478 | 0.6578 | 0.6478 | 0.6396 |
67
- | 0.7741 | 1.7182 | 500 | 0.8539 | 0.6581 | 0.6598 | 0.6581 | 0.6547 |
68
- | 0.637 | 1.8900 | 550 | 0.8348 | 0.6701 | 0.6767 | 0.6701 | 0.6679 |
69
- | 0.6021 | 2.0619 | 600 | 0.9177 | 0.6581 | 0.6735 | 0.6581 | 0.6517 |
70
- | 0.4407 | 2.2337 | 650 | 1.0360 | 0.6667 | 0.6720 | 0.6667 | 0.6653 |
71
- | 0.4253 | 2.4055 | 700 | 1.1312 | 0.6701 | 0.6897 | 0.6701 | 0.6715 |
72
- | 0.4493 | 2.5773 | 750 | 1.0542 | 0.6581 | 0.6698 | 0.6581 | 0.6545 |
73
- | 0.6027 | 2.7491 | 800 | 0.8682 | 0.6873 | 0.6913 | 0.6873 | 0.6881 |
74
- | 0.4366 | 2.9210 | 850 | 0.9622 | 0.6753 | 0.6837 | 0.6753 | 0.6700 |
75
- | 0.3124 | 3.0928 | 900 | 1.0760 | 0.6821 | 0.6877 | 0.6821 | 0.6769 |
76
- | 0.4476 | 3.2646 | 950 | 1.2845 | 0.6804 | 0.6852 | 0.6804 | 0.6794 |
77
- | 0.3383 | 3.4364 | 1000 | 1.4062 | 0.6598 | 0.6720 | 0.6598 | 0.6564 |
78
- | 0.2415 | 3.6082 | 1050 | 1.4028 | 0.6598 | 0.6697 | 0.6598 | 0.6552 |
79
- | 0.3712 | 3.7801 | 1100 | 1.3918 | 0.6770 | 0.6790 | 0.6770 | 0.6759 |
80
- | 0.2293 | 3.9519 | 1150 | 1.3776 | 0.6838 | 0.6871 | 0.6838 | 0.6841 |
81
- | 0.0608 | 4.1237 | 1200 | 1.4518 | 0.6856 | 0.6921 | 0.6856 | 0.6852 |
82
- | 0.1634 | 4.2955 | 1250 | 1.4920 | 0.6718 | 0.6910 | 0.6718 | 0.6761 |
83
- | 0.136 | 4.4674 | 1300 | 1.5883 | 0.6667 | 0.6726 | 0.6667 | 0.6660 |
84
- | 0.2907 | 4.6392 | 1350 | 1.6322 | 0.6718 | 0.6745 | 0.6718 | 0.6683 |
85
- | 0.0488 | 4.8110 | 1400 | 1.5561 | 0.6787 | 0.6842 | 0.6787 | 0.6797 |
86
- | 0.3063 | 4.9828 | 1450 | 1.5937 | 0.6684 | 0.6736 | 0.6684 | 0.6676 |
87
 
88
 
89
  ### Framework versions
90
 
91
- - Transformers 4.47.1
92
- - Pytorch 2.5.1+cu121
93
  - Datasets 3.2.0
94
- - Tokenizers 0.21.0
 
20
 
21
  This model is a fine-tuned version of [bert-base-chinese](https://huggingface.co/bert-base-chinese) on an unknown dataset.
22
  It achieves the following results on the evaluation set:
23
+ - Loss: 1.2399
24
+ - Accuracy: 0.6873
25
+ - Precision: 0.6966
26
+ - Recall: 0.6873
27
+ - F1: 0.6892
28
 
29
  ## Model description
30
 
 
44
 
45
  The following hyperparameters were used during training:
46
  - learning_rate: 5e-05
47
+ - train_batch_size: 16
48
+ - eval_batch_size: 16
49
  - seed: 42
50
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
51
  - lr_scheduler_type: linear
52
  - num_epochs: 5
53
 
 
55
 
56
  | Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision | Recall | F1 |
57
  |:-------------:|:------:|:----:|:---------------:|:--------:|:---------:|:------:|:------:|
58
+ | 1.1171 | 0.3425 | 50 | 1.0987 | 0.5962 | 0.6031 | 0.5962 | 0.5899 |
59
+ | 1.1728 | 0.6849 | 100 | 1.0413 | 0.5636 | 0.6400 | 0.5636 | 0.5309 |
60
+ | 1.0455 | 1.0274 | 150 | 1.0055 | 0.6254 | 0.6466 | 0.6254 | 0.6130 |
61
+ | 0.7633 | 1.3699 | 200 | 1.0100 | 0.5928 | 0.6324 | 0.5928 | 0.5809 |
62
+ | 0.7814 | 1.7123 | 250 | 0.9436 | 0.6340 | 0.6499 | 0.6340 | 0.6282 |
63
+ | 0.6404 | 2.0548 | 300 | 0.8559 | 0.6529 | 0.6831 | 0.6529 | 0.6493 |
64
+ | 0.3885 | 2.3973 | 350 | 0.9820 | 0.6718 | 0.6748 | 0.6718 | 0.6637 |
65
+ | 0.463 | 2.7397 | 400 | 0.8935 | 0.6770 | 0.6767 | 0.6770 | 0.6764 |
66
+ | 0.317 | 3.0822 | 450 | 1.0278 | 0.6856 | 0.6865 | 0.6856 | 0.6799 |
67
+ | 0.3404 | 3.4247 | 500 | 1.0437 | 0.6924 | 0.7017 | 0.6924 | 0.6927 |
68
+ | 0.2154 | 3.7671 | 550 | 1.1039 | 0.6838 | 0.6865 | 0.6838 | 0.6843 |
69
+ | 0.2714 | 4.1096 | 600 | 1.1328 | 0.6976 | 0.7001 | 0.6976 | 0.6971 |
70
+ | 0.2692 | 4.4521 | 650 | 1.2025 | 0.6821 | 0.7008 | 0.6821 | 0.6843 |
71
+ | 0.1445 | 4.7945 | 700 | 1.2399 | 0.6873 | 0.6966 | 0.6873 | 0.6892 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
72
 
73
 
74
  ### Framework versions
75
 
76
+ - Transformers 4.44.2
77
+ - Pytorch 2.4.1+cu121
78
  - Datasets 3.2.0
79
+ - Tokenizers 0.19.1
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a9e26d10ce8876725f393d87be25cca1e3b6f70921ca25ef998fab54d212c064
3
  size 409106392
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fce0a614c7286adc576e255ec210a073b624aa98514c67fb6bd7b1dd30449b7a
3
  size 409106392