stefan-it's picture
Upload folder using huggingface_hub
6f13378
2023-10-08 22:25:09,722 ----------------------------------------------------------------------------------------------------
2023-10-08 22:25:09,723 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): T5LayerNorm()
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-08 22:25:09,723 ----------------------------------------------------------------------------------------------------
2023-10-08 22:25:09,723 MultiCorpus: 966 train + 219 dev + 204 test sentences
- NER_HIPE_2022 Corpus: 966 train + 219 dev + 204 test sentences - /app/.flair/datasets/ner_hipe_2022/v2.1/ajmc/fr/with_doc_seperator
2023-10-08 22:25:09,723 ----------------------------------------------------------------------------------------------------
2023-10-08 22:25:09,723 Train: 966 sentences
2023-10-08 22:25:09,724 (train_with_dev=False, train_with_test=False)
2023-10-08 22:25:09,724 ----------------------------------------------------------------------------------------------------
2023-10-08 22:25:09,724 Training Params:
2023-10-08 22:25:09,724 - learning_rate: "0.00015"
2023-10-08 22:25:09,724 - mini_batch_size: "8"
2023-10-08 22:25:09,724 - max_epochs: "10"
2023-10-08 22:25:09,724 - shuffle: "True"
2023-10-08 22:25:09,724 ----------------------------------------------------------------------------------------------------
2023-10-08 22:25:09,724 Plugins:
2023-10-08 22:25:09,724 - TensorboardLogger
2023-10-08 22:25:09,724 - LinearScheduler | warmup_fraction: '0.1'
2023-10-08 22:25:09,724 ----------------------------------------------------------------------------------------------------
2023-10-08 22:25:09,724 Final evaluation on model from best epoch (best-model.pt)
2023-10-08 22:25:09,724 - metric: "('micro avg', 'f1-score')"
2023-10-08 22:25:09,724 ----------------------------------------------------------------------------------------------------
2023-10-08 22:25:09,724 Computation:
2023-10-08 22:25:09,724 - compute on device: cuda:0
2023-10-08 22:25:09,724 - embedding storage: none
2023-10-08 22:25:09,724 ----------------------------------------------------------------------------------------------------
2023-10-08 22:25:09,725 Model training base path: "hmbench-ajmc/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-4"
2023-10-08 22:25:09,725 ----------------------------------------------------------------------------------------------------
2023-10-08 22:25:09,725 ----------------------------------------------------------------------------------------------------
2023-10-08 22:25:09,725 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-08 22:25:19,051 epoch 1 - iter 12/121 - loss 3.24900965 - time (sec): 9.33 - samples/sec: 269.81 - lr: 0.000014 - momentum: 0.000000
2023-10-08 22:25:27,957 epoch 1 - iter 24/121 - loss 3.24426110 - time (sec): 18.23 - samples/sec: 264.00 - lr: 0.000029 - momentum: 0.000000
2023-10-08 22:25:37,626 epoch 1 - iter 36/121 - loss 3.23307878 - time (sec): 27.90 - samples/sec: 266.37 - lr: 0.000043 - momentum: 0.000000
2023-10-08 22:25:46,856 epoch 1 - iter 48/121 - loss 3.21571402 - time (sec): 37.13 - samples/sec: 265.17 - lr: 0.000058 - momentum: 0.000000
2023-10-08 22:25:56,529 epoch 1 - iter 60/121 - loss 3.18187185 - time (sec): 46.80 - samples/sec: 266.37 - lr: 0.000073 - momentum: 0.000000
2023-10-08 22:26:05,668 epoch 1 - iter 72/121 - loss 3.12935934 - time (sec): 55.94 - samples/sec: 266.33 - lr: 0.000088 - momentum: 0.000000
2023-10-08 22:26:14,694 epoch 1 - iter 84/121 - loss 3.06309912 - time (sec): 64.97 - samples/sec: 266.58 - lr: 0.000103 - momentum: 0.000000
2023-10-08 22:26:23,425 epoch 1 - iter 96/121 - loss 2.99009477 - time (sec): 73.70 - samples/sec: 266.46 - lr: 0.000118 - momentum: 0.000000
2023-10-08 22:26:32,586 epoch 1 - iter 108/121 - loss 2.90222709 - time (sec): 82.86 - samples/sec: 268.73 - lr: 0.000133 - momentum: 0.000000
2023-10-08 22:26:41,072 epoch 1 - iter 120/121 - loss 2.81749293 - time (sec): 91.35 - samples/sec: 269.79 - lr: 0.000148 - momentum: 0.000000
2023-10-08 22:26:41,534 ----------------------------------------------------------------------------------------------------
2023-10-08 22:26:41,534 EPOCH 1 done: loss 2.8139 - lr: 0.000148
2023-10-08 22:26:47,472 DEV : loss 1.8567742109298706 - f1-score (micro avg) 0.0
2023-10-08 22:26:47,478 ----------------------------------------------------------------------------------------------------
2023-10-08 22:26:56,120 epoch 2 - iter 12/121 - loss 1.82869066 - time (sec): 8.64 - samples/sec: 283.54 - lr: 0.000148 - momentum: 0.000000
2023-10-08 22:27:05,389 epoch 2 - iter 24/121 - loss 1.72583013 - time (sec): 17.91 - samples/sec: 287.45 - lr: 0.000147 - momentum: 0.000000
2023-10-08 22:27:13,493 epoch 2 - iter 36/121 - loss 1.64466489 - time (sec): 26.01 - samples/sec: 281.93 - lr: 0.000145 - momentum: 0.000000
2023-10-08 22:27:22,170 epoch 2 - iter 48/121 - loss 1.54713294 - time (sec): 34.69 - samples/sec: 286.02 - lr: 0.000144 - momentum: 0.000000
2023-10-08 22:27:30,697 epoch 2 - iter 60/121 - loss 1.46968568 - time (sec): 43.22 - samples/sec: 287.43 - lr: 0.000142 - momentum: 0.000000
2023-10-08 22:27:38,966 epoch 2 - iter 72/121 - loss 1.37839808 - time (sec): 51.49 - samples/sec: 286.58 - lr: 0.000140 - momentum: 0.000000
2023-10-08 22:27:47,892 epoch 2 - iter 84/121 - loss 1.29828571 - time (sec): 60.41 - samples/sec: 286.75 - lr: 0.000139 - momentum: 0.000000
2023-10-08 22:27:57,220 epoch 2 - iter 96/121 - loss 1.22462702 - time (sec): 69.74 - samples/sec: 287.11 - lr: 0.000137 - momentum: 0.000000
2023-10-08 22:28:05,490 epoch 2 - iter 108/121 - loss 1.18238969 - time (sec): 78.01 - samples/sec: 285.38 - lr: 0.000135 - momentum: 0.000000
2023-10-08 22:28:13,941 epoch 2 - iter 120/121 - loss 1.12850311 - time (sec): 86.46 - samples/sec: 284.59 - lr: 0.000134 - momentum: 0.000000
2023-10-08 22:28:14,426 ----------------------------------------------------------------------------------------------------
2023-10-08 22:28:14,427 EPOCH 2 done: loss 1.1281 - lr: 0.000134
2023-10-08 22:28:20,237 DEV : loss 0.6606601476669312 - f1-score (micro avg) 0.0
2023-10-08 22:28:20,243 ----------------------------------------------------------------------------------------------------
2023-10-08 22:28:28,452 epoch 3 - iter 12/121 - loss 0.65604974 - time (sec): 8.21 - samples/sec: 271.72 - lr: 0.000132 - momentum: 0.000000
2023-10-08 22:28:36,878 epoch 3 - iter 24/121 - loss 0.63741540 - time (sec): 16.63 - samples/sec: 273.97 - lr: 0.000130 - momentum: 0.000000
2023-10-08 22:28:46,111 epoch 3 - iter 36/121 - loss 0.61760978 - time (sec): 25.87 - samples/sec: 278.44 - lr: 0.000129 - momentum: 0.000000
2023-10-08 22:28:54,090 epoch 3 - iter 48/121 - loss 0.61448220 - time (sec): 33.85 - samples/sec: 279.86 - lr: 0.000127 - momentum: 0.000000
2023-10-08 22:29:02,661 epoch 3 - iter 60/121 - loss 0.59757428 - time (sec): 42.42 - samples/sec: 281.52 - lr: 0.000125 - momentum: 0.000000
2023-10-08 22:29:11,463 epoch 3 - iter 72/121 - loss 0.58109575 - time (sec): 51.22 - samples/sec: 284.02 - lr: 0.000124 - momentum: 0.000000
2023-10-08 22:29:20,574 epoch 3 - iter 84/121 - loss 0.56190891 - time (sec): 60.33 - samples/sec: 287.38 - lr: 0.000122 - momentum: 0.000000
2023-10-08 22:29:29,925 epoch 3 - iter 96/121 - loss 0.53200756 - time (sec): 69.68 - samples/sec: 288.46 - lr: 0.000120 - momentum: 0.000000
2023-10-08 22:29:37,880 epoch 3 - iter 108/121 - loss 0.51675821 - time (sec): 77.63 - samples/sec: 285.72 - lr: 0.000119 - momentum: 0.000000
2023-10-08 22:29:46,392 epoch 3 - iter 120/121 - loss 0.50248915 - time (sec): 86.15 - samples/sec: 285.87 - lr: 0.000117 - momentum: 0.000000
2023-10-08 22:29:46,878 ----------------------------------------------------------------------------------------------------
2023-10-08 22:29:46,879 EPOCH 3 done: loss 0.5011 - lr: 0.000117
2023-10-08 22:29:52,662 DEV : loss 0.38955041766166687 - f1-score (micro avg) 0.0911
2023-10-08 22:29:52,668 saving best model
2023-10-08 22:29:53,502 ----------------------------------------------------------------------------------------------------
2023-10-08 22:30:01,750 epoch 4 - iter 12/121 - loss 0.48386370 - time (sec): 8.25 - samples/sec: 285.20 - lr: 0.000115 - momentum: 0.000000
2023-10-08 22:30:10,627 epoch 4 - iter 24/121 - loss 0.42735391 - time (sec): 17.12 - samples/sec: 291.35 - lr: 0.000114 - momentum: 0.000000
2023-10-08 22:30:20,102 epoch 4 - iter 36/121 - loss 0.38016291 - time (sec): 26.60 - samples/sec: 289.38 - lr: 0.000112 - momentum: 0.000000
2023-10-08 22:30:28,643 epoch 4 - iter 48/121 - loss 0.36501396 - time (sec): 35.14 - samples/sec: 285.97 - lr: 0.000110 - momentum: 0.000000
2023-10-08 22:30:37,651 epoch 4 - iter 60/121 - loss 0.35335814 - time (sec): 44.15 - samples/sec: 286.99 - lr: 0.000109 - momentum: 0.000000
2023-10-08 22:30:47,047 epoch 4 - iter 72/121 - loss 0.33675474 - time (sec): 53.54 - samples/sec: 288.59 - lr: 0.000107 - momentum: 0.000000
2023-10-08 22:30:55,705 epoch 4 - iter 84/121 - loss 0.32763381 - time (sec): 62.20 - samples/sec: 288.08 - lr: 0.000105 - momentum: 0.000000
2023-10-08 22:31:04,357 epoch 4 - iter 96/121 - loss 0.32341354 - time (sec): 70.85 - samples/sec: 286.69 - lr: 0.000104 - momentum: 0.000000
2023-10-08 22:31:12,510 epoch 4 - iter 108/121 - loss 0.32263201 - time (sec): 79.01 - samples/sec: 285.34 - lr: 0.000102 - momentum: 0.000000
2023-10-08 22:31:20,455 epoch 4 - iter 120/121 - loss 0.31874878 - time (sec): 86.95 - samples/sec: 283.77 - lr: 0.000101 - momentum: 0.000000
2023-10-08 22:31:20,849 ----------------------------------------------------------------------------------------------------
2023-10-08 22:31:20,850 EPOCH 4 done: loss 0.3195 - lr: 0.000101
2023-10-08 22:31:26,721 DEV : loss 0.2868027985095978 - f1-score (micro avg) 0.4672
2023-10-08 22:31:26,728 saving best model
2023-10-08 22:31:31,083 ----------------------------------------------------------------------------------------------------
2023-10-08 22:31:39,208 epoch 5 - iter 12/121 - loss 0.23884032 - time (sec): 8.12 - samples/sec: 283.48 - lr: 0.000099 - momentum: 0.000000
2023-10-08 22:31:47,807 epoch 5 - iter 24/121 - loss 0.24002408 - time (sec): 16.72 - samples/sec: 280.16 - lr: 0.000097 - momentum: 0.000000
2023-10-08 22:31:56,895 epoch 5 - iter 36/121 - loss 0.24189282 - time (sec): 25.81 - samples/sec: 285.00 - lr: 0.000095 - momentum: 0.000000
2023-10-08 22:32:05,611 epoch 5 - iter 48/121 - loss 0.24662212 - time (sec): 34.53 - samples/sec: 281.95 - lr: 0.000094 - momentum: 0.000000
2023-10-08 22:32:14,565 epoch 5 - iter 60/121 - loss 0.24298616 - time (sec): 43.48 - samples/sec: 283.42 - lr: 0.000092 - momentum: 0.000000
2023-10-08 22:32:23,432 epoch 5 - iter 72/121 - loss 0.23940302 - time (sec): 52.35 - samples/sec: 281.98 - lr: 0.000091 - momentum: 0.000000
2023-10-08 22:32:32,422 epoch 5 - iter 84/121 - loss 0.24597045 - time (sec): 61.34 - samples/sec: 282.14 - lr: 0.000089 - momentum: 0.000000
2023-10-08 22:32:41,819 epoch 5 - iter 96/121 - loss 0.24474100 - time (sec): 70.74 - samples/sec: 279.83 - lr: 0.000087 - momentum: 0.000000
2023-10-08 22:32:50,010 epoch 5 - iter 108/121 - loss 0.24034422 - time (sec): 78.93 - samples/sec: 277.70 - lr: 0.000086 - momentum: 0.000000
2023-10-08 22:32:59,049 epoch 5 - iter 120/121 - loss 0.23582739 - time (sec): 87.96 - samples/sec: 278.25 - lr: 0.000084 - momentum: 0.000000
2023-10-08 22:32:59,950 ----------------------------------------------------------------------------------------------------
2023-10-08 22:32:59,950 EPOCH 5 done: loss 0.2348 - lr: 0.000084
2023-10-08 22:33:06,117 DEV : loss 0.22080878913402557 - f1-score (micro avg) 0.5607
2023-10-08 22:33:06,123 saving best model
2023-10-08 22:33:10,499 ----------------------------------------------------------------------------------------------------
2023-10-08 22:33:19,203 epoch 6 - iter 12/121 - loss 0.18999162 - time (sec): 8.70 - samples/sec: 273.85 - lr: 0.000082 - momentum: 0.000000
2023-10-08 22:33:28,526 epoch 6 - iter 24/121 - loss 0.18623698 - time (sec): 18.03 - samples/sec: 273.78 - lr: 0.000081 - momentum: 0.000000
2023-10-08 22:33:37,893 epoch 6 - iter 36/121 - loss 0.19645797 - time (sec): 27.39 - samples/sec: 274.60 - lr: 0.000079 - momentum: 0.000000
2023-10-08 22:33:46,683 epoch 6 - iter 48/121 - loss 0.18669721 - time (sec): 36.18 - samples/sec: 269.30 - lr: 0.000077 - momentum: 0.000000
2023-10-08 22:33:55,908 epoch 6 - iter 60/121 - loss 0.18926314 - time (sec): 45.41 - samples/sec: 269.10 - lr: 0.000076 - momentum: 0.000000
2023-10-08 22:34:05,138 epoch 6 - iter 72/121 - loss 0.19296299 - time (sec): 54.64 - samples/sec: 267.79 - lr: 0.000074 - momentum: 0.000000
2023-10-08 22:34:14,575 epoch 6 - iter 84/121 - loss 0.18871964 - time (sec): 64.07 - samples/sec: 267.22 - lr: 0.000072 - momentum: 0.000000
2023-10-08 22:34:24,183 epoch 6 - iter 96/121 - loss 0.18386245 - time (sec): 73.68 - samples/sec: 267.57 - lr: 0.000071 - momentum: 0.000000
2023-10-08 22:34:33,037 epoch 6 - iter 108/121 - loss 0.18245191 - time (sec): 82.54 - samples/sec: 266.91 - lr: 0.000069 - momentum: 0.000000
2023-10-08 22:34:42,434 epoch 6 - iter 120/121 - loss 0.18290327 - time (sec): 91.93 - samples/sec: 266.29 - lr: 0.000067 - momentum: 0.000000
2023-10-08 22:34:43,298 ----------------------------------------------------------------------------------------------------
2023-10-08 22:34:43,298 EPOCH 6 done: loss 0.1832 - lr: 0.000067
2023-10-08 22:34:49,816 DEV : loss 0.18161384761333466 - f1-score (micro avg) 0.7194
2023-10-08 22:34:49,822 saving best model
2023-10-08 22:34:54,074 ----------------------------------------------------------------------------------------------------
2023-10-08 22:35:04,051 epoch 7 - iter 12/121 - loss 0.13278540 - time (sec): 9.98 - samples/sec: 267.23 - lr: 0.000066 - momentum: 0.000000
2023-10-08 22:35:12,631 epoch 7 - iter 24/121 - loss 0.15453181 - time (sec): 18.56 - samples/sec: 259.43 - lr: 0.000064 - momentum: 0.000000
2023-10-08 22:35:21,547 epoch 7 - iter 36/121 - loss 0.14716641 - time (sec): 27.47 - samples/sec: 258.88 - lr: 0.000062 - momentum: 0.000000
2023-10-08 22:35:31,104 epoch 7 - iter 48/121 - loss 0.14727213 - time (sec): 37.03 - samples/sec: 257.45 - lr: 0.000061 - momentum: 0.000000
2023-10-08 22:35:40,223 epoch 7 - iter 60/121 - loss 0.14846752 - time (sec): 46.15 - samples/sec: 258.93 - lr: 0.000059 - momentum: 0.000000
2023-10-08 22:35:49,626 epoch 7 - iter 72/121 - loss 0.14570512 - time (sec): 55.55 - samples/sec: 258.55 - lr: 0.000057 - momentum: 0.000000
2023-10-08 22:35:59,185 epoch 7 - iter 84/121 - loss 0.14496672 - time (sec): 65.11 - samples/sec: 258.39 - lr: 0.000056 - momentum: 0.000000
2023-10-08 22:36:08,222 epoch 7 - iter 96/121 - loss 0.14292115 - time (sec): 74.15 - samples/sec: 257.95 - lr: 0.000054 - momentum: 0.000000
2023-10-08 22:36:18,509 epoch 7 - iter 108/121 - loss 0.14409720 - time (sec): 84.43 - samples/sec: 260.25 - lr: 0.000052 - momentum: 0.000000
2023-10-08 22:36:28,346 epoch 7 - iter 120/121 - loss 0.14530205 - time (sec): 94.27 - samples/sec: 261.54 - lr: 0.000051 - momentum: 0.000000
2023-10-08 22:36:28,848 ----------------------------------------------------------------------------------------------------
2023-10-08 22:36:28,848 EPOCH 7 done: loss 0.1458 - lr: 0.000051
2023-10-08 22:36:35,388 DEV : loss 0.16460329294204712 - f1-score (micro avg) 0.8102
2023-10-08 22:36:35,394 saving best model
2023-10-08 22:36:39,777 ----------------------------------------------------------------------------------------------------
2023-10-08 22:36:49,394 epoch 8 - iter 12/121 - loss 0.10956992 - time (sec): 9.62 - samples/sec: 270.51 - lr: 0.000049 - momentum: 0.000000
2023-10-08 22:36:58,320 epoch 8 - iter 24/121 - loss 0.10733949 - time (sec): 18.54 - samples/sec: 257.10 - lr: 0.000047 - momentum: 0.000000
2023-10-08 22:37:07,877 epoch 8 - iter 36/121 - loss 0.11071697 - time (sec): 28.10 - samples/sec: 263.64 - lr: 0.000046 - momentum: 0.000000
2023-10-08 22:37:16,828 epoch 8 - iter 48/121 - loss 0.11759892 - time (sec): 37.05 - samples/sec: 263.59 - lr: 0.000044 - momentum: 0.000000
2023-10-08 22:37:25,640 epoch 8 - iter 60/121 - loss 0.12537088 - time (sec): 45.86 - samples/sec: 261.09 - lr: 0.000042 - momentum: 0.000000
2023-10-08 22:37:35,102 epoch 8 - iter 72/121 - loss 0.12952155 - time (sec): 55.32 - samples/sec: 262.19 - lr: 0.000041 - momentum: 0.000000
2023-10-08 22:37:44,296 epoch 8 - iter 84/121 - loss 0.12737468 - time (sec): 64.52 - samples/sec: 261.96 - lr: 0.000039 - momentum: 0.000000
2023-10-08 22:37:53,781 epoch 8 - iter 96/121 - loss 0.12591850 - time (sec): 74.00 - samples/sec: 263.20 - lr: 0.000038 - momentum: 0.000000
2023-10-08 22:38:03,262 epoch 8 - iter 108/121 - loss 0.12362271 - time (sec): 83.48 - samples/sec: 263.64 - lr: 0.000036 - momentum: 0.000000
2023-10-08 22:38:12,425 epoch 8 - iter 120/121 - loss 0.12234941 - time (sec): 92.65 - samples/sec: 264.69 - lr: 0.000034 - momentum: 0.000000
2023-10-08 22:38:13,115 ----------------------------------------------------------------------------------------------------
2023-10-08 22:38:13,116 EPOCH 8 done: loss 0.1219 - lr: 0.000034
2023-10-08 22:38:19,145 DEV : loss 0.15558937191963196 - f1-score (micro avg) 0.8099
2023-10-08 22:38:19,151 ----------------------------------------------------------------------------------------------------
2023-10-08 22:38:29,350 epoch 9 - iter 12/121 - loss 0.09053591 - time (sec): 10.20 - samples/sec: 291.07 - lr: 0.000032 - momentum: 0.000000
2023-10-08 22:38:37,739 epoch 9 - iter 24/121 - loss 0.09289045 - time (sec): 18.59 - samples/sec: 286.18 - lr: 0.000031 - momentum: 0.000000
2023-10-08 22:38:45,831 epoch 9 - iter 36/121 - loss 0.09665172 - time (sec): 26.68 - samples/sec: 285.59 - lr: 0.000029 - momentum: 0.000000
2023-10-08 22:38:54,325 epoch 9 - iter 48/121 - loss 0.10928968 - time (sec): 35.17 - samples/sec: 284.60 - lr: 0.000028 - momentum: 0.000000
2023-10-08 22:39:02,937 epoch 9 - iter 60/121 - loss 0.11052597 - time (sec): 43.78 - samples/sec: 284.49 - lr: 0.000026 - momentum: 0.000000
2023-10-08 22:39:10,948 epoch 9 - iter 72/121 - loss 0.10965277 - time (sec): 51.79 - samples/sec: 283.08 - lr: 0.000024 - momentum: 0.000000
2023-10-08 22:39:20,070 epoch 9 - iter 84/121 - loss 0.10949041 - time (sec): 60.92 - samples/sec: 283.17 - lr: 0.000023 - momentum: 0.000000
2023-10-08 22:39:28,701 epoch 9 - iter 96/121 - loss 0.11090953 - time (sec): 69.55 - samples/sec: 283.73 - lr: 0.000021 - momentum: 0.000000
2023-10-08 22:39:37,111 epoch 9 - iter 108/121 - loss 0.11103717 - time (sec): 77.96 - samples/sec: 283.93 - lr: 0.000019 - momentum: 0.000000
2023-10-08 22:39:45,815 epoch 9 - iter 120/121 - loss 0.10866095 - time (sec): 86.66 - samples/sec: 283.89 - lr: 0.000018 - momentum: 0.000000
2023-10-08 22:39:46,355 ----------------------------------------------------------------------------------------------------
2023-10-08 22:39:46,355 EPOCH 9 done: loss 0.1082 - lr: 0.000018
2023-10-08 22:39:52,143 DEV : loss 0.14723752439022064 - f1-score (micro avg) 0.8144
2023-10-08 22:39:52,149 saving best model
2023-10-08 22:39:56,556 ----------------------------------------------------------------------------------------------------
2023-10-08 22:40:05,872 epoch 10 - iter 12/121 - loss 0.10971863 - time (sec): 9.32 - samples/sec: 302.19 - lr: 0.000016 - momentum: 0.000000
2023-10-08 22:40:13,890 epoch 10 - iter 24/121 - loss 0.11125723 - time (sec): 17.33 - samples/sec: 282.70 - lr: 0.000014 - momentum: 0.000000
2023-10-08 22:40:22,434 epoch 10 - iter 36/121 - loss 0.10350049 - time (sec): 25.88 - samples/sec: 283.10 - lr: 0.000013 - momentum: 0.000000
2023-10-08 22:40:30,973 epoch 10 - iter 48/121 - loss 0.09816038 - time (sec): 34.42 - samples/sec: 282.75 - lr: 0.000011 - momentum: 0.000000
2023-10-08 22:40:39,848 epoch 10 - iter 60/121 - loss 0.09507264 - time (sec): 43.29 - samples/sec: 280.73 - lr: 0.000009 - momentum: 0.000000
2023-10-08 22:40:48,886 epoch 10 - iter 72/121 - loss 0.09477910 - time (sec): 52.33 - samples/sec: 281.93 - lr: 0.000008 - momentum: 0.000000
2023-10-08 22:40:57,294 epoch 10 - iter 84/121 - loss 0.09712686 - time (sec): 60.74 - samples/sec: 283.03 - lr: 0.000006 - momentum: 0.000000
2023-10-08 22:41:05,778 epoch 10 - iter 96/121 - loss 0.09905517 - time (sec): 69.22 - samples/sec: 282.40 - lr: 0.000004 - momentum: 0.000000
2023-10-08 22:41:14,857 epoch 10 - iter 108/121 - loss 0.10229551 - time (sec): 78.30 - samples/sec: 283.89 - lr: 0.000003 - momentum: 0.000000
2023-10-08 22:41:23,155 epoch 10 - iter 120/121 - loss 0.09936083 - time (sec): 86.60 - samples/sec: 283.26 - lr: 0.000001 - momentum: 0.000000
2023-10-08 22:41:23,840 ----------------------------------------------------------------------------------------------------
2023-10-08 22:41:23,840 EPOCH 10 done: loss 0.1003 - lr: 0.000001
2023-10-08 22:41:29,733 DEV : loss 0.14771409332752228 - f1-score (micro avg) 0.8193
2023-10-08 22:41:29,739 saving best model
2023-10-08 22:41:34,963 ----------------------------------------------------------------------------------------------------
2023-10-08 22:41:34,964 Loading model from best epoch ...
2023-10-08 22:41:37,998 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date
2023-10-08 22:41:43,904
Results:
- F-score (micro) 0.7909
- F-score (macro) 0.4759
- Accuracy 0.6893
By class:
precision recall f1-score support
pers 0.7763 0.8489 0.8110 139
scope 0.8000 0.8992 0.8467 129
work 0.6854 0.7625 0.7219 80
loc 0.0000 0.0000 0.0000 9
date 0.0000 0.0000 0.0000 3
micro avg 0.7642 0.8194 0.7909 360
macro avg 0.4523 0.5021 0.4759 360
weighted avg 0.7387 0.8194 0.7770 360
2023-10-08 22:41:43,905 ----------------------------------------------------------------------------------------------------