floflodebilbao commited on
Commit
39d567c
·
verified ·
1 Parent(s): 215fe46

End of training

Browse files
README.md CHANGED
@@ -22,21 +22,21 @@ should probably proofread and complete it, then remove this comment. -->
22
 
23
  This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
24
  It achieves the following results on the evaluation set:
25
- - Loss: 8.0673
26
- - Rouge1: 0.1845
27
- - Rouge2: 0.0244
28
- - Rougel: 0.129
29
- - Rougelsum: 0.1293
30
- - Gen Len: 55.32
31
- - Bleu: 0.0
32
- - Precisions: 0.0357
33
  - Brevity Penalty: 1.0
34
- - Length Ratio: 1.6374
35
- - Translation Length: 1978.0
36
  - Reference Length: 1208.0
37
- - Precision: 0.8258
38
- - Recall: 0.8507
39
- - F1: 0.838
40
  - Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
41
 
42
  ## Model description
@@ -64,24 +64,27 @@ The following hyperparameters were used during training:
64
  - total_train_batch_size: 16
65
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
66
  - lr_scheduler_type: linear
67
- - num_epochs: 12
68
 
69
  ### Training results
70
 
71
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1 | Hashcode |
72
  |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
73
- | No log | 1.0 | 7 | 15.9454 | 0.193 | 0.0296 | 0.1313 | 0.1315 | 61.62 | 0.0098 | 0.0376 | 1.0 | 1.7988 | 2173.0 | 1208.0 | 0.8219 | 0.8539 | 0.8376 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
74
- | No log | 2.0 | 14 | 14.6416 | 0.2017 | 0.0338 | 0.1393 | 0.1398 | 59.64 | 0.0107 | 0.0402 | 1.0 | 1.745 | 2108.0 | 1208.0 | 0.8261 | 0.8556 | 0.8405 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
75
- | No log | 3.0 | 21 | 13.5836 | 0.2044 | 0.0328 | 0.1404 | 0.1408 | 59.28 | 0.0088 | 0.0404 | 1.0 | 1.7359 | 2097.0 | 1208.0 | 0.8263 | 0.8551 | 0.8404 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
76
- | No log | 4.0 | 28 | 12.7352 | 0.196 | 0.0304 | 0.1365 | 0.1366 | 58.98 | 0.0085 | 0.0398 | 1.0 | 1.7293 | 2089.0 | 1208.0 | 0.8256 | 0.8537 | 0.8393 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
77
- | No log | 5.0 | 35 | 12.0126 | 0.1954 | 0.0312 | 0.1343 | 0.1343 | 58.46 | 0.0085 | 0.0395 | 1.0 | 1.7169 | 2074.0 | 1208.0 | 0.8266 | 0.854 | 0.84 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
78
- | No log | 6.0 | 42 | 11.3598 | 0.1909 | 0.0307 | 0.1337 | 0.1331 | 57.46 | 0.0085 | 0.0394 | 1.0 | 1.6921 | 2044.0 | 1208.0 | 0.827 | 0.8534 | 0.8399 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
79
- | No log | 7.0 | 49 | 10.7039 | 0.1891 | 0.0351 | 0.1342 | 0.1343 | 57.02 | 0.0138 | 0.0405 | 1.0 | 1.6639 | 2010.0 | 1208.0 | 0.8269 | 0.8534 | 0.8398 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
80
- | No log | 8.0 | 56 | 10.0681 | 0.1885 | 0.0352 | 0.1321 | 0.1324 | 57.26 | 0.0135 | 0.0401 | 1.0 | 1.6921 | 2044.0 | 1208.0 | 0.8266 | 0.8528 | 0.8394 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
81
- | No log | 9.0 | 63 | 9.3697 | 0.1931 | 0.0337 | 0.1354 | 0.1355 | 57.76 | 0.0087 | 0.0397 | 1.0 | 1.7061 | 2061.0 | 1208.0 | 0.826 | 0.8528 | 0.8391 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
82
- | No log | 10.0 | 70 | 8.7531 | 0.1839 | 0.0283 | 0.1264 | 0.1269 | 58.26 | 0.0 | 0.0373 | 1.0 | 1.7194 | 2077.0 | 1208.0 | 0.8233 | 0.8502 | 0.8364 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
83
- | No log | 11.0 | 77 | 8.2701 | 0.1843 | 0.0262 | 0.1321 | 0.1324 | 55.72 | 0.0 | 0.0365 | 1.0 | 1.6432 | 1985.0 | 1208.0 | 0.8259 | 0.8509 | 0.8381 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
84
- | No log | 12.0 | 84 | 8.0673 | 0.1845 | 0.0244 | 0.129 | 0.1293 | 55.32 | 0.0 | 0.0357 | 1.0 | 1.6374 | 1978.0 | 1208.0 | 0.8258 | 0.8507 | 0.838 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
 
 
 
85
 
86
 
87
  ### Framework versions
 
22
 
23
  This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
24
  It achieves the following results on the evaluation set:
25
+ - Loss: 4.8960
26
+ - Rouge1: 0.0787
27
+ - Rouge2: 0.0206
28
+ - Rougel: 0.0662
29
+ - Rougelsum: 0.067
30
+ - Gen Len: 63.0
31
+ - Bleu: 0.0106
32
+ - Precisions: 0.0227
33
  - Brevity Penalty: 1.0
34
+ - Length Ratio: 1.4992
35
+ - Translation Length: 1811.0
36
  - Reference Length: 1208.0
37
+ - Precision: 0.7672
38
+ - Recall: 0.8255
39
+ - F1: 0.7948
40
  - Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
41
 
42
  ## Model description
 
64
  - total_train_batch_size: 16
65
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
66
  - lr_scheduler_type: linear
67
+ - num_epochs: 15
68
 
69
  ### Training results
70
 
71
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1 | Hashcode |
72
  |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
73
+ | No log | 1.0 | 7 | 28.7642 | 0.186 | 0.0294 | 0.1265 | 0.1264 | 62.06 | 0.0096 | 0.0362 | 1.0 | 1.822 | 2201.0 | 1208.0 | 0.8206 | 0.8529 | 0.8364 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
74
+ | No log | 2.0 | 14 | 26.0446 | 0.1828 | 0.0274 | 0.1243 | 0.1244 | 61.32 | 0.0091 | 0.0352 | 1.0 | 1.7815 | 2152.0 | 1208.0 | 0.8204 | 0.8522 | 0.8359 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
75
+ | No log | 3.0 | 21 | 23.8385 | 0.187 | 0.0264 | 0.1252 | 0.1248 | 60.82 | 0.0 | 0.0368 | 1.0 | 1.7707 | 2139.0 | 1208.0 | 0.8223 | 0.8531 | 0.8373 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
76
+ | No log | 4.0 | 28 | 21.9343 | 0.1873 | 0.0255 | 0.1282 | 0.1279 | 59.98 | 0.0 | 0.0374 | 1.0 | 1.7475 | 2111.0 | 1208.0 | 0.8229 | 0.8523 | 0.8373 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
77
+ | No log | 5.0 | 35 | 20.2839 | 0.1972 | 0.0303 | 0.1325 | 0.1326 | 59.46 | 0.0079 | 0.0387 | 1.0 | 1.7566 | 2122.0 | 1208.0 | 0.825 | 0.8537 | 0.839 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
78
+ | No log | 6.0 | 42 | 18.7365 | 0.1927 | 0.0329 | 0.1346 | 0.1346 | 59.28 | 0.0127 | 0.039 | 1.0 | 1.7492 | 2113.0 | 1208.0 | 0.8246 | 0.8532 | 0.8386 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
79
+ | No log | 7.0 | 49 | 17.0557 | 0.1935 | 0.0352 | 0.1361 | 0.1363 | 59.44 | 0.0134 | 0.0403 | 1.0 | 1.7409 | 2103.0 | 1208.0 | 0.8248 | 0.8529 | 0.8385 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
80
+ | No log | 8.0 | 56 | 15.0848 | 0.187 | 0.0355 | 0.1301 | 0.1308 | 59.14 | 0.0115 | 0.0381 | 1.0 | 1.7351 | 2096.0 | 1208.0 | 0.8232 | 0.8515 | 0.837 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
81
+ | No log | 9.0 | 63 | 12.3513 | 0.1883 | 0.0324 | 0.1297 | 0.1298 | 59.32 | 0.0108 | 0.0374 | 1.0 | 1.7384 | 2100.0 | 1208.0 | 0.8231 | 0.8511 | 0.8368 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
82
+ | No log | 10.0 | 70 | 8.3897 | 0.1907 | 0.0356 | 0.1361 | 0.1363 | 57.58 | 0.0108 | 0.0384 | 1.0 | 1.6887 | 2040.0 | 1208.0 | 0.825 | 0.8515 | 0.838 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
83
+ | No log | 11.0 | 77 | 5.0447 | 0.1462 | 0.0254 | 0.1123 | 0.1124 | 57.06 | 0.0102 | 0.0334 | 1.0 | 1.5712 | 1898.0 | 1208.0 | 0.8064 | 0.8435 | 0.8242 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
84
+ | No log | 12.0 | 84 | 4.7254 | 0.0956 | 0.0096 | 0.0715 | 0.0714 | 61.38 | 0.0 | 0.0228 | 1.0 | 1.5132 | 1828.0 | 1208.0 | 0.7674 | 0.8298 | 0.7968 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
85
+ | No log | 13.0 | 91 | 4.8274 | 0.0795 | 0.013 | 0.061 | 0.0603 | 61.68 | 0.0075 | 0.0219 | 1.0 | 1.4909 | 1801.0 | 1208.0 | 0.7666 | 0.8251 | 0.7943 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
86
+ | No log | 14.0 | 98 | 4.8941 | 0.0822 | 0.0202 | 0.0684 | 0.0691 | 63.0 | 0.0098 | 0.0217 | 1.0 | 1.5414 | 1862.0 | 1208.0 | 0.7746 | 0.8263 | 0.7992 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
87
+ | No log | 15.0 | 105 | 4.8960 | 0.0787 | 0.0206 | 0.0662 | 0.067 | 63.0 | 0.0106 | 0.0227 | 1.0 | 1.4992 | 1811.0 | 1208.0 | 0.7672 | 0.8255 | 0.7948 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
88
 
89
 
90
  ### Framework versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:6775b66a1007770897e652e42d91bfc03994d7f283dee3725f319b2c55acf9c9
3
  size 1187780840
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7505a676b9ab398044918c117dcd8d1a0d91bda9f7f7ddf9efb9b3331b40d7de
3
  size 1187780840
runs/Jul10_11-30-29_tardis/events.out.tfevents.1752139830.tardis.31443.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c60dc00864f61f0a27e23a4f22f79544170f64d1b6675a510cc0478d8fffafcb
3
+ size 22680
tokenizer.json CHANGED
@@ -7,7 +7,9 @@
7
  "stride": 0
8
  },
9
  "padding": {
10
- "strategy": "BatchLongest",
 
 
11
  "direction": "Right",
12
  "pad_to_multiple_of": null,
13
  "pad_id": 0,
 
7
  "stride": 0
8
  },
9
  "padding": {
10
+ "strategy": {
11
+ "Fixed": 64
12
+ },
13
  "direction": "Right",
14
  "pad_to_multiple_of": null,
15
  "pad_id": 0,
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3540ea8fc39a4444a9c659fb39e5ee976fb8756e41e055cfd908df9612c7e5ea
3
  size 5905
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4f0e208ac9fde542e02371cb15d531fd42ea59e9ea3810e596ded26b0a436b01
3
  size 5905