End of training

Browse files

Files changed (5) hide show

README.md +29 -26
model.safetensors +1 -1
runs/Jul10_11-30-29_tardis/events.out.tfevents.1752139830.tardis.31443.0 +3 -0
tokenizer.json +3 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -22,21 +22,21 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 8.0673
-- Rouge1: 0.1845
-- Rouge2: 0.0244
-- Rougel: 0.129
-- Rougelsum: 0.1293
-- Gen Len: 55.32
-- Bleu: 0.0
-- Precisions: 0.0357
 - Brevity Penalty: 1.0
-- Length Ratio: 1.6374
-- Translation Length: 1978.0
 - Reference Length: 1208.0
-- Precision: 0.8258
-- Recall: 0.8507
-- F1: 0.838
 - Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
 ## Model description
@@ -64,24 +64,27 @@ The following hyperparameters were used during training:
 - total_train_batch_size: 16
 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
-- num_epochs: 12
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu   | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1     | Hashcode                                                  |
 |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
-| No log        | 1.0   | 7    | 15.9454         | 0.193  | 0.0296 | 0.1313 | 0.1315    | 61.62   | 0.0098 | 0.0376     | 1.0             | 1.7988       | 2173.0             | 1208.0           | 0.8219    | 0.8539 | 0.8376 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
-| No log        | 2.0   | 14   | 14.6416         | 0.2017 | 0.0338 | 0.1393 | 0.1398    | 59.64   | 0.0107 | 0.0402     | 1.0             | 1.745        | 2108.0             | 1208.0           | 0.8261    | 0.8556 | 0.8405 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
-| No log        | 3.0   | 21   | 13.5836         | 0.2044 | 0.0328 | 0.1404 | 0.1408    | 59.28   | 0.0088 | 0.0404     | 1.0             | 1.7359       | 2097.0             | 1208.0           | 0.8263    | 0.8551 | 0.8404 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
-| No log        | 4.0   | 28   | 12.7352         | 0.196  | 0.0304 | 0.1365 | 0.1366    | 58.98   | 0.0085 | 0.0398     | 1.0             | 1.7293       | 2089.0             | 1208.0           | 0.8256    | 0.8537 | 0.8393 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
-| No log        | 5.0   | 35   | 12.0126         | 0.1954 | 0.0312 | 0.1343 | 0.1343    | 58.46   | 0.0085 | 0.0395     | 1.0             | 1.7169       | 2074.0             | 1208.0           | 0.8266    | 0.854  | 0.84   | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
-| No log        | 6.0   | 42   | 11.3598         | 0.1909 | 0.0307 | 0.1337 | 0.1331    | 57.46   | 0.0085 | 0.0394     | 1.0             | 1.6921       | 2044.0             | 1208.0           | 0.827     | 0.8534 | 0.8399 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
-| No log        | 7.0   | 49   | 10.7039         | 0.1891 | 0.0351 | 0.1342 | 0.1343    | 57.02   | 0.0138 | 0.0405     | 1.0             | 1.6639       | 2010.0             | 1208.0           | 0.8269    | 0.8534 | 0.8398 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
-| No log        | 8.0   | 56   | 10.0681         | 0.1885 | 0.0352 | 0.1321 | 0.1324    | 57.26   | 0.0135 | 0.0401     | 1.0             | 1.6921       | 2044.0             | 1208.0           | 0.8266    | 0.8528 | 0.8394 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
-| No log        | 9.0   | 63   | 9.3697          | 0.1931 | 0.0337 | 0.1354 | 0.1355    | 57.76   | 0.0087 | 0.0397     | 1.0             | 1.7061       | 2061.0             | 1208.0           | 0.826     | 0.8528 | 0.8391 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
-| No log        | 10.0  | 70   | 8.7531          | 0.1839 | 0.0283 | 0.1264 | 0.1269    | 58.26   | 0.0    | 0.0373     | 1.0             | 1.7194       | 2077.0             | 1208.0           | 0.8233    | 0.8502 | 0.8364 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
-| No log        | 11.0  | 77   | 8.2701          | 0.1843 | 0.0262 | 0.1321 | 0.1324    | 55.72   | 0.0    | 0.0365     | 1.0             | 1.6432       | 1985.0             | 1208.0           | 0.8259    | 0.8509 | 0.8381 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
-| No log        | 12.0  | 84   | 8.0673          | 0.1845 | 0.0244 | 0.129  | 0.1293    | 55.32   | 0.0    | 0.0357     | 1.0             | 1.6374       | 1978.0             | 1208.0           | 0.8258    | 0.8507 | 0.838  | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
 ### Framework versions

 This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 4.8960
+- Rouge1: 0.0787
+- Rouge2: 0.0206
+- Rougel: 0.0662
+- Rougelsum: 0.067
+- Gen Len: 63.0
+- Bleu: 0.0106
+- Precisions: 0.0227
 - Brevity Penalty: 1.0
+- Length Ratio: 1.4992
+- Translation Length: 1811.0
 - Reference Length: 1208.0
+- Precision: 0.7672
+- Recall: 0.8255
+- F1: 0.7948
 - Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
 ## Model description
 - total_train_batch_size: 16
 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
+- num_epochs: 15
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu   | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1     | Hashcode                                                  |
 |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
+| No log        | 1.0   | 7    | 28.7642         | 0.186  | 0.0294 | 0.1265 | 0.1264    | 62.06   | 0.0096 | 0.0362     | 1.0             | 1.822        | 2201.0             | 1208.0           | 0.8206    | 0.8529 | 0.8364 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
+| No log        | 2.0   | 14   | 26.0446         | 0.1828 | 0.0274 | 0.1243 | 0.1244    | 61.32   | 0.0091 | 0.0352     | 1.0             | 1.7815       | 2152.0             | 1208.0           | 0.8204    | 0.8522 | 0.8359 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
+| No log        | 3.0   | 21   | 23.8385         | 0.187  | 0.0264 | 0.1252 | 0.1248    | 60.82   | 0.0    | 0.0368     | 1.0             | 1.7707       | 2139.0             | 1208.0           | 0.8223    | 0.8531 | 0.8373 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
+| No log        | 4.0   | 28   | 21.9343         | 0.1873 | 0.0255 | 0.1282 | 0.1279    | 59.98   | 0.0    | 0.0374     | 1.0             | 1.7475       | 2111.0             | 1208.0           | 0.8229    | 0.8523 | 0.8373 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
+| No log        | 5.0   | 35   | 20.2839         | 0.1972 | 0.0303 | 0.1325 | 0.1326    | 59.46   | 0.0079 | 0.0387     | 1.0             | 1.7566       | 2122.0             | 1208.0           | 0.825     | 0.8537 | 0.839  | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
+| No log        | 6.0   | 42   | 18.7365         | 0.1927 | 0.0329 | 0.1346 | 0.1346    | 59.28   | 0.0127 | 0.039      | 1.0             | 1.7492       | 2113.0             | 1208.0           | 0.8246    | 0.8532 | 0.8386 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
+| No log        | 7.0   | 49   | 17.0557         | 0.1935 | 0.0352 | 0.1361 | 0.1363    | 59.44   | 0.0134 | 0.0403     | 1.0             | 1.7409       | 2103.0             | 1208.0           | 0.8248    | 0.8529 | 0.8385 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
+| No log        | 8.0   | 56   | 15.0848         | 0.187  | 0.0355 | 0.1301 | 0.1308    | 59.14   | 0.0115 | 0.0381     | 1.0             | 1.7351       | 2096.0             | 1208.0           | 0.8232    | 0.8515 | 0.837  | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
+| No log        | 9.0   | 63   | 12.3513         | 0.1883 | 0.0324 | 0.1297 | 0.1298    | 59.32   | 0.0108 | 0.0374     | 1.0             | 1.7384       | 2100.0             | 1208.0           | 0.8231    | 0.8511 | 0.8368 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
+| No log        | 10.0  | 70   | 8.3897          | 0.1907 | 0.0356 | 0.1361 | 0.1363    | 57.58   | 0.0108 | 0.0384     | 1.0             | 1.6887       | 2040.0             | 1208.0           | 0.825     | 0.8515 | 0.838  | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
+| No log        | 11.0  | 77   | 5.0447          | 0.1462 | 0.0254 | 0.1123 | 0.1124    | 57.06   | 0.0102 | 0.0334     | 1.0             | 1.5712       | 1898.0             | 1208.0           | 0.8064    | 0.8435 | 0.8242 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
+| No log        | 12.0  | 84   | 4.7254          | 0.0956 | 0.0096 | 0.0715 | 0.0714    | 61.38   | 0.0    | 0.0228     | 1.0             | 1.5132       | 1828.0             | 1208.0           | 0.7674    | 0.8298 | 0.7968 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
+| No log        | 13.0  | 91   | 4.8274          | 0.0795 | 0.013  | 0.061  | 0.0603    | 61.68   | 0.0075 | 0.0219     | 1.0             | 1.4909       | 1801.0             | 1208.0           | 0.7666    | 0.8251 | 0.7943 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
+| No log        | 14.0  | 98   | 4.8941          | 0.0822 | 0.0202 | 0.0684 | 0.0691    | 63.0    | 0.0098 | 0.0217     | 1.0             | 1.5414       | 1862.0             | 1208.0           | 0.7746    | 0.8263 | 0.7992 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
+| No log        | 15.0  | 105  | 4.8960          | 0.0787 | 0.0206 | 0.0662 | 0.067     | 63.0    | 0.0106 | 0.0227     | 1.0             | 1.4992       | 1811.0             | 1208.0           | 0.7672    | 0.8255 | 0.7948 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6775b66a1007770897e652e42d91bfc03994d7f283dee3725f319b2c55acf9c9
 size 1187780840

 version https://git-lfs.github.com/spec/v1
+oid sha256:7505a676b9ab398044918c117dcd8d1a0d91bda9f7f7ddf9efb9b3331b40d7de
 size 1187780840

runs/Jul10_11-30-29_tardis/events.out.tfevents.1752139830.tardis.31443.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c60dc00864f61f0a27e23a4f22f79544170f64d1b6675a510cc0478d8fffafcb
+size 22680

tokenizer.json CHANGED Viewed

@@ -7,7 +7,9 @@
     "stride": 0
   },
   "padding": {
-    "strategy": "BatchLongest",
     "direction": "Right",
     "pad_to_multiple_of": null,
     "pad_id": 0,

     "stride": 0
   },
   "padding": {
+    "strategy": {
+      "Fixed": 64
+    },
     "direction": "Right",
     "pad_to_multiple_of": null,
     "pad_id": 0,

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:3540ea8fc39a4444a9c659fb39e5ee976fb8756e41e055cfd908df9612c7e5ea
 size 5905

 version https://git-lfs.github.com/spec/v1
+oid sha256:4f0e208ac9fde542e02371cb15d531fd42ea59e9ea3810e596ded26b0a436b01
 size 5905