End of training
Browse files- README.md +29 -26
- model.safetensors +1 -1
- runs/Jul10_11-30-29_tardis/events.out.tfevents.1752139830.tardis.31443.0 +3 -0
- tokenizer.json +3 -1
- training_args.bin +1 -1
README.md
CHANGED
|
@@ -22,21 +22,21 @@ should probably proofread and complete it, then remove this comment. -->
|
|
| 22 |
|
| 23 |
This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
|
| 24 |
It achieves the following results on the evaluation set:
|
| 25 |
-
- Loss:
|
| 26 |
-
- Rouge1: 0.
|
| 27 |
-
- Rouge2: 0.
|
| 28 |
-
- Rougel: 0.
|
| 29 |
-
- Rougelsum: 0.
|
| 30 |
-
- Gen Len:
|
| 31 |
-
- Bleu: 0.
|
| 32 |
-
- Precisions: 0.
|
| 33 |
- Brevity Penalty: 1.0
|
| 34 |
-
- Length Ratio: 1.
|
| 35 |
-
- Translation Length:
|
| 36 |
- Reference Length: 1208.0
|
| 37 |
-
- Precision: 0.
|
| 38 |
-
- Recall: 0.
|
| 39 |
-
- F1: 0.
|
| 40 |
- Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
|
| 41 |
|
| 42 |
## Model description
|
|
@@ -64,24 +64,27 @@ The following hyperparameters were used during training:
|
|
| 64 |
- total_train_batch_size: 16
|
| 65 |
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
| 66 |
- lr_scheduler_type: linear
|
| 67 |
-
- num_epochs:
|
| 68 |
|
| 69 |
### Training results
|
| 70 |
|
| 71 |
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1 | Hashcode |
|
| 72 |
|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
|
| 73 |
-
| No log | 1.0 | 7 |
|
| 74 |
-
| No log | 2.0 | 14 |
|
| 75 |
-
| No log | 3.0 | 21 |
|
| 76 |
-
| No log | 4.0 | 28 |
|
| 77 |
-
| No log | 5.0 | 35 |
|
| 78 |
-
| No log | 6.0 | 42 |
|
| 79 |
-
| No log | 7.0 | 49 |
|
| 80 |
-
| No log | 8.0 | 56 |
|
| 81 |
-
| No log | 9.0 | 63 |
|
| 82 |
-
| No log | 10.0 | 70 | 8.
|
| 83 |
-
| No log | 11.0 | 77 |
|
| 84 |
-
| No log | 12.0 | 84 |
|
|
|
|
|
|
|
|
|
|
| 85 |
|
| 86 |
|
| 87 |
### Framework versions
|
|
|
|
| 22 |
|
| 23 |
This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
|
| 24 |
It achieves the following results on the evaluation set:
|
| 25 |
+
- Loss: 4.8960
|
| 26 |
+
- Rouge1: 0.0787
|
| 27 |
+
- Rouge2: 0.0206
|
| 28 |
+
- Rougel: 0.0662
|
| 29 |
+
- Rougelsum: 0.067
|
| 30 |
+
- Gen Len: 63.0
|
| 31 |
+
- Bleu: 0.0106
|
| 32 |
+
- Precisions: 0.0227
|
| 33 |
- Brevity Penalty: 1.0
|
| 34 |
+
- Length Ratio: 1.4992
|
| 35 |
+
- Translation Length: 1811.0
|
| 36 |
- Reference Length: 1208.0
|
| 37 |
+
- Precision: 0.7672
|
| 38 |
+
- Recall: 0.8255
|
| 39 |
+
- F1: 0.7948
|
| 40 |
- Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
|
| 41 |
|
| 42 |
## Model description
|
|
|
|
| 64 |
- total_train_batch_size: 16
|
| 65 |
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
| 66 |
- lr_scheduler_type: linear
|
| 67 |
+
- num_epochs: 15
|
| 68 |
|
| 69 |
### Training results
|
| 70 |
|
| 71 |
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1 | Hashcode |
|
| 72 |
|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
|
| 73 |
+
| No log | 1.0 | 7 | 28.7642 | 0.186 | 0.0294 | 0.1265 | 0.1264 | 62.06 | 0.0096 | 0.0362 | 1.0 | 1.822 | 2201.0 | 1208.0 | 0.8206 | 0.8529 | 0.8364 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 74 |
+
| No log | 2.0 | 14 | 26.0446 | 0.1828 | 0.0274 | 0.1243 | 0.1244 | 61.32 | 0.0091 | 0.0352 | 1.0 | 1.7815 | 2152.0 | 1208.0 | 0.8204 | 0.8522 | 0.8359 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 75 |
+
| No log | 3.0 | 21 | 23.8385 | 0.187 | 0.0264 | 0.1252 | 0.1248 | 60.82 | 0.0 | 0.0368 | 1.0 | 1.7707 | 2139.0 | 1208.0 | 0.8223 | 0.8531 | 0.8373 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 76 |
+
| No log | 4.0 | 28 | 21.9343 | 0.1873 | 0.0255 | 0.1282 | 0.1279 | 59.98 | 0.0 | 0.0374 | 1.0 | 1.7475 | 2111.0 | 1208.0 | 0.8229 | 0.8523 | 0.8373 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 77 |
+
| No log | 5.0 | 35 | 20.2839 | 0.1972 | 0.0303 | 0.1325 | 0.1326 | 59.46 | 0.0079 | 0.0387 | 1.0 | 1.7566 | 2122.0 | 1208.0 | 0.825 | 0.8537 | 0.839 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 78 |
+
| No log | 6.0 | 42 | 18.7365 | 0.1927 | 0.0329 | 0.1346 | 0.1346 | 59.28 | 0.0127 | 0.039 | 1.0 | 1.7492 | 2113.0 | 1208.0 | 0.8246 | 0.8532 | 0.8386 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 79 |
+
| No log | 7.0 | 49 | 17.0557 | 0.1935 | 0.0352 | 0.1361 | 0.1363 | 59.44 | 0.0134 | 0.0403 | 1.0 | 1.7409 | 2103.0 | 1208.0 | 0.8248 | 0.8529 | 0.8385 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 80 |
+
| No log | 8.0 | 56 | 15.0848 | 0.187 | 0.0355 | 0.1301 | 0.1308 | 59.14 | 0.0115 | 0.0381 | 1.0 | 1.7351 | 2096.0 | 1208.0 | 0.8232 | 0.8515 | 0.837 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 81 |
+
| No log | 9.0 | 63 | 12.3513 | 0.1883 | 0.0324 | 0.1297 | 0.1298 | 59.32 | 0.0108 | 0.0374 | 1.0 | 1.7384 | 2100.0 | 1208.0 | 0.8231 | 0.8511 | 0.8368 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 82 |
+
| No log | 10.0 | 70 | 8.3897 | 0.1907 | 0.0356 | 0.1361 | 0.1363 | 57.58 | 0.0108 | 0.0384 | 1.0 | 1.6887 | 2040.0 | 1208.0 | 0.825 | 0.8515 | 0.838 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 83 |
+
| No log | 11.0 | 77 | 5.0447 | 0.1462 | 0.0254 | 0.1123 | 0.1124 | 57.06 | 0.0102 | 0.0334 | 1.0 | 1.5712 | 1898.0 | 1208.0 | 0.8064 | 0.8435 | 0.8242 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 84 |
+
| No log | 12.0 | 84 | 4.7254 | 0.0956 | 0.0096 | 0.0715 | 0.0714 | 61.38 | 0.0 | 0.0228 | 1.0 | 1.5132 | 1828.0 | 1208.0 | 0.7674 | 0.8298 | 0.7968 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 85 |
+
| No log | 13.0 | 91 | 4.8274 | 0.0795 | 0.013 | 0.061 | 0.0603 | 61.68 | 0.0075 | 0.0219 | 1.0 | 1.4909 | 1801.0 | 1208.0 | 0.7666 | 0.8251 | 0.7943 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 86 |
+
| No log | 14.0 | 98 | 4.8941 | 0.0822 | 0.0202 | 0.0684 | 0.0691 | 63.0 | 0.0098 | 0.0217 | 1.0 | 1.5414 | 1862.0 | 1208.0 | 0.7746 | 0.8263 | 0.7992 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 87 |
+
| No log | 15.0 | 105 | 4.8960 | 0.0787 | 0.0206 | 0.0662 | 0.067 | 63.0 | 0.0106 | 0.0227 | 1.0 | 1.4992 | 1811.0 | 1208.0 | 0.7672 | 0.8255 | 0.7948 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 88 |
|
| 89 |
|
| 90 |
### Framework versions
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 1187780840
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:7505a676b9ab398044918c117dcd8d1a0d91bda9f7f7ddf9efb9b3331b40d7de
|
| 3 |
size 1187780840
|
runs/Jul10_11-30-29_tardis/events.out.tfevents.1752139830.tardis.31443.0
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c60dc00864f61f0a27e23a4f22f79544170f64d1b6675a510cc0478d8fffafcb
|
| 3 |
+
size 22680
|
tokenizer.json
CHANGED
|
@@ -7,7 +7,9 @@
|
|
| 7 |
"stride": 0
|
| 8 |
},
|
| 9 |
"padding": {
|
| 10 |
-
"strategy":
|
|
|
|
|
|
|
| 11 |
"direction": "Right",
|
| 12 |
"pad_to_multiple_of": null,
|
| 13 |
"pad_id": 0,
|
|
|
|
| 7 |
"stride": 0
|
| 8 |
},
|
| 9 |
"padding": {
|
| 10 |
+
"strategy": {
|
| 11 |
+
"Fixed": 64
|
| 12 |
+
},
|
| 13 |
"direction": "Right",
|
| 14 |
"pad_to_multiple_of": null,
|
| 15 |
"pad_id": 0,
|
training_args.bin
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 5905
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4f0e208ac9fde542e02371cb15d531fd42ea59e9ea3810e596ded26b0a436b01
|
| 3 |
size 5905
|