floflodebilbao's picture
End of training
12934b5 verified
|
raw
history blame
5.15 kB
metadata
library_name: peft
license: apache-2.0
base_model: allenai/led-base-16384
tags:
  - generated_from_trainer
metrics:
  - rouge
  - bleu
  - precision
  - recall
  - f1
model-index:
  - name: Lora_LED_sum_approach
    results: []

Lora_LED_sum_approach

This model is a fine-tuned version of allenai/led-base-16384 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.5599
  • Rouge1: 0.4697
  • Rouge2: 0.239
  • Rougel: 0.3921
  • Rougelsum: 0.3927
  • Gen Len: 29.3
  • Bleu: 0.1424
  • Precisions: 0.2062
  • Brevity Penalty: 0.8922
  • Length Ratio: 0.8976
  • Translation Length: 1096.0
  • Reference Length: 1221.0
  • Precision: 0.9067
  • Recall: 0.9023
  • F1: 0.9044
  • Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.002
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len Bleu Precisions Brevity Penalty Length Ratio Translation Length Reference Length Precision Recall F1 Hashcode
7.7469 1.0 7 6.8825 0.4069 0.2053 0.3491 0.3496 32.0 0.1293 0.164 1.0 1.0713 1308.0 1221.0 0.8782 0.8868 0.8824 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
5.647 2.0 14 4.7268 0.4079 0.2091 0.3571 0.3564 24.94 0.1023 0.2027 0.6841 0.7248 885.0 1221.0 0.9076 0.8896 0.8984 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
4.2551 3.0 21 3.9355 0.4487 0.2508 0.3879 0.3876 27.34 0.1555 0.2293 0.8182 0.8329 1017.0 1221.0 0.9067 0.8982 0.9023 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
3.6931 4.0 28 3.7415 0.4466 0.2287 0.3819 0.3833 25.88 0.126 0.217 0.7559 0.7813 954.0 1221.0 0.9073 0.8943 0.9006 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
3.4714 5.0 35 3.6417 0.4519 0.2393 0.3936 0.3948 27.74 0.1386 0.2131 0.8231 0.837 1022.0 1221.0 0.9094 0.8988 0.904 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
3.3284 6.0 42 3.6012 0.4464 0.2381 0.3804 0.383 28.96 0.1494 0.2089 0.8721 0.8796 1074.0 1221.0 0.9039 0.8991 0.9014 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
3.245 7.0 49 3.5702 0.4443 0.2155 0.3753 0.3765 28.2 0.1286 0.198 0.8525 0.8624 1053.0 1221.0 0.906 0.8975 0.9016 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
3.1794 8.0 56 3.5747 0.4596 0.2332 0.3882 0.3881 30.18 0.148 0.2069 0.9075 0.9115 1113.0 1221.0 0.9018 0.9007 0.9012 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
3.1144 9.0 63 3.5583 0.4513 0.2278 0.3795 0.3806 29.26 0.1358 0.2003 0.8794 0.8862 1082.0 1221.0 0.9037 0.9 0.9018 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
3.1082 10.0 70 3.5599 0.4697 0.239 0.3921 0.3927 29.3 0.1424 0.2062 0.8922 0.8976 1096.0 1221.0 0.9067 0.9023 0.9044 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)

Framework versions

  • PEFT 0.15.2
  • Transformers 4.53.1
  • Pytorch 2.7.0+cu126
  • Datasets 3.6.0
  • Tokenizers 0.21.1