FLAN-T5-Base Fine-tuned on XSum with LoRA
This model is a fine-tuned version of google/flan-t5-base on the XSum dataset using LoRA (Low-Rank Adaptation) for parameter-efficient fine-tuning.
Model Description
- Base Model: google/flan-t5-base
- Task: Extreme Summarization (one-sentence summaries)
- Dataset: XSum (BBC news articles)
- Training Method: LoRA (Low-Rank Adaptation)
- Parameters: 0.00M trainable (0.00% of 249.35M total)
Training Details
LoRA Configuration
- Rank (r): 16
- Alpha: 32
- Target modules: q, v
- Dropout: 0.05
Training Hyperparameters
- Learning rate: 3e-4
- Batch size: 8
- Epochs: 3
- Optimizer: AdamW
- Mixed precision: FP16
Usage
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
from peft import PeftModel
# Load base model and tokenizer
base_model = AutoModelForSeq2SeqLM.from_pretrained("google/flan-t5-base")
tokenizer = AutoTokenizer.from_pretrained("AKG2/flan-t5-base-xsum-lora")
# Load LoRA adapters
model = PeftModel.from_pretrained(base_model, "AKG2/flan-t5-base-xsum-lora")
# Generate summary
text = "Your article text here..."
inputs = tokenizer("summarize: " + text, return_tensors="pt", max_length=512, truncation=True)
outputs = model.generate(**inputs, max_length=64, num_beams=4, length_penalty=2.0)
summary = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(summary)
Performance
Evaluation metrics on XSum test set:
- ROUGE-1: [Add your score]
- ROUGE-2: [Add your score]
- ROUGE-L: [Add your score]
Citation
If you use this model, please cite the original FLAN-T5 paper and the XSum dataset:
@article{chung2022scaling,
title={Scaling instruction-finetuned language models},
author={Chung, Hyung Won and others},
journal={arXiv preprint arXiv:2210.11416},
year={2022}
}
@inproceedings{narayan2018don,
title={Don't give me the details, just the summary! topic-aware convolutional neural networks for extreme summarization},
author={Narayan, Shashi and others},
booktitle={EMNLP},
year={2018}
}
License
This model inherits the license from the base model: Apache 2.0
Trained by: AKG2
- Downloads last month
- 12
Model tree for AKG2/flan-t5-base-xsum-lora
Base model
google/flan-t5-base