Models_Anlp
Collection
8 items
•
Updated
This model is a fine-tuned version of google/flan-t5-base on the XSum dataset using LoRA (Low-Rank Adaptation) for parameter-efficient fine-tuning.
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
from peft import PeftModel
# Load base model and tokenizer
base_model = AutoModelForSeq2SeqLM.from_pretrained("google/flan-t5-base")
tokenizer = AutoTokenizer.from_pretrained("Ekansh112/flan-t5-base-xsum-lora")
# Load LoRA adapters
model = PeftModel.from_pretrained(base_model, "Ekansh112/flan-t5-base-xsum-lora")
# Generate summary
text = "Your article text here..."
inputs = tokenizer("summarize: " + text, return_tensors="pt", max_length=512, truncation=True)
outputs = model.generate(**inputs, max_length=64, num_beams=4, length_penalty=2.0)
summary = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(summary)
Evaluation metrics on XSum test set:
If you use this model, please cite the original FLAN-T5 paper and the XSum dataset:
@article{chung2022scaling,
title={Scaling instruction-finetuned language models},
author={Chung, Hyung Won and others},
journal={arXiv preprint arXiv:2210.11416},
year={2022}
}
@inproceedings{narayan2018don,
title={Don't give me the details, just the summary! topic-aware convolutional neural networks for extreme summarization},
author={Narayan, Shashi and others},
booktitle={EMNLP},
year={2018}
}
This model inherits the license from the base model: Apache 2.0
Trained by: Ekansh112
Base model
google/flan-t5-base