FlanT5-Small Grammar Correction

Fine-tuned google/flan-t5-small on the grammarly/coedit dataset for English Grammar Error Correction (GEC).

Training Details

Base model: google/flan-t5-small (77M params)
Dataset: grammarly/coedit (GEC subset, 2000 training examples)
Training recipe: Based on CoEdIT paper (EMNLP 2023)
Epochs: 3
Learning rate: 3e-4
Final training loss: 0.27

Usage

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("xhimanshuz/flan-t5-small-grammar-correction")
model = AutoModelForSeq2SeqLM.from_pretrained("xhimanshuz/flan-t5-small-grammar-correction")

text = "Fix the grammar: I goes to school yesterday and learn many thing."
inputs = tokenizer(text, return_tensors="pt", max_length=128, truncation=True)
outputs = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
# Output: I went to school yesterday and learned many things.

Supported Instructions

Use instruction prefixes from the CoEdIT format:

"Fix the grammar: <text>"
"Fix grammatical errors in this sentence: <text>"
"Improve the grammaticality: <text>"
"Remove all grammatical errors from this text: <text>"

Example Results

Input	Output
I goes to school yesterday and learn many thing.	I went to school yesterday and learned many things.
She don't know what are she doing.	She doesn't know what she is doing.
The informations was very helpfull for our researchs.	The information was very helpful for our research.
He have went to the market and buyed some apple.	He has gone to the market and bought some apple.
The childs was playing in park when it start raining.	The children were playing in the park when it started raining.

Training Loss Curve

Step	Loss	Epoch
1	0.669	0.00
100	0.484	0.40
250	0.448	1.00
500	0.325	2.00
750	0.292	3.00

Scaling Up

This model was trained on a 2000-example subset on CPU as a demonstration. For better performance:

More data: Train on the full 19K GEC examples from grammarly/coedit, or all 69K examples (including simplification, paraphrasing, etc.)
Larger model: Use google/flan-t5-base (250M) or google/flan-t5-large (770M)
GPU training: Use A10G or A100 GPUs for faster training with larger batch sizes
More epochs: Train for 5 epochs with early stopping (CoEdIT paper recipe)

Citation

@inproceedings{raheja2023coedit,
  title={CoEdIT: Text Editing by Task-Specific Instruction Tuning},
  author={Raheja, Vipul and Kumar, Dhruv and Koo, Ryan and Kang, Dongyeop},
  booktitle={EMNLP 2023},
  year={2023}
}

Downloads last month: 229

Safetensors

Model size

77M params

Tensor type

F32

Model tree for xhimanshuz/flan-t5-small-grammar-correction

Base model

google/flan-t5-small

Finetuned

(499)

this model

Dataset used to train xhimanshuz/flan-t5-small-grammar-correction

Paper for xhimanshuz/flan-t5-small-grammar-correction

CoEdIT: Text Editing by Task-Specific Instruction Tuning

Paper • 2305.09857 • Published May 17, 2023 • 10