---
license: mit
metrics:
- rouge
tags:
- summarization
- legal
- led
- transformers
- huggingface
- legal-domain
- india
- india-legal
- indian-law
language:
- en
base_model:
- allenai/led-base-16384
---

# 🧠 Legal Summarizer Model (Indian Legal Domain)

This model is a fine-tuned version of [`allenai/led-base-16384`](https://huggingface.co/allenai/led-base-16384), specifically trained on a curated dataset of **Indian legal documents**. It is optimized for summarizing long legal texts such as court judgments, case laws, contracts, and regulatory documents originating from the Indian judiciary and legal system.

## 📌 Model Use Case

Designed to generate concise and informative summaries of lengthy legal documents, such as:
- Contracts
- Legal notices
- Judgments
- Regulatory texts

## 📌 Model Use Case

This model is intended for summarizing complex and lengthy legal documents from the **Indian legal system**, including:
- Court judgments (Supreme Court, High Courts)
- Government acts and bills
- Contracts governed by Indian law
- Legal notices and petitions

## 🇮🇳 Domain Specialization

Unlike general-purpose summarization models, this model has been trained specifically on Indian legal content. This includes:
- Judgments and case laws sourced from Indian court databases
- Indian statutes, acts, and amendments
- Public legal notices and contract templates relevant to Indian jurisprudence

The vocabulary, phrasing, and structure of Indian legal writing have been captured more accurately by this model.

## 📈 Evaluation Metrics

| Metric     | Score  |
|------------|--------|
| ROUGE-1    | 50.13  |
| ROUGE-2    | 27.15  |
| ROUGE-L    | 28.14  |
| ROUGE-Lsum | 44.75  |


## 🚀 How to Use

```python
from transformers import LEDTokenizer, LEDForConditionalGeneration

tokenizer = LEDTokenizer.from_pretrained("TheGod-2003/legal-summarizer")
model = LEDForConditionalGeneration.from_pretrained("TheGod-2003/legal-summarizer")

text = "Your long legal document here..."
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=16384)
summary_ids = model.generate(inputs["input_ids"], max_length=512, num_beams=4)
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)

print(summary)