DeAR-Reranking
Collection
DeAR (Deep Agent Rank): Dual-Stage Document Reranking with Reasoning Agents Accepted at EMNLP Findings 2025
β’
12 items
β’
Updated
β’
1
DeAR-8B-Reranker-RankNet-v1 is an 8B parameter neural reranker trained with RankNet loss and knowledge distillation. This model is part of the DeAR framework family and achieves strong performance on standard IR benchmarks while being significantly faster than larger teacher models.
β
High Performance: Competitive with 13B teacher on BEIR benchmarks
β
Fast Inference: 2.2s average latency on standard GPU
β
Memory Efficient: Fits on single 24GB GPU
β
Knowledge Distillation: Enhanced with Chain-of-Thought reasoning
| Benchmark | NDCG@10 |
|---|---|
| TREC DL19 | 74.5 |
| TREC DL20 | 72.8 |
| BEIR (Avg) | 45.2 |
| MS MARCO Dev | 68.9 |
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification
# Load model
model_path = "abdoelsayed/dear-8b-reranker-ranknet-v1"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForSequenceClassification.from_pretrained(
model_path,
torch_dtype=torch.bfloat16
)
model.eval().cuda()
# Score a query-document pair
query = "What is machine learning?"
document = "Machine learning is a subset of artificial intelligence..."
inputs = tokenizer(
f"query: {query}",
f"document: {document}",
return_tensors="pt",
truncation=True,
max_length=228, # q_max_len(32) + p_max_len(196)
padding="max_length"
)
inputs = {k: v.cuda() for k, v in inputs.items()}
with torch.no_grad():
score = model(**inputs).logits.squeeze().item()
print(f"Relevance score: {score}")
def rerank_documents(query, documents, model, tokenizer, batch_size=64):
"""
Rerank a list of documents for a query.
Args:
query: Search query string
documents: List of (title, text) tuples
model: Loaded reranker model
tokenizer: Loaded tokenizer
batch_size: Batch size for inference
Returns:
List of (index, score) tuples sorted by relevance
"""
scores = []
for i in range(0, len(documents), batch_size):
batch_docs = documents[i:i + batch_size]
# Prepare inputs
queries = [f"query: {query}"] * len(batch_docs)
docs = [f"document: {title} {text}" for title, text in batch_docs]
inputs = tokenizer(
queries,
docs,
return_tensors="pt",
truncation=True,
max_length=228,
padding=True
)
inputs = {k: v.to(model.device) for k, v in inputs.items()}
# Get scores
with torch.no_grad():
logits = model(**inputs).logits.squeeze(-1)
scores.extend(logits.cpu().tolist())
# Sort by score (descending)
ranked = sorted(enumerate(scores), key=lambda x: x[1], reverse=True)
return ranked
# Example usage
query = "When was the Eiffel Tower built?"
documents = [
("Eiffel Tower", "The Eiffel Tower was built in 1889 for the World's Fair."),
("Paris", "Paris is the capital of France."),
("Architecture", "Modern architecture has evolved significantly."),
]
ranking = rerank_documents(query, documents, model, tokenizer)
print(ranking)
# Output: [(0, 8.23), (1, 2.45), (2, -1.87)]
RankNet Loss with Knowledge Distillation:
L_total = (1 - Ξ±) * L_RankNet + Ξ± * L_KD
where:
- L_RankNet: Pairwise ranking loss
- L_KD: KL divergence with teacher (temperature=2)
- Ξ±: 0.1 (distillation weight)
| Dataset | NDCG@10 | NDCG@20 | MAP |
|---|---|---|---|
| DL19 | 74.50 | 70.23 | 45.67 |
| DL20 | 72.80 | 69.15 | 43.21 |
| Dataset | NDCG@10 |
|---|---|
| MS MARCO | 68.9 |
| NQ | 52.3 |
| HotpotQA | 61.8 |
| FiQA | 47.2 |
| ArguAna | 59.4 |
| SciFact | 73.6 |
| TREC-COVID | 85.2 |
| NFCorpus | 39.8 |
| Metric | Value |
|---|---|
| Inference Time (100 docs) | 2.2s |
| GPU Memory (inference) | 18GB |
| Throughput | ~45 docs/sec |
| Model | Size | TREC DL19 | BEIR Avg | Inference (s) |
|---|---|---|---|---|
| MonoT5-3B | 3B | 71.8 | 43.5 | 3.5 |
| DeAR-P-8B-RL | 8B | 74.5 | 45.2 | 2.2 |
| Teacher (13B) | 13B | 73.8 | 44.8 | 5.8 |
Input: "query: [Q] [SEP] document: [D]"
β
LLaMA-3.1-8B Encoder
β
[CLS] Token Representation
β
Linear Classification Head
β
Relevance Score (scalar)
This model inherits biases present in:
Users should evaluate fairness for their specific use cases.
DeAR Family:
Teacher:
Dataset:
@article{abdallah2025dear,
title={DeAR: Dual-Stage Document Reranking with Reasoning Agents via LLM Distillation},
author={Abdallah, Abdelrahman and Mozafari, Jamshid and Piryani, Bhawna and Jatowt, Adam},
journal={arXiv preprint arXiv:2508.16998},
year={2025}
}
MIT License
Base model
meta-llama/Llama-3.1-8B