DeAR-8B-Reranker-RankNet-v1

Model Description

DeAR-8B-Reranker-RankNet-v1 is an 8B parameter neural reranker trained with RankNet loss and knowledge distillation. This model is part of the DeAR framework family and achieves strong performance on standard IR benchmarks while being significantly faster than larger teacher models.

Model Details

  • Model Type: Pointwise Reranker (Sequence Classification)
  • Base Model: LLaMA-3.1-8B
  • Parameters: 8 billion
  • Training Method: Knowledge Distillation + RankNet Loss
  • Teacher Model: LLaMA2-13B-RankLLaMA
  • Training Data: MS MARCO
  • Precision: BFloat16

Key Features

βœ… High Performance: Competitive with 13B teacher on BEIR benchmarks
βœ… Fast Inference: 2.2s average latency on standard GPU
βœ… Memory Efficient: Fits on single 24GB GPU
βœ… Knowledge Distillation: Enhanced with Chain-of-Thought reasoning

Performance

Benchmark NDCG@10
TREC DL19 74.5
TREC DL20 72.8
BEIR (Avg) 45.2
MS MARCO Dev 68.9

Usage

Quick Start

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

# Load model
model_path = "abdoelsayed/dear-8b-reranker-ranknet-v1"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForSequenceClassification.from_pretrained(
    model_path,
    torch_dtype=torch.bfloat16
)
model.eval().cuda()

# Score a query-document pair
query = "What is machine learning?"
document = "Machine learning is a subset of artificial intelligence..."

inputs = tokenizer(
    f"query: {query}",
    f"document: {document}",
    return_tensors="pt",
    truncation=True,
    max_length=228,  # q_max_len(32) + p_max_len(196)
    padding="max_length"
)
inputs = {k: v.cuda() for k, v in inputs.items()}

with torch.no_grad():
    score = model(**inputs).logits.squeeze().item()
    
print(f"Relevance score: {score}")

Batch Reranking

def rerank_documents(query, documents, model, tokenizer, batch_size=64):
    """
    Rerank a list of documents for a query.
    
    Args:
        query: Search query string
        documents: List of (title, text) tuples
        model: Loaded reranker model
        tokenizer: Loaded tokenizer
        batch_size: Batch size for inference
    
    Returns:
        List of (index, score) tuples sorted by relevance
    """
    scores = []
    
    for i in range(0, len(documents), batch_size):
        batch_docs = documents[i:i + batch_size]
        
        # Prepare inputs
        queries = [f"query: {query}"] * len(batch_docs)
        docs = [f"document: {title} {text}" for title, text in batch_docs]
        
        inputs = tokenizer(
            queries,
            docs,
            return_tensors="pt",
            truncation=True,
            max_length=228,
            padding=True
        )
        inputs = {k: v.to(model.device) for k, v in inputs.items()}
        
        # Get scores
        with torch.no_grad():
            logits = model(**inputs).logits.squeeze(-1)
            scores.extend(logits.cpu().tolist())
    
    # Sort by score (descending)
    ranked = sorted(enumerate(scores), key=lambda x: x[1], reverse=True)
    return ranked


# Example usage
query = "When was the Eiffel Tower built?"
documents = [
    ("Eiffel Tower", "The Eiffel Tower was built in 1889 for the World's Fair."),
    ("Paris", "Paris is the capital of France."),
    ("Architecture", "Modern architecture has evolved significantly."),
]

ranking = rerank_documents(query, documents, model, tokenizer)
print(ranking)
# Output: [(0, 8.23), (1, 2.45), (2, -1.87)]

Training Details

Training Data

  • Primary Dataset: MS MARCO Passage Ranking

Hardware

  • GPUs: 4x NVIDIA A100 (40GB)
  • Training Time: ~36 hours
  • DeepSpeed: ZeRO Stage 2

Loss Function

RankNet Loss with Knowledge Distillation:

L_total = (1 - Ξ±) * L_RankNet + Ξ± * L_KD

where:
- L_RankNet: Pairwise ranking loss
- L_KD: KL divergence with teacher (temperature=2)
- Ξ±: 0.1 (distillation weight)

Evaluation Results

TREC Deep Learning

Dataset NDCG@10 NDCG@20 MAP
DL19 74.50 70.23 45.67
DL20 72.80 69.15 43.21

BEIR Benchmark

Dataset NDCG@10
MS MARCO 68.9
NQ 52.3
HotpotQA 61.8
FiQA 47.2
ArguAna 59.4
SciFact 73.6
TREC-COVID 85.2
NFCorpus 39.8

Efficiency

Metric Value
Inference Time (100 docs) 2.2s
GPU Memory (inference) 18GB
Throughput ~45 docs/sec

Comparison with Other Models

Model Size TREC DL19 BEIR Avg Inference (s)
MonoT5-3B 3B 71.8 43.5 3.5
DeAR-P-8B-RL 8B 74.5 45.2 2.2
Teacher (13B) 13B 73.8 44.8 5.8

Model Architecture

Input: "query: [Q] [SEP] document: [D]"
    ↓
LLaMA-3.1-8B Encoder
    ↓
[CLS] Token Representation
    ↓
Linear Classification Head
    ↓
Relevance Score (scalar)

Limitations

  • Domain Adaptation: Trained primarily on MS MARCO; may require fine-tuning for specialized domains
  • Query Length: Optimized for queries up to 32 tokens
  • Document Length: Truncated to 196 tokens; longer documents lose information
  • Language: English only
  • Numerical Reasoning: Limited capability for queries requiring calculations

Bias and Fairness

This model inherits biases present in:

  • Base LLaMA-3.1-8B model
  • MS MARCO training data
  • Teacher model annotations

Users should evaluate fairness for their specific use cases.

Ethical Considerations

  • Search Ranking: Can influence information access and visibility
  • Training Data: May contain biased or sensitive content
  • Misuse Potential: Should not be used for surveillance or discriminatory ranking

Related Models

DeAR Family:

Teacher:

Dataset:

Citation

@article{abdallah2025dear,
  title={DeAR: Dual-Stage Document Reranking with Reasoning Agents via LLM Distillation},
  author={Abdallah, Abdelrahman and Mozafari, Jamshid and Piryani, Bhawna and Jatowt, Adam},
  journal={arXiv preprint arXiv:2508.16998},
  year={2025}
}

License

MIT License

Contact

Downloads last month
34
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for abdoelsayed/dear-8b-reranker-ranknet-v1

Finetuned
(1606)
this model

Datasets used to train abdoelsayed/dear-8b-reranker-ranknet-v1

Collection including abdoelsayed/dear-8b-reranker-ranknet-v1