Paper-Summarizer-Qwen3-14B
A fine-tuned Qwen3-14B model specialized for generating structured summaries of scientific research papers in standardized JSON format.
Model Description
This model is part of Project AELLA, developed in collaboration with LAION and Wynd Labs to democratize access to scientific knowledge by creating structured summaries of research papers at scale.
Base Model: Qwen 3 14B Training Data: 110,000 curated research papers Performance: Achieves 73.9% accuracy on QA evaluation, comparable to GPT-5 (74.6%) Cost Efficiency: 98% lower cost than closed-source alternatives
This generates comprehensive structured summaries in a JSON format. The papers are either classified as SCIENTIFIC_TEXT, PARTIAL_SCIENTIFIC_TEXT, or NON_SCIENTIFIC_TEXT. The fields extracted are key research elements such as methodology, results, claims, and limitations.
The model supports papers up to 131K tokens.
Usage
Serving the Model
vllm serve inference-net/Paper-Summarizer-Qwen3-14B \
--port 8000 \
--host 0.0.0.0 \
--trust-remote-code \
--data-parallel-size 1 \
--tensor-parallel-size 1 \
--max-num-seqs 32 \
--max-model-len 131072 \
--max-num-batched-tokens 8192 \
--gpu-memory-utilization 0.90 \
--enable-prefix-caching \
--enable-chunked-prefill
Making Requests
import requests
# System prompt (required)
system_prompt = """[Insert the full system prompt from the prompt.txt file -
see the full prompt in the model repository]"""
# User prompt: the paper text to summarize
paper_text = """
Title: Your Paper Title
Authors: Author 1, Author 2
Abstract: ...
[Full paper content]
"""
# API request
response = requests.post(
"http://localhost:8000/v1/chat/completions",
json={
"model": "inference-net/Paper-Summarizer-Qwen3-14B",
"messages": [
{"role": "system", "content": system_prompt},
{"role": "user", "content": paper_text}
],
"temperature": 0.2
},
timeout=600
)
result = response.json()
summary = result["choices"][0]["message"]["content"]
print(summary)
System Prompt
The model requires a specific system prompt that defines the JSON schema and extraction instructions. The prompt instructs the model to:
- Classify the text as SCIENTIFIC_TEXT, PARTIAL_SCIENTIFIC_TEXT, or NON_SCIENTIFIC_TEXT
- Extract structured information including:
- Title, authors, publication year
- Research context and hypotheses
- Methodological details
- Key results with quantitative data
- Claims with supporting evidence
- Limitations and ethical considerations
The full system prompt is available in the model repository's prompt.txt file.
Output Format
The model outputs a single valid JSON object with this structure:
{
"article_classification": "SCIENTIFIC_TEXT",
"reason": null,
"summary": {
"title": "",
"authors": "",
"publication_year": null,
"field_subfield": "",
"executive_summary": "",
"research_context": "",
"methodological_details": "",
"key_results": "",
"claims": [...],
"contradictions_and_limitations": "",
...
}
}
Performance
LLM-as-a-Judge Evaluation
- Score: 4.207/5.0
- Comparison: Within 15% of GPT-5 (4.805/5.0)
QA Dataset Evaluation
- Accuracy: 73.9%
- Comparison: Ties with Gemini 2.5 Flash, nearly matches GPT-5 (74.6%)
Throughput
- Requests/sec: 0.43
- Input Tokens/sec: 7,516.54
- Output Tokens/sec: 2,588.30
Training Details
- Training Set: 100,000 papers
- Validation Set: 10,000 papers
- Average Paper Length: 81,334 characters
- Training Approach: Post-training on summaries generated by frontier models (GPT-5, Claude 4.5 Sonnet, Gemini 2.5 Pro)
Limitations
- May generate subtle factual errors (hallucinations) for fine-grained details
- Context limit (131K tokens) may truncate extremely long documents
- Unified schema may not capture all domain-specific nuances
- Summaries are research aids, not replacements for primary sources in high-stakes scenarios
Related Resources
- Paper Visualization Website: https://laion.inference.net
- Visualization Repository: https://github.com/context-labs/laion-data-explorer
- Alexandria Paper: https://arxiv.org/abs/2502.19413
- Nemotron Variant: inference-net/Paper-Summarizer-Nemotron-12B
License
[License information to be added]
Acknowledgments
This work was made possible through collaboration with:
- LAION
- Wynd Labs
- Inference.net
- Contributors to bethgelab, PeS2o, Common Pile, and OpenAlex
- Downloads last month
- 80