Paper-Summarizer-Qwen3-14B

A fine-tuned Qwen3-14B model specialized for generating structured summaries of scientific research papers in standardized JSON format.

Model Description

This model is part of Project AELLA, developed in collaboration with LAION and Wynd Labs to democratize access to scientific knowledge by creating structured summaries of research papers at scale.

Base Model: Qwen 3 14B Training Data: 110,000 curated research papers Performance: Achieves 73.9% accuracy on QA evaluation, comparable to GPT-5 (74.6%) Cost Efficiency: 98% lower cost than closed-source alternatives

This generates comprehensive structured summaries in a JSON format. The papers are either classified as SCIENTIFIC_TEXT, PARTIAL_SCIENTIFIC_TEXT, or NON_SCIENTIFIC_TEXT. The fields extracted are key research elements such as methodology, results, claims, and limitations.

The model supports papers up to 131K tokens.

Usage

Serving the Model

vllm serve inference-net/Paper-Summarizer-Qwen3-14B \
--port 8000 \
--host 0.0.0.0 \
--trust-remote-code \
--data-parallel-size 1 \
--tensor-parallel-size 1 \
--max-num-seqs 32 \
--max-model-len 131072 \
--max-num-batched-tokens 8192 \
--gpu-memory-utilization 0.90 \
--enable-prefix-caching \
--enable-chunked-prefill

Making Requests

import requests

# System prompt (required)
system_prompt = """[Insert the full system prompt from the prompt.txt file -
see the full prompt in the model repository]"""

# User prompt: the paper text to summarize
paper_text = """
Title: Your Paper Title
Authors: Author 1, Author 2
Abstract: ...
[Full paper content]
"""

# API request
response = requests.post(
"http://localhost:8000/v1/chat/completions",
json={
"model": "inference-net/Paper-Summarizer-Qwen3-14B",
"messages": [
{"role": "system", "content": system_prompt},
{"role": "user", "content": paper_text}
],
"temperature": 0.2
},
timeout=600
)

result = response.json()
summary = result["choices"][0]["message"]["content"]
print(summary)

System Prompt

The model requires a specific system prompt that defines the JSON schema and extraction instructions. The prompt instructs the model to:

Classify the text as SCIENTIFIC_TEXT, PARTIAL_SCIENTIFIC_TEXT, or NON_SCIENTIFIC_TEXT
Extract structured information including:

Title, authors, publication year
Research context and hypotheses
Methodological details
Key results with quantitative data
Claims with supporting evidence
Limitations and ethical considerations

The full system prompt is available in the model repository's prompt.txt file.

Output Format

The model outputs a single valid JSON object with this structure:

{
"article_classification": "SCIENTIFIC_TEXT",
"reason": null,
"summary": {
"title": "",
"authors": "",
"publication_year": null,
"field_subfield": "",
"executive_summary": "",
"research_context": "",
"methodological_details": "",
"key_results": "",
"claims": [...],
"contradictions_and_limitations": "",
...
}
}

Performance

LLM-as-a-Judge Evaluation

Score: 4.207/5.0
Comparison: Within 15% of GPT-5 (4.805/5.0)

QA Dataset Evaluation

Accuracy: 73.9%
Comparison: Ties with Gemini 2.5 Flash, nearly matches GPT-5 (74.6%)

Throughput

Requests/sec: 0.43
Input Tokens/sec: 7,516.54
Output Tokens/sec: 2,588.30

Training Details

Training Set: 100,000 papers
Validation Set: 10,000 papers
Average Paper Length: 81,334 characters
Training Approach: Post-training on summaries generated by frontier models (GPT-5, Claude 4.5 Sonnet, Gemini 2.5 Pro)

Limitations

May generate subtle factual errors (hallucinations) for fine-grained details
Context limit (131K tokens) may truncate extremely long documents
Unified schema may not capture all domain-specific nuances
Summaries are research aids, not replacements for primary sources in high-stakes scenarios

Related Resources

Paper Visualization Website: https://laion.inference.net
Visualization Repository: https://github.com/context-labs/laion-data-explorer
Alexandria Paper: https://arxiv.org/abs/2502.19413
Nemotron Variant: inference-net/Paper-Summarizer-Nemotron-12B

License

[License information to be added]

Acknowledgments

This work was made possible through collaboration with:

LAION
Wynd Labs
Inference.net
Contributors to bethgelab, PeS2o, Common Pile, and OpenAlex

Downloads last month: 80

Safetensors

Model size

15B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for inference-net/Aella-Qwen3-14B

Base model

Qwen/Qwen3-14B-Base

Finetuned

Qwen/Qwen3-14B

Finetuned

(145)

this model

Quantizations

5 models