Paper-Summarizer-Qwen3-14B

A fine-tuned Qwen3-14B model specialized for generating structured summaries of scientific research papers in standardized JSON format.

Model Description

This model is part of Project AELLA, developed in collaboration with LAION and Wynd Labs to democratize access to scientific knowledge by creating structured summaries of research papers at scale.

Base Model: Qwen 3 14B Training Data: 110,000 curated research papers Performance: Achieves 73.9% accuracy on QA evaluation, comparable to GPT-5 (74.6%) Cost Efficiency: 98% lower cost than closed-source alternatives

This generates comprehensive structured summaries in a JSON format. The papers are either classified as SCIENTIFIC_TEXT, PARTIAL_SCIENTIFIC_TEXT, or NON_SCIENTIFIC_TEXT. The fields extracted are key research elements such as methodology, results, claims, and limitations.

The model supports papers up to 131K tokens.

Usage

Serving the Model

vllm serve inference-net/Paper-Summarizer-Qwen3-14B \
--port 8000 \
--host 0.0.0.0 \
--trust-remote-code \
--data-parallel-size 1 \
--tensor-parallel-size 1 \
--max-num-seqs 32 \
--max-model-len 131072 \
--max-num-batched-tokens 8192 \
--gpu-memory-utilization 0.90 \
--enable-prefix-caching \
--enable-chunked-prefill

Making Requests

import requests

# System prompt (required)
system_prompt = """[Insert the full system prompt from the prompt.txt file -
see the full prompt in the model repository]"""

# User prompt: the paper text to summarize
paper_text = """
Title: Your Paper Title
Authors: Author 1, Author 2
Abstract: ...
[Full paper content]
"""

# API request
response = requests.post(
"http://localhost:8000/v1/chat/completions",
json={
"model": "inference-net/Paper-Summarizer-Qwen3-14B",
"messages": [
{"role": "system", "content": system_prompt},
{"role": "user", "content": paper_text}
],
"temperature": 0.2
},
timeout=600
)

result = response.json()
summary = result["choices"][0]["message"]["content"]
print(summary)

System Prompt

The model requires a specific system prompt that defines the JSON schema and extraction instructions. The prompt instructs the model to:

  1. Classify the text as SCIENTIFIC_TEXT, PARTIAL_SCIENTIFIC_TEXT, or NON_SCIENTIFIC_TEXT
  2. Extract structured information including:
  • Title, authors, publication year
  • Research context and hypotheses
  • Methodological details
  • Key results with quantitative data
  • Claims with supporting evidence
  • Limitations and ethical considerations

The full system prompt is available in the model repository's prompt.txt file.

Output Format

The model outputs a single valid JSON object with this structure:

{
"article_classification": "SCIENTIFIC_TEXT",
"reason": null,
"summary": {
"title": "",
"authors": "",
"publication_year": null,
"field_subfield": "",
"executive_summary": "",
"research_context": "",
"methodological_details": "",
"key_results": "",
"claims": [...],
"contradictions_and_limitations": "",
...
}
}

Performance

LLM-as-a-Judge Evaluation

  • Score: 4.207/5.0
  • Comparison: Within 15% of GPT-5 (4.805/5.0)

QA Dataset Evaluation

  • Accuracy: 73.9%
  • Comparison: Ties with Gemini 2.5 Flash, nearly matches GPT-5 (74.6%)

Throughput

  • Requests/sec: 0.43
  • Input Tokens/sec: 7,516.54
  • Output Tokens/sec: 2,588.30

Training Details

  • Training Set: 100,000 papers
  • Validation Set: 10,000 papers
  • Average Paper Length: 81,334 characters
  • Training Approach: Post-training on summaries generated by frontier models (GPT-5, Claude 4.5 Sonnet, Gemini 2.5 Pro)

Limitations

  • May generate subtle factual errors (hallucinations) for fine-grained details
  • Context limit (131K tokens) may truncate extremely long documents
  • Unified schema may not capture all domain-specific nuances
  • Summaries are research aids, not replacements for primary sources in high-stakes scenarios

Related Resources

License

[License information to be added]

Acknowledgments

This work was made possible through collaboration with:

  • LAION
  • Wynd Labs
  • Inference.net
  • Contributors to bethgelab, PeS2o, Common Pile, and OpenAlex
Downloads last month
80
Safetensors
Model size
15B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for inference-net/Aella-Qwen3-14B

Finetuned
Qwen/Qwen3-14B
Finetuned
(145)
this model
Quantizations
5 models