Model Card for quinex-context-v0-77M

quinex-context-v0-77M is based on FLAN-T5 small, which is a pre-trained and instruction-finetuned encoder-decoder transformer model. We further fine-tuned this model to extract the measurement context of quantities in text (i.e., the entity and property being measured as well as qualifiers). For more details, please refer to our paper "Quinex: Quantitative Information Extraction from Text using Open and Lightweight LLMs" (published soon).

Usage

This model is intended for extracting the measurement context of quantities in text using multi-turn question-answering. To first identify the quantities, you can use our quantity identification models.

This model assumes the use of specific question templates, and that already extracted information is highlighted in the context. The following input format is used: question: {question} context: {highlighted_text}. First, the measured property is extracted, then the entity, and then, without a specific order, further context such as the temporal scope, spatial scope, references, determination method, and other qualifiers. The quantity, property, and entity spans are highlighted in the context using $...$ , **...**, and [[...]], respectively.

Example

Let's say we want to extract the property that is measured by the quantity "6 MW" in the sentence "The nominal power and rotor diameter of the Enercon E-175 EP5 are 6 MW and 175 m, respectively". The procedure would be as follows:

First, the measured property is extracted using the following input: question: Which property or quality is characterized by 6 MW? context: The nominal power and rotor diameter of the Enercon E-175 EP5 are $6 MW$ and 175 m, respectively.
Given the answer "nominal power", the entity is extracted using the following input: question: Which entity's nominal power is characterized by 6 MW? context: The **nominal power** and rotor diameter of the Enercon E-175 EP5 are $6 MW$ and 175 m, respectively. In case the property could not be extracted, a fallback question is used instead.
Given the answer "Enercon E-175 EP5", further context can be extracted, e.g., the temporal scope using the following input: question: For which point in time is the statement true that nominal power of Enercon E-175 EP5 is 6 MW? context: The **nominal power** and rotor diameter of the [[Enercon E-175 EP5]] are $6 MW$ and 175 m, respectively. For each of the context types (temporal scope, spatial scope, references, determination method, and other qualifiers), there is also a fallback question in case either the property or entity could not be extracted.

Full list of question templates:

Property:
- Default: Which property or quality is characterized by {quantity_span}?
Entity:
- Default: Which entity's {property_span} {is_or_are} characterized by {quantity_span}?
- Fallback: Which entity does {quantity_span} characterize?
Qualifiers:
- Temporal scope:
  - Default: For which point in time is the statement true that {property_span} of {entity_span} {is_or_are} {quantity_span}?
  - Fallback: For which point in time is the statement true that {entity_or_property_span} {is_or_are} {quantity_span}?
- Spatial scope:
  - Default: For which location is the statement true that {property_span} of {entity_span} {is_or_are} {quantity_span}?
  - Fallback: For which location is the statement true that {entity_or_property_span} {is_or_are} {quantity_span}?
- Reference:
  - Default: According to whom or which reference is the statement true that {property_span} of {entity_span} {is_or_are} {quantity_span}?
  - Fallback: According to whom or which reference is the statement true that {entity_or_property_span} {is_or_are} {quantity_span}?
- Method:
  - Default: What methods and instruments were used in determining that {property_span} of {entity_span} {is_or_are} {quantity_span}?
  - Fallback: What methods and instruments were used in determining that {entity_or_property_span} {is_or_are} {quantity_span}?
- Other qualifiers:
  - Default: Under which constraints is the statement true that {property_span} of {entity_span} {is_or_are} {quantity_span}?
  - Fallback: Under which constraints is the statement true that {entity_or_property_span} {is_or_are} {quantity_span}?

Model details

Base Model: FLAN-T5 small
Tokenizer: T5 tokenizer
Parameters: 77M

Fine-tuning data

The model was fine-tuned sequentially on all non-curated "silver" examples and all curated "gold" examples from a combination of datasets for measurement context extraction that includes:

Wiki-Measurements
MeasEval (relabeled)
Materials Science Procedural Text Corpus (relabeled)
MuLMS (relabeled)
orkg-R0 (relabeled)
PolyIE (relabeled)
PHEE (relabeled)
SuperMat (subset, relabeled)
Quinex-Hydrogen-TechData

Evaluation results

Evaluation results on the test set as described in the paper:

	Concept	F1 (SQuAD overlap)
Overall	Micro average	83.32

Entity and property	Macro average	80.13
	Measured Property	81.68
	Measured Entity	78.58

Qualifiers	Macro average	85.41
	Spatial Scope	97.40
	Temporal Scope	92.52
	Method	86.65
	Reference	86.41
	Other Qualifiers	64.08

Note that the scores for the qualifier questions are higher because abstaining is often the correct answer for them.

Citation

If you use this model in your research, please cite the following paper:

@article{quinex2025,
    title = {{Quinex: Quantitative Information Extraction from Text using Open and Lightweight LLMs}},	
    author = {Göpfert, Jan and Kuckertz, Patrick and Müller, Gian and Lütz, Luna and Körner, Celine and Khuat, Hang and Stolten, Detlef and Weinand, Jann M.},
    month = okt,
    year = {2025},
}

Framework versions

Transformers 4.36.2
Pytorch 2.1.2
Datasets 2.16.1
Tokenizers 0.15.0

Downloads last month: 12

Safetensors

Model size

77M params

Tensor type

F32

Model tree for JuelichSystemsAnalysis/quinex-context-v0-77M

Base model

google/flan-t5-small

Finetuned

(452)

this model

Collection including JuelichSystemsAnalysis/quinex-context-v0-77M

quinex v0

Collection

Models for quantitative information extraction • 7 items • Updated 3 days ago