Model Card for quinex-statement-clf-v0-125M

quinex-statement-clf-v0-125M is based on RoBERTa-base. We further fine-tuned this model to classify quantitative statements into different types. For more details, please refer to our paper "Quinex: Quantitative Information Extraction from Text using Open and Lightweight LLMs" (published soon).

Note that this model is only a placeholder.

It was only trained on 247 training examples, mainly on hydrogen-related text. Therefore, it is not expected to perform and generalize well. The F1 score on the dev (not test) set is 68.6 F1. We recommend using this model with caution and not relying on its predictions.

Example input and output

The input is a text containing the quantitative statement of interest, with the quantity enclosed in 🍏...🍏.

For example: "In order to mitigate the consequences to overshoot the 🍏1.5 °C🍏 global warming level..."

To identify the quantities and extract further context, you can use our quantity identification and measurement context extraction models.

The output is a list of labels with the respective scores. The labels can be grouped into three categories:

Type: What kind of statement is it?
- assumption
- feasibility_estimation
- goal
- observation
- prediction
- requirement
- specification
Rational: How was the value obtained?
- arbitrary
- company_reported
- experiments
- expert_elicitation
- individual_literature_sources
- literature_review
- regression
- simulation_or_calculation
- rough_estimate_or_analogy
System: What kind of system is it about?
- real_world
- lab_or_prototype_or_pilot_system
- model

Model details

Base Model: RoBERTa-base
Tokenizer: RoBERTa
Parameters: 125M

Framework versions

Transformers 4.36.2
Pytorch 2.1.2
Tokenizers 0.15.0

Downloads last month: 12

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for JuelichSystemsAnalysis/quinex-statement-clf-v0-125M

Base model

FacebookAI/roberta-base

Finetuned

(1962)

this model