--- license: mit language: - en base_model: - FacebookAI/roberta-base pipeline_tag: text-classification library_name: transformers tags: - quantititative information extraction - measurement extraction - quantitative claim - quantitative statement - quantitative data - numeric information --- # Model Card for quinex-statement-clf-v0-125M `quinex-statement-clf-v0-125M` is based on [RoBERTa-base](https://huggingface.co/FacebookAI/roberta-base). We further fine-tuned this model to classify quantitative statements into different types. For more details, please refer to our paper *"Quinex: Quantitative Information Extraction from Text using Open and Lightweight LLMs"* (published soon). > [!CAUTION] > **Note that this model is only a placeholder**. It was only trained on 247 training examples, mainly on hydrogen-related text. Therefore, it is not expected to perform and generalize well. The F1 score on the dev (not test) set is 68.6 F1. We recommend using this model with caution and not relying on its predictions. ## Example input and output The input is a text containing the quantitative statement of interest, with the quantity enclosed in 🍏...🍏. For example: *"In order to mitigate the consequences to overshoot the 🍏1.5 °C🍏 global warming level..."* To identify the quantities and extract further context, you can use our [quantity identification](https://huggingface.co/JuelichSystemsAnalysis/quinex-quantity-v0-124M) and [measurement context extraction](https://huggingface.co/JuelichSystemsAnalysis/quinex-context-v0-783M) models. The output is a list of labels with the respective scores. The labels can be grouped into three categories: * Type: What kind of statement is it? - `assumption` - `feasibility_estimation` - `goal` - `observation` - `prediction` - `requirement` - `specification` * Rational: How was the value obtained? - `arbitrary` - `company_reported` - `experiments` - `expert_elicitation` - `individual_literature_sources` - `literature_review` - `regression` - `simulation_or_calculation` - `rough_estimate_or_analogy` * System: What kind of system is it about? - `real_world` - `lab_or_prototype_or_pilot_system` - `model` ## Model details - **Base Model**: [RoBERTa-base](https://huggingface.co/FacebookAI/roberta-base) - **Tokenizer**: RoBERTa - **Parameters**: 125M ### Framework versions - Transformers 4.36.2 - Pytorch 2.1.2 - Tokenizers 0.15.0