File size: 7,647 Bytes

a1a1bf7
 
86dc6d2
 
 
 
 
 
 
 
 
 
 
 
a1a1bf7
 
86dc6d2
 
a1a1bf7
 
 
 
 
 
 
 
 
 
 
86dc6d2
a21fa30
86dc6d2
 
 
 
 
a1a1bf7
 
 
 
 
 
 
 
86dc6d2
a1a1bf7
86dc6d2
 
 
 
a1a1bf7
86dc6d2
a1a1bf7
 
86dc6d2
a1a1bf7
86dc6d2
a1a1bf7
86dc6d2
a1a1bf7
86dc6d2
 
 
a1a1bf7
 
 
86dc6d2
 
 
 
 
a1a1bf7
 
 
 
86dc6d2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1b4de22
86dc6d2
 
1b4de22
86dc6d2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a1a1bf7
 
 
 
 
 
86dc6d2
a1a1bf7
86dc6d2
a1a1bf7
86dc6d2
 
a1a1bf7
86dc6d2
a1a1bf7
86dc6d2
a1a1bf7
86dc6d2
a1a1bf7
86dc6d2
a1a1bf7
86dc6d2
a1a1bf7
86dc6d2
a1a1bf7
86dc6d2
a1a1bf7
86dc6d2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a1a1bf7

---
base_model: meta-llama/Llama-3.1-8B-Instruct
library_name: transformers
license: apache-2.0
datasets:
- iedeveci/WillowNLtoFOL
- yuan-yang/MALLS-v0
language:
- en
pipeline_tag: translation
tags:
- NLTOFOL
- NL
- FOL
---

# Model Card for fvossel/Llama-3.1-8B-Instruct-nl-to-fol
This model contains **LoRA adapter weights** for the base model [`meta-llama/Llama-3.1-8B-Instruct`](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct). It was trained to translate **natural language statements into First-Order Logic (FOL)** representations.



## Model Details

### Model Description

<!-- Provide a longer summary of what this model is. -->



- **Developed by:** Vossel et al. at Osnabrück University
- **Funded by:** Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) 456666331  
- **Model type:** Decoder-only causal language model (LLaMA architecture)
- **Language(s) (NLP):** Englisch, FOL
- **License:** This repository contains **only LoRA adapter weights**, trained using the base model [`meta-llama/Llama-3.1-8B-Instruct`](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct), which is released under the [Meta Llama 3 Community License](https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/LICENSE).
  These adapter weights are released under the **Apache 2.0 License**.
- **Finetuned from model:** meta-llama/Llama-3.1-8B-Instruct


## Uses

<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

### Direct Use

This model is designed to translate natural language (NL) sentences into corresponding first-order logic (FOL) expressions. It can be used directly for:

- Automated semantic parsing and formalization of NL statements into symbolic logic.
- Supporting explainable AI systems that require symbolic reasoning based on language input.
- Research in neurosymbolic AI, logic-based natural language understanding, and formal verification.
- Integrating into larger pipelines for natural language inference, question answering, or knowledge base population.

Users should be aware that the model outputs symbolic formulas which may require further validation or post-processing depending on the application.


### Downstream Use

This LoRA adapter can be further fine-tuned or combined with other models for specialized domains, e.g., legal, biomedical, or scientific text formalization. It is also suitable for integration into interactive systems that require formal reasoning capabilities.

### Out-of-Scope Use

- This model is not intended for general-purpose natural language generation.
- It may not perform well on ambiguous, highly figurative, or out-of-domain sentences.
- Users should not rely on the model output as final legal, medical, or safety-critical logic without expert review.

### Recommendations

- The model may produce logically incorrect or syntactically invalid formulas, especially on complex or ambiguous input. Users should validate outputs before use in critical applications.
- As the training data contains inherent biases from the source datasets and GPT-4 generation, users should be cautious about potential biases reflected in the model’s predictions.
- The model is specialized for English natural language and first-order logic; it may not generalize well to other languages or more expressive logics.
- Users applying this model in downstream tasks should carefully evaluate its performance and consider human-in-the-loop validation to ensure correctness.
- This model is intended for research and prototyping purposes and should not be used as a standalone solution in high-stakes decision making without expert oversight.


## How to Get Started with the Model

```python
import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

# Model repository identifiers
base_model_name = "meta-llama/Llama-3.1-8B-Instruct"
lora_weights = "fvossel/Llama-3.1-8B-Instruct-nl-to-fol"

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model_name, trust_remote_code=True)
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "left"

# Load the base model
model = AutoModelForCausalLM.from_pretrained(base_model_name, trust_remote_code=True, device_map="auto")

# Load the LoRA adapter weights
model = PeftModel.from_pretrained(model, lora_weights, device_map="auto")

# Define formatting function for preparing the input prompt
def formatting_func(text):
    return tokenizer.apply_chat_template(
        [
            {
                "role": "system",
                "content": (
                    "You are a helpful AI assistant that translates Natural Language (NL) text "
                    "into First-Order Logic (FOL) using only the given quantors and junctors: "
                    "∀ (for all), ∃ (there exists), ¬ (not), ∧ (and), ∨ (or), → (implies), "
                    "↔ (if and only if), ⊕ (xor). "
                    "Start your answer with '𝜙=' followed by the FOL-formula. Do not include any other text."
                ),
            },
            {"role": "user", "content": text},
        ],
        tokenize=False,
        add_generation_prompt=False,
    )

# Example NL input
input_text = "All dogs are animals."

# Use formatting_func to create the model input
prompt = formatting_func(input_text)

# Tokenize the prompt
inputs = tokenizer(prompt, return_tensors="pt", padding=True)

# Generate output
outputs = model.generate(**inputs, max_new_tokens=100)

# Decode and print the result
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

```


## Training Details

### Training Data

The model was fine-tuned on two datasets:

1. **WillowNLtoFOL**: A dataset containing over 16,000 paired natural language (NL) sentences and first-order logic (FOL) expressions. Originally published as part of the following Master's thesis:

   Deveci, İ. E. (2024). Transformer models for translating natural language sentences into formal logical expressions [M.S. - Master of Science]. Middle East Technical University.  
   Available at: https://open.metu.edu.tr/handle/11511/109445

   Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND 4.0).

2. **MALLS-v0**: A dataset with 34,000 pairs of NL statements and corresponding FOL rules, generated using GPT-4. The FOL rules were checked for syntactic correctness but not for semantic alignment with the NL statements. More details: https://huggingface.co/datasets/yuan-yang/MALLS-v0

   Licensed under Attribution-NonCommercial 4.0 International. As the data was collected using GPT-4, usage is also subject to OpenAI’s terms of use: https://openai.com/policies/terms-of-use

### Training Procedure

The model was fine-tuned using LoRA adapters on top of the pre-trained "meta-llama/Llama-3.1-8B-Instruct" model. The training involved:

- Prompt-based instruction tuning

#### Training Hyperparameters

- **Training regime:** bf16 mixed precision  
- **Batch size:** 8 (per device)  
- **Learning rate:** 1e-5  
- **Number of epochs:** 12  
- **Optimizer:** AdamW  
- **Scheduler:** Cosine learning rate scheduler  
- **Warmup ratio:** 0.05  
- **Gradient accumulation steps:** 2  
- **Weight decay:** 0.01  
- **LoRA rank (r):** 16  
- **LoRA alpha:** 32  
- **LoRA dropout:** 0.05  
- **Target modules:** ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]  
- **Bias:** none  
- **Task type:** CAUSAL_LM  
- **Early stopping patience:** 4 epochs