|
|
--- |
|
|
base_model: meta-llama/Llama-3.1-8B-Instruct |
|
|
library_name: transformers |
|
|
license: apache-2.0 |
|
|
datasets: |
|
|
- iedeveci/WillowNLtoFOL |
|
|
- yuan-yang/MALLS-v0 |
|
|
language: |
|
|
- en |
|
|
pipeline_tag: translation |
|
|
tags: |
|
|
- NLTOFOL |
|
|
- NL |
|
|
- FOL |
|
|
--- |
|
|
|
|
|
# Model Card for fvossel/Llama-3.1-8B-Instruct-nl-to-fol |
|
|
This model contains **LoRA adapter weights** for the base model [`meta-llama/Llama-3.1-8B-Instruct`](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct). It was trained to translate **natural language statements into First-Order Logic (FOL)** representations. |
|
|
|
|
|
|
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Model Description |
|
|
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
|
|
|
|
|
|
|
|
- **Developed by:** Vossel et al. at Osnabrück University |
|
|
- **Funded by:** Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) 456666331 |
|
|
- **Model type:** Decoder-only causal language model (LLaMA architecture) |
|
|
- **Language(s) (NLP):** Englisch, FOL |
|
|
- **License:** This repository contains **only LoRA adapter weights**, trained using the base model [`meta-llama/Llama-3.1-8B-Instruct`](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct), which is released under the [Meta Llama 3 Community License](https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/LICENSE). |
|
|
These adapter weights are released under the **Apache 2.0 License**. |
|
|
- **Finetuned from model:** meta-llama/Llama-3.1-8B-Instruct |
|
|
|
|
|
|
|
|
## Uses |
|
|
|
|
|
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. --> |
|
|
|
|
|
### Direct Use |
|
|
|
|
|
This model is designed to translate natural language (NL) sentences into corresponding first-order logic (FOL) expressions. It can be used directly for: |
|
|
|
|
|
- Automated semantic parsing and formalization of NL statements into symbolic logic. |
|
|
- Supporting explainable AI systems that require symbolic reasoning based on language input. |
|
|
- Research in neurosymbolic AI, logic-based natural language understanding, and formal verification. |
|
|
- Integrating into larger pipelines for natural language inference, question answering, or knowledge base population. |
|
|
|
|
|
Users should be aware that the model outputs symbolic formulas which may require further validation or post-processing depending on the application. |
|
|
|
|
|
|
|
|
### Downstream Use |
|
|
|
|
|
This LoRA adapter can be further fine-tuned or combined with other models for specialized domains, e.g., legal, biomedical, or scientific text formalization. It is also suitable for integration into interactive systems that require formal reasoning capabilities. |
|
|
|
|
|
### Out-of-Scope Use |
|
|
|
|
|
- This model is not intended for general-purpose natural language generation. |
|
|
- It may not perform well on ambiguous, highly figurative, or out-of-domain sentences. |
|
|
- Users should not rely on the model output as final legal, medical, or safety-critical logic without expert review. |
|
|
|
|
|
### Recommendations |
|
|
|
|
|
- The model may produce logically incorrect or syntactically invalid formulas, especially on complex or ambiguous input. Users should validate outputs before use in critical applications. |
|
|
- As the training data contains inherent biases from the source datasets and GPT-4 generation, users should be cautious about potential biases reflected in the model’s predictions. |
|
|
- The model is specialized for English natural language and first-order logic; it may not generalize well to other languages or more expressive logics. |
|
|
- Users applying this model in downstream tasks should carefully evaluate its performance and consider human-in-the-loop validation to ensure correctness. |
|
|
- This model is intended for research and prototyping purposes and should not be used as a standalone solution in high-stakes decision making without expert oversight. |
|
|
|
|
|
|
|
|
## How to Get Started with the Model |
|
|
|
|
|
```python |
|
|
import torch |
|
|
from peft import PeftModel |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
|
|
# Model repository identifiers |
|
|
base_model_name = "meta-llama/Llama-3.1-8B-Instruct" |
|
|
lora_weights = "fvossel/Llama-3.1-8B-Instruct-nl-to-fol" |
|
|
|
|
|
# Load the tokenizer |
|
|
tokenizer = AutoTokenizer.from_pretrained(base_model_name, trust_remote_code=True) |
|
|
if tokenizer.pad_token is None: |
|
|
tokenizer.pad_token = tokenizer.eos_token |
|
|
tokenizer.padding_side = "left" |
|
|
|
|
|
# Load the base model |
|
|
model = AutoModelForCausalLM.from_pretrained(base_model_name, trust_remote_code=True, device_map="auto") |
|
|
|
|
|
# Load the LoRA adapter weights |
|
|
model = PeftModel.from_pretrained(model, lora_weights, device_map="auto") |
|
|
|
|
|
# Define formatting function for preparing the input prompt |
|
|
def formatting_func(text): |
|
|
return tokenizer.apply_chat_template( |
|
|
[ |
|
|
{ |
|
|
"role": "system", |
|
|
"content": ( |
|
|
"You are a helpful AI assistant that translates Natural Language (NL) text " |
|
|
"into First-Order Logic (FOL) using only the given quantors and junctors: " |
|
|
"∀ (for all), ∃ (there exists), ¬ (not), ∧ (and), ∨ (or), → (implies), " |
|
|
"↔ (if and only if), ⊕ (xor). " |
|
|
"Start your answer with '𝜙=' followed by the FOL-formula. Do not include any other text." |
|
|
), |
|
|
}, |
|
|
{"role": "user", "content": text}, |
|
|
], |
|
|
tokenize=False, |
|
|
add_generation_prompt=False, |
|
|
) |
|
|
|
|
|
# Example NL input |
|
|
input_text = "All dogs are animals." |
|
|
|
|
|
# Use formatting_func to create the model input |
|
|
prompt = formatting_func(input_text) |
|
|
|
|
|
# Tokenize the prompt |
|
|
inputs = tokenizer(prompt, return_tensors="pt", padding=True) |
|
|
|
|
|
# Generate output |
|
|
outputs = model.generate(**inputs, max_new_tokens=100) |
|
|
|
|
|
# Decode and print the result |
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
|
|
|
|
``` |
|
|
|
|
|
|
|
|
## Training Details |
|
|
|
|
|
### Training Data |
|
|
|
|
|
The model was fine-tuned on two datasets: |
|
|
|
|
|
1. **WillowNLtoFOL**: A dataset containing over 16,000 paired natural language (NL) sentences and first-order logic (FOL) expressions. Originally published as part of the following Master's thesis: |
|
|
|
|
|
Deveci, İ. E. (2024). Transformer models for translating natural language sentences into formal logical expressions [M.S. - Master of Science]. Middle East Technical University. |
|
|
Available at: https://open.metu.edu.tr/handle/11511/109445 |
|
|
|
|
|
Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND 4.0). |
|
|
|
|
|
2. **MALLS-v0**: A dataset with 34,000 pairs of NL statements and corresponding FOL rules, generated using GPT-4. The FOL rules were checked for syntactic correctness but not for semantic alignment with the NL statements. More details: https://huggingface.co/datasets/yuan-yang/MALLS-v0 |
|
|
|
|
|
Licensed under Attribution-NonCommercial 4.0 International. As the data was collected using GPT-4, usage is also subject to OpenAI’s terms of use: https://openai.com/policies/terms-of-use |
|
|
|
|
|
### Training Procedure |
|
|
|
|
|
The model was fine-tuned using LoRA adapters on top of the pre-trained "meta-llama/Llama-3.1-8B-Instruct" model. The training involved: |
|
|
|
|
|
- Prompt-based instruction tuning |
|
|
|
|
|
#### Training Hyperparameters |
|
|
|
|
|
- **Training regime:** bf16 mixed precision |
|
|
- **Batch size:** 8 (per device) |
|
|
- **Learning rate:** 1e-5 |
|
|
- **Number of epochs:** 12 |
|
|
- **Optimizer:** AdamW |
|
|
- **Scheduler:** Cosine learning rate scheduler |
|
|
- **Warmup ratio:** 0.05 |
|
|
- **Gradient accumulation steps:** 2 |
|
|
- **Weight decay:** 0.01 |
|
|
- **LoRA rank (r):** 16 |
|
|
- **LoRA alpha:** 32 |
|
|
- **LoRA dropout:** 0.05 |
|
|
- **Target modules:** ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"] |
|
|
- **Bias:** none |
|
|
- **Task type:** CAUSAL_LM |
|
|
- **Early stopping patience:** 4 epochs |
|
|
|
|
|
|