RegulaUAE-1.2B - UAE Rulebook Q&A Assistant - Finetuned LFM2 Model
Model ID: rajeshthangaraj1/uae_rule_book_QA_assistant
Base Model: [unsloth/LFM2-1.2B](https://docs.unsloth.ai/)
Model Overview
This model is a fine-tuned version of LFM2 (1.2B), optimized as a conversational assistant specifically for answering questions based on the UAE Central Bank Rulebook (Banking Regulations). It specializes in navigating regulatory sections such as Capital Adequacy, Licensing, Corporate Governance, and Risk Management.
Domain: UAE Central Bank banking regulations
Precision: Quantized to 4-bit with bitsandbytes for efficient deployment
Pipeline: text-generation with chat template support
Framework: Hugging Face transformers
Use Cases
Legal and regulatory Q&A: Ask precise questions like:
- "What is the relationship between Decree Law No. (20) of 2018 and Cabinet Decision No. (10) of 2019?"
- "What are the minimum capital ratios specified in Article (2)?"
Educational Tool: Great for students or professionals seeking quick, accurate answers to banking regulation questions.
Limitations
- Hallucination Risk: Without explicit context or document retrieval, the model may hallucinate or generate plausible but incorrect answers.
- Domain-specific: Tailored exclusively to the UAE Central Bank Rulebook’s banking sections.
- Precision: May occasionally misuse percentages or article contents not in the training set.
📊 Dataset Creation
Source Data: The dataset was built using publicly available content from the official UAE Central Bank Rulebook, accessible at rulebook.centralbank.ae. The rulebook outlines the legal and compliance frameworks governing financial institutions in the UAE, with a focus on banking regulations such as Capital Adequacy, Licensing, Governance, and Risk Management.
Preprocessing:
The scraped content was cleaned and segmented into approximately 65,000 text chunks.
Each chunk contains ~500 characters, preserving semantic boundaries such as article titles, clauses, and legal definitions.
These chunks were used as context for generating question-answer pairs.
The resulting dataset follows a structure of:
"context": rulebook chunk
"question": generated question
"answer": answer grounded in the context
This dataset was then used to fine-tune the model for domain-specific legal QA behavior.
Example Usage
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "rajeshthangaraj1/uae_rule_book_QA_assistant"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="auto",
torch_dtype=torch.float16
)
messages = [
{"role": "system", "content":
"You are an assistant specialized in the UAE Central Bank Rulebook. "
"Only answer based on the UAE Rulebook. "
"If the answer is not in the Rulebook, reply 'Not found in UAE Rulebook'."},
{"role": "user", "content":
"According to the UAE Central Bank Rulebook – Capital Adequacy Section, "
"what does Article (2) specify about minimum capital ratios?"}
]
# Apply chat template
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors="pt"
).to(model.device)
inputs.pop("token_type_ids", None)
# Generate response
outputs = model.generate(**inputs, max_new_tokens=128)
answer = tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True)
print(answer) ```
## Example Usage
```python
import gradio as gr
def chat_with_model(message, history):
# (your chat_with_model function here)
...
gr.ChatInterface(fn=chat_with_model, title="UAE Rulebook QA Assistant").launch() ```
## 🔧Technical Details
Quantization: 4-bit (bitsandbytes)
Training Framework: Hugging Face transformers + accelerate
Task Type: Domain-specific legal Q&A
✍️ Author: @rajeshthangaraj1
📅 Last Updated: 2025
- Downloads last month
- 27
