Model Card for Islamspecialist PRO 12B

## Model Description

Islamspecialist PRO 12B is a 12 billion parameter large language model, fine-tuned from mistralai/Mistral-Nemo-Base-2407. It is a specialized model trained to act as a domain expert in Islamic scholarship and criticism. Its a improvement over the 7B version with bigger pretraining dataset, better data and more parameters. Its knowledge base includes Hadith scriptures from Sahih Muslim (translated by Abdul Hamid Siddiqui), multiple Quran transliterations (including the Oxford Quran), and other high-quality texts on Islamic themes.

  • Developed by: Catdrout
  • Model type: Transformer-based decoder-only language model
  • Language(s): English (Primary, based on translated and transliterated source texts)
  • License: CC-BY-SA 4.0
  • Finetuned from: mistralai/Mistral-Nemo-Base-2407 (important set **Finished.** as stop token)

Model Sources

Uses

Direct Use

This model is designed for direct use in generating analysis, responses, and critiques based on Islamic texts. It functions as a digital scholar for tasks involving:

  • Question-Answering on Islamic topics
  • Retrieval-Augmented Generation (RAG)
  • Discussion and reasoning about Islamic scholarship

Out-of-Scope Use

The model is not intended for:

  • General-purpose tasks outside its domain expertise.
  • Promoting hate speech, religious intolerance, or illegal activities.
  • Use without critical verification of its outputs against original sources.

Bias, Risks, and Limitations

This model is trained on a curated dataset focused on Muslim criticism and scholarship. Users should be aware of the following:

  • Biases: The model's outputs may reflect the perspectives and potential biases present in its training data, including the specific translations used (e.g., Abdul Hamid Siddiqui's Sahih Muslim, Oxford Quran).
  • Risks: It may generate controversial or offensive content related to religious topics.
  • Limitations: The model was trained on 8-10 million tokens for a short duration (1 hour). Its performance on nuanced or highly specific theological questions may be limited compared to larger, more general models.

Recommendations: Users are advised to cross-verify the model's outputs with original authoritative sources and exercise caution in sensitive discussions.

How to Get Started

Load the model using the transformers library:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "c4tdr0ut/Islamspecialist-PRO-12B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

Continue with your inference code

Training Details

Training Data

The model was trained on approximately 8-10 million tokens from:

  • Sahih Muslim (Translated by Abdul Hamid Siddiqui)
  • Multiple Quran transliterations, including the Oxford Quran
  • Other high-quality books on Islamic criticism and scholarship

Training Procedure

  • Pretraining: 8 epochs
  • Supervised Fine-Tuning (SFT): 2.3 epochs
  • SFT Techniques: Open-ended, follow-up, negative example, RAG, and correction training with masked entries

Key Hyperparameters

  • Training Regime: bf16
  • Learning Rate: 2.0e-05 (constant scheduler)
  • Micro Batch Size: 2
  • Gradient Accumulation Steps: 75
  • Sequence Length: 5000
  • Optimizer: paged_adamw_8bit

Infrastructure

  • Hardware: 1x NVIDIA B200
  • Software: PyTorch 2.7, CUDA 12.8, Axolotl
  • Training Time: 1 hour

Evaluation

The model was evaluated conversationally on its ability to reason and answer questions based on Islamic texts. As a specialized conversational agent, it was not evaluated using standard benchmark metrics.

Citation

No accompanying paper is available for this model.

Contact

For questions and comments, please contact:

Model Card Author: Catdrout

Downloads last month
7
Safetensors
Model size
12B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for c4tdr0ut/Islamspecialist-PRO-12B

Finetuned
(80)
this model
Quantizations
2 models