🧠 ARIES 1.5B - Reasoning Language Model

A 1.5B parameter reasoning model fine-tuned with custom reasoning tokens for step-by-step mathematical problem solving.

📊 Model Details

Architecture: Qwen2-1.5B-Instruct (base) + Custom Reasoning Tokens
Parameters: 1.54B
Training Method: Fine-tuned on GSM8K with reasoning token integration
Special Tokens: <think>, <context>, <answer>, <end>
Training Loss: 0.2130
Version: v1.0-finetuned

🎯 What Makes This Model Special

This model extends Qwen2-1.5B with:

Custom reasoning tokens for structured thought processes
Step-by-step explanation capabilities
GSM8K-style notation support (<<calculation=result>>)
Chain-of-thought reasoning integration

📝 Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    "ziadrone/aries-1.5b-reasoning",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("ziadrone/aries-1.5b-reasoning")

prompt = "<think> What is 25 + 17?"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")

outputs = model.generate(
    **inputs,
    max_new_tokens=100,
    temperature=0.7,
    top_p=0.9
)

print(tokenizer.decode(outputs[0], skip_special_tokens=False))

🧪 Example Outputs

Addition:

Input: <think> What is 25 + 17?
Output: The answer to 25 + 17 is 42. 
        Explanation: We add the two numbers together. 
        25 + 17 = <<25+17=42>>42

Word Problems:

Input: <think> If I have $50 and spend $23, how much is left?
Output: You are left with 27 dollars. 
        If you have $50 and spend $23, then the amount left will be 
        $50 - $23 = <<50-23=27>>27

Distance Problems:

Input: <think> A train travels 60 mph for 3 hours. How far does it go?
Output: It goes 180 miles because 60 times 3 is 180.
        <answer> 180 <end>

📈 Training Details

Dataset: GSM8K (1,500 training examples)
Epochs: 2
Batch Size: 1 × 32 gradient accumulation
Learning Rate: 3e-5 with cosine schedule + warmup
Optimizer: AdamW with CPU offloading (memory efficient)
Training Time: ~42 minutes on single GPU
Hardware: NVIDIA GPU with 24GB VRAM

🎓 Training Strategy

The model was trained using a memory-efficient approach:

CPU-offloaded optimizer states (saved ~6GB GPU memory)
Gradient checkpointing enabled
Mixed precision (BF16)
Custom learning rate scheduler with warmup

🔄 Roadmap

v1.0 (Current): Fine-tuned on GSM8K
v2.0 (Coming): Knowledge distillation for improved performance
v3.0 (Planned): Extended to MATH and MMLU datasets

📄 License

Apache 2.0

🙏 Credits

Base Model: Qwen Team (Qwen2-1.5B-Instruct)
Reasoning Framework: ARIES (Autonomous Reasoning Improvement via Ensembling Systems)
Training Dataset: OpenAI GSM8K
Framework: HuggingFace Transformers

📧 Contact

For questions or collaborations: [Your contact]

Downloads last month: 16

Safetensors

Model size

2B params

Tensor type

BF16

Model tree for navflyer/aries-1.5b-reasoning

Base model

Qwen/Qwen2-1.5B-Instruct

Finetuned

(106)

this model

Quantizations

2 models