🧠 ARIES 1.5B - Reasoning Language Model

A 1.5B parameter reasoning model fine-tuned with custom reasoning tokens for step-by-step mathematical problem solving.

πŸ“Š Model Details

  • Architecture: Qwen2-1.5B-Instruct (base) + Custom Reasoning Tokens
  • Parameters: 1.54B
  • Training Method: Fine-tuned on GSM8K with reasoning token integration
  • Special Tokens: <think>, <context>, <answer>, <end>
  • Training Loss: 0.2130
  • Version: v1.0-finetuned

🎯 What Makes This Model Special

This model extends Qwen2-1.5B with:

  1. Custom reasoning tokens for structured thought processes
  2. Step-by-step explanation capabilities
  3. GSM8K-style notation support (<<calculation=result>>)
  4. Chain-of-thought reasoning integration

πŸ“ Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    "ziadrone/aries-1.5b-reasoning",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("ziadrone/aries-1.5b-reasoning")

prompt = "<think> What is 25 + 17?"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")

outputs = model.generate(
    **inputs,
    max_new_tokens=100,
    temperature=0.7,
    top_p=0.9
)

print(tokenizer.decode(outputs[0], skip_special_tokens=False))

πŸ§ͺ Example Outputs

Addition:

Input: <think> What is 25 + 17?
Output: The answer to 25 + 17 is 42. 
        Explanation: We add the two numbers together. 
        25 + 17 = <<25+17=42>>42

Word Problems:

Input: <think> If I have $50 and spend $23, how much is left?
Output: You are left with 27 dollars. 
        If you have $50 and spend $23, then the amount left will be 
        $50 - $23 = <<50-23=27>>27

Distance Problems:

Input: <think> A train travels 60 mph for 3 hours. How far does it go?
Output: It goes 180 miles because 60 times 3 is 180.
        <answer> 180 <end>

πŸ“ˆ Training Details

  • Dataset: GSM8K (1,500 training examples)
  • Epochs: 2
  • Batch Size: 1 Γ— 32 gradient accumulation
  • Learning Rate: 3e-5 with cosine schedule + warmup
  • Optimizer: AdamW with CPU offloading (memory efficient)
  • Training Time: ~42 minutes on single GPU
  • Hardware: NVIDIA GPU with 24GB VRAM

πŸŽ“ Training Strategy

The model was trained using a memory-efficient approach:

  • CPU-offloaded optimizer states (saved ~6GB GPU memory)
  • Gradient checkpointing enabled
  • Mixed precision (BF16)
  • Custom learning rate scheduler with warmup

πŸ”„ Roadmap

  • v1.0 (Current): Fine-tuned on GSM8K
  • v2.0 (Coming): Knowledge distillation for improved performance
  • v3.0 (Planned): Extended to MATH and MMLU datasets

πŸ“„ License

Apache 2.0

πŸ™ Credits

  • Base Model: Qwen Team (Qwen2-1.5B-Instruct)
  • Reasoning Framework: ARIES (Autonomous Reasoning Improvement via Ensembling Systems)
  • Training Dataset: OpenAI GSM8K
  • Framework: HuggingFace Transformers

πŸ“§ Contact

For questions or collaborations: [Your contact]

Downloads last month
16
Safetensors
Model size
2B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for navflyer/aries-1.5b-reasoning

Finetuned
(106)
this model
Quantizations
2 models