π§ ARIES 1.5B - Reasoning Language Model
A 1.5B parameter reasoning model fine-tuned with custom reasoning tokens for step-by-step mathematical problem solving.
π Model Details
- Architecture: Qwen2-1.5B-Instruct (base) + Custom Reasoning Tokens
- Parameters: 1.54B
- Training Method: Fine-tuned on GSM8K with reasoning token integration
- Special Tokens:
<think>,<context>,<answer>,<end> - Training Loss: 0.2130
- Version: v1.0-finetuned
π― What Makes This Model Special
This model extends Qwen2-1.5B with:
- Custom reasoning tokens for structured thought processes
- Step-by-step explanation capabilities
- GSM8K-style notation support (
<<calculation=result>>) - Chain-of-thought reasoning integration
π Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model = AutoModelForCausalLM.from_pretrained(
"ziadrone/aries-1.5b-reasoning",
torch_dtype=torch.bfloat16,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("ziadrone/aries-1.5b-reasoning")
prompt = "<think> What is 25 + 17?"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(
**inputs,
max_new_tokens=100,
temperature=0.7,
top_p=0.9
)
print(tokenizer.decode(outputs[0], skip_special_tokens=False))
π§ͺ Example Outputs
Addition:
Input: <think> What is 25 + 17?
Output: The answer to 25 + 17 is 42.
Explanation: We add the two numbers together.
25 + 17 = <<25+17=42>>42
Word Problems:
Input: <think> If I have $50 and spend $23, how much is left?
Output: You are left with 27 dollars.
If you have $50 and spend $23, then the amount left will be
$50 - $23 = <<50-23=27>>27
Distance Problems:
Input: <think> A train travels 60 mph for 3 hours. How far does it go?
Output: It goes 180 miles because 60 times 3 is 180.
<answer> 180 <end>
π Training Details
- Dataset: GSM8K (1,500 training examples)
- Epochs: 2
- Batch Size: 1 Γ 32 gradient accumulation
- Learning Rate: 3e-5 with cosine schedule + warmup
- Optimizer: AdamW with CPU offloading (memory efficient)
- Training Time: ~42 minutes on single GPU
- Hardware: NVIDIA GPU with 24GB VRAM
π Training Strategy
The model was trained using a memory-efficient approach:
- CPU-offloaded optimizer states (saved ~6GB GPU memory)
- Gradient checkpointing enabled
- Mixed precision (BF16)
- Custom learning rate scheduler with warmup
π Roadmap
- v1.0 (Current): Fine-tuned on GSM8K
- v2.0 (Coming): Knowledge distillation for improved performance
- v3.0 (Planned): Extended to MATH and MMLU datasets
π License
Apache 2.0
π Credits
- Base Model: Qwen Team (Qwen2-1.5B-Instruct)
- Reasoning Framework: ARIES (Autonomous Reasoning Improvement via Ensembling Systems)
- Training Dataset: OpenAI GSM8K
- Framework: HuggingFace Transformers
π§ Contact
For questions or collaborations: [Your contact]
- Downloads last month
- 16