--- license: apache-2.0 base_model: Qwen/Qwen2-1.5B-Instruct language: - en tags: - reasoning - math - gsm8k - chain-of-thought pipeline_tag: text-generation --- # ๐Ÿง  ARIES 1.5B - Reasoning Language Model A 1.5B parameter reasoning model fine-tuned with custom reasoning tokens for step-by-step mathematical problem solving. ## ๐Ÿ“Š Model Details - **Architecture:** Qwen2-1.5B-Instruct (base) + Custom Reasoning Tokens - **Parameters:** 1.54B - **Training Method:** Fine-tuned on GSM8K with reasoning token integration - **Special Tokens:** ``, ``, ``, `` - **Training Loss:** 0.2130 - **Version:** v1.0-finetuned ## ๐ŸŽฏ What Makes This Model Special This model extends Qwen2-1.5B with: 1. **Custom reasoning tokens** for structured thought processes 2. **Step-by-step explanation** capabilities 3. **GSM8K-style notation** support (`<>`) 4. **Chain-of-thought reasoning** integration ## ๐Ÿ“ Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch model = AutoModelForCausalLM.from_pretrained( "ziadrone/aries-1.5b-reasoning", torch_dtype=torch.bfloat16, device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained("ziadrone/aries-1.5b-reasoning") prompt = " What is 25 + 17?" inputs = tokenizer(prompt, return_tensors="pt").to("cuda") outputs = model.generate( **inputs, max_new_tokens=100, temperature=0.7, top_p=0.9 ) print(tokenizer.decode(outputs[0], skip_special_tokens=False)) ``` ## ๐Ÿงช Example Outputs **Addition:** ``` Input: What is 25 + 17? Output: The answer to 25 + 17 is 42. Explanation: We add the two numbers together. 25 + 17 = <<25+17=42>>42 ``` **Word Problems:** ``` Input: If I have $50 and spend $23, how much is left? Output: You are left with 27 dollars. If you have $50 and spend $23, then the amount left will be $50 - $23 = <<50-23=27>>27 ``` **Distance Problems:** ``` Input: A train travels 60 mph for 3 hours. How far does it go? Output: It goes 180 miles because 60 times 3 is 180. 180 ``` ## ๐Ÿ“ˆ Training Details - **Dataset:** GSM8K (1,500 training examples) - **Epochs:** 2 - **Batch Size:** 1 ร— 32 gradient accumulation - **Learning Rate:** 3e-5 with cosine schedule + warmup - **Optimizer:** AdamW with CPU offloading (memory efficient) - **Training Time:** ~42 minutes on single GPU - **Hardware:** NVIDIA GPU with 24GB VRAM ## ๐ŸŽ“ Training Strategy The model was trained using a memory-efficient approach: - **CPU-offloaded optimizer states** (saved ~6GB GPU memory) - **Gradient checkpointing** enabled - **Mixed precision** (BF16) - **Custom learning rate scheduler** with warmup ## ๐Ÿ”„ Roadmap - **v1.0** (Current): Fine-tuned on GSM8K - **v2.0** (Coming): Knowledge distillation for improved performance - **v3.0** (Planned): Extended to MATH and MMLU datasets ## ๐Ÿ“„ License Apache 2.0 ## ๐Ÿ™ Credits - **Base Model:** Qwen Team (Qwen2-1.5B-Instruct) - **Reasoning Framework:** ARIES (Autonomous Reasoning Improvement via Ensembling Systems) - **Training Dataset:** OpenAI GSM8K - **Framework:** HuggingFace Transformers ## ๐Ÿ“ง Contact For questions or collaborations: [Your contact]