🧠 Srikri7/qwen3.5-2b-reasoning

📢 Release Note: Build Environment Upgrades

Fine-tuning Framework: Unsloth 2026.3.7

Core Dependencies: Transformers 5.3.0, Torch 2.10.0+cu128

Hardware: Optimized for Tesla T4 (16GB VRAM) using 4-bit NormalFloat (NF4) quantization.

Native Developer Role: Support for the "developer" role natively to ensure compatibility with modern coding agents (Claude Code, OpenCode).

Continuous Thinking: Optimized to run autonomously for over 9 minutes without stalling.

💡 Model Introduction

qwen3.5-2b-reasoning is a highly efficient reasoning model fine-tuned on the Qwen3.5-2B architecture. Despite its 2-billion parameter count, it leverages high-density Chain-of-Thought (CoT) distillation primarily sourced from Claude-4.6 Opus trajectories.

The model is specifically trained to avoid the "repetitive loop" failure common in small models by enforcing a strict hierarchy of analytical thought within <think> tags.

🧠 Learned Reasoning Scaffold

The model adopts a streamlined structured thinking pattern to ensure deep analytical capacity without redundant cognitive loops:

<think>
1. [Understanding]: Restate the core objective and identify key numerical constraints (e.g., "252 students", "41-seater bus").
2. [Plan]: Identify necessary strategies or math rules (e.g., Product Rule, Rounding-up logic).
3. [Step-by-step Reasoning]: Execute transformations with intermediate justifications.
4. [Verification]: Cross-check the final result against the initial constraints.
</think>
[Final Answer]

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for Srikri7/qwen3.5-2b-reasoning

Base model

Qwen/Qwen3.5-2B-Base

Finetuned

Qwen/Qwen3.5-2B

Finetuned

(193)

this model

Srikri7
/

qwen3.5-2b-reasoning

🧠 Srikri7/qwen3.5-2b-reasoning

💡 Model Introduction

🧠 Learned Reasoning Scaffold

Model tree for Srikri7/qwen3.5-2b-reasoning

Dataset used to train Srikri7/qwen3.5-2b-reasoning