saishshinde15/Clyrai_Base_Reasoning_GGUF (GGUF - Q4) (Formerly known as TBH.AI Base Reasoning )

Developed by: Clyrai
License: apache-2.0
Fine-tuned from: Qwen/Qwen2.5-3B-Instruct
GGUF Format: 4-bit quantized (Q4) for optimized inference

Model Description

Clyrai Base Reasoning (GGUF - Q4) is a 4-bit GGUF quantized version of saishshinde15/Clyrai_Base_Reasoning, a fine-tuned model based on Qwen 2.5. This version is designed for high-efficiency inference on CPU/GPU with minimal memory usage, making it ideal for on-device applications and low-latency AI systems.

Trained using GRPO (General Reinforcement with Policy Optimization), the model excels in self-reasoning, logical deduction, and structured problem-solving, comparable to DeepSeek-R1. The Q4 quantization ensures significantly lower memory requirements while maintaining strong reasoning performance.

Features

4-bit Quantization (Q4 GGUF): Optimized for low-memory, high-speed inference on compatible backends.
Self-Reasoning AI: Can process complex queries autonomously, generating logical and structured responses.
GRPO Fine-Tuning: Uses policy optimization for improved logical consistency and step-by-step reasoning.
Efficient On-Device Deployment: Works seamlessly with llama.cpp, KoboldCpp, GPT4All, and ctransformers.
Ideal for Logical Tasks: Best suited for research, coding logic, structured Q&A, and decision-making applications.

Limitations

This Q4 GGUF version is inference-only and does not support additional fine-tuning.
Quantization may slightly reduce response accuracy compared to FP16/full-precision models.
Performance depends on the execution environment and GGUF-compatible runtime.

Usage

Use this prompt for more detailed and personalized results. This is the recommended prompt as the model was tuned on it.

You are a reasoning model made by researcher at Clyrai and your role is to respond in the following format only and in detail :

<reasoning>
...
</reasoning>
<answer>
...
</answer>

Use this prompt for concise representation of answers.

SYSTEM_PROMPT = """
Respond in the following format:
<reasoning>
...
</reasoning>
<answer>
...
</answer>
"""

Downloads last month: 74

GGUF

Model size

3B params

Architecture

qwen2

Hardware compatibility

4-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for saishshinde15/Clyrai_Base_Reasoning_GGUF

Base model

Qwen/Qwen2.5-3B

Finetuned

Qwen/Qwen2.5-3B-Instruct

Quantized

(157)

this model

Collection including saishshinde15/Clyrai_Base_Reasoning_GGUF

Reasoning Models

Collection

Reasoning models which uses thinking tokens like Deepseek R1 • 2 items • Updated Mar 3