Qwen3-4b-tcomanr-merge-v2.5

Ties merged COde MAth aNd general Reasoning model

Description

Qwen3-4b-tcomanr-merge-v2.5 is an upgraded revision of [Qwen3-4b-tcomanr-merge-v2.3]. This model combines the reasoning, code, and mathematics capabilities of the Qwen3-4B-Thinking-2507 with various other Qwen3 4B fine-tuned models. The result is a robust and versatile model, highly suitable for use-cases such as text generation, coding, math, and multi-turn conversations.

Key Features

New Chat Template: Enhanced prompt features with special commands:
- /think: Enables an in-depth chain-of-thought reasoning mode.
- /shortthink: Enables a brief, step-by-step reasoning mode.
- /nothink: Provides direct answers with no intermediate reasoning.

Note: These commands only work on platforms that support jinja templating.

How to Use

Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "ertghiu256/Qwen3-4b-tcomanr-merge-v2.5"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")

prompt = "Give me a short introduction to large language model."
messages = [{"role": "user", "content": prompt}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(**model_inputs, max_new_tokens=32768)

output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
# Parsing thinking content
try:
    index = len(output_ids) - output_ids[::-1].index(151668)
except ValueError:
    index = 0
thinking_content = tokenizer.decode(output_ids[:index], skip_special_tokens=True).strip("\n")
content = tokenizer.decode(output_ids[index:], skip_special_tokens=True).strip("\n")

print("thinking content:", thinking_content)
print("content:", content)

vLLM

vllm serve ertghiu256/Qwen3-4b-tcomanr-merge-v2.5 --enable-reasoning --reasoning-parser deepseek_r1

SGLang

python -m sglang.launch_server --model-path ertghiu256/Qwen3-4b-tcomanr-merge-v2.5 --reasoning-parser deepseek-r1

llama.cpp

llama-server --hf-repo ertghiu256/Qwen3-4b-tcomanr-merge-v2.5
llama-cli --hf ertghiu256/Qwen3-4b-tcomanr-merge-v2.5

Ollama

# For Q8_0
ollama run ertghiu256/Qwen3-4b-tcomanr-merge-v2.5:Q8_0
# For Q5_K_M
ollama run ertghiu256/Qwen3-4b-tcomanr-merge-v2.5:Q5_K_M
# For IQ4_NL
ollama run ertghiu256/Qwen3-4b-tcomanr-merge-v2.5:IQ4_NL

LM Studio

Search for ertghiu256/Qwen3-4b-tcomanr-merge-v2.5 in the LM Studio model search and download directly.

Recommended Parameters

Reasoning: temp: 0.6, num_ctx: ≥8192, top_p: 0.95, top_k: 20
Non Reasoning: temp: 0.7, num_ctx: ≥2048, top_p: 0.85, top_k: 20, Repeat Penalty: 1.0 - 1.1

System prompt:

You are 'Tcomanr 2.5' an AI model made by Ertghiu256 using the base model of Qwen3 4b 2507 made by Alibaba. You are a helpful, playful, and neutral AI chatbot. Use markdown formatting and emojis to make your response less plain.

Merge Details

This model is created using the TIES method ([arxiv:2306.01708]) with Qwen3-4B-Thinking-2507 as the base. The following models are included in the merge (see model card for detailed settings and URLs):

ertghiu256/qwen3-4b-code-reasoning
Goekdeniz-Guelmez/Josiefied-Qwen3-4B-Instruct-2507-gabliterated-v1
ertghiu256/qwen3-math-reasoner
ertghiu256/Qwen3-Hermes-4b
ertghiu256/Qwen3-4B-Thinking-2507-Hermes-3
ValiantLabs/Qwen3-4B-Esper3
Tesslate/WEBGEN-4B-Preview
ValiantLabs/Qwen3-4B-ShiningValiant3
GetSoloTech/Qwen3-Code-Reasoning-4B
Qwen/Qwen3-4b-Instruct-2507
POLARIS-Project/Polaris-4B-Preview
janhq/Jan-v1-4B
huihui-ai/Huihui-Qwen3-4B-Thinking-2507-abliterated
ertghiu256/qwen-3-4b-mixture-of-thought
ertghiu256/qwen3-multi-reasoner
quelmap/Lightning-4b

YAML configuration is available on the model card for reference.

Credits

Special thanks to the Qwen team for providing the Qwen 3 4b and Qwen 3 4b 2507 base models, and to all contributors of the finetuned models merged into this release.

Model Size: 4.02B params (F16)
Language: English
Tasks: Text Generation, Reasoning, Code, Math, Conversational