Qwen3-14B-AI-Expert-250925 - A Fine-tuned Model for AI Core Technologies

🤗 Hugging Face | 🤖 ModelScope

This model is a specialized expert on core Artificial Intelligence concepts, developed by performing Instruction Supervised Fine-Tuning (SFT) on the Qwen/Qwen3-14B model.

The fine-tuning was conducted using Low-Rank Adaptation (LoRA), a parameter-efficient technique, on a custom-built dataset. This process adapted the model to provide high-quality, detailed responses specifically within the domains of:

Large Language Models (LLMs)
Retrieval-Augmented Generation (RAG)
AI Agents

The model was fine-tuned with LlaMA-Factory.

Developed by: real-jiakai
License: apache-2.0
Finetuned from model: Qwen/Qwen3-14B

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "GXMZU/Qwen3-14B-ai-expert-250925"

# Load the tokenizer and the model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)

# Prepare the model input
prompt = "What is MCP Protocol?"
messages = [
    {"role": "system", "content": "You are an AI expert assistant (Focus on LLM, RAG, and Agent Domain) to help with technical questions. You should provide clear, accurate, and helpful responses."},
    {"role": "user", "content": prompt}
]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=False  # Switches between thinking and non-thinking modes. Default is True.
)

model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

# Conduct text completion
generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512
)

output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()

# Parse thinking content (if enabled)
try:
    # rindex finding 151668 (</think>)
    index = len(output_ids) - output_ids[::-1].index(151668)
except ValueError:
    index = 0

thinking_content = tokenizer.decode(output_ids[:index], skip_special_tokens=True).strip("\n")
content = tokenizer.decode(output_ids[index:], skip_special_tokens=True).strip("\n")

if thinking_content:
    print("Thinking content:", thinking_content)
print("Response:", content)

Performance

The primary objective of this fine-tuning is to adapt the model to a specialized domain, enhancing its performance on specific tasks by injecting relevant knowledge and terminology while preserving its foundational generalist capabilities.

Before Fine-tuning vs After Fine-tuning

The model demonstrates significant improvements in domain-specific tasks related to LLM, RAG, and AI Agents, as shown in the example below:

Note: Currently, due to the lack of a test dataset, we cannot effectively quantify the model's performance before and after fine-tuning. We plan to construct a fine-tuning test dataset in the future for better evaluation.

NLG Evaluation

The following table shows the model's performance on standard benchmarks after fine-tuning:

Benchmark	Metric	Qwen/Qwen3-14B (Base)	Qwen3-14B-ai-expert-250925 (Fine-tuned)
MMLU	Average	35.29	31.68
	STEM	35.49	34.39
	Social Sciences	38.64	30.52
	Humanities	31.90	29.08
	Other	36.86	34.05
CEval	Average	33.21	39.00
	STEM	34.42	40.47
	Social Sciences	32.36	40.36
	Humanities	29.96	33.85
	Other	34.64	39.84
CMMLU	Average	32.35	33.61
	STEM	31.61	36.67
	Social Sciences	34.15	32.45
	Humanities	31.06	32.62
	Other	31.86	33.26

The results show that the model maintains strong general capabilities while gaining specialized expertise.

Fine-tuning Procedure

Dataset

The model was fine-tuned on a custom, high-quality dataset of 9,735 Alpaca-format items. The dataset was carefully curated to cover three core areas:

Large Language Models (LLM)
Retrieval-Augmented Generation (RAG)
AI Agents

Training Loss

You can view the full training run on Weights & Biases.

Future Plans

Enhanced Evaluation Framework: Implement more flexible evaluation metrics including LLM-as-a-Judge methodologies (prerequisite: developing comprehensive test datasets for rigorous assessment)
Dataset Expansion: Continue to enrich the instruction fine-tuning dataset by maintaining the high quality of existing data while adding new data with emphasis on both quality and quantity
Data Quality Enhancement: Refine the existing instruction tuning dataset by correcting and standardizing its phrasing and formatting.

Citation

If you use this model in your work, please cite it as:

@misc{Qwen3-14B-AI-Expert-250925,
  author = {real-jiakai},
  title = {Qwen3-14B-AI-Expert-250925},
  year = 2025,
  url = {https://huggingface.co/GXMZU/Qwen3-14B-ai-expert-250925},
  publisher = {Hugging Face}
}

@misc{qwen3technicalreport,
  title={Qwen3 Technical Report}, 
  author={Qwen Team},
  year={2025},
  eprint={2505.09388},
  archivePrefix={arXiv},
  primaryClass={cs.CL},
  url={https://arxiv.org/abs/2505.09388}
}

@inproceedings{zheng2024llamafactory,
  title={LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models},
  author={Yaowei Zheng and Richong Zhang and Junhao Zhang and Yanhan Ye and Zheyan Luo and Zhangchi Feng and Yongqiang Ma},
  booktitle={Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)},
  address={Bangkok, Thailand},
  publisher={Association for Computational Linguistics},
  year={2024},
  url={http://arxiv.org/abs/2403.13372}
}

Downloads last month: 7

Safetensors

Model size

15B params

Tensor type

BF16

Model tree for GXMZU/Qwen3-14B-ai-expert-250925

Base model

Qwen/Qwen3-14B-Base

Finetuned

Qwen/Qwen3-14B

Finetuned

(134)

this model

Quantizations

1 model