safal312
/

qwen2.5-3b-kk-distilled

Model card Files Files and versions

qwen2.5-3b-kk-distilled / README.md

safal312's picture

Update README.md

b950c22 verified 6 months ago

|

history blame contribute delete

817 Bytes

	---
	license: apache-2.0
	base_model:
	- Qwen/Qwen2.5-3B
	---

	# 🧠 Model Card: Qwen2.5-3b-kk-distilled

	## 🧬 Model Description

	This is a distilled language model fine-tuned on reasoning traces derived from the QwQ-32B model using the the Knights and Knaves logic puzzles. The base model is Qwen2.5-3B.

	---

	## 📄 Associated Paper

	Title: Warm Up Before You Train: Unlocking General Reasoning in Resource-Constrained Settings

	arXiv: [https://arxiv.org/pdf/2505.13718](https://arxiv.org/pdf/2505.13718)

	hf papers: [https://huggingface.co/papers/2505.13718](https://huggingface.co/papers/2505.13718)

	---

	## 📚 Model Details

	* Base model: Qwen2.5-3B
	* Training data: [QwQ-Knights-and-Knaves-Traces](https://huggingface.co/datasets/safal312/knights_and_knaves_reasoning)

	---