|
|
--- |
|
|
license: apache-2.0 |
|
|
base_model: |
|
|
- Qwen/Qwen2.5-3B |
|
|
--- |
|
|
|
|
|
# π§ Model Card: **Qwen2.5-3b-kk-distilled** |
|
|
|
|
|
## 𧬠Model Description |
|
|
|
|
|
This is a distilled language model fine-tuned on reasoning traces derived from the **QwQ-32B** model using the the **Knights and Knaves** logic puzzles. The base model is Qwen2.5-3B. |
|
|
|
|
|
--- |
|
|
|
|
|
## π Associated Paper |
|
|
|
|
|
**Title:** *Warm Up Before You Train: Unlocking General Reasoning in Resource-Constrained Settings* |
|
|
|
|
|
**arXiv:** [https://arxiv.org/pdf/2505.13718](https://arxiv.org/pdf/2505.13718) |
|
|
|
|
|
**hf papers**: [https://huggingface.co/papers/2505.13718](https://huggingface.co/papers/2505.13718) |
|
|
|
|
|
--- |
|
|
|
|
|
## π Model Details |
|
|
|
|
|
* **Base model:** Qwen2.5-3B |
|
|
* **Training data:** [QwQ-Knights-and-Knaves-Traces](https://huggingface.co/datasets/safal312/knights_and_knaves_reasoning) |
|
|
|
|
|
--- |