metadata
license: apache-2.0
base_model:
- Qwen/Qwen2.5-3B
π§ Model Card: Qwen2.5-3b-kk-distilled
𧬠Model Description
This is a distilled language model fine-tuned on reasoning traces derived from the QwQ-32B model using the the Knights and Knaves logic puzzles. The base model is Qwen2.5-3B.
π Associated Paper
Title: Warm Up Before You Train: Unlocking General Reasoning in Resource-Constrained Settings
arXiv: https://arxiv.org/pdf/2505.13718
hf papers: https://huggingface.co/papers/2505.13718
π Model Details
- Base model: Qwen2.5-3B
- Training data: QwQ-Knights-and-Knaves-Traces