safal312's picture
Update README.md
b950c22 verified
metadata
license: apache-2.0
base_model:
  - Qwen/Qwen2.5-3B

🧠 Model Card: Qwen2.5-3b-kk-distilled

🧬 Model Description

This is a distilled language model fine-tuned on reasoning traces derived from the QwQ-32B model using the the Knights and Knaves logic puzzles. The base model is Qwen2.5-3B.


πŸ“„ Associated Paper

Title: Warm Up Before You Train: Unlocking General Reasoning in Resource-Constrained Settings

arXiv: https://arxiv.org/pdf/2505.13718

hf papers: https://huggingface.co/papers/2505.13718


πŸ“š Model Details