Text Generation
Transformers
Safetensors
qwen3
code-generation
competitive-programming
code-reasoning
programming
algorithms
problem-solving
python
conversational
text-generation-inference
Instructions to use GetSoloTech/Qwen3-Code-Reasoning-4B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use GetSoloTech/Qwen3-Code-Reasoning-4B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="GetSoloTech/Qwen3-Code-Reasoning-4B") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("GetSoloTech/Qwen3-Code-Reasoning-4B") model = AutoModelForCausalLM.from_pretrained("GetSoloTech/Qwen3-Code-Reasoning-4B") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use GetSoloTech/Qwen3-Code-Reasoning-4B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "GetSoloTech/Qwen3-Code-Reasoning-4B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "GetSoloTech/Qwen3-Code-Reasoning-4B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/GetSoloTech/Qwen3-Code-Reasoning-4B
- SGLang
How to use GetSoloTech/Qwen3-Code-Reasoning-4B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "GetSoloTech/Qwen3-Code-Reasoning-4B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "GetSoloTech/Qwen3-Code-Reasoning-4B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "GetSoloTech/Qwen3-Code-Reasoning-4B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "GetSoloTech/Qwen3-Code-Reasoning-4B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use GetSoloTech/Qwen3-Code-Reasoning-4B with Docker Model Runner:
docker model run hf.co/GetSoloTech/Qwen3-Code-Reasoning-4B
| license: apache-2.0 | |
| datasets: | |
| - GetSoloTech/Code-Reasoning | |
| base_model: | |
| - Qwen/Qwen3-4B-Thinking-2507 | |
| pipeline_tag: text-generation | |
| library_name: transformers | |
| tags: | |
| - code-generation | |
| - competitive-programming | |
| - code-reasoning | |
| - programming | |
| - algorithms | |
| - problem-solving | |
| - python | |
| # GetSoloTech/Qwen3-Code-Reasoning-4B | |
| A finetuned version of Qwen3-4B-Thinking-2507 specifically optimized for competitive programming and code reasoning tasks. This model has been trained on the high-quality [Code-Reasoning](https://huggingface.co/datasets/GetSoloTech/Code-Reasoning) dataset to enhance its capabilities in solving complex programming problems with detailed reasoning. | |
| ## π― Model Overview | |
| This model is a **LoRA-finetuned** version of [Qwen3-4B-Thinking-2507](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507) with the following specifications: | |
| - **Base Model**: Qwen3-4B-Thinking-2507 (4.0B parameters) | |
| - **Training Method**: LoRA (Low-Rank Adaptation) | |
| - **Training Dataset**: GetSoloTech/Code-Reasoning | |
| - **Training Framework**: Unsloth with QLoRA | |
| - **Context Length**: 4096 tokens (configurable up to 262,144) | |
| - **Model Type**: Causal Language Model with Thinking Capabilities | |
| ## π Key Features | |
| - **Enhanced Code Reasoning**: Specifically trained on competitive programming problems | |
| - **Thinking Capabilities**: Inherits the advanced reasoning capabilities from the base model | |
| - **High-Quality Solutions**: Trained on solutions with β₯50% test case pass rates | |
| - **Structured Output**: Optimized for generating well-reasoned programming solutions | |
| - **Efficient Training**: Uses LoRA adapters for efficient parameter updates | |
| ### Dataset Statistics | |
| - **Split**: Python | |
| - **Source**: High-quality competitive programming problems from TACO, APPS, CodeContests, and Codeforces | |
| - **Quality Filter**: Only correctly solved problems with β₯50% test case pass rates | |
| ## π§ Usage | |
| ### Basic Inference | |
| ```python | |
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| model_name = "GetSoloTech/Qwen3-Code-Reasoning-4B" | |
| # Load the tokenizer and model | |
| tokenizer = AutoTokenizer.from_pretrained(model_name) | |
| model = AutoModelForCausalLM.from_pretrained( | |
| model_name, | |
| torch_dtype="auto", | |
| device_map="auto" | |
| ) | |
| # Prepare input for competitive programming problem | |
| messages = [ | |
| {"role": "system", "content": "You are an expert competitive programmer. Read the problem and produce a correct, efficient solution. Include reasoning if helpful."}, | |
| {"role": "user", "content": "Your programming problem here..."} | |
| ] | |
| text = tokenizer.apply_chat_template( | |
| messages, | |
| tokenize=False, | |
| add_generation_prompt=True, | |
| ) | |
| model_inputs = tokenizer([text], return_tensors="pt").to(model.device) | |
| # Generate solution | |
| generated_ids = model.generate( | |
| **model_inputs, | |
| max_new_tokens=4096, | |
| temperature=0.7, | |
| top_p=0.8, | |
| top_k=20 | |
| ) | |
| output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist() | |
| content = tokenizer.decode(output_ids, skip_special_tokens=True).strip("\n") | |
| print(content) | |
| ``` | |
| ## π Performance Expectations | |
| This finetuned model is expected to show improved performance on: | |
| - **Competitive Programming Problems**: Better understanding of problem constraints and requirements | |
| - **Code Generation**: More accurate and efficient solutions | |
| - **Reasoning Quality**: Enhanced step-by-step reasoning for complex problems | |
| - **Solution Completeness**: More comprehensive solutions with proper edge case handling | |
| ## ποΈ Recommended Settings | |
| ### For Code Generation | |
| - **Temperature**: 0.7 | |
| - **Top-p**: 0.8 | |
| - **Top-k**: 20 | |
| - **Max New Tokens**: 4096 (adjust based on problem complexity) | |
| ### For Reasoning Tasks | |
| - **Temperature**: 0.6 | |
| - **Top-p**: 0.95 | |
| - **Top-k**: 20 | |
| - **Max New Tokens**: 81920 (for complex reasoning) | |
| ## π Related Resources | |
| - **Base Model**: [Qwen3-4B-Thinking-2507](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507) | |
| - **Training Dataset**: [Code-Reasoning](https://huggingface.co/datasets/GetSoloTech/Code-Reasoning) | |
| - **Training Framework**: [Unsloth](https://github.com/unslothai/unsloth) | |
| - **Original Dataset**: [OpenCodeReasoning-2](https://huggingface.co/datasets/nvidia/OpenCodeReasoning-2) | |
| ## π€ Contributing | |
| This model was created using the Unsloth framework and the Code-Reasoning dataset. For questions about: | |
| - The base model: [Qwen3 GitHub](https://github.com/QwenLM/Qwen3) | |
| - The training dataset: [Code-Reasoning Repository](https://huggingface.co/datasets/GetSoloTech/Code-Reasoning) | |
| - The training framework: [Unsloth Documentation](https://docs.unsloth.ai/) | |
| ## π License | |
| This model follows the same license as the base model (Apache 2.0). Please refer to the [base model license](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507/blob/main/LICENSE) for details. | |
| ## π Acknowledgments | |
| - **Qwen Team** for the excellent base model | |
| - **Unsloth Team** for the efficient training framework | |
| - **NVIDIA Research** for the original OpenCodeReasoning-2 dataset | |
| ## π Contact | |
| For questions about this finetuned model, please open an issue in the repository. | |
| --- | |
| **Note**: This model is specifically optimized for competitive programming and code reasoning tasks. |