--- library_name: transformers license: apache-2.0 datasets: - HuggingFaceH4/ultrafeedback_binarized language: - en base_model: - Qwen/Qwen2-0.5B-Instruct pipeline_tag: text-generation --- # Model Card for Model ID ## Model Details - **Base Model:** Qwen2-0.5B - **Fine-tuning Method:** Direct Preference Optimization (DPO) - **Framework:** Unsloth - **Quantization:** 4-bit QLoRA (during training) ## Uses ```python from transformers import AutoTokenizer from unsloth import FastLanguageModel model, tokenizer = FastLanguageModel.from_pretrained( model_name = "VinitT/Qwen2-0.5B-DPO", dtype = None, load_in_4bit = False, ) messages = [{"role": "user", "content": "Hello,how can i develop a habit of drawing daily?"}] inputs = tokenizer.apply_chat_template( messages, tokenize=True, add_generation_prompt=True, return_dict=True, return_tensors="pt" ) inputs = {k: v.to(model.device) for k, v in inputs.items()} # Generate outputs = model.generate( **inputs, max_new_tokens=100, temperature=0.7, top_p=0.9, do_sample=True ) # Decode only the new response (not the prompt) prompt_len = inputs["input_ids"].shape[-1] response = tokenizer.decode(outputs[0][prompt_len:], skip_special_tokens=True) print(response.strip()) ```