--- license: apache-2.0 base_model: Qwen/Qwen3-4B-Instruct-2507 tags: - text-generation - bias-mitigation - self-correction - sherlock - lora - peft library_name: transformers --- # Qwen-4B-Instruct-2505-Self-correct This is a **Sherlock-style debiasing model** trained using the Self-Correction approach for bias mitigation. ## Model Description - **Base Model**: [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) - **Training Method**: LoRA (Low-Rank Adaptation) with QLoRA (4-bit quantization), then merged - **Task**: Bias mitigation and self-correction - **Framework**: PyTorch + Transformers + PEFT ## Training Details This model was trained using the Sherlock framework which includes: 1. **Stage I (SFT)**: Supervised fine-tuning on bias correction examples 2. **Stage II (Offline)**: Preference learning with DPO + Self-Correction loss ### Key Features - Self-correction capability for biased reasoning - Trajectory-level preference learning - Dynamic β adaptation based on divergence points ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer # Load model model = AutoModelForCausalLM.from_pretrained( "fenffef/Qwen-4B-Instruct-2505-Self-correct", device_map="auto", trust_remote_code=True ) tokenizer = AutoTokenizer.from_pretrained("fenffef/Qwen-4B-Instruct-2505-Self-correct") # Generate messages = [ {"role": "user", "content": "Your prompt here"} ] text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) inputs = tokenizer(text, return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=512) response = tokenizer.decode(outputs[0], skip_special_tokens=True) print(response) ``` ## Training Configuration Configuration file not found. ## Citation If you use this model, please cite: ```bibtex @article{sherlock2024, title={Sherlock: Self-Correcting Framework for Bias Mitigation}, author={Your Name}, year={2024} } ``` ## License This model is released under the Apache 2.0 license.