--- base_model: - huihui-ai/Huihui-gpt-oss-20b-BF16-abliterated license: apache-2.0 pipeline_tag: text-generation library_name: transformers tags: - vllm - generated_from_trainer - trl - sft - abliterated - uncensored --- # huihui-ai/Huihui-gpt-oss-20b-BF16-abliterated-v2 This model is a fine-tuned version of [huihui-ai/Huihui-gpt-oss-20b-BF16-abliterated](https://huggingface.co/huihui-ai/Huihui-gpt-oss-20b-BF16-abliterated). It has been trained using [TRL](https://github.com/huggingface/trl). Please refer to [Quantization-Aware Training (QAT)](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/76e8ce21bf9ce4e0510fea96c998aaee7cfeaf7c/examples/gpt-oss/README.md) for fine-tuning and quantization([huihui-ai/Huihui-gpt-oss-20b-mxfp4-abliterated-v2](https://huggingface.co/huihui-ai/Huihui-gpt-oss-20b-mxfp4-abliterated-v2)). ## Dataset Using [huihui-ai/Huihui-gpt-oss-20b-BF16-abliterated](https://huggingface.co/huihui-ai/Huihui-gpt-oss-20b-BF16-abliterated) to generate a dataset for harmful instructions. **Advantages**: All core metrics (Loss/Acc/Entropy) improve synchronously, with a small gap between Eval and Train (<0.01), indicating strong generalization ability. Fine-tuning shows effect in just 400 steps, with high efficiency. **Potential Issues**: The rise in Grad Norm in the later stages may be caused by lack of learning rate decay or batch noise; suggest checking the logs for signs of gradient explosion. **Training metrics** ![training metrics)](training_metrics_plot.png) ## ollama Ollama requires the latest version: [v0.11.8](https://github.com/ollama/ollama/releases/tag/v0.11.8) You can use [huihui_ai/gpt-oss-abliterated:20b-v2-q4_K_M](https://ollama.com/huihui_ai/gpt-oss-abliterated:20b-v2-q4_K_M) directly, ``` ollama run huihui_ai/gpt-oss-abliterated:20b-v2-q4_K_M ``` ## GGUF [llama.cpp-b6115](https://github.com/ggml-org/llama.cpp/releases/tag/b6115) now supports conversion to GGUF format and can be tested using llama-cli. The [GGUF](https://huggingface.co/huihui-ai/Huihui-gpt-oss-20b-BF16-abliterated-v2/tree/main/GGUF) file has been uploaded. ``` llama-cli -m huihui-ai/Huihui-gpt-oss-20b-mxfp4-abliterated-v2/GGUF/Huihui-gpt-oss-20b-BF16-abliterated-v2-Q4_K_M.gguf ``` ## Quick start ```python from transformers import pipeline question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?" generator = pipeline("text-generation", model="huihui-ai/Huihui-gpt-oss-20b-BF16-abliterated-v2", device="cuda") output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0] print(output["generated_text"]) ``` ## Training procedure This model was trained with SFT. ### Framework versions - TRL: 0.23.0 - Transformers: 4.57.0.dev0 - Pytorch: 2.8.0+cu128 - Datasets: 4.0.0 - Tokenizers: 0.22.0 ## Citations Cite TRL as: ```bibtex @misc{vonwerra2022trl, title = {{TRL: Transformer Reinforcement Learning}}, author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\'e}dec}, year = 2020, journal = {GitHub repository}, publisher = {GitHub}, howpublished = {\url{https://github.com/huggingface/trl}} } ``` ### Usage Warnings - **Risk of Sensitive or Controversial Outputs**: This model’s safety filtering has been significantly reduced, potentially generating sensitive, controversial, or inappropriate content. Users should exercise caution and rigorously review generated outputs. - **Not Suitable for All Audiences**: Due to limited content filtering, the model’s outputs may be inappropriate for public settings, underage users, or applications requiring high security. - **Legal and Ethical Responsibilities**: Users must ensure their usage complies with local laws and ethical standards. Generated content may carry legal or ethical risks, and users are solely responsible for any consequences. - **Research and Experimental Use**: It is recommended to use this model for research, testing, or controlled environments, avoiding direct use in production or public-facing commercial applications. - **Monitoring and Review Recommendations**: Users are strongly advised to monitor model outputs in real-time and conduct manual reviews when necessary to prevent the dissemination of inappropriate content. - **No Default Safety Guarantees**: Unlike standard models, this model has not undergone rigorous safety optimization. huihui.ai bears no responsibility for any consequences arising from its use. ### Donation If you like it, please click 'like' and follow us for more updates. You can follow [x.com/support_huihui](https://x.com/support_huihui) to get the latest model information from huihui.ai. ##### Your donation helps us continue our further development and improvement, a cup of coffee can do it. - bitcoin(BTC): ``` bc1qqnkhuchxw0zqjh2ku3lu4hq45hc6gy84uk70ge ``` - Support our work on Ko-fi (https://ko-fi.com/huihuiai)!