---
base_model:
- huihui-ai/Huihui-gpt-oss-20b-BF16-abliterated
license: apache-2.0
pipeline_tag: text-generation
library_name: transformers
tags:
- vllm
- generated_from_trainer
- trl
- sft
- abliterated
- uncensored
---

# huihui-ai/Huihui-gpt-oss-20b-BF16-abliterated-v2

This model is a fine-tuned version of [huihui-ai/Huihui-gpt-oss-20b-BF16-abliterated](https://huggingface.co/huihui-ai/Huihui-gpt-oss-20b-BF16-abliterated).
It has been trained using [TRL](https://github.com/huggingface/trl).

Please refer to [Quantization-Aware Training (QAT)](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/76e8ce21bf9ce4e0510fea96c998aaee7cfeaf7c/examples/gpt-oss/README.md) 
for fine-tuning and quantization([huihui-ai/Huihui-gpt-oss-20b-mxfp4-abliterated-v2](https://huggingface.co/huihui-ai/Huihui-gpt-oss-20b-mxfp4-abliterated-v2)).

## Dataset
Using [huihui-ai/Huihui-gpt-oss-20b-BF16-abliterated](https://huggingface.co/huihui-ai/Huihui-gpt-oss-20b-BF16-abliterated) to generate a dataset for harmful instructions.

**Advantages**: All core metrics (Loss/Acc/Entropy) improve synchronously, with a small gap between Eval and Train (<0.01), indicating strong generalization ability. Fine-tuning shows effect in just 400 steps, with high efficiency.  

**Potential Issues**: The rise in Grad Norm in the later stages may be caused by lack of learning rate decay or batch noise; suggest checking the logs for signs of gradient explosion.

**Training metrics**
![training metrics)](training_metrics_plot.png)


## ollama
Ollama requires the latest version: [v0.11.8](https://github.com/ollama/ollama/releases/tag/v0.11.8)

You can use [huihui_ai/gpt-oss-abliterated:20b-v2-q4_K_M](https://ollama.com/huihui_ai/gpt-oss-abliterated:20b-v2-q4_K_M) directly, 
```
ollama run huihui_ai/gpt-oss-abliterated:20b-v2-q4_K_M
```

## GGUF


[llama.cpp-b6115](https://github.com/ggml-org/llama.cpp/releases/tag/b6115) now supports conversion to GGUF format and can be tested using  llama-cli.

The [GGUF](https://huggingface.co/huihui-ai/Huihui-gpt-oss-20b-BF16-abliterated-v2/tree/main/GGUF) file has been uploaded. 

```
llama-cli -m huihui-ai/Huihui-gpt-oss-20b-mxfp4-abliterated-v2/GGUF/Huihui-gpt-oss-20b-BF16-abliterated-v2-Q4_K_M.gguf

```

## Quick start

```python
from transformers import pipeline

question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
generator = pipeline("text-generation", model="huihui-ai/Huihui-gpt-oss-20b-BF16-abliterated-v2", device="cuda")
output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
print(output["generated_text"])
```

## Training procedure

This model was trained with SFT.

### Framework versions

- TRL: 0.23.0
- Transformers: 4.57.0.dev0
- Pytorch: 2.8.0+cu128
- Datasets: 4.0.0
- Tokenizers: 0.22.0

## Citations

Cite TRL as:
    
```bibtex
@misc{vonwerra2022trl,
	title        = {{TRL: Transformer Reinforcement Learning}},
	author       = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\'e}dec},
	year         = 2020,
	journal      = {GitHub repository},
	publisher    = {GitHub},
	howpublished = {\url{https://github.com/huggingface/trl}}
}
```

### Usage Warnings


 - **Risk of Sensitive or Controversial Outputs**: This model’s safety filtering has been significantly reduced, potentially generating sensitive, controversial, or inappropriate content. Users should exercise caution and rigorously review generated outputs.

 - **Not Suitable for All Audiences**: Due to limited content filtering, the model’s outputs may be inappropriate for public settings, underage users, or applications requiring high security.

 - **Legal and Ethical Responsibilities**: Users must ensure their usage complies with local laws and ethical standards. Generated content may carry legal or ethical risks, and users are solely responsible for any consequences.

 - **Research and Experimental Use**: It is recommended to use this model for research, testing, or controlled environments, avoiding direct use in production or public-facing commercial applications.

 - **Monitoring and Review Recommendations**: Users are strongly advised to monitor model outputs in real-time and conduct manual reviews when necessary to prevent the dissemination of inappropriate content.

 - **No Default Safety Guarantees**: Unlike standard models, this model has not undergone rigorous safety optimization. huihui.ai bears no responsibility for any consequences arising from its use.


### Donation

If you like it, please click 'like' and follow us for more updates.  
You can follow [x.com/support_huihui](https://x.com/support_huihui) to get the latest model information from huihui.ai.

##### Your donation helps us continue our further development and improvement, a cup of coffee can do it.
- bitcoin（BTC):
```
  bc1qqnkhuchxw0zqjh2ku3lu4hq45hc6gy84uk70ge
```
- Support our work on Ko-fi (https://ko-fi.com/huihuiai)!