vincentoh
/

guardreasoner-8b-4bit

Text Generation

content-moderation

4-bit precision

Model card Files Files and versions

guardreasoner-8b-4bit / README.md

vincentoh's picture

Upload README.md with huggingface_hub

d2b5178 verified 25 days ago

|

history blame contribute delete

2.42 kB

	---
	license: llama3
	base_model: yueliu1999/GuardReasoner-8B
	tags:
	- llama
	- safety
	- content-moderation
	- 4-bit
	- bitsandbytes
	- quantized
	library_name: transformers
	pipeline_tag: text-generation
	---

	# GuardReasoner-8B (4-bit Quantized)

	Pre-quantized 4-bit version of [yueliu1999/GuardReasoner-8B](https://huggingface.co/yueliu1999/GuardReasoner-8B) for efficient inference.

	## Model Description

	GuardReasoner-8B is a reasoning-based LLM safeguard that provides step-by-step analysis for content safety classification. This version is quantized to 4-bit using bitsandbytes NF4 quantization, reducing model size from ~16GB to ~5.4GB while maintaining performance.

	Paper: [GuardReasoner: Towards Reasoning-based LLM Safeguards](https://arxiv.org/abs/2501.18492)

	## Usage

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM

	model_id = "vincentoh/guardreasoner-8b-4bit"

	tokenizer = AutoTokenizer.from_pretrained(model_id)
	model = AutoModelForCausalLM.from_pretrained(
	model_id,
	device_map="auto",
	trust_remote_code=True,
	)

	# Build prompt
	text = "What is the capital of France?"
	prompt = f"""<\|begin_of_text\|><\|start_header_id\|>system<\|end_header_id\|>

	You are a content safety expert. Analyze the request and determine if it is safe or harmful.

	Think step-by-step:
	1. What is being requested?
	2. What are the potential harms?
	3. Does this violate safety policies?

	End your analysis with exactly: "Request: harmful" or "Request: unharmful".<\|eot_id\|><\|start_header_id\|>user<\|end_header_id\|>

	{text}<\|eot_id\|><\|start_header_id\|>assistant<\|end_header_id\|>

	"""

	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
	outputs = model.generate(**inputs, max_new_tokens=256, do_sample=False)
	response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
	print(response)
	```

	## Quantization Details

	- Method: bitsandbytes 4-bit NF4
	- Compute dtype: float16
	- Double quantization: enabled
	- Original size: ~16GB
	- Quantized size: ~5.4GB

	## Performance

	Expected ~84% F1 on safety benchmarks (same as original model).

	## License

	This model inherits the Llama 3 license from the base model.

	## Citation

	```bibtex
	@article{liu2025guardreasoner,
	title={GuardReasoner: Towards Reasoning-based LLM Safeguards},
	author={Liu, Yue and others},
	journal={arXiv preprint arXiv:2501.18492},
	year={2025}
	}
	```