Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Uppaal
/
gpt-j-ProFS-toxicity
like
0
Text Generation
Transformers
Safetensors
English
gptj
generation
safety
model-editing
editing
activation-steering
activation-editing
dpo
rlhf
profs
detox
toxicity
iclr
iclr2025
arxiv:
2405.13967
arxiv:
2401.01967
License:
mit
Model card
Files
Files and versions
xet
Community
Deploy
Use this model
main
gpt-j-ProFS-toxicity
12.1 GB
1 contributor
History:
8 commits
Uppaal
Update README.md
6f02751
verified
16 days ago
.gitattributes
Safe
1.58 kB
Upload ProFS Method.png
18 days ago
ProFS Method.png
111 kB
xet
Upload ProFS Method.png
18 days ago
README.md
7.08 kB
Update README.md
16 days ago
added_tokens.json
Safe
4.33 kB
Upload tokenizer
19 days ago
config.json
Safe
947 Bytes
Upload GPTJForCausalLM
19 days ago
generation_config.json
Safe
119 Bytes
Upload GPTJForCausalLM
19 days ago
merges.txt
Safe
456 kB
Upload tokenizer
19 days ago
model-00001-of-00003.safetensors
Safe
4.98 GB
xet
Upload GPTJForCausalLM
19 days ago
model-00002-of-00003.safetensors
4.97 GB
xet
Upload GPTJForCausalLM
19 days ago
model-00003-of-00003.safetensors
Safe
2.16 GB
xet
Upload GPTJForCausalLM
19 days ago
model.safetensors.index.json
Safe
21.8 kB
Upload GPTJForCausalLM
19 days ago
special_tokens_map.json
Safe
470 Bytes
Upload tokenizer
19 days ago
tokenizer.json
Safe
3.59 MB
Upload tokenizer
19 days ago
tokenizer_config.json
Safe
26.8 kB
Upload tokenizer
19 days ago
vocab.json
Safe
798 kB
Upload tokenizer
19 days ago