responsible-ai commited on
Commit
5c41ac7
·
verified ·
1 Parent(s): 0070aed

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -4
README.md CHANGED
@@ -13,14 +13,16 @@ tags:
13
  <br>
14
 
15
  # News 📢
16
- - 📘 `2025/09/24`: Published a [Responsible AI Technical Report](https://huggingface.co/K-intelligence/Llama-SafetyGuard-Content-Binary/resolve/main/assets/RAI_Technical_Report.pdf)
17
- - ⚡️`2025/09/24`: Released SafetyGuard Model collection on Hugging Face🤗.
 
18
  <br>
19
  <br>
20
 
21
  # Overview
22
  ## Description
23
- **SafetyGuard :: Content Binary Guard** is a streaming-aware safety classifier built with Llama.
 
24
 
25
 
26
  ## What it does
@@ -134,7 +136,7 @@ def guard_stream(response_chunks, step_chars: int = 100, tau: float = 0.5):
134
  | | Sensitive Uses | Content providing advice in specialized domains that may significantly influence user decision-making beyond the scope of basic domain-specific knowledge. |
135
  | **Legal and Rights related Risks** | Privacy | Content requesting, misusing, or facilitating the unauthorized disclosure of an individual’s private information. |
136
  | | Illegal or Unethical | Content promoting or endorsing illegal or unethical behavior, or providing information related to such activities. |
137
- | | Copyrights | Content requesting or encouraging violations of copyright or security as defined. |
138
  | | Weaponization | Content promoting the possession, distribution, or manufacturing of firearms, or encouraging methods and intentions related to cyberattacks, infrastructure sabotage, or CBRN (Chemical, Biological, Radiological, and Nuclear) weapons. |
139
 
140
 
@@ -184,5 +186,19 @@ KT proprietary evaluation dataset
184
  ## License
185
  This model is released under the [Llama 3.1 Community License Agreement](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B/blob/main/LICENSE).
186
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
187
  ## Contact
188
  Technical Inquiries: [[email protected]](mailto:[email protected])
 
13
  <br>
14
 
15
  # News 📢
16
+ - 📑 `2025/10/01`: Published a [Research Paper](https://arxiv.org/abs/2509.23381)
17
+ - 📘 `2025/09/24`: Published a [Responsible AI Technical Report](https://arxiv.org/abs/2509.20057)
18
+ - ⚡️ `2025/09/24`: Released SafetyGuard Model collection on Hugging Face🤗.
19
  <br>
20
  <br>
21
 
22
  # Overview
23
  ## Description
24
+ **SafetyGuard :: Content Binary Guard** is a streaming-aware safety classifier built with Llama.
25
+ For more technical details, please refer to our [Research Paper](https://arxiv.org/abs/2509.23381).
26
 
27
 
28
  ## What it does
 
136
  | | Sensitive Uses | Content providing advice in specialized domains that may significantly influence user decision-making beyond the scope of basic domain-specific knowledge. |
137
  | **Legal and Rights related Risks** | Privacy | Content requesting, misusing, or facilitating the unauthorized disclosure of an individual’s private information. |
138
  | | Illegal or Unethical | Content promoting or endorsing illegal or unethical behavior, or providing information related to such activities. |
139
+ | | Copyrights | Content requesting or encouraging violations of copyright or security as defined under South Korean law. |
140
  | | Weaponization | Content promoting the possession, distribution, or manufacturing of firearms, or encouraging methods and intentions related to cyberattacks, infrastructure sabotage, or CBRN (Chemical, Biological, Radiological, and Nuclear) weapons. |
141
 
142
 
 
186
  ## License
187
  This model is released under the [Llama 3.1 Community License Agreement](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B/blob/main/LICENSE).
188
 
189
+
190
+ ## Citation
191
+
192
+ ```
193
+ @misc{lee2025guardvectorenglishllm,
194
+ title={Guard Vector: Beyond English LLM Guardrails with Task-Vector Composition and Streaming-Aware Prefix SFT},
195
+ author={Wonhyuk Lee and Youngchol Kim and Yunjin Park and Junhyung Moon and Dongyoung Jeong and Wanjin Park},
196
+ year={2025},
197
+ eprint={2509.23381},
198
+ archivePrefix={arXiv},
199
+ primaryClass={cs.CL},
200
+ url={https://arxiv.org/abs/2509.23381},
201
+ }
202
+ ```
203
  ## Contact
204
  Technical Inquiries: [[email protected]](mailto:[email protected])