RWKV-Red-Team
/

ARWKV-R1-1B5

Text Generation

Model card Files Files and versions

xiaol commited on Feb 7

Commit

7753d3d

·

verified ·

1 Parent(s): c2aa8d4

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -31,7 +31,7 @@ library_name: transformers
 **ALL YOU NEED IS RWKV**
-This is an **early preview** of our 7B parameter hybrid RNN-Transformer model, trained on 2k context length **(only stage-2 applied, without SFT or DPO)** through 3-stage knowledge distillation from DeepSeek-R1-Distill-Qwen-1.5B. While being a foundational version, it demonstrates:
 - ✅ RWKV-7's efficient recurrence mechanism
 - ✅ No self-attention, fully O(n)

 **ALL YOU NEED IS RWKV**
+This is an **early preview** of our 7B parameter RNN-based model, trained on 2k context length **(only stage-2 applied, without SFT or DPO)** through 3-stage knowledge distillation from DeepSeek-R1-Distill-Qwen-1.5B. While being a foundational version, it demonstrates:
 - ✅ RWKV-7's efficient recurrence mechanism
 - ✅ No self-attention, fully O(n)