Seerkfang commited on
Commit
a5551ca
·
verified ·
1 Parent(s): 1fb9666

Fix typos in README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -9
README.md CHANGED
@@ -42,15 +42,15 @@ This repository hosts the model weights for AHN. For installation, usage instruc
42
  ### Model Zoo
43
  | base model | AHN module | #params | checkpoint (AHN only) |
44
  |:---:|:---:| :---:|:---:|
45
- | Qwen2.5-3B-Instruct | Mamba2 | 119M | [🤗model](https://huggingface.co/ByteDance-Seed/AHN-Mamba2-for-Qwen-2.5-Instruct-3B) |
46
- | Qwen2.5-3B-Instruct | DeltaNet | 118M | [🤗model](https://huggingface.co/ByteDance-Seed/AHN-DN-for-Qwen-2.5-Instruct-3B) |
47
- | Qwen2.5-3B-Instruct | GatedDeltaNet | 130M | [🤗model](https://huggingface.co/ByteDance-Seed/AHN-GDN-for-Qwen-2.5-Instruct-3B) |
48
- | Qwen2.5-7B-Instruct | Mamba2 | 186M | [🤗model](https://huggingface.co/ByteDance-Seed/AHN-Mamba2-for-Qwen-2.5-Instruct-7B) |
49
- | Qwen2.5-7B-Instruct | DeltaNet | 185M | [🤗model](https://huggingface.co/ByteDance-Seed/AHN-DN-for-Qwen-2.5-Instruct-7B) |
50
- | Qwen2.5-7B-Instruct | GatedDeltaNet | 213M | [🤗model](https://huggingface.co/ByteDance-Seed/AHN-GDN-for-Qwen-2.5-Instruct-7B) |
51
- | Qwen2.5-14B-Instruct | Mamba2 | 514M | [🤗model](https://huggingface.co/ByteDance-Seed/AHN-Mamba2-for-Qwen-2.5-Instruct-14B) |
52
- | Qwen2.5-14B-Instruct | DeltaNet | 511M | [🤗model](https://huggingface.co/ByteDance-Seed/AHN-DN-for-Qwen-2.5-Instruct-14B) |
53
- | Qwen2.5-14B-Instruct | GatedDeltaNet | 610M | [🤗model](https://huggingface.co/ByteDance-Seed/AHN-GDN-for-Qwen-2.5-Instruct-14B) |
54
 
55
  ### Evaluation
56
 
 
42
  ### Model Zoo
43
  | base model | AHN module | #params | checkpoint (AHN only) |
44
  |:---:|:---:| :---:|:---:|
45
+ | Qwen2.5-3B-Instruct | Mamba2 | 11.9M | [🤗model](https://huggingface.co/ByteDance-Seed/AHN-Mamba2-for-Qwen-2.5-Instruct-3B) |
46
+ | Qwen2.5-3B-Instruct | DeltaNet | 11.8M | [🤗model](https://huggingface.co/ByteDance-Seed/AHN-DN-for-Qwen-2.5-Instruct-3B) |
47
+ | Qwen2.5-3B-Instruct | GatedDeltaNet | 13.0M | [🤗model](https://huggingface.co/ByteDance-Seed/AHN-GDN-for-Qwen-2.5-Instruct-3B) |
48
+ | Qwen2.5-7B-Instruct | Mamba2 | 18.6M | [🤗model](https://huggingface.co/ByteDance-Seed/AHN-Mamba2-for-Qwen-2.5-Instruct-7B) |
49
+ | Qwen2.5-7B-Instruct | DeltaNet | 18.5M | [🤗model](https://huggingface.co/ByteDance-Seed/AHN-DN-for-Qwen-2.5-Instruct-7B) |
50
+ | Qwen2.5-7B-Instruct | GatedDeltaNet | 21.3M | [🤗model](https://huggingface.co/ByteDance-Seed/AHN-GDN-for-Qwen-2.5-Instruct-7B) |
51
+ | Qwen2.5-14B-Instruct | Mamba2 | 51.4M | [🤗model](https://huggingface.co/ByteDance-Seed/AHN-Mamba2-for-Qwen-2.5-Instruct-14B) |
52
+ | Qwen2.5-14B-Instruct | DeltaNet | 51.1M | [🤗model](https://huggingface.co/ByteDance-Seed/AHN-DN-for-Qwen-2.5-Instruct-14B) |
53
+ | Qwen2.5-14B-Instruct | GatedDeltaNet | 61.0M | [🤗model](https://huggingface.co/ByteDance-Seed/AHN-GDN-for-Qwen-2.5-Instruct-14B) |
54
 
55
  ### Evaluation
56