luguoshan commited on
Commit
daccaf1
·
verified ·
1 Parent(s): c01ef14

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -21,6 +21,7 @@ This model is based on the principles described in the paper [Large Language Dif
21
 
22
  - `LLaDA-MoE-7B-A1B-Base`: A base pre-trained model designed for research and secondary development.
23
  - `LLaDA-MoE-7B-A1B-Instruct`: An instruction-tuned model optimized for practical applications.
 
24
 
25
  ---
26
  <div align="center">
@@ -56,6 +57,7 @@ This model is based on the principles described in the paper [Large Language Dif
56
  |--------|-------------|-------------------|
57
  | [`inclusionAI/LLaDA-MoE-7B-A1B-Base`](https://huggingface.co/inclusionAI/LLaDA-MoE-7B-A1B-Base) | Base pre-trained model for research and fine-tuning. | [🤗 Model Card](https://huggingface.co/inclusionAI/LLaDA-MoE-7B-A1B-Base) |
58
  | [`inclusionAI/LLaDA-MoE-7B-A1B-Instruct`](https://huggingface.co/inclusionAI/LLaDA-MoE-7B-A1B-Instruct) | Instruction-tuned model, ready for downstream applications. | [🤗 Model Card](https://huggingface.co/inclusionAI/LLaDA-MoE-7B-A1B-Instruct) |
 
59
 
60
  ---
61
 
 
21
 
22
  - `LLaDA-MoE-7B-A1B-Base`: A base pre-trained model designed for research and secondary development.
23
  - `LLaDA-MoE-7B-A1B-Instruct`: An instruction-tuned model optimized for practical applications.
24
+ - `LLaDA-MoE-7B-A1B-Instruct-TD`: A specialized instruction-tuned model, further optimized for accelerated inference using Trajectory Distillation.
25
 
26
  ---
27
  <div align="center">
 
57
  |--------|-------------|-------------------|
58
  | [`inclusionAI/LLaDA-MoE-7B-A1B-Base`](https://huggingface.co/inclusionAI/LLaDA-MoE-7B-A1B-Base) | Base pre-trained model for research and fine-tuning. | [🤗 Model Card](https://huggingface.co/inclusionAI/LLaDA-MoE-7B-A1B-Base) |
59
  | [`inclusionAI/LLaDA-MoE-7B-A1B-Instruct`](https://huggingface.co/inclusionAI/LLaDA-MoE-7B-A1B-Instruct) | Instruction-tuned model, ready for downstream applications. | [🤗 Model Card](https://huggingface.co/inclusionAI/LLaDA-MoE-7B-A1B-Instruct) |
60
+ | [`inclusionAI/LLaDA-MoE-7B-A1B-Instruct-TD`](https://huggingface.co/inclusionAI/LLaDA-MoE-7B-A1B-Instruct-TD) | An instruction-tuned model further optimized with **Trajectory Distillation (TD)** for accelerated inference. Decodes multiple tokens per forward pass. | [🤗 Model Card](https://huggingface.co/inclusionAI/LLaDA-MoE-7B-A1B-Instruct-TD) |
61
 
62
  ---
63