Glitch Token Issue in DeepSeek-R1-0528-AWQ – Incorrect “极” Character in Long Prompts
#5
by
alexchenyu
- opened
Description:
When using cognitivecomputations/DeepSeek-R1-0528-AWQ, I often encounter a glitch during long-prompt inference. Specifically, the output sometimes contains the Chinese character “极” inserted mistakenly in the middle of English text. This appears to be a mis-decoded token (e.g. _St) caused by quantization/tokenizer misalignment.
Steps to Reproduce:
- Load one of the AWQ quantized models using vLLM.
- Provide a very long English prompt (e.g. >50 K tokens) with code or markdown.
- Observe that the response intermittently includes “极” in places where it should not.
Expected Behavior:
The response should contain only English tokens or valid ASCII characters. No Chinese characters like “极” should appear.
Temporary Workaround / Notes:
- The glitch seems frequency-related to long input sizes.
- I’ve tried clearing caches and ensuring tokenizers from the same commit, but it persists.
- Some community discussion suggests tokenizer/embedding offset due to old AWQ tokenizers. A full re-quantization or updating AWQ files might fix it.
Please let me know if there’s a patch/fix, updated model, or recommended workaround. Thanks!