Glitch Token Issue in DeepSeek-R1-0528-AWQ – Incorrect “极” Character in Long Prompts

#5
by alexchenyu - opened

Description:
When using cognitivecomputations/DeepSeek-R1-0528-AWQ, I often encounter a glitch during long-prompt inference. Specifically, the output sometimes contains the Chinese character “极” inserted mistakenly in the middle of English text. This appears to be a mis-decoded token (e.g. _St) caused by quantization/tokenizer misalignment.

Steps to Reproduce:

  1. Load one of the AWQ quantized models using vLLM.
  2. Provide a very long English prompt (e.g. >50 K tokens) with code or markdown.
  3. Observe that the response intermittently includes “极” in places where it should not.

Expected Behavior:
The response should contain only English tokens or valid ASCII characters. No Chinese characters like “极” should appear.

Temporary Workaround / Notes:

  1. The glitch seems frequency-related to long input sizes.
  2. I’ve tried clearing caches and ensuring tokenizers from the same commit, but it persists.
  3. Some community discussion suggests tokenizer/embedding offset due to old AWQ tokenizers. A full re-quantization or updating AWQ files might fix it.

Please let me know if there’s a patch/fix, updated model, or recommended workaround. Thanks!

Sign up or log in to comment