The model is outputting garbage
#7 opened 3 months ago
by
sakurakotomi
Quantization Process?
#6 opened 5 months ago
by
yangchen123
Glitch Token Issue in DeepSeek-R1-0528-AWQ – Incorrect “极” Character in Long Prompts
#5 opened 5 months ago
by
alexchenyu
Future Plans for Multi-Token Prediction Support?
#4 opened 6 months ago
by
NaiveYan
The result is problematic.
1
#3 opened 6 months ago
by
zhnagchenchne
running with flashmla on A100s
1
#1 opened 6 months ago
by
ehartford