wenhua cheng
wenhuach
AI & ML interests
Model Compression, CV
Recent Activity
new activity 8 days ago
Intel/gemma-4-31B-it-int4-AutoRound:This quant is the best. Please conver it to GGUF updated a model 9 days ago
Intel/gemma-4-31B-it-q4km-AutoRound-preview published a model 9 days ago
Intel/gemma-4-31B-it-q4km-AutoRound-previewOrganizations
This quant is the best. Please conver it to GGUF
5
#7 opened 9 days ago
by
alexcardo
torch RuntimeError: Shape mismatch: a.size(1) = 4096, size_k = 8192
2
#1 opened 14 days ago
by
saadsafi
INT8 version for TP=2 / dual Ampere GPUs?
🚀 1
1
#6 opened about 1 month ago
by
mancub
Correct metadata, add library name, and link SignRoundV2 paper
#1 opened 30 days ago
by
nielsr
Why delete Intel/Qwen3.6-35B-A3B-int4-AutoRound?
15
#1 opened about 2 months ago
by
bgeneto
it can use dflash directly with z-lab/Qwen3.6-35B-A3B-Dflash
1
#2 opened about 1 month ago
by
syvvvv
Please update chat template
2
#4 opened about 2 months ago
by
alexcardo
does this even run on intel gpus?
6
#2 opened about 2 months ago
by
Thomas98519864
AutoRound quant fails to load with mlx-lm
👍 1
1
#1 opened about 2 months ago
by
smcleod
How does this compare to the original 8bit qwen quant and the 4 bit auto-round quant?
2
#5 opened 2 months ago
by
sparx3
any plan for an Ampere compatible version?
2
#2 opened about 2 months ago
by
electroglyph
Fails to load on Ampere (sm_86) at TP=2: Marlin kernel rejects 32-dim weight slice
2
#3 opened about 2 months ago
by
wasifb
MTP 0 accept rate
2
#4 opened 2 months ago
by
AMUN-RA1
Installation Video and Testing - Step by Step
🚀 3
5
#1 opened 2 months ago
by
fahdmirzac
GGUF version
🔥 1
1
#1 opened 2 months ago
by
limcheekin
Performance indicators
👍 4
4
#1 opened 3 months ago
by
dehnhaide
This model always predicts some few nonsense sequences
8
#1 opened 4 months ago
by
CharlesChen2023
Does the A100 work?
12
#1 opened 4 months ago
by
xz123321
Thanks! And MTP key question
11
#1 opened 4 months ago
by
seanthomaswilliams