49 11 36

wenhua cheng

wenhuach

wenhuach21

AI & ML interests

Model Compression, CV

Recent Activity

new activity 8 days ago

Intel/gemma-4-31B-it-int4-AutoRound:This quant is the best. Please conver it to GGUF

updated a model 9 days ago

Intel/gemma-4-31B-it-q4km-AutoRound-preview

published a model 9 days ago

Intel/gemma-4-31B-it-q4km-AutoRound-preview

View all activity

Organizations

New activity in Intel/gemma-4-31B-it-int4-AutoRound 8 days ago

This quant is the best. Please conver it to GGUF

#7 opened 9 days ago by

alexcardo

New activity in Intel/gemma-4-12B-it-int4-AutoRound 14 days ago

torch RuntimeError: Shape mismatch: a.size(1) = 4096, size_k = 8192

#1 opened 14 days ago by

saadsafi

New activity in Intel/gemma-4-31B-it-int4-AutoRound 25 days ago

INT8 version for TP=2 / dual Ampere GPUs?

🚀 1

#6 opened about 1 month ago by

mancub

New activity in Intel/MiroThinker-v1.5-30B-gguf-q2ks-mixed-AutoRound 28 days ago

Correct metadata, add library name, and link SignRoundV2 paper

#1 opened 30 days ago by

nielsr

New activity in Intel/Qwen3.6-27B-int4-AutoRound 28 days ago

Why delete Intel/Qwen3.6-35B-A3B-int4-AutoRound?

#1 opened about 2 months ago by

bgeneto

New activity in Intel/Qwen3.6-35B-A3B-int4-mixed-AutoRound about 1 month ago

it can use dflash directly with z-lab/Qwen3.6-35B-A3B-Dflash

#2 opened about 1 month ago by

syvvvv

New activity in Intel/gemma-4-31B-it-int4-AutoRound about 2 months ago

Please update chat template

#4 opened about 2 months ago by

alexcardo

FP4?

#5 opened about 2 months ago by

alexcardo

New activity in Intel/Qwen3.6-27B-int4-AutoRound about 2 months ago

does this even run on intel gpus?

#2 opened about 2 months ago by

Thomas98519864

New activity in Intel/Qwen3.6-27B-4.5b-mlx-AutoRound about 2 months ago

AutoRound quant fails to load with mlx-lm

👍 1

#1 opened about 2 months ago by

smcleod

New activity in cyankiwi/Qwen3.6-35B-A3B-AWQ-4bit about 2 months ago

How does this compare to the original 8bit qwen quant and the 4 bit auto-round quant?

#5 opened 2 months ago by

sparx3

New activity in Intel/gemma-4-26B-A4B-it-int4-mixed-AutoRound about 2 months ago

any plan for an Ampere compatible version?

#2 opened about 2 months ago by

electroglyph

New activity in Intel/gemma-4-31B-it-int4-AutoRound about 2 months ago

Fails to load on Ampere (sm_86) at TP=2: Marlin kernel rejects 32-dim weight slice

#3 opened about 2 months ago by

wasifb

New activity in Intel/GLM-4.7-Flash-int4-AutoRound 2 months ago

MTP 0 accept rate

#4 opened 2 months ago by

AMUN-RA1

New activity in Intel/gemma-4-31B-it-int4-AutoRound 2 months ago

Installation Video and Testing - Step by Step

🚀 3

#1 opened 2 months ago by

fahdmirzac

New activity in Intel/gemma-4-26B-A4B-it-int4-mixed-AutoRound 2 months ago

GGUF version

🔥 1

#1 opened 2 months ago by

limcheekin

New activity in Intel/Qwen3.5-397B-A17B-gguf-q2ks-mixed-AutoRound 3 months ago

Performance indicators

👍 4

#1 opened 3 months ago by

dehnhaide

New activity in Intel/GLM-5-int4-mixed-AutoRound 3 months ago

This model always predicts some few nonsense sequences

#1 opened 4 months ago by

CharlesChen2023

New activity in Intel/Qwen3.5-122B-A10B-int4-AutoRound 3 months ago

Does the A100 work?

#1 opened 4 months ago by

xz123321

New activity in Intel/Qwen3.5-35B-A3B-int4-AutoRound 4 months ago

Thanks! And MTP key question

#1 opened 4 months ago by

seanthomaswilliams

wenhua cheng

AI & ML interests

Recent Activity

Organizations

wenhuach's activity

This quant is the best. Please conver it to GGUF

torch RuntimeError: Shape mismatch: a.size(1) = 4096, size_k = 8192

INT8 version for TP=2 / dual Ampere GPUs?

Correct metadata, add library name, and link SignRoundV2 paper

Why delete Intel/Qwen3.6-35B-A3B-int4-AutoRound?

it can use dflash directly with z-lab/Qwen3.6-35B-A3B-Dflash

Please update chat template

FP4?

does this even run on intel gpus?

AutoRound quant fails to load with mlx-lm

How does this compare to the original 8bit qwen quant and the 4 bit auto-round quant?

any plan for an Ampere compatible version?

Fails to load on Ampere (sm_86) at TP=2: Marlin kernel rejects 32-dim weight slice

MTP 0 accept rate

Installation Video and Testing - Step by Step

GGUF version

Performance indicators

This model always predicts some few nonsense sequences

Does the A100 work?

Thanks! And MTP key question