Qwen3-VL-8B-Instruct-AWQ

AWQ (W4A16) quantized version of Qwen/Qwen3-VL-8B-Instruct.

Quantization: AWQ, 4 bits, group_size=128, zero_point=true, version="gemm"
modules_to_not_convert: ["visual"]
Prepared with LLM Compressor oneshot AWQ. recipe = AWQModifier( targets="Linear", scheme="W4A16", ignore=[r"re:model.visual.", r"re:visual."], # drop lm_head from ignore duo_scaling=True, )

Safetensors

Model size

3B params

Tensor type

I64

I32

BF16

Model tree for SherlockID365/Qwen3-VL-8B-Instruct-quantized.w4a16

Base model

Quantized

(38)

this model