Int4ChooseQParamsAlgorithm error with vllm server

by witwicki - opened Sep 21

Discussion

witwicki

Sep 21

•

edited Sep 21

In a venv with:

torch 2.8.0
torchao 0.13.0
vllm 0.10.2rc3.dev324

I see the following error when running vllm serve pytorch/Qwen3-8B-AWQ-INT4:

(APIServer pid=109129)   Value error, Failed to find class Int4ChooseQParamsAlgorithm in any of the allowed modules: 
torchao.quantization, torchao.dtypes, torchao.prototype.awq, torchao.prototype.mx_formats, 
torchao.quantization.quantize_.common, torchao.sparsity.sparse_api, torchao.prototype.quantization [type=value_error, 
input_value=ArgsKwargs((), {'model_co...additional_config': {}}), input_type=ArgsKwargs]

Update: From the code, it looks like this model now depends on torchao>0.13.0, which as far as I can tell depends on torch >= 2.9.0, breaking the dependencies of vllm's latest nightly build (2.8.0). Is that right? Any suggestions on how to make it servable with vllm?

jerryzh168

pytorch org Sep 22

•

edited Oct 1

Thanks for trying this out, yeah this model requires more recent torchao, it has to use torchao nightly currently. and the compatibility is currently a bit complicated: https://github.com/pytorch/ao/issues/2919

One thing we can try is to install torchao, torch and vllm nightly now I think, through:

pip install --pre torchao torch vllm --extra-index-url https://download.pytorch.org/whl/nightly/cu128

vllm is added very recently, it is built against the torch nightly I think.

I found the following issues currently, but will fix soon

serving with vllm: seems this checkpoint produces wrong results, I also tests Phi4-mini-AWQ-INT4, that one is fine, so this is something specific to this checkpoint
Update[10/01/2025]: this seems to specific to Qwen3-4B-AWQ-INT4 and to vllm...don't have time to debug now though
serving with transformers, seems to work now, but I'll put up a fix if it failed

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment