Int4ChooseQParamsAlgorithm error with vllm server
In a venv with:
- torch 2.8.0
- torchao 0.13.0
- vllm 0.10.2rc3.dev324
I see the following error when running vllm serve pytorch/Qwen3-8B-AWQ-INT4:
(APIServer pid=109129) Value error, Failed to find class Int4ChooseQParamsAlgorithm in any of the allowed modules:
torchao.quantization, torchao.dtypes, torchao.prototype.awq, torchao.prototype.mx_formats,
torchao.quantization.quantize_.common, torchao.sparsity.sparse_api, torchao.prototype.quantization [type=value_error,
input_value=ArgsKwargs((), {'model_co...additional_config': {}}), input_type=ArgsKwargs]
Update: From the code, it looks like this model now depends on torchao>0.13.0, which as far as I can tell depends on torch >= 2.9.0, breaking the dependencies of vllm's latest nightly build (2.8.0). Is that right? Any suggestions on how to make it servable with vllm?
Thanks for trying this out, yeah this model requires more recent torchao, it has to use torchao nightly currently. and the compatibility is currently a bit complicated: https://github.com/pytorch/ao/issues/2919
One thing we can try is to install torchao, torch and vllm nightly now I think, through:
pip install --pre torchao torch vllm --extra-index-url https://download.pytorch.org/whl/nightly/cu128
vllm is added very recently, it is built against the torch nightly I think.
I found the following issues currently, but will fix soon
- serving with vllm: seems this checkpoint produces wrong results, I also tests Phi4-mini-AWQ-INT4, that one is fine, so this is something specific to this checkpoint
Update[10/01/2025]: this seems to specific to Qwen3-4B-AWQ-INT4 and to vllm...don't have time to debug now though - serving with transformers, seems to work now, but I'll put up a fix if it failed