(vLLM) Tool calling broken after update to tokenizer_config.json

#10

by m1das13 - opened Apr 11

Apr 11

I'm serving the model with vLLM, but commit 66c370b modified the chat template in a way that removed tool support, breaking features such as tool calling via tool_choice='auto' with the OpenAI Chat Completion Client.

Workaround: use the previous version (hash: 05440b7) of tokenizer_config.json.

For vLLM specifically, serve your model using the the following argument:
--tokenizer-revision 05440b7

Link to commit change:
https://huggingface.co/Qwen/Qwen2.5-VL-32B-Instruct-AWQ/commit/66c370b74a18e7b1e871c97918f032ed3578dfef

maleal

Sep 6

Qwen 2.5 VL is a great model but tool calling works only with "required" in vllm.

The option suggested here (adding --tokenizer-revision 05440b7005147091006f2d72024a2d86801a4418) doesn't work anymore and throws an error:

ValueError: Unrecognized model in Qwen/Qwen2.5-VL-72B-Instruct-AWQ. Should have a `model_type` key in its config.json, or contain one of the following strings in its name: ...
... qwen2_5_omni, qwen2_5_vl, qwen2_5_vl_text, qwen2_audio ...

I also tried to copy the template from that commit and paste it manually in tokenizer_config.json but in this case vllm simply hangs and the request is never answered (while you can see the memory usage in nvidia-smi spiking, as well as the power, indicating that it's doing something).

Any suggestion at this point? Is Qwen 2.5 VL (both AWQ and non-quantized) still usable with tool call auto? I'm using vllm 0.10.1.1.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment