Update config.json for sglang

by ayachinenefan - opened 12 days ago

base: refs/heads/main

←

from: refs/pr/3

Discussion Files changed

+138

-22

ayachinenefan

12 days ago

In sglang, the original ignore method is not effective. It needs to be modified by following the configuration format of https://huggingface.co/Qwen/Qwen3-VL-235B-A22B-Instruct-FP8.

Update config.json for sglange6bb86e0

ayachinenefan

12 days ago

@alexmarques @nm-research @dsikka could you please have a review when you are free, thansk.

ayachinenefan

4 days ago

@alexmarques @nm-research @dsikka could you please have a review when you are free, thansk.

dsikka

Red Hat AI org 4 days ago

hi @ayachinenefan the vision tower is unquantized which is why the all the vision layers are listed in the ignore list as they are. It seems like you're keeping some layers as is while removing the model. prefix in some cases?

ayachinenefan

3 days ago

@dsikka Yes, in sglang, using the original config.json will cause the non-quantized layers in the vision part to become invalid, resulting in the vision part also being quantized. To mimic the approach of this link, https://huggingface.co/Qwen/Qwen3-VL-235B-A22B-Instruct-FP8, the prefix in the config.json needs to be removed.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment