vllm version for inference of Qwen/Qwen3-VL-4B-Instruct-FP8 and Qwen/Qwen3-VL-4B-Instruct
#3 opened about 4 hours ago
by
saiyanhuang
VRAM usage not making sense
1
#2 opened 13 days ago
by
spanspek