Optimized Model: Qwen3-VL-2B-Instruct-4bit-bnb

This is an optimized version of Qwen/Qwen3-VL-2B-Instruct.

Optimizations Applied

  • Quantize to 4-bit (BNB)

How to Use


from transformers import AutoProcessor, AutoModelForCausalLM
model_id = "broadfield-dev/Qwen3-VL-2B-Instruct-4bit-bnb"
processor = AutoProcessor.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", trust_remote_code=True)

ONNX Information

This model was converted to ONNX with the task: vision2seq-lm.

Downloads last month
14
Safetensors
Model size
2B params
Tensor type
F32
F16
U8
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support