Vision LoRA Adapter

This is a LoRA adapter for vision-language models, trained to adapt vision tower and connector layers in addition to language model layers.

Model Details

Base Model: Qwen/Qwen2.5-VL-3B-Instruct
LoRA Rank: 32
LoRA Alpha: 32
Target Modules:
- Language Model: ✓
- Vision Tower: ✓
- Connector/Projector: ✓

Usage with vLLM

from vllm import LLM
from vllm.lora.request import LoRARequest

# Load model with LoRA support
llm = LLM(
    model="Qwen/Qwen2.5-VL-3B-Instruct",
    enable_lora=True,
    max_loras=1,
    max_lora_rank=32,
)

# Generate with LoRA
lora_request = LoRARequest("adapter", 1, "prashanth058/qwen2.5-3b-vl-flickr-lora-vision")
outputs = llm.generate(
    prompts=["<your prompt>"],
    lora_request=lora_request,
)

Usage with Transformers + PEFT

from transformers import AutoModelForVision2Seq, AutoProcessor
from peft import PeftModel

# Load base model
model = AutoModelForVision2Seq.from_pretrained("Qwen/Qwen2.5-VL-3B-Instruct")
processor = AutoProcessor.from_pretrained("Qwen/Qwen2.5-VL-3B-Instruct")

# Load adapter
model = PeftModel.from_pretrained(model, "prashanth058/qwen2.5-3b-vl-flickr-lora-vision")

# Generate
# ... (process your inputs)
outputs = model.generate(**inputs)

Training Details

This adapter was trained to demonstrate vision layer adaptation capabilities in vLLM.

Dataset: Synthetic/small-scale training data
Training: PEFT LoRA with vision layer targeting
Purpose: Testing and demonstration

License

This adapter follows the license of the base model: Qwen/Qwen2.5-VL-3B-Instruct

Downloads last month: 19

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for prashanth058/qwen2.5-3b-vl-flickr-lora-vision

Base model

Qwen/Qwen2.5-VL-3B-Instruct

Adapter

(70)

this model