Vision LoRA Adapter

This is a LoRA adapter for vision-language models, trained to adapt vision tower and connector layers in addition to language model layers.

Model Details

  • Base Model: Qwen/Qwen2.5-VL-3B-Instruct
  • LoRA Rank: 32
  • LoRA Alpha: 32
  • Target Modules:
    • Language Model: โœ“
    • Vision Tower: โœ“
    • Connector/Projector: โœ“

Usage with vLLM

from vllm import LLM
from vllm.lora.request import LoRARequest

# Load model with LoRA support
llm = LLM(
    model="Qwen/Qwen2.5-VL-3B-Instruct",
    enable_lora=True,
    max_loras=1,
    max_lora_rank=32,
)

# Generate with LoRA
lora_request = LoRARequest("adapter", 1, "prashanth058/qwen2.5-3b-vl-flickr-lora-vision")
outputs = llm.generate(
    prompts=["<your prompt>"],
    lora_request=lora_request,
)

Usage with Transformers + PEFT

from transformers import AutoModelForVision2Seq, AutoProcessor
from peft import PeftModel

# Load base model
model = AutoModelForVision2Seq.from_pretrained("Qwen/Qwen2.5-VL-3B-Instruct")
processor = AutoProcessor.from_pretrained("Qwen/Qwen2.5-VL-3B-Instruct")

# Load adapter
model = PeftModel.from_pretrained(model, "prashanth058/qwen2.5-3b-vl-flickr-lora-vision")

# Generate
# ... (process your inputs)
outputs = model.generate(**inputs)

Training Details

This adapter was trained to demonstrate vision layer adaptation capabilities in vLLM.

  • Dataset: Synthetic/small-scale training data
  • Training: PEFT LoRA with vision layer targeting
  • Purpose: Testing and demonstration

License

This adapter follows the license of the base model: Qwen/Qwen2.5-VL-3B-Instruct

Downloads last month
19
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for prashanth058/qwen2.5-3b-vl-flickr-lora-vision

Adapter
(70)
this model