so how to input multi-images? (Just python code without ComfyUI)
image1 = Image.open("./1.png").convert("RGB").resize((512, 512))
image2 = Image.open("./2.png").convert("RGB").resize((512, 512))
inputs = {
"image": [image1, image2],
"prompt": prompt,
"generator": torch.manual_seed(0),
"true_cfg_scale": 4.0,
"negative_prompt": "",
"num_inference_steps": 8,
}
I tried to pass multiple images as a list into the pipeline, but unfortunately, it does not work. The error is as follows:
"/home/zyy/miniconda3/envs/py311/lib/python3.11/site-packages/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py", line 1220, in get_placeholder_mask raise ValueError( ValueError: Image features and image tokens do not match: tokens: 324, features: 648
image_paths = ["./1.png", "./2.png"]
prompt = "Wushu stance"
for image_path in image_paths:
image = Image.open(image_path).convert("RGB").resize((512, 512))
inputs = {
"image": image,
"prompt": prompt,
"generator": torch.manual_seed(0),
"true_cfg_scale": 4.0,
"negative_prompt": "",
"num_inference_steps": 8,
}