I compared V10 with Qwen Image Edit 2509 + LoRAs

#110
by YZH9376 - opened

I ran a comparison between V10 and a combination of Qwen Image Edit 2509 fp8 + Lightning 4-steps LoRA + Snofs LoRA in an almost identical workflow.

The results are as follows (please pardon me for not being able to upload the NSFW test images):

V10 produced quite noticeable grid artifacts (though less pronounced than V8). Character consistency is very good (compared to V8), and the character skin tones and the overall image became more vibrant.

The base Qwen + LoRA combination had almost no grid artifacts. Character consistency was the same as V10, and the skin tones and image were slightly brightened.

Perhaps by following this method, we can test the LoRAs used in the AIO (All-in-One) one by one to identify the cause of the grid artifacts and color deviation.

yes. i did experience grid artifacts from V9... seedVR2 can do some makeup but it does take sometime and hassle

It's called refining
@Tofusang

I include far more LORAs than just snofs to hopefully cover more variety and situations. I also include both 4 step and 8 step LORAs partially to support a wider range of step counts. I've noticed minimal grid patterns using euler/beta. It is a balance and completely getting rid of grid artifacts could come at other costs.

yes. i did experience grid artifacts from V9... seedVR2 can do some makeup but it does take sometime and hassle

Actually, you can try using a regular upscaling model to fix most artifacts, and it only takes 1-3 seconds

Actually, you can try using a regular upscaling model to fix most artifacts, and it only takes 1-3 seconds

What's the model?

Been trying NSFW v10 with 8 steps, euler/beta, denoise 0.9. Hardly notice a grid issue. Character consistency seems to be very good. Impressed.

Also not sure if anyone else does so here, but I tend to use the second input image slot with the same image as image 1, unless I need to use image 2 for something else. Possibly placebo, but it feels like there's an improvement in consistency when I do that.

Keep up the good work!

Having great results, thanks! I add:

and get nearly always perfect consistency (no zooming or panning) when asking to edit something. Also, you can edit in more megapixels and the model works fine too (tried up to 5mp), which is frankly amazing, I think Flux Kontext has no way of being pushed that far.

Having great results, thanks! I add:

and get nearly always perfect consistency (no zooming or panning) when asking to edit something. Also, you can edit in more megapixels and the model works fine too (tried up to 5mp), which is frankly amazing, I think Flux Kontext has no way of being pushed that far.

OK I finally got around to testing v10, and I have to say, @CamiloMM nailed it with this one. That combination is amazing for realism! Adding also that I am using ddim/beta as someone else mentioned got great results with and mind blown with the clarity and quality of the generation. I tested this same combination with the vanilla 2509 workflow, and v10 is MUCH better.

I have seen the "vagina testicles" issue still just using the loras mentioned here, adding the penis lora at 0.3 fixes that completely and doesn't take away from the quality of the image at all. This is my new baseline for generations. This is why I love knowledge sharing, we all want to get to the best combinations, having multiple testers trying different things and reporting results benefits everyone!

NOTE: I do get some SLIGHT gridlines adding all the loras above AND the penislora, but only when zoomed out, and because I am using these images with WAN the gridlines aren't even a factor as my 1024x1024 generation with v10 gets scaled down to 720x720 video generation anyway.

OK I finally got around to testing v10, and I have to say, @CamiloMM nailed it with this one. That combination is amazing for realism! Adding also that I am using ddim/beta as someone else mentioned got great results with and mind blown with the clarity and quality of the generation. I tested this same combination with the vanilla 2509 workflow, and v10 is MUCH better.

Glad it helps! May I also suggest, for anyone who doesn't know, you can have all your LoRAs on a list with this node, which is much more convenient than adding LoRAs one after the other as individual nodes:

image

Also, the Qwen template can be altered. I've been experimenting with it, and not sure if it helps yet, but I think more people should know this is one avenue of experimentation. Remember there's an LLM running each image you generate, and this template might shoehorn it into not paying attention to details you think matter!

image

It is connected to this node before going to the sampler:

image

You can find more about it and workflows here:
https://github.com/fblissjr/ComfyUI-QwenImageWanBridge

For the lora loader, where are the clips connecting to in the workflow?

For the lora loader, where are the clips connecting to in the workflow?

Everything modern tends to completely avoid training text encoder at all for lora, so you can just not even connect the clip

From my testing, v10 seems to retain character identity for me, unlike in 9.

Something inside of it though really pushes the contrast and saturation hard and alters the original image's colors even when explicitly told not to do so.
This part I do not like. (Wayyyy less prevalent in 5.3)
But this seems like a huge step up from 9. atleast for my use case. Time will tell.

For the lora loader, where are the clips connecting to in the workflow?

I connect like so
image

But as TheNecr0mancer mentioned, I think LoRAs don't train on CLIP anyway. Though I am not aware if there would be any problem regardless, it should be just as fast with it on or off I believe.

For the lora loader, where are the clips connecting to in the workflow?

I connect like so
image

But as TheNecr0mancer mentioned, I think LoRAs don't train on CLIP anyway. Though I am not aware if there would be any problem regardless, it should be just as fast with it on or off I believe.

Can you link me the consistency lora please?

Having great results, thanks! I add:

and get nearly always perfect consistency (no zooming or panning) when asking to edit something. Also, you can edit in more megapixels and the model works fine too (tried up to 5mp), which is frankly amazing, I think Flux Kontext has no way of being pushed that far.

我刚测试了一下,consistence_edit_v2这个LORA还可以,但是另外两个LORA(adorablegirls,A2R_2509_Base)添加了容易产生网格效果。由于consistence_edit_v2这个LORA我平时也有使用的,尤其在V7版本后为了抑制美图LORA效果我都用上了这个consistence_edit_v2的LORA,但是consistence_edit_v2它有个弊端,它会对提示词造成不少影响,我大量试验了添加consistence_edit_v2后,提示词的遵循度有所下降的

I've been playing with these LORAs a bit myself. AdorableGirls seems to set a specific face theme (like the "flux chin") and reduces variety in my opinion, while also introducing grid artifacts. Anime2Realistic appeared to increase grid artifacts too. Consistency was ehhhhhh not really sure what it was doing... might have been making it harder to prompt because it is trained to stick to the original image much more (even when not desired). After quite a bit of testing, I think the only thing I might consider adding to the AIO is the PenisLora to reduce genitalia mixup, but I'm worried that might be a balance between starting to show dicks on the ladies when not desired if added too much.

Oh, I'm just suggesting adding the consistency LoRA and A2R at a low weight; please do not adorablegirls or penis LoRAs, I would be off-put by the chicks with dicks and so would gay people be off-put by adorable girls. Those are stylistic.

But as for Consistency and A2R, even at a low weight they help make the edit snap to the original better, which is a huge plus. I suggest adding them at some low weight. Despite the name, I'm not using A2R for its anime to realism effect, it just adds consistency for some reason.

Actually, you can try using a regular upscaling model to fix most artifacts, and it only takes 1-3 seconds

What's the model?

image
It's just a regular 4x magnification model

I've been playing with these LORAs a bit myself. AdorableGirls seems to set a specific face theme (like the "flux chin") and reduces variety in my opinion, while also introducing grid artifacts. Anime2Realistic appeared to increase grid artifacts too. Consistency was ehhhhhh not really sure what it was doing... might have been making it harder to prompt because it is trained to stick to the original image much more (even when not desired). After quite a bit of testing, I think the only thing I might consider adding to the AIO is the PenisLora to reduce genitalia mixup, but I'm worried that might be a balance between starting to show dicks on the ladies when not desired if added too much.

The lora that I found is best for consistency is edit_0928_lora_step40000.safetensors. Although this Lora is optimized for faces, it also tries to retain the original overall characteristics of the image while allowing for better prompt following. But I prefer it as a Lora to adjust its weight (0.10~0.40) depending on the prompt and image

@Shiny2480 , for the lora your mentioned above, if I wanted to integrate it with what Camilo recommended for his 3 loras, what would I replace to try yours? His 3 give me good results for the most part, but still have to play the seed lottery to get the consistency I want. Happy to try your recommendation with any others :)

Sign up or log in to comment