Model Summary

UnifiedReward-Edit-qwen3vl-4b is a unified reward model for both Text-to-Image and Image-to-Image generation!! For image editing reward task, our models support:

Pairwise Rank — directly judge which of two edited images is better.

Pairwise Score — assign a separate score to each image in a pair.

Pointwise Score — rate a single image on two axes: instruction-following and overall image quality.

🚀 The image editing reward inference code is available at UnifiedReward-Edit/ directory, while T2I inference code is unchanged from previous models. The editing training data is preprocessed from EditScore, EditReward, and Pico-Nano-Banana. We sincerely appreciate all contributors!!

For further details, please refer to the following resources:

📰 Paper: https://arxiv.org/pdf/2503.05236
🪐 Project Page: https://codegoat24.github.io/UnifiedReward/
🤗 Model Collections: https://huggingface.co/collections/CodeGoat24/unifiedreward-models-67c3008148c3a380d15ac63a
🤗 Dataset Collections: https://huggingface.co/collections/CodeGoat24/unifiedreward-training-data-67c300d4fd5eff00fa7f1ede
👋 Point of Contact: Yibin Wang

Citation

@article{unifiedreward,
  title={Unified reward model for multimodal understanding and generation},
  author={Wang, Yibin and Zang, Yuhang and Li, Hao and Jin, Cheng and Wang, Jiaqi},
  journal={arXiv preprint arXiv:2503.05236},
  year={2025}
}