Update README.md
#5
by
CVPIE
- opened
README.md
CHANGED
|
@@ -1,57 +1,58 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
---
|
| 4 |
-
|
| 5 |
-
<div align="center">
|
| 6 |
-
<h1>X2Edit</h1>
|
| 7 |
-
<a href='https://arxiv.org/abs/2508.07607'><img src='https://img.shields.io/badge/arXiv-2508.07607-b31b1b.svg'></a>
|
| 8 |
-
<a href='https://huggingface.co/datasets/OPPOer/X2Edit-Dataset'><img src='https://img.shields.io/badge/π€%20HuggingFace-X2Edit Dataset-ffd21f.svg'></a>
|
| 9 |
-
<a href='https://huggingface.co/OPPOer/X2Edit'><img src='https://img.shields.io/badge/π€%20HuggingFace-X2Edit-ffd21f.svg'></a>
|
| 10 |
-
<a href='https://www.modelscope.cn/datasets/AIGCer-OPPO/X2Edit-Dataset'><img src='https://img.shields.io/badge/π€%20ModelScope-X2Edit Dataset-purple.svg'></a>
|
| 11 |
-
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
|
| 15 |
-
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
$
|
| 20 |
-
$
|
| 21 |
-
$ conda
|
| 22 |
-
$
|
| 23 |
-
|
| 24 |
-
|
| 25 |
-
|
| 26 |
-
|
| 27 |
-
|
| 28 |
-
|
| 29 |
-
|
| 30 |
-
|
| 31 |
-
|
| 32 |
-
|
| 33 |
-
|
| 34 |
-
|
| 35 |
-
**
|
| 36 |
-
**
|
| 37 |
-
**
|
| 38 |
-
**
|
| 39 |
-
**
|
| 40 |
-
**
|
| 41 |
-
|
| 42 |
-
|
| 43 |
-
|
| 44 |
-
|
| 45 |
-
|
| 46 |
-
|
| 47 |
-
|
| 48 |
-
|
| 49 |
-
|
| 50 |
-
|
| 51 |
-
|
| 52 |
-
|
| 53 |
-
|
| 54 |
-
|
| 55 |
-
}
|
| 56 |
-
|
| 57 |
-
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
---
|
| 4 |
+
|
| 5 |
+
<div align="center">
|
| 6 |
+
<h1>X2Edit</h1>
|
| 7 |
+
<a href='https://arxiv.org/abs/2508.07607'><img src='https://img.shields.io/badge/arXiv-2508.07607-b31b1b.svg'></a>
|
| 8 |
+
<a href='https://huggingface.co/datasets/OPPOer/X2Edit-Dataset'><img src='https://img.shields.io/badge/π€%20HuggingFace-X2Edit Dataset-ffd21f.svg'></a>
|
| 9 |
+
<a href='https://huggingface.co/OPPOer/X2Edit'><img src='https://img.shields.io/badge/π€%20HuggingFace-X2Edit-ffd21f.svg'></a>
|
| 10 |
+
<a href='https://www.modelscope.cn/datasets/AIGCer-OPPO/X2Edit-Dataset'><img src='https://img.shields.io/badge/π€%20ModelScope-X2Edit Dataset-purple.svg'></a>
|
| 11 |
+
<a href='https://github.com/OPPO-Mente-Lab/X2Edit'><img src="https://img.shields.io/badge/GitHub-OPPOer/X2Edit-blue.svg?logo=github" alt="GitHub"></a>
|
| 12 |
+
</div>
|
| 13 |
+
|
| 14 |
+
## Environment
|
| 15 |
+
|
| 16 |
+
Prepare the environment, install the required libraries:
|
| 17 |
+
|
| 18 |
+
```shell
|
| 19 |
+
$ git clone https://github.com/OPPO-Mente-Lab/X2Edit.git
|
| 20 |
+
$ cd X2Edit
|
| 21 |
+
$ conda create --name X2Edit python==3.11
|
| 22 |
+
$ conda activate X2Edit
|
| 23 |
+
$ pip install -r requirements.txt
|
| 24 |
+
```
|
| 25 |
+
|
| 26 |
+
## Inference
|
| 27 |
+
We provides inference scripts for editing images with resolutions of **1024** and **512**. In addition, we can choose the base model of X2Edit, including **[FLUX.1-Krea](https://huggingface.co/black-forest-labs/FLUX.1-Krea-dev)**, **[FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev)**, **[FLUX.1-schnell](https://huggingface.co/black-forest-labs/FLUX.1-schnell)**, **[PixelWave](https://huggingface.co/mikeyandfriends/PixelWave_FLUX.1-dev_03)**, **[shuttle-3-diffusion](https://huggingface.co/shuttleai/shuttle-3-diffusion)**, and choose the LoRA for integration with MoE-LoRA including **[Turbo-Alpha](https://huggingface.co/alimama-creative/FLUX.1-Turbo-Alpha)**, **[AntiBlur](https://huggingface.co/Shakker-Labs/FLUX.1-dev-LoRA-AntiBlur)**, **[Midjourney-Mix2](https://huggingface.co/strangerzonehf/Flux-Midjourney-Mix2-LoRA)**, **[Super-Realism](https://huggingface.co/strangerzonehf/Flux-Super-Realism-LoRA)**, **[Chatgpt-Ghibli](https://huggingface.co/openfree/flux-chatgpt-ghibli-lora)**. Choose the model you like and download it. For the MoE-LoRA, we will open source a unified checkpoint that can be used for both 512 and 1024 resolutions.
|
| 28 |
+
|
| 29 |
+
Before executing the script, download **[Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B)** to select the task type for the input instruction, base model(**FLUX.1-Krea**, **FLUX.1-dev**, **FLUX.1-schnell**, **shuttle-3-diffusion**), **[MLLM](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct)** and **[Alignet](https://huggingface.co/OPPOer/X2I/blob/main/qwen2.5-vl-7b_proj.pt)**. All scripts follow analogous command patterns. Simply replace the script filename while maintaining consistent parameter configurations.
|
| 30 |
+
|
| 31 |
+
```shell
|
| 32 |
+
$ python infer.py --device cuda --pixel 1024 --num_experts 12 --base_path BASE_PATH --qwen_path QWEN_PATH --lora_path LORA_PATH --extra_lora_path EXTRA_LORA_PATH
|
| 33 |
+
```
|
| 34 |
+
|
| 35 |
+
**device:** The device used for inference. default: `cuda`<br>
|
| 36 |
+
**pixel:** The resolution of the input image, , you can choose from **[512, 1024]**. default: `1024`<br>
|
| 37 |
+
**num_experts:** The number of expert in MoE. default: `12`<br>
|
| 38 |
+
**base_path:** The path of base model.<br>
|
| 39 |
+
**qwen_path:** The path of model used to select the task type for the input instruction. We use **Qwen3-8B** here.<br>
|
| 40 |
+
**lora_path:** The path of MoE-LoRA in X2Edit.<br>
|
| 41 |
+
**extra_lora_path:** The path of extra LoRA for plug-and-play. default: `None`.<br>
|
| 42 |
+
|
| 43 |
+
## Citation
|
| 44 |
+
|
| 45 |
+
π If you find our work helpful, please consider citing our paper and leaving valuable stars
|
| 46 |
+
|
| 47 |
+
```
|
| 48 |
+
@misc{ma2025x2editrevisitingarbitraryinstructionimage,
|
| 49 |
+
title={X2Edit: Revisiting Arbitrary-Instruction Image Editing through Self-Constructed Data and Task-Aware Representation Learning},
|
| 50 |
+
author={Jian Ma and Xujie Zhu and Zihao Pan and Qirong Peng and Xu Guo and Chen Chen and Haonan Lu},
|
| 51 |
+
year={2025},
|
| 52 |
+
eprint={2508.07607},
|
| 53 |
+
archivePrefix={arXiv},
|
| 54 |
+
primaryClass={cs.CV},
|
| 55 |
+
url={https://arxiv.org/abs/2508.07607},
|
| 56 |
+
}
|
| 57 |
+
```
|
| 58 |
+
|