Diffusers
CVPIE commited on
Commit
4d95820
Β·
verified Β·
1 Parent(s): 6c90d07

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +58 -57
README.md CHANGED
@@ -1,57 +1,58 @@
1
- ---
2
- license: apache-2.0
3
- ---
4
-
5
- <div align="center">
6
- <h1>X2Edit</h1>
7
- <a href='https://arxiv.org/abs/2508.07607'><img src='https://img.shields.io/badge/arXiv-2508.07607-b31b1b.svg'></a> &nbsp;
8
- <a href='https://huggingface.co/datasets/OPPOer/X2Edit-Dataset'><img src='https://img.shields.io/badge/πŸ€—%20HuggingFace-X2Edit Dataset-ffd21f.svg'></a>
9
- <a href='https://huggingface.co/OPPOer/X2Edit'><img src='https://img.shields.io/badge/πŸ€—%20HuggingFace-X2Edit-ffd21f.svg'></a>
10
- <a href='https://www.modelscope.cn/datasets/AIGCer-OPPO/X2Edit-Dataset'><img src='https://img.shields.io/badge/πŸ€–%20ModelScope-X2Edit Dataset-purple.svg'></a>
11
- </div>
12
-
13
- ## Environment
14
-
15
- Prepare the environment, install the required libraries:
16
-
17
- ```shell
18
- $ git clone https://github.com/OPPO-Mente-Lab/X2Edit.git
19
- $ cd X2Edit
20
- $ conda create --name X2Edit python==3.11
21
- $ conda activate X2Edit
22
- $ pip install -r requirements.txt
23
- ```
24
-
25
- ## Inference
26
- We provides inference scripts for editing images with resolutions of **1024** and **512**. In addition, we can choose the base model of X2Edit, including **[FLUX.1-Krea](https://huggingface.co/black-forest-labs/FLUX.1-Krea-dev)**, **[FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev)**, **[FLUX.1-schnell](https://huggingface.co/black-forest-labs/FLUX.1-schnell)**, **[PixelWave](https://huggingface.co/mikeyandfriends/PixelWave_FLUX.1-dev_03)**, **[shuttle-3-diffusion](https://huggingface.co/shuttleai/shuttle-3-diffusion)**, and choose the LoRA for integration with MoE-LoRA including **[Turbo-Alpha](https://huggingface.co/alimama-creative/FLUX.1-Turbo-Alpha)**, **[AntiBlur](https://huggingface.co/Shakker-Labs/FLUX.1-dev-LoRA-AntiBlur)**, **[Midjourney-Mix2](https://huggingface.co/strangerzonehf/Flux-Midjourney-Mix2-LoRA)**, **[Super-Realism](https://huggingface.co/strangerzonehf/Flux-Super-Realism-LoRA)**, **[Chatgpt-Ghibli](https://huggingface.co/openfree/flux-chatgpt-ghibli-lora)**. Choose the model you like and download it. For the MoE-LoRA, we will open source a unified checkpoint that can be used for both 512 and 1024 resolutions.
27
-
28
- Before executing the script, download **[Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B)** to select the task type for the input instruction, base model(**FLUX.1-Krea**, **FLUX.1-dev**, **FLUX.1-schnell**, **shuttle-3-diffusion**), **[MLLM](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct)** and **[Alignet](https://huggingface.co/OPPOer/X2I/blob/main/qwen2.5-vl-7b_proj.pt)**. All scripts follow analogous command patterns. Simply replace the script filename while maintaining consistent parameter configurations.
29
-
30
- ```shell
31
- $ python infer.py --device cuda --pixel 1024 --num_experts 12 --base_path BASE_PATH --qwen_path QWEN_PATH --lora_path LORA_PATH --extra_lora_path EXTRA_LORA_PATH
32
- ```
33
-
34
- **device:** The device used for inference. default: `cuda`<br>
35
- **pixel:** The resolution of the input image, , you can choose from **[512, 1024]**. default: `1024`<br>
36
- **num_experts:** The number of expert in MoE. default: `12`<br>
37
- **base_path:** The path of base model.<br>
38
- **qwen_path:** The path of model used to select the task type for the input instruction. We use **Qwen3-8B** here.<br>
39
- **lora_path:** The path of MoE-LoRA in X2Edit.<br>
40
- **extra_lora_path:** The path of extra LoRA for plug-and-play. default: `None`.<br>
41
-
42
- ## Citation
43
-
44
- 🌟 If you find our work helpful, please consider citing our paper and leaving valuable stars
45
-
46
- ```
47
- @misc{ma2025x2editrevisitingarbitraryinstructionimage,
48
- title={X2Edit: Revisiting Arbitrary-Instruction Image Editing through Self-Constructed Data and Task-Aware Representation Learning},
49
- author={Jian Ma and Xujie Zhu and Zihao Pan and Qirong Peng and Xu Guo and Chen Chen and Haonan Lu},
50
- year={2025},
51
- eprint={2508.07607},
52
- archivePrefix={arXiv},
53
- primaryClass={cs.CV},
54
- url={https://arxiv.org/abs/2508.07607},
55
- }
56
- ```
57
-
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+
5
+ <div align="center">
6
+ <h1>X2Edit</h1>
7
+ <a href='https://arxiv.org/abs/2508.07607'><img src='https://img.shields.io/badge/arXiv-2508.07607-b31b1b.svg'></a> &nbsp;
8
+ <a href='https://huggingface.co/datasets/OPPOer/X2Edit-Dataset'><img src='https://img.shields.io/badge/πŸ€—%20HuggingFace-X2Edit Dataset-ffd21f.svg'></a>
9
+ <a href='https://huggingface.co/OPPOer/X2Edit'><img src='https://img.shields.io/badge/πŸ€—%20HuggingFace-X2Edit-ffd21f.svg'></a>
10
+ <a href='https://www.modelscope.cn/datasets/AIGCer-OPPO/X2Edit-Dataset'><img src='https://img.shields.io/badge/πŸ€–%20ModelScope-X2Edit Dataset-purple.svg'></a>
11
+ <a href='https://github.com/OPPO-Mente-Lab/X2Edit'><img src="https://img.shields.io/badge/GitHub-OPPOer/X2Edit-blue.svg?logo=github" alt="GitHub"></a>
12
+ </div>
13
+
14
+ ## Environment
15
+
16
+ Prepare the environment, install the required libraries:
17
+
18
+ ```shell
19
+ $ git clone https://github.com/OPPO-Mente-Lab/X2Edit.git
20
+ $ cd X2Edit
21
+ $ conda create --name X2Edit python==3.11
22
+ $ conda activate X2Edit
23
+ $ pip install -r requirements.txt
24
+ ```
25
+
26
+ ## Inference
27
+ We provides inference scripts for editing images with resolutions of **1024** and **512**. In addition, we can choose the base model of X2Edit, including **[FLUX.1-Krea](https://huggingface.co/black-forest-labs/FLUX.1-Krea-dev)**, **[FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev)**, **[FLUX.1-schnell](https://huggingface.co/black-forest-labs/FLUX.1-schnell)**, **[PixelWave](https://huggingface.co/mikeyandfriends/PixelWave_FLUX.1-dev_03)**, **[shuttle-3-diffusion](https://huggingface.co/shuttleai/shuttle-3-diffusion)**, and choose the LoRA for integration with MoE-LoRA including **[Turbo-Alpha](https://huggingface.co/alimama-creative/FLUX.1-Turbo-Alpha)**, **[AntiBlur](https://huggingface.co/Shakker-Labs/FLUX.1-dev-LoRA-AntiBlur)**, **[Midjourney-Mix2](https://huggingface.co/strangerzonehf/Flux-Midjourney-Mix2-LoRA)**, **[Super-Realism](https://huggingface.co/strangerzonehf/Flux-Super-Realism-LoRA)**, **[Chatgpt-Ghibli](https://huggingface.co/openfree/flux-chatgpt-ghibli-lora)**. Choose the model you like and download it. For the MoE-LoRA, we will open source a unified checkpoint that can be used for both 512 and 1024 resolutions.
28
+
29
+ Before executing the script, download **[Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B)** to select the task type for the input instruction, base model(**FLUX.1-Krea**, **FLUX.1-dev**, **FLUX.1-schnell**, **shuttle-3-diffusion**), **[MLLM](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct)** and **[Alignet](https://huggingface.co/OPPOer/X2I/blob/main/qwen2.5-vl-7b_proj.pt)**. All scripts follow analogous command patterns. Simply replace the script filename while maintaining consistent parameter configurations.
30
+
31
+ ```shell
32
+ $ python infer.py --device cuda --pixel 1024 --num_experts 12 --base_path BASE_PATH --qwen_path QWEN_PATH --lora_path LORA_PATH --extra_lora_path EXTRA_LORA_PATH
33
+ ```
34
+
35
+ **device:** The device used for inference. default: `cuda`<br>
36
+ **pixel:** The resolution of the input image, , you can choose from **[512, 1024]**. default: `1024`<br>
37
+ **num_experts:** The number of expert in MoE. default: `12`<br>
38
+ **base_path:** The path of base model.<br>
39
+ **qwen_path:** The path of model used to select the task type for the input instruction. We use **Qwen3-8B** here.<br>
40
+ **lora_path:** The path of MoE-LoRA in X2Edit.<br>
41
+ **extra_lora_path:** The path of extra LoRA for plug-and-play. default: `None`.<br>
42
+
43
+ ## Citation
44
+
45
+ 🌟 If you find our work helpful, please consider citing our paper and leaving valuable stars
46
+
47
+ ```
48
+ @misc{ma2025x2editrevisitingarbitraryinstructionimage,
49
+ title={X2Edit: Revisiting Arbitrary-Instruction Image Editing through Self-Constructed Data and Task-Aware Representation Learning},
50
+ author={Jian Ma and Xujie Zhu and Zihao Pan and Qirong Peng and Xu Guo and Chen Chen and Haonan Lu},
51
+ year={2025},
52
+ eprint={2508.07607},
53
+ archivePrefix={arXiv},
54
+ primaryClass={cs.CV},
55
+ url={https://arxiv.org/abs/2508.07607},
56
+ }
57
+ ```
58
+