sft-simple / README.md
rabiulawal's picture
Create README.md
b3723d3 verified
metadata
language:
  - en
base_model:
  - BAAI/Emu3-Stage1

EARL - SFT (S) (8B)

Model Name: mair-lab/sft-simple
Model Size: 8B parameters
Base Model: BAAI/Emu3-Stage1
Training Method: Supervised Fine-Tuning (SFT)
Dataset: Simple Edit (S)

This model is part of the EARL benchmark effort introduced in our paper:
👉 EARL: The Promise of RL for Autoregressive Image Editing

Model Summary

This SFT model is fine-tuned from Emu3 using direct supervision on the Simple Edit dataset. It is optimized for general-purpose autoregressive image editing without requiring intermediate reasoning steps. This model achieves state-of-the-art performance on several editing benchmarks across modalities.

➡️ Inference script and usage: GitHub Repo

Benchmark Results (Avg Score Across Benchmarks)

Model Base Model OmniEdit EmuEdit AURORA MB VisMin I2EBench AVG
Magicbrush SD v1.5 3.43 3.28 3.01 3.64 3.48 3.06 3.32
InstructPix2Pix SD v1.5 3.97 3.24 3.05 3.12 2.94 3.23 3.26
Aurora SD v1.5 4.50 4.40 4.12 4.62 3.82 3.58 4.17
Omnigen* - 5.68 5.00 4.10 4.68 4.09 4.68 4.70
SFT (S) Emu3 5.73 3.66 3.58 3.19 3.57 3.59 3.88

📈 Note: The Emu3-based SFT (S) model achieves top results among all open-source supervised models on OmniEdit and competitive performance across other benchmarks.

Use Cases

  • Open-ended and instruction-guided image editing
  • Object, attribute, style and environment change