---
language:
- en
library_name: diffusers
license: mit
pipeline_tag: image-to-image
---

# Arc2Face Model Card

<div align="center">

[**Project Page**](https://arc2face.github.io/) **|** [**Original Paper (ArXiv)**](https://arxiv.org/abs/2403.11641) **|** [**Expression Adapter Paper (HF)**](https://huggingface.co/papers/2510.04706) **|** [**Code**](https://github.com/foivospar/Arc2Face) **|** [🤗 **Gradio demo**](https://huggingface.co/spaces/FoivosPar/Arc2Face)

</div>

## Introduction

Arc2Face is an ID-conditioned face model, that can generate diverse, ID-consistent photos of a person given only its ArcFace ID-embedding.
It is trained on a restored version of the WebFace42M face recognition database, and is further fine-tuned on FFHQ and CelebA-HQ.

Arc2Face has been extended with a fine-grained **Expression Adapter**, enabling the generation of any subject under any facial expression (even rare, asymmetric, subtle, or extreme ones). This extension is detailed in the paper [ID-Consistent, Precise Expression Generation with Blendshape-Guided Diffusion](https://huggingface.co/papers/2510.04706).

<div align="center">
<img src='https://huggingface.co/foivospar/Arc2Face/resolve/main/assets/exp_teaser.jpg'>
</div>

<div align="center">
<img src='https://huggingface.co/foivospar/Arc2Face/resolve/main/assets/samples_short.jpg'>
</div>

## Model Details

It consists of 2 components:
- encoder, a finetuned CLIP ViT-L/14 model
- arc2face, a finetuned UNet model

both of which are fine-tuned from [runwayml/stable-diffusion-v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5).
The encoder is tailored for projecting ID-embeddings to the CLIP latent space.
Arc2Face adapts the pre-trained backbone to the task of ID-to-face generation, conditioned solely on ID vectors.

## ControlNet (pose)

We also provide a ControlNet model trained on top of Arc2Face for pose control.

<div align="center">
<img src='https://huggingface.co/foivospar/Arc2Face/resolve/main/assets/controlnet_short.jpg'>
</div>

## Download Models

The models can be downloaded directly from this repository or using python:
```python
from huggingface_hub import hf_hub_download

hf_hub_download(repo_id="FoivosPar/Arc2Face", filename="arc2face/config.json", local_dir="./models")
hf_hub_download(repo_id="FoivosPar/Arc2Face", filename="arc2face/diffusion_pytorch_model.safetensors", local_dir="./models")
hf_hub_download(repo_id="FoivosPar/Arc2Face", filename="encoder/config.json", local_dir="./models")
hf_hub_download(repo_id="FoivosPar/Arc2Face", filename="encoder/pytorch_model.bin", local_dir="./models")
hf_hub_download(repo_id="FoivosPar/Arc2Face", filename="controlnet/config.json", local_dir="./models")
hf_hub_download(repo_id="FoivosPar/Arc2Face", filename="controlnet/diffusion_pytorch_model.safetensors", local_dir="./models")
```

Please check our [GitHub repository](https://github.com/foivospar/Arc2Face) for complete inference instructions.

## Sample Usage with Diffusers

To use the Arc2Face model with the `diffusers` library, first load the pipeline components:

```python
from diffusers import (
    StableDiffusionPipeline,
    UNet2DConditionModel,
    DPMSolverMultistepScheduler,
)

from arc2face import CLIPTextModelWrapper, project_face_embs

import torch
from insightface.app import FaceAnalysis
from PIL import Image
import numpy as np

# Arc2Face is built upon SD1.5
# The repo below can be used instead of the now deprecated 'runwayml/stable-diffusion-v1-5'
base_model = 'runwayml/stable-diffusion-v1-5' # Changed to match original from README

encoder = CLIPTextModelWrapper.from_pretrained(
    'models', subfolder="encoder", torch_dtype=torch.float16
)

unet = UNet2DConditionModel.from_pretrained(
    'models', subfolder="arc2face", torch_dtype=torch.float16
)

pipeline = StableDiffusionPipeline.from_pretrained(
        base_model,
        text_encoder=encoder,
        unet=unet,
        torch_dtype=torch.float16,
        safety_checker=None
    )
pipeline.scheduler = DPMSolverMultistepScheduler.from_config(pipeline.scheduler.config)
pipeline = pipeline.to('cuda')
```

Then, pick an image to extract the ID-embedding and generate images:

```python
app = FaceAnalysis(name='antelopev2', root='./', providers=['CUDAExecutionProvider', 'CPUExecutionProvider'])
app.prepare(ctx_id=0, det_size=(640, 640))

img = np.array(Image.open('https://huggingface.co/foivospar/Arc2Face/resolve/main/assets/examples/joacquin.png'))[:,:,::-1] # Updated image path

faces = app.get(img)
faces = sorted(faces, key=lambda x:(x['bbox'][2]-x['bbox'][0])*(x['bbox'][3]-x['bbox'][1]))[-1]  # select largest face (if more than one detected)
id_emb = torch.tensor(faces['embedding'], dtype=torch.float16)[None].cuda()
id_emb = id_emb/torch.norm(id_emb, dim=1, keepdim=True)   # normalize embedding
id_emb = project_face_embs(pipeline, id_emb)    # pass through the encoder
```

<div align="center">
<img src='https://huggingface.co/foivospar/Arc2Face/resolve/main/assets/examples/joacquin.png' style='width:25%;'>
</div>

Finally, generate images:
```python
num_images = 4
images = pipeline(prompt_embeds=id_emb, num_inference_steps=25, guidance_scale=3.0, num_images_per_prompt=num_images).images
```
<div align="center">
<img src='https://huggingface.co/foivospar/Arc2Face/resolve/main/assets/samples.jpg'>
</div>

## Limitations and Bias

- Only one person per image can be generated.
- Poses are constrained to the frontal hemisphere, similar to FFHQ images.
- The model may reflect the biases of the training data or the ID encoder.

## Citation

If you find Arc2Face useful for your research, please consider citing us:

**BibTeX for Arc2Face:**
```bibtex
@inproceedings{paraperas2024arc2face,
      title={Arc2Face: A Foundation Model for ID-Consistent Human Faces}, 
      author={Paraperas Papantoniou, Foivos and Lattas, Alexandros and Moschoglou, Stylianos and Deng, Jiankang and Kainz, Bernhard and Zafeiriou, Stefanos},
      booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
      year={2024}
}
```

Additionally, if you use the Expression Adapter, please also cite the extension:

**BibTeX for Expression Adapter:**
```bibtex
@inproceedings{paraperas2025arc2face_exp,
      title={ID-Consistent, Precise Expression Generation with Blendshape-Guided Diffusion}, 
      author={Paraperas Papantoniou, Foivos and Zafeiriou, Stefanos},
      booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops},
      year={2025}
}
```