FoivosPar
/

Arc2Face

@@ -1,17 +1,16 @@
 ---
-license: mit
 language:
 - en
 library_name: diffusers
 ---
 # Arc2Face Model Card
 <div align="center">
-[**Project Page**](https://arc2face.github.io/) **|** [**Paper (ArXiv)**](https://arxiv.org/abs/2403.11641) **|** [**Code**](https://github.com/foivospar/Arc2Face) **|** [🤗 **Gradio demo**](https://huggingface.co/spaces/FoivosPar/Arc2Face)
 </div>
@@ -20,8 +19,14 @@ library_name: diffusers
 Arc2Face is an ID-conditioned face model, that can generate diverse, ID-consistent photos of a person given only its ArcFace ID-embedding.
 It is trained on a restored version of the WebFace42M face recognition database, and is further fine-tuned on FFHQ and CelebA-HQ.
-<div  align="center">
-<img src='assets/samples_short.jpg'>
 </div>
 ## Model Details
@@ -38,11 +43,11 @@ Arc2Face adapts the pre-trained backbone to the task of ID-to-face generation, c
 We also provide a ControlNet model trained on top of Arc2Face for pose control.
-<div  align="center">
-<img src='assets/controlnet_short.jpg'>
 </div>
-## Usage
 The models can be downloaded directly from this repository or using python:
 ```python
@@ -58,6 +63,75 @@ hf_hub_download(repo_id="FoivosPar/Arc2Face", filename="controlnet/diffusion_pyt
 Please check our [GitHub repository](https://github.com/foivospar/Arc2Face) for complete inference instructions.
 ## Limitations and Bias
 - Only one person per image can be generated.
@@ -66,9 +140,9 @@ Please check our [GitHub repository](https://github.com/foivospar/Arc2Face) for
 ## Citation
-**BibTeX:**
 ```bibtex
 @inproceedings{paraperas2024arc2face,
       title={Arc2Face: A Foundation Model for ID-Consistent Human Faces},
@@ -76,4 +150,16 @@ Please check our [GitHub repository](https://github.com/foivospar/Arc2Face) for
       booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
       year={2024}
 }
 ```

 ---
 language:
 - en
 library_name: diffusers
+license: mit
+pipeline_tag: image-to-image
 ---
 # Arc2Face Model Card
 <div align="center">
+[**Project Page**](https://arc2face.github.io/) **|** [**Original Paper (ArXiv)**](https://arxiv.org/abs/2403.11641) **|** [**Expression Adapter Paper (HF)**](https://huggingface.co/papers/2510.04706) **|** [**Code**](https://github.com/foivospar/Arc2Face) **|** [🤗 **Gradio demo**](https://huggingface.co/spaces/FoivosPar/Arc2Face)
 </div>
 Arc2Face is an ID-conditioned face model, that can generate diverse, ID-consistent photos of a person given only its ArcFace ID-embedding.
 It is trained on a restored version of the WebFace42M face recognition database, and is further fine-tuned on FFHQ and CelebA-HQ.
+Arc2Face has been extended with a fine-grained **Expression Adapter**, enabling the generation of any subject under any facial expression (even rare, asymmetric, subtle, or extreme ones). This extension is detailed in the paper [ID-Consistent, Precise Expression Generation with Blendshape-Guided Diffusion](https://huggingface.co/papers/2510.04706).
+<div align="center">
+<img src='https://huggingface.co/foivospar/Arc2Face/resolve/main/assets/exp_teaser.jpg'>
+</div>
+<div align="center">
+<img src='https://huggingface.co/foivospar/Arc2Face/resolve/main/assets/samples_short.jpg'>
 </div>
 ## Model Details
 We also provide a ControlNet model trained on top of Arc2Face for pose control.
+<div align="center">
+<img src='https://huggingface.co/foivospar/Arc2Face/resolve/main/assets/controlnet_short.jpg'>
 </div>
+## Download Models
 The models can be downloaded directly from this repository or using python:
 ```python
 Please check our [GitHub repository](https://github.com/foivospar/Arc2Face) for complete inference instructions.
+## Sample Usage with Diffusers
+To use the Arc2Face model with the `diffusers` library, first load the pipeline components:
+```python
+from diffusers import (
+    StableDiffusionPipeline,
+    UNet2DConditionModel,
+    DPMSolverMultistepScheduler,
+)
+from arc2face import CLIPTextModelWrapper, project_face_embs
+import torch
+from insightface.app import FaceAnalysis
+from PIL import Image
+import numpy as np
+# Arc2Face is built upon SD1.5
+# The repo below can be used instead of the now deprecated 'runwayml/stable-diffusion-v1-5'
+base_model = 'runwayml/stable-diffusion-v1-5' # Changed to match original from README
+encoder = CLIPTextModelWrapper.from_pretrained(
+    'models', subfolder="encoder", torch_dtype=torch.float16
+)
+unet = UNet2DConditionModel.from_pretrained(
+    'models', subfolder="arc2face", torch_dtype=torch.float16
+)
+pipeline = StableDiffusionPipeline.from_pretrained(
+        base_model,
+        text_encoder=encoder,
+        unet=unet,
+        torch_dtype=torch.float16,
+        safety_checker=None
+    )
+pipeline.scheduler = DPMSolverMultistepScheduler.from_config(pipeline.scheduler.config)
+pipeline = pipeline.to('cuda')
+```
+Then, pick an image to extract the ID-embedding and generate images:
+```python
+app = FaceAnalysis(name='antelopev2', root='./', providers=['CUDAExecutionProvider', 'CPUExecutionProvider'])
+app.prepare(ctx_id=0, det_size=(640, 640))
+img = np.array(Image.open('https://huggingface.co/foivospar/Arc2Face/resolve/main/assets/examples/joacquin.png'))[:,:,::-1] # Updated image path
+faces = app.get(img)
+faces = sorted(faces, key=lambda x:(x['bbox'][2]-x['bbox'][0])*(x['bbox'][3]-x['bbox'][1]))[-1]  # select largest face (if more than one detected)
+id_emb = torch.tensor(faces['embedding'], dtype=torch.float16)[None].cuda()
+id_emb = id_emb/torch.norm(id_emb, dim=1, keepdim=True)   # normalize embedding
+id_emb = project_face_embs(pipeline, id_emb)    # pass through the encoder
+```
+<div align="center">
+<img src='https://huggingface.co/foivospar/Arc2Face/resolve/main/assets/examples/joacquin.png' style='width:25%;'>
+</div>
+Finally, generate images:
+```python
+num_images = 4
+images = pipeline(prompt_embeds=id_emb, num_inference_steps=25, guidance_scale=3.0, num_images_per_prompt=num_images).images
+```
+<div align="center">
+<img src='https://huggingface.co/foivospar/Arc2Face/resolve/main/assets/samples.jpg'>
+</div>
 ## Limitations and Bias
 - Only one person per image can be generated.
 ## Citation
+If you find Arc2Face useful for your research, please consider citing us:
+**BibTeX for Arc2Face:**
 ```bibtex
 @inproceedings{paraperas2024arc2face,
       title={Arc2Face: A Foundation Model for ID-Consistent Human Faces},
       booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
       year={2024}
 }
+```
+Additionally, if you use the Expression Adapter, please also cite the extension:
+**BibTeX for Expression Adapter:**
+```bibtex
+@inproceedings{paraperas2025arc2face_exp,
+      title={ID-Consistent, Precise Expression Generation with Blendshape-Guided Diffusion},
+      author={Paraperas Papantoniou, Foivos and Zafeiriou, Stefanos},
+      booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops},
+      year={2025}
+}
 ```