MetaCLIP2 Image Classification Experiments
Collection
Domain-Specific Downstream Tasks
•
5 items
•
Updated
•
2
MetaCLIP-2-Gender-Identifier is an image classification vision-language encoder model fine-tuned from facebook/metaclip-2-worldwide-s16 for a single-label classification task. It is designed to predict the gender of a person from an image using the MetaClip2ForImageClassification architecture.
MetaCLIP 2: A Worldwide Scaling Recipe : https://huggingface.co/papers/2507.22062
Classification Report:
precision recall f1-score support
female 0.9815 0.9631 0.9722 1600
male 0.9638 0.9819 0.9728 1600
accuracy 0.9725 3200
macro avg 0.9727 0.9725 0.9725 3200
weighted avg 0.9727 0.9725 0.9725 3200
The model categorizes images into two gender classes:
!pip install -q transformers torch pillow gradio
import gradio as gr
import torch
from transformers import AutoImageProcessor, AutoModelForImageClassification
from PIL import Image
# Model name from Hugging Face Hub
model_name = "prithivMLmods/MetaCLIP-2-Gender-Identifier"
# Load processor and model
processor = AutoImageProcessor.from_pretrained(model_name)
model = AutoModelForImageClassification.from_pretrained(model_name)
model.eval()
# Define labels
LABELS = {
0: "female",
1: "male"
}
def age_classification(image):
"""Predict the age group of a person from an image."""
image = Image.fromarray(image).convert("RGB")
inputs = processor(images=image, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
logits = outputs.logits
probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist()
predictions = {LABELS[i]: round(probs[i], 3) for i in range(len(probs))}
return predictions
# Build Gradio interface
iface = gr.Interface(
fn=age_classification,
inputs=gr.Image(type="numpy", label="Upload Image"),
outputs=gr.Label(label="Predicted Gender"),
title="MetaCLIP-2-Gender-Identifier",
description="Upload an image to predict the person's gender."
)
# Launch app
if __name__ == "__main__":
iface.launch()
The MetaCLIP-2-Gender-Identifier model is designed to classify images into gender categories. Potential use cases include:
Base model
facebook/metaclip-2-worldwide-s16