UG Food Detection Model

This model identifies food ingredients, utensils, and estimates portion sizes from images.

Model Description

This Vision Transformer (ViT) model is trained on the UG Food Dataset to recognize:

  • Food ingredients: Various food items and ingredients
  • Kitchen utensils: Cooking tools and equipment
  • Portion sizes: Measurement estimates

Classes

The model can identify 40 classes.

Usage

from transformers import ViTImageProcessor, ViTForImageClassification
from PIL import Image
import torch

# Load model and processor
processor = ViTImageProcessor.from_pretrained("ssevan/ug-food-detector")
model = ViTForImageClassification.from_pretrained("ssevan/ug-food-detector")

# Process image
image = Image.open('food_image.jpg')
inputs = processor(image, return_tensors='pt')

# Get predictions
with torch.no_grad():
    outputs = model(**inputs)
    probabilities = torch.nn.functional.softmax(outputs.logits, dim=-1)
    predicted_class_idx = torch.argmax(probabilities, dim=1).item()

print(f'Predicted class index: {predicted_class_idx}')

Mobile Usage

This model is optimized for mobile deployment.

Downloads last month
45
Safetensors
Model size
85.8M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support