UG Food Detection Model
This model identifies food ingredients, utensils, and estimates portion sizes from images.
Model Description
This Vision Transformer (ViT) model is trained on the UG Food Dataset to recognize:
- Food ingredients: Various food items and ingredients
- Kitchen utensils: Cooking tools and equipment
- Portion sizes: Measurement estimates
Classes
The model can identify 40 classes.
Usage
from transformers import ViTImageProcessor, ViTForImageClassification
from PIL import Image
import torch
# Load model and processor
processor = ViTImageProcessor.from_pretrained("ssevan/ug-food-detector")
model = ViTForImageClassification.from_pretrained("ssevan/ug-food-detector")
# Process image
image = Image.open('food_image.jpg')
inputs = processor(image, return_tensors='pt')
# Get predictions
with torch.no_grad():
outputs = model(**inputs)
probabilities = torch.nn.functional.softmax(outputs.logits, dim=-1)
predicted_class_idx = torch.argmax(probabilities, dim=1).item()
print(f'Predicted class index: {predicted_class_idx}')
Mobile Usage
This model is optimized for mobile deployment.
- Downloads last month
- 45
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support