ALBEF Price Prediction Model
This is a multimodal ALBEF (Align Before Fuse) model trained for product price prediction using both images and text descriptions.
Model Description
- Architecture: ALBEF with ResNet50 vision encoder and BERT text encoder
- Task: Product price prediction from images and catalog descriptions
- Training: Cross-modal fusion with contrastive learning
Latest Metrics (Epoch 3)
| Metric | Value |
|---|---|
| Validation Loss | 0.8670 |
| RMSE | 29.24 |
| MAE | 13.00 |
| SMAPE | 56.06% |
| MAPE | 73.07% |
Training Configuration
- Vision Encoder: ResNet50 (pretrained)
- Text Encoder: BERT-base-uncased
- Hidden Dimension: 1024
- Cross-modal Layers: 6
- Optimizer: AdamW with Cosine Annealing
- Loss: Combined MSE + SMAPE + Contrastive
Usage
import torch
from transformers import AutoTokenizer
from PIL import Image
import torchvision.transforms as T
from huggingface_hub import hf_hub_download
# Download checkpoint
checkpoint_path = hf_hub_download(
repo_id="Rudra12567/albef-price-prediction",
filename="best_model.pth"
)
# Load checkpoint
checkpoint = torch.load(checkpoint_path)
# Initialize your model and load state_dict
# model.load_state_dict(checkpoint['model_state_dict'])
# Prepare image
transform = T.Compose([
T.Resize((224, 224)),
T.ToTensor(),
T.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)),
])
image = Image.open('product.jpg').convert('RGB')
pixel_values = transform(image).unsqueeze(0)
# Prepare text
tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')
text_inputs = tokenizer(
"Product description here",
truncation=True,
padding='max_length',
max_length=128,
return_tensors='pt'
)
# Predict
with torch.no_grad():
outputs = model(pixel_values, text_inputs)
price_log = outputs['price_pred']
price = torch.expm1(price_log)
Training Details
- Trained on product images and catalog descriptions
- Log-transformed prices for better regression performance
- Multi-task learning with contrastive objectives
License
Apache 2.0
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support