Swin Transformer for Fish Classification

image

This swin-tiny transformer was trained to classify fish species from the fish-visa dataset. The dataset contains massive class imbalance. While I tried to mitigate the effect of this imbalance, the model has varying degrees of accuracy depending on the class frequency. Therefore, I recommend to use this model weights as a baseline for building a classification or segmentation model on a fish dataset, as it would not be suitable for any task out-of-the-box.

Usage

Load and use the model :

import torch
from PIL import Image
from transformers import AutoImageProcessor
from transformers import AutoModel

# 1. Load the model and processor from the Hub
repo_id = "paulprt/swin-fish-classification"

# This downloads the code from the Hub and initializes the classes automatically
processor = AutoImageProcessor.from_pretrained(repo_id, trust_remote_code=True)
model = AutoModel.from_pretrained(repo_id, trust_remote_code=True)

You can find the class mappings using:

model.config.id2label  # Mapping from class IDs to labels
model.config.label2id  # Mapping from labels to class IDs

Training

The model was trained for 80 epochs on a A10 GPU, first by training the classifier head only for 30 epochs and then by training the entire model for 50 more epochs.

loss_curve

Results

Following the original fish-vista paper, I computed the accuracy accross bins on the test dataset (ultra rare 2-10, minority 10-100, neutral 100-500, majority 500+).

image

Downloads last month
114
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train paulprt/swin-fish-classification