🌿 Sisigoks/FloraSense

FloraSense is a fine-tuned Vision Transformer (ViT) model designed for accurate classification of plant species and flora-related imagery. It builds on top of the powerful google/vit-base-patch16-224 base model and is fine-tuned on the Planter_GARDEN_EDITION dataset curated by Sisigoks, which includes over 10,000 diverse plant images.

🧠 Model Description

Architecture: Vision Transformer (ViT)
Base Model: google/vit-base-patch16-224
Task: Image Classification
Use Case: Automated plant and flora species recognition in digital botany, garden classification systems, plant care apps, biodiversity projects, and educational tools.

📊 Model Performance

Evaluation Accuracy: 35.46%
Evaluation Loss: 4.2894
Epochs Trained: 10
Evaluation Speed:
- 33.9 samples/sec
- 2.12 steps/sec

⚠️ While the accuracy may appear moderate, the model is handling over 10,000 highly similar plant species, making this a non-trivial challenge in fine-grained classification.

🧪 Training Procedure

Hyperparameter	Value
Learning Rate	5e-5
Train Batch Size	16
Eval Batch Size	16
Gradient Accumulation	4
Total Effective Batch	64
Optimizer	Adam (β1=0.9, β2=0.999)
Scheduler	Linear w/ warmup (10%)
Epochs	15
Seed	42

Framework: PyTorch
Libraries: Transformers 4.45.1, Datasets 3.0.1, Tokenizers 0.20.0

📚 Dataset

Name: Sisigoks/Planter_GARDEN_EDITION
Type: Image Classification
Language: English
Scope: Over 10,000 unique plant and floral species
Format: Real-world garden and nature photography
Use Case: Realistic and diverse training scenarios for classification models

✅ Intended Use

Use Cases

Botanical image recognition apps
Educational tools for students and researchers
Smart gardening & plant care solutions
Field-use flora identification via AR and mobile apps

Target Users

Botanists
AI and ML researchers
Gardeners and farmers
Biology educators and students

⚠️ Limitations

May confuse visually similar species due to fine-grained class diversity.
Performance could degrade in poor lighting or occlusion-heavy environments.
Biases may exist based on the geographic scope of the dataset (e.g., underrepresentation of tropical or rare plants).

🔐 Ethical Considerations

Accuracy: Misclassification of medicinal/toxic plants can have real-world safety implications.
Bias: Regional, lighting, or season-specific training data may skew predictions in certain environments.
Usage: This is a research-grade model and should not be relied on for critical decisions without expert validation.

🚀 How to Use

from transformers import AutoImageProcessor, AutoModelForImageClassification
from PIL import Image
import torch

# Load model and processor
processor = AutoImageProcessor.from_pretrained("Sisigoks/FloraSense")
model = AutoModelForImageClassification.from_pretrained("Sisigoks/FloraSense")

# Load and preprocess image
image = Image.open("your_image.jpg")
inputs = processor(images=image, return_tensors="pt")

# Inference
with torch.no_grad():
    outputs = model(**inputs)
    logits = outputs.logits
    predicted_label = logits.argmax(-1).item()

print(f"Predicted class ID: {predicted_label}")

📄 Citation

If you use this model or dataset in your work, please cite:

  @misc{sisigoks_florasense_2025,
    author = {Sisigoks},
    title = {FloraSense: ViT-based Fine-Grained Plant Classifier},
    year = {2025},
    publisher = {Hugging Face},
    howpublished = {\url{https://huggingface.co/Sisigoks/FloraSense}}
  }

🙌 Acknowledgements

Hugging Face 🤗 – for providing the model and dataset hosting infrastructure.
Google Research – for the original ViT architecture that enabled scalable vision transformers.

Downloads last month: 369

Safetensors

Model size

0.3B params

Tensor type

F32

Model tree for Sisigoks/FloraSense

Base model

google/vit-base-patch16-224

Finetuned

(927)

this model

Sisigoks
/

FloraSense