Model Description

This model is a fine-tuned Stable Diffusion model to generate realistic pedestrian-perspective images of crosswalks. It was fine-tuned on a dataset of 150 first-person view (FPV) images, primarily captured in sunny conditions, to enable controlled text-to-image generation for data augmentation in crosswalk segmentation tasks.

  • Base model: Stable Diffusion v1.4
  • Fine-tuning method: Text-to-image fine-tuning using custom FPV crosswalk dataset
  • Components:
    • unet โ€” fine-tuned U-Net weights
    • vae โ€” fine-tuned VAE weights
  • Intended use: Synthetic data generation for semantic segmentation augmentation

Use Cases

  • Data augmentation for crosswalk segmentation models
  • Generating diverse weather and lighting scenarios (e.g., fog, rain, snow, night) from text prompts
  • Research on assistive navigation systems for visually impaired pedestrians
  • Benchmarking model generalization across diverse environments

How to Use

You can generate images with the provided Python inference script:

# Clone the repository
git clone https://huggingface.co/kromic/sd-crosswalk-augmentation
cd sd-crosswalk-augmentation

# Install dependencies
pip install diffusers transformers torch

# Run inference
python generate.py

# Customize your prompt
prompt = "a crosswalk image"
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for kromic/sd-crosswalk-augmentation

Finetuned
(1156)
this model