Model Description
This model is a fine-tuned Stable Diffusion model to generate realistic pedestrian-perspective images of crosswalks. It was fine-tuned on a dataset of 150 first-person view (FPV) images, primarily captured in sunny conditions, to enable controlled text-to-image generation for data augmentation in crosswalk segmentation tasks.
- Base model: Stable Diffusion v1.4
- Fine-tuning method: Text-to-image fine-tuning using custom FPV crosswalk dataset
- Components:
unetโ fine-tuned U-Net weightsvaeโ fine-tuned VAE weights
- Intended use: Synthetic data generation for semantic segmentation augmentation
Use Cases
- Data augmentation for crosswalk segmentation models
- Generating diverse weather and lighting scenarios (e.g., fog, rain, snow, night) from text prompts
- Research on assistive navigation systems for visually impaired pedestrians
- Benchmarking model generalization across diverse environments
How to Use
You can generate images with the provided Python inference script:
# Clone the repository
git clone https://huggingface.co/kromic/sd-crosswalk-augmentation
cd sd-crosswalk-augmentation
# Install dependencies
pip install diffusers transformers torch
# Run inference
python generate.py
# Customize your prompt
prompt = "a crosswalk image"
- Downloads last month
- -
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for kromic/sd-crosswalk-augmentation
Base model
CompVis/stable-diffusion-v1-4