Sprite-flow

Flow-based generative model for unguided generation of 128x128 RGBA pixel art characters.

Model Details

Model Description

  • Developed by: Mihailo Radović
  • Model type: Unconditional Image Generation
  • License: MIT

Model Sources

Uses

Direct Use

Predicts the vector field for generating 128x128 RGBA pixel art character images from Isotropic Gaussian Distribution by simulating an ODE with Linear Noise Scheduling.

Out-of-Scope Use

Could be used with Cosine or any other Noise scheduler.

How to Get Started with the Model

  • Step 1 - Clone the GitHub Repo

  • Step 2 - Initialize the model:

    from models.unet import PixelArtUNet
    
    model = PixelArtUNet(
        channels = [128, 256, 512, 1024],
        num_residual_layers = 2,
        t_embed_dim = 128,
        midcoder_dropout_p=0.2
    ).to(device)
    
  • Step 3: Load Model weights:

    from huggingface_hub import hf_hub_download
    from safetensors.torch import load_file
    
    repo_id = "mradovic38/sprite-flow"
    filename = "model.safetensors"
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    checkpoint = load_file(file_path)
    model.load_state_dict(checkpoint)
    model.to(device)
    model.eval()
    
  • Step 4: Initialize the probability path:

    from sampling.conditional_probability_path import GaussianConditionalProbabilityPath
    from sampling.noise_scheduling import LinearAlpha, LinearBeta
    
    path = GaussianConditionalProbabilityPath(
        p_data=None,
        p_simple_shape=[4, 128, 128],
        alpha=LinearAlpha(),
        beta=LinearBeta()
    ).to(device)
    path.eval()
    
  • Step 5: Simulate ODE:

    import torch
    
    from diff_eq.ode_sde import UnguidedVectorFieldODE
    from diff_eq.simulator import EulerSimulator
    
    num_timesteps = 200 # example number of timesteps
    num_samples = 3 # example number of samples
    
    ts = torch.linspace(0, 1, num_timesteps).view(1, -1, 1, 1, 1).expand(num_samples, -1, 1, 1, 1).to(device)
    x0 = path.p_simple.sample(num_samples).to(device)  # (num_samples, 4, 128, 128)
    ode = UnguidedVectorFieldODE(model)
    simulator = EulerSimulator(ode)
    x1 = simulator.simulate(x0, ts)  # (num_samples, 4, 128, 128)
    
  • Step 6: Turn torch tensor to PIL:

    from utils.helpers import tensor_to_rgba_image, normalize_to_unit
    
    x1 = normalize_to_unit(x1) # [-1, 1] -> [0, 1]
    imgs = tensor_to_rgba_image(x1)
    
Downloads last month
5
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Space using mradovic38/sprite-flow 1