Sprite-flow

Flow-based generative model for unguided generation of 128x128 RGBA pixel art characters.

Model Details

Model Description

Developed by: Mihailo Radović
Model type: Unconditional Image Generation
License: MIT

Model Sources

Repository: GitHub Repo
Demo: Gradio App

Uses

Direct Use

Predicts the vector field for generating 128x128 RGBA pixel art character images from Isotropic Gaussian Distribution by simulating an ODE with Linear Noise Scheduling.

Out-of-Scope Use

Could be used with Cosine or any other Noise scheduler.

How to Get Started with the Model

Step 1 - Clone the GitHub Repo

Step 2 - Initialize the model:

from models.unet import PixelArtUNet

model = PixelArtUNet(
    channels = [128, 256, 512, 1024],
    num_residual_layers = 2,
    t_embed_dim = 128,
    midcoder_dropout_p=0.2
).to(device)

Step 3: Load Model weights:

from huggingface_hub import hf_hub_download
from safetensors.torch import load_file

repo_id = "mradovic38/sprite-flow"
filename = "model.safetensors"
file_path = hf_hub_download(repo_id=repo_id, filename=filename)
checkpoint = load_file(file_path)
model.load_state_dict(checkpoint)
model.to(device)
model.eval()

Step 4: Initialize the probability path:

from sampling.conditional_probability_path import GaussianConditionalProbabilityPath
from sampling.noise_scheduling import LinearAlpha, LinearBeta

path = GaussianConditionalProbabilityPath(
    p_data=None,
    p_simple_shape=[4, 128, 128],
    alpha=LinearAlpha(),
    beta=LinearBeta()
).to(device)
path.eval()

Step 5: Simulate ODE:

import torch

from diff_eq.ode_sde import UnguidedVectorFieldODE
from diff_eq.simulator import EulerSimulator

num_timesteps = 200 # example number of timesteps
num_samples = 3 # example number of samples

ts = torch.linspace(0, 1, num_timesteps).view(1, -1, 1, 1, 1).expand(num_samples, -1, 1, 1, 1).to(device)
x0 = path.p_simple.sample(num_samples).to(device)  # (num_samples, 4, 128, 128)
ode = UnguidedVectorFieldODE(model)
simulator = EulerSimulator(ode)
x1 = simulator.simulate(x0, ts)  # (num_samples, 4, 128, 128)

Step 6: Turn torch tensor to PIL:

from utils.helpers import tensor_to_rgba_image, normalize_to_unit

x1 = normalize_to_unit(x1) # [-1, 1] -> [0, 1]
imgs = tensor_to_rgba_image(x1)

Downloads last month: 5

Inference Providers NEW

Unconditional Image Generation

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

mradovic38
/

sprite-flow