--- license: apache-2.0 --- ## Comparison ![vae](vae.png) ## === Metrics === ``` SD15 VAE | MSE=2.732e-03 PSNR=28.10 LPIPS=0.147 Edge=0.206 KL=19.821 | Z[min/mean/max/std]=[-17.375, 0.072, 16.203, 0.900] | Skew[min/mean/max]=[-0.543, -0.126, 0.070] | Kurt[min/mean/max]=[-0.151, 1.228, 4.574] SDXL VAE fp16 fix | MSE=2.018e-03 PSNR=29.67 LPIPS=0.124 Edge=0.188 KL=32.222 | Z[min/mean/max/std]=[-4.066, -0.014, 4.301, 0.861] | Skew[min/mean/max]=[-0.017, 0.105, 0.165] | Kurt[min/mean/max]=[-0.380, -0.228, -0.107] AiArtLab/sdxl_vae | MSE=1.736e-03 PSNR=30.29 LPIPS=0.116 Edge=0.181 KL=32.222 | Z[min/mean/max/std]=[-4.066, -0.014, 4.301, 0.861] | Skew[min/mean/max]=[-0.017, 0.105, 0.165] | Kurt[min/mean/max]=[-0.380, -0.228, -0.107] LTX-Video VAE | MSE=1.202e-03 PSNR=31.84 LPIPS=0.141 Edge=0.168 KL=6.656 | Z[min/mean/max/std]=[-5.043, 0.011, 4.969, 0.272] | Skew[min/mean/max]=[-0.542, -0.018, 0.411] | Kurt[min/mean/max]=[-0.576, 0.741, 1.843] Wan2.2-TI2V-5B | MSE=7.782e-04 PSNR=34.25 LPIPS=0.052 Edge=0.121 KL=9.472 | Z[min/mean/max/std]=[-4.789, -0.012, 4.266, 0.375] | Skew[min/mean/max]=[-0.397, 0.022, 0.653] | Kurt[min/mean/max]=[-0.482, 0.006, 0.538] AiArtLab/wan16x_vae | MSE=7.275e-04 PSNR=34.51 LPIPS=0.051 Edge=0.118 KL=9.472 | Z[min/mean/max/std]=[-4.789, -0.012, 4.266, 0.375] | Skew[min/mean/max]=[-0.397, 0.022, 0.653] | Kurt[min/mean/max]=[-0.482, 0.006, 0.538] Wan2.2-T2V-A14B | MSE=7.073e-04 PSNR=34.59 LPIPS=0.048 Edge=0.115 KL=7.781 | Z[min/mean/max/std]=[-15.336, -0.159, 17.703, 2.563] | Skew[min/mean/max]=[-0.343, 0.006, 0.367] | Kurt[min/mean/max]=[-0.538, -0.071, 0.594] QwenImage | MSE=6.549e-04 PSNR=35.21 LPIPS=0.047 Edge=0.110 KL=7.776 | Z[min/mean/max/std]=[-15.297, -0.158, 17.688, 2.561] | Skew[min/mean/max]=[-0.346, 0.005, 0.368] | Kurt[min/mean/max]=[-0.538, -0.072, 0.597] AuraDiffusion/16ch-vae | MSE=5.361e-04 PSNR=35.80 LPIPS=0.041 Edge=0.100 KL=4.421 | Z[min/mean/max/std]=[-1.373, -0.005, 1.621, 0.165] | Skew[min/mean/max]=[-0.331, 0.040, 0.413] | Kurt[min/mean/max]=[-0.170, 0.303, 0.670] FLUX.1-schnell VAE | MSE=4.594e-04 PSNR=35.87 LPIPS=0.035 Edge=0.088 KL=13.016 | Z[min/mean/max/std]=[-5.824, -0.076, 6.246, 0.945] | Skew[min/mean/max]=[-0.268, 0.048, 0.483] | Kurt[min/mean/max]=[-0.498, 0.037, 0.568] AiArtLab/simplevae | MSE=4.818e-04 PSNR=36.20 LPIPS=0.035 Edge=0.095 KL=4.032 | Z[min/mean/max/std]=[-7.762, -0.061, 9.914, 0.965] | Skew[min/mean/max]=[-0.320, 0.044, 0.411] | Kurt[min/mean/max]=[-0.045, 0.346, 0.696] ``` ## === Percent === ``` | Model | PSNR | LPIPS | Edge | |----------------------------|-----------|-----------|-----------| | SD15 VAE | 100% | 100% | 100% | | SDXL VAE fp16 fix | 105.6% | 118.3% | 109.7% | | AiArtLab/sdxl_vae | 107.8% | 126.8% | 113.8% | | LTX-Video VAE | 113.3% | 103.8% | 122.5% | | Wan2.2-TI2V-5B | 121.9% | 280.8% | 170.8% | | AiArtLab/wan16x_vae | 122.8% | 287.3% | 174.2% | | Wan2.2-T2V-A14B | 123.1% | 303.2% | 179.4% | | QwenImage | 125.3% | 308.8% | 188.0% | | AuraDiffusion/16ch-vae | 127.4% | 355.5% | 206.6% | | FLUX.1-schnell VAE | 127.6% | 424.4% | 234.8% | | AiArtLab/simplevae | 128.8% | 415.2% | 217.7% | ``` ## Compare https://imgsli.com/NDE1MzE0/5/2 ### Diffusers ``` from diffusers import AutoencoderKL vae = AutoencoderKL.from_pretrained("AiArtLab/simplevae",subfolder="vae").cuda().half() ``` ## VAE Training Process - Inited from AuraDiffusion/16ch-vae (not compatible), added mid block/retrained - Dataset: 100,000 PNG images - Training Time: ~ 2 weeks - Hardware: Single RTX 5090 - Resolution: 512px - Precision: FP32 - Effective Batch Size: 16 - Optimizer: AdamW (8-bit) - Balanced losses (lpips, MSE, MAE, Edge, KL) ## Source https://huggingface.co/AiArtLab/simplevae/blob/main/train_vae.py ## Acknowledgments - **[Stan](https://t.me/Stangle)** — Key investor. Thank you for believing in us when others called it madness. - **Captainsaturnus** - **Love. Death. Transformers.** - **TOPAPEC** ## Donations Please contact with us if you may provide some GPU's or money on training DOGE: DEw2DR8C7BnF8GgcrfTzUjSnGkuMeJhg83 BTC: 3JHv9Hb8kEW8zMAccdgCdZGfrHeMhH1rpN ## Contacts [recoilme](https://t.me/recoilme)