title: Image GS
emoji: π»
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 5.49.1
python_version: '3.10'
app_file: gradio_app.py
suggested_hardware: cpu-basic
models:
- blanchon/image-gs-models-utils
pinned: false
Image-GS: Content-Adaptive Image Representation via 2D Gaussians
Yunxiang Zhang1*, Bingxuan Li1*, Alexandr Kuznetsov3β , Akshay Jindal2, Stavros Diolatzis2, Kenneth Chen1, Anton Sochenov2, Anton Kaplanyan2, Qi Sun1
* Equal contribution β β Work done while at Intel
Inspired by recent advancements in radiance field rendering, we introduce Image-GS, a content-adaptive image representation based on 2D Gaussians. Leveraging a custom differentiable renderer, Image-GS reconstructs images by adaptively allocating and progressively optimizing a group of anisotropic, colored 2D Gaussians. It achieves a favorable balance between visual fidelity and memory efficiency across a variety of stylized images frequently seen in graphics workflows, especially for those showing non-uniformly distributed features and in low-bitrate regimes. Moreover, it supports hardware-friendly rapid random access for real-time usage, requiring only 0.3K MACs to decode a pixel. Through error-guided progressive optimization, Image-GS naturally constructs a smooth level-of-detail hierarchy. We demonstrate its versatility with several applications, including texture compression, semantics-aware compression, and joint image compression and restoration.
Figure 1: Image-GS reconstructs an image by adaptively allocating and progressively optimizing a set of colored 2D Gaussians. It achieves favorable rate-distortion trade-offs, hardware-friendly random access, and flexible quality control through a smooth level-of-detail stack. (a) visualizes the optimized spatial distribution of Gaussians (20% randomly sampled for clarity). (b) Image-GSβs explicit content-adaptive design effectively captures non-uniformly distributed image features and better preserves fine details under constrained memory budgets. In the inset error maps, brighter colors indicate larger errors.
Setup
- Create a dedicated Python environment and install the dependencies
git clone https://github.com/NYU-ICL/image-gs.git cd image-gs conda env create -f environment.yml conda activate image-gs pip install git+https://github.com/rahul-goel/fused-ssim/ --no-build-isolation cd gsplat pip install -e ".[dev]" cd .. - Download the image and texture datasets from OneDrive and organize the folder structure as follows
image-gs βββ media βββ images βββ textures - (Optional) To run saliency-guided Gaussian position initialization, download the pre-trained EML-Net models (res_imagenet.pth, res_places.pth, res_decoder.pth) and place them under the
models/emlnet/folderimage-gs βββ models βββ emlnet βββ res_decoder.pth βββ res_imagenet.pth βββ res_places.pth
Quick Start
Image Compression
- Optimize an Image-GS representation for an input image
anime-1_2k.pngusing10000Gaussians with half-precision parameters
python main.py --input_path="images/anime-1_2k.png" --exp_name="test/anime-1_2k" --num_gaussians=10000 --quantize
- Render the corresponding optimized Image-GS representation at a new resolution with height
4000(aspect ratio is maintained)
python main.py --input_path="images/anime-1_2k.png" --exp_name="test/anime-1_2k" --num_gaussians=10000 --quantize --eval --render_height=4000
Texture Stack Compression
- Optimize an Image-GS representation for an input texture stack
alarm-clock_2kusing30000Gaussians with half-precision parameters
python main.py --input_path="textures/alarm-clock_2k" --exp_name="test/alarm-clock_2k" --num_gaussians=30000 --quantize
- Render the corresponding optimized Image-GS representation at a new resolution with height
3000(aspect ratio is maintained)
python main.py --input_path="textures/alarm-clock_2k" --exp_name="test/alarm-clock_2k" --num_gaussians=30000 --quantize --eval --render_height=3000
Control bit precision of Gaussian parameters
- Optimize an Image-GS representation for an input image
anime-1_2k.pngusing10000Gaussians with 12-bit-precision parameters
python main.py --input_path="images/anime-1_2k.png" --exp_name="test/anime-1_2k" --num_gaussians=10000 --quantize --pos_bits=12 --scale_bits 12 --rot_bits 12 --feat_bits 12
Switch to saliency-guided Gaussian position initialization
- Optimize an Image-GS representation for an input image
anime-1_2k.pngusing10000Gaussians with half-precision parameters and saliency-guided initialization
python main.py --input_path="images/anime-1_2k.png" --exp_name="test/anime-1_2k" --num_gaussians=10000 --quantize --init_mode="saliency"
Gradio Web Interface
We provide a user-friendly web interface built with Gradio for easy experimentation and training visualization.
Setup for Web Interface
- Install Gradio (in addition to the main dependencies):
pip install gradio>=5.0.0
- Launch the web interface:
python gradio_app.py
- Open your browser and navigate to
http://localhost:7860
Features
The Gradio interface provides:
- Interactive Parameter Configuration: Adjust all training parameters through an intuitive UI
- Image Upload: Drag and drop any image to train on
- Real-time Training Progress: Stream training logs and intermediate results
- Live Visualization: Watch Gaussian placement and rendering progress during training
- Result Gallery: View final renders, gradient maps, and saliency maps
- Easy Experimentation: No need to remember command-line arguments
Interface Sections
Configuration Panel:
- Basic parameters (number of Gaussians, training steps)
- Quantization settings for memory efficiency
- Initialization modes (gradient, saliency, random)
- Advanced optimization parameters (learning rates, loss weights)
Training Progress:
- Real-time streaming logs
- Current render and Gaussian visualization updates
- Training status and control buttons
Results Display:
- Final optimized image
- Gradient and saliency maps used for initialization
- Download capabilities for all results
Usage Tips
- Start with default parameters for your first run
- Use saliency initialization for better results on complex images
- Enable Gaussian visualization to see how the representation evolves
- Adjust save image steps to control visualization frequency (lower = more updates, but slower)
- For quick tests, reduce max steps to 500-1000
Command Line Arguments
Please refer to cfgs/default.yaml for the full list of arguments and their default values.
Post-optimization rendering
--evalrender the optimized Image-GS representation.--render_heightimage height for rendering (aspect ratio is maintained).
Bit precision control: 32 bits (float32) per dimension by default
--quantizeenable bit precision control of Gaussian parameters.--pos_bitsbit precision of individual coordinate dimension.--scale_bitsbit precision of individual scale dimension.--rot_bitsbit precision of Gaussian orientation angle.--feat_bitsbit precision of individual feature dimension.
Logging
--exp_namepath to the logging directory.--vis_gaussians: visualize Gaussians during optimization.--save_image_stepsfrequency of rendering intermediate results during optimization.--save_ckpt_stepsfrequency of checkpointing during optimization.
Input image
--input_pathpath to an image file or a directory containing a texture stack.--downsampleload a downsampled version of the input image or texture stack as the optimization target to evaluate image upsampling performance.--downsample_ratiodownsampling ratio.--gammaoptimize in a gamma-corrected space, modify with caution.
Gaussian
--num_gaussiansnumber of Gaussians (for compression rate control).--init_scaleinitial Gaussian scale in number of pixels.--disable_topk_normdisable top-K normalization.--disable_inverse_scaledisable inverse Gaussian scale optimization.--init_modeGaussian position initialization mode, valid values include "gradient", "saliency", and "random".--init_random_ratioratio of Gaussians with randomly initialized position.
Optimization
--disable_tilesdisable tile-based rendering (warning: optimization and rendering without tiles will be way slower).--max_stepsmaximum number of optimization steps.--pos_lrGaussian position learning rate.--scale_lrGaussian scale learning rate.--rot_lrGaussian orientation angle learning rate.--feat_lrGaussian feature learning rate.--disable_lr_scheduledisable learning rate decay and early stopping schedule.--disable_prog_optimdisable error-guided progressive optimization.
Acknowledgements
We would like to thank the gsplat team, and the authors of 3DGS, fused-ssim, and EML-Net for their great work, based on which Image-GS was developed.
License
This project is licensed under the terms of the MIT license.
Citation
If you find this project helpful to your research, please consider citing BibTeX:
@inproceedings{zhang2025image,
title={Image-gs: Content-adaptive image representation via 2d gaussians},
author={Zhang, Yunxiang and Li, Bingxuan and Kuznetsov, Alexandr and Jindal, Akshay and Diolatzis, Stavros and Chen, Kenneth and Sochenov, Anton and Kaplanyan, Anton and Sun, Qi},
booktitle={Proceedings of the Special Interest Group on Computer Graphics and Interactive Techniques Conference Conference Papers},
pages={1--11},
year={2025}
}


