---
title: Image GS
emoji: 💻
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 5.49.1
python_version: "3.10"
app_file: gradio_app.py
suggested_hardware: "cpu-basic"
models:
- blanchon/image-gs-models-utils
pinned: false
---
Image-GS: Content-Adaptive Image Representation via 2D Gaussians
[**Yunxiang Zhang**](https://yunxiangzhang.github.io/)
1\*,
[**Bingxuan Li**](https://bingxuan-li.github.io/)
1\*,
[**Alexandr Kuznetsov**](https://alexku.me/)
3†,
[**Akshay Jindal**](https://www.akshayjindal.com/)
2,
[**Stavros Diolatzis**](https://www.sdiolatz.info/)
2,
[**Kenneth Chen**](https://kenchen10.github.io/)
1,
[**Anton Sochenov**](https://www.intel.com/content/www/us/en/developer/articles/community/gpu-researchers-anton-sochenov.html)
2,
[**Anton Kaplanyan**](http://kaplanyan.com/)
2,
[**Qi Sun**](https://qisun.me/)
1
\* Equal contribution † Work done while at Intel
1
2
3
Neural image representations have emerged as a promising approach for encoding and rendering visual data. Combined with learning-based workflows, they demonstrate impressive trade-offs between visual fidelity and memory footprint. Existing methods in this domain, however, often rely on fixed data structures that suboptimally allocate memory or compute-intensive implicit models, hindering their practicality for real-time graphics applications.
Inspired by recent advancements in radiance field rendering, we introduce Image-GS, a content-adaptive image representation based on 2D Gaussians. Leveraging a custom differentiable renderer, Image-GS reconstructs images by adaptively allocating and progressively optimizing a group of anisotropic, colored 2D Gaussians. It achieves a favorable balance between visual fidelity and memory efficiency across a variety of stylized images frequently seen in graphics workflows, especially for those showing non-uniformly distributed features and in low-bitrate regimes. Moreover, it supports hardware-friendly rapid random access for real-time usage, requiring only 0.3K MACs to decode a pixel. Through error-guided progressive optimization, Image-GS naturally constructs a smooth level-of-detail hierarchy. We demonstrate its versatility with several applications, including texture compression, semantics-aware compression, and joint image compression and restoration.
Figure 1: Image-GS reconstructs an image by adaptively allocating and progressively optimizing a set of colored 2D Gaussians. It achieves favorable rate-distortion trade-offs, hardware-friendly random access, and flexible quality control through a smooth level-of-detail stack. (a) visualizes the optimized spatial distribution of Gaussians (20% randomly sampled for clarity). (b) Image-GS’s explicit content-adaptive design effectively captures non-uniformly distributed image features and better preserves fine details under constrained memory budgets. In the inset error maps, brighter colors indicate larger errors.
## Setup
1. Create a dedicated Python environment and install the dependencies
```bash
git clone https://github.com/NYU-ICL/image-gs.git
cd image-gs
conda env create -f environment.yml
conda activate image-gs
pip install git+https://github.com/rahul-goel/fused-ssim/ --no-build-isolation
cd gsplat
pip install -e ".[dev]"
cd ..
```
2. Download the image and texture datasets from [OneDrive](https://1drv.ms/u/c/3a8968df8a027819/EeshjZJlMtdCmvvmESiN2pABM71EDaoLYmEwuOvecg0tAA?e=GybqBv) and organize the folder structure as follows
```
image-gs
└── media
├── images
└── textures
```
3. (Optional) To run saliency-guided Gaussian position initialization, download the pre-trained [EML-Net](https://github.com/SenJia/EML-NET-Saliency) models ([res_imagenet.pth](https://drive.google.com/open?id=1-a494canr9qWKLdm-DUDMgbGwtlAJz71), [res_places.pth](https://drive.google.com/open?id=18nRz0JSRICLqnLQtAvq01azZAsH0SEzS), [res_decoder.pth](https://drive.google.com/open?id=1vwrkz3eX-AMtXQE08oivGMwS4lKB74sH)) and place them under the `models/emlnet/` folder
```
image-gs
└── models
└── emlnet
├── res_decoder.pth
├── res_imagenet.pth
└── res_places.pth
```
## Quick Start
#### Image Compression
- Optimize an Image-GS representation for an input image `anime-1_2k.png` using `10000` Gaussians with half-precision parameters
```bash
python main.py --input_path="images/anime-1_2k.png" --exp_name="test/anime-1_2k" --num_gaussians=10000 --quantize
```
- Render the corresponding optimized Image-GS representation at a new resolution with height `4000` (aspect ratio is maintained)
```bash
python main.py --input_path="images/anime-1_2k.png" --exp_name="test/anime-1_2k" --num_gaussians=10000 --quantize --eval --render_height=4000
```
#### Texture Stack Compression
- Optimize an Image-GS representation for an input texture stack `alarm-clock_2k` using `30000` Gaussians with half-precision parameters
```bash
python main.py --input_path="textures/alarm-clock_2k" --exp_name="test/alarm-clock_2k" --num_gaussians=30000 --quantize
```
- Render the corresponding optimized Image-GS representation at a new resolution with height `3000` (aspect ratio is maintained)
```bash
python main.py --input_path="textures/alarm-clock_2k" --exp_name="test/alarm-clock_2k" --num_gaussians=30000 --quantize --eval --render_height=3000
```
#### Control bit precision of Gaussian parameters
- Optimize an Image-GS representation for an input image `anime-1_2k.png` using `10000` Gaussians with 12-bit-precision parameters
```bash
python main.py --input_path="images/anime-1_2k.png" --exp_name="test/anime-1_2k" --num_gaussians=10000 --quantize --pos_bits=12 --scale_bits 12 --rot_bits 12 --feat_bits 12
```
#### Switch to saliency-guided Gaussian position initialization
- Optimize an Image-GS representation for an input image `anime-1_2k.png` using `10000` Gaussians with half-precision parameters and saliency-guided initialization
```bash
python main.py --input_path="images/anime-1_2k.png" --exp_name="test/anime-1_2k" --num_gaussians=10000 --quantize --init_mode="saliency"
```
## Gradio Web Interface
We provide a user-friendly web interface built with Gradio for easy experimentation and training visualization.
### Setup for Web Interface
1. Install Gradio (in addition to the main dependencies):
```bash
pip install gradio>=5.0.0
```
2. Launch the web interface:
```bash
python gradio_app.py
```
3. Open your browser and navigate to `http://localhost:7860`
### Features
The Gradio interface provides:
- **Interactive Parameter Configuration**: Adjust all training parameters through an intuitive UI
- **Image Upload**: Drag and drop any image to train on
- **Real-time Training Progress**: Stream training logs and intermediate results
- **Live Visualization**: Watch Gaussian placement and rendering progress during training
- **Result Gallery**: View final renders, gradient maps, and saliency maps
- **Easy Experimentation**: No need to remember command-line arguments
### Interface Sections
1. **Configuration Panel**:
- Basic parameters (number of Gaussians, training steps)
- Quantization settings for memory efficiency
- Initialization modes (gradient, saliency, random)
- Advanced optimization parameters (learning rates, loss weights)
2. **Training Progress**:
- Real-time streaming logs
- Current render and Gaussian visualization updates
- Training status and control buttons
3. **Results Display**:
- Final optimized image
- Gradient and saliency maps used for initialization
- Download capabilities for all results
### Usage Tips
- Start with default parameters for your first run
- Use **saliency initialization** for better results on complex images
- Enable **Gaussian visualization** to see how the representation evolves
- Adjust **save image steps** to control visualization frequency (lower = more updates, but slower)
- For quick tests, reduce **max steps** to 500-1000
### Command Line Arguments
Please refer to `cfgs/default.yaml` for the full list of arguments and their default values.
**Post-optimization rendering**
- `--eval` render the optimized Image-GS representation.
- `--render_height` image height for rendering (aspect ratio is maintained).
**Bit precision control**: 32 bits (float32) per dimension by default
- `--quantize` enable bit precision control of Gaussian parameters.
- `--pos_bits` bit precision of individual coordinate dimension.
- `--scale_bits` bit precision of individual scale dimension.
- `--rot_bits` bit precision of Gaussian orientation angle.
- `--feat_bits` bit precision of individual feature dimension.
**Logging**
- `--exp_name` path to the logging directory.
- `--vis_gaussians`: visualize Gaussians during optimization.
- `--save_image_steps` frequency of rendering intermediate results during optimization.
- `--save_ckpt_steps` frequency of checkpointing during optimization.
**Input image**
- `--input_path` path to an image file or a directory containing a texture stack.
- `--downsample` load a downsampled version of the input image or texture stack as the optimization target to evaluate image upsampling performance.
- `--downsample_ratio` downsampling ratio.
- `--gamma` optimize in a gamma-corrected space, modify with caution.
**Gaussian**
- `--num_gaussians` number of Gaussians (for compression rate control).
- `--init_scale` initial Gaussian scale in number of pixels.
- `--disable_topk_norm` disable top-K normalization.
- `--disable_inverse_scale` disable inverse Gaussian scale optimization.
- `--init_mode` Gaussian position initialization mode, valid values include "gradient", "saliency", and "random".
- `--init_random_ratio` ratio of Gaussians with randomly initialized position.
**Optimization**
- `--disable_tiles` disable tile-based rendering (warning: optimization and rendering without tiles will be way slower).
- `--max_steps` maximum number of optimization steps.
- `--pos_lr` Gaussian position learning rate.
- `--scale_lr` Gaussian scale learning rate.
- `--rot_lr` Gaussian orientation angle learning rate.
- `--feat_lr` Gaussian feature learning rate.
- `--disable_lr_schedule` disable learning rate decay and early stopping schedule.
- `--disable_prog_optim` disable error-guided progressive optimization.
## Acknowledgements
We would like to thank the [gsplat](https://github.com/nerfstudio-project/gsplat) team, and the authors of [3DGS](https://github.com/graphdeco-inria/gaussian-splatting), [fused-ssim](https://github.com/rahul-goel/fused-ssim), and [EML-Net](https://github.com/SenJia/EML-NET-Saliency) for their great work, based on which Image-GS was developed.
## License
This project is licensed under the terms of the MIT license.
## Citation
If you find this project helpful to your research, please consider citing [BibTeX](assets/docs/image-gs.bib):
```bibtex
@inproceedings{zhang2025image,
title={Image-gs: Content-adaptive image representation via 2d gaussians},
author={Zhang, Yunxiang and Li, Bingxuan and Kuznetsov, Alexandr and Jindal, Akshay and Diolatzis, Stavros and Chen, Kenneth and Sochenov, Anton and Kaplanyan, Anton and Sun, Qi},
booktitle={Proceedings of the Special Interest Group on Computer Graphics and Interactive Techniques Conference Conference Papers},
pages={1--11},
year={2025}
}
```