Image-GS / README.md
Julien Blanchon
Update
a0b1d08
metadata
title: Image GS
emoji: πŸ’»
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 5.49.1
python_version: '3.10'
app_file: gradio_app.py
suggested_hardware: cpu-basic
models:
  - blanchon/image-gs-models-utils
pinned: false

Image-GS: Content-Adaptive Image Representation via 2D Gaussians

Yunxiang Zhang1*, Bingxuan Li1*, Alexandr Kuznetsov3†, Akshay Jindal2, Stavros Diolatzis2, Kenneth Chen1, Anton Sochenov2, Anton Kaplanyan2, Qi Sun1

* Equal contribution   † Work done while at Intel

1 NYU logo   2 Intel logo   3 AMD logo

arXiv project page visitors

Neural image representations have emerged as a promising approach for encoding and rendering visual data. Combined with learning-based workflows, they demonstrate impressive trade-offs between visual fidelity and memory footprint. Existing methods in this domain, however, often rely on fixed data structures that suboptimally allocate memory or compute-intensive implicit models, hindering their practicality for real-time graphics applications.

Inspired by recent advancements in radiance field rendering, we introduce Image-GS, a content-adaptive image representation based on 2D Gaussians. Leveraging a custom differentiable renderer, Image-GS reconstructs images by adaptively allocating and progressively optimizing a group of anisotropic, colored 2D Gaussians. It achieves a favorable balance between visual fidelity and memory efficiency across a variety of stylized images frequently seen in graphics workflows, especially for those showing non-uniformly distributed features and in low-bitrate regimes. Moreover, it supports hardware-friendly rapid random access for real-time usage, requiring only 0.3K MACs to decode a pixel. Through error-guided progressive optimization, Image-GS naturally constructs a smooth level-of-detail hierarchy. We demonstrate its versatility with several applications, including texture compression, semantics-aware compression, and joint image compression and restoration.

Figure 1: Image-GS reconstructs an image by adaptively allocating and progressively optimizing a set of colored 2D Gaussians. It achieves favorable rate-distortion trade-offs, hardware-friendly random access, and flexible quality control through a smooth level-of-detail stack. (a) visualizes the optimized spatial distribution of Gaussians (20% randomly sampled for clarity). (b) Image-GS’s explicit content-adaptive design effectively captures non-uniformly distributed image features and better preserves fine details under constrained memory budgets. In the inset error maps, brighter colors indicate larger errors.

Setup

  1. Create a dedicated Python environment and install the dependencies
    git clone https://github.com/NYU-ICL/image-gs.git
    cd image-gs
    conda env create -f environment.yml
    conda activate image-gs
    pip install git+https://github.com/rahul-goel/fused-ssim/ --no-build-isolation
    cd gsplat
    pip install -e ".[dev]"
    cd ..
    
  2. Download the image and texture datasets from OneDrive and organize the folder structure as follows
    image-gs
    └── media
        β”œβ”€β”€ images
        └── textures
    
  3. (Optional) To run saliency-guided Gaussian position initialization, download the pre-trained EML-Net models (res_imagenet.pth, res_places.pth, res_decoder.pth) and place them under the models/emlnet/ folder
    image-gs
    └── models
        └── emlnet
            β”œβ”€β”€ res_decoder.pth
            β”œβ”€β”€ res_imagenet.pth
            └── res_places.pth
    

Quick Start

Image Compression

  • Optimize an Image-GS representation for an input image anime-1_2k.png using 10000 Gaussians with half-precision parameters
python main.py --input_path="images/anime-1_2k.png" --exp_name="test/anime-1_2k" --num_gaussians=10000 --quantize
  • Render the corresponding optimized Image-GS representation at a new resolution with height 4000 (aspect ratio is maintained)
python main.py --input_path="images/anime-1_2k.png" --exp_name="test/anime-1_2k" --num_gaussians=10000 --quantize --eval --render_height=4000

Texture Stack Compression

  • Optimize an Image-GS representation for an input texture stack alarm-clock_2k using 30000 Gaussians with half-precision parameters
python main.py --input_path="textures/alarm-clock_2k" --exp_name="test/alarm-clock_2k" --num_gaussians=30000 --quantize
  • Render the corresponding optimized Image-GS representation at a new resolution with height 3000 (aspect ratio is maintained)
python main.py --input_path="textures/alarm-clock_2k" --exp_name="test/alarm-clock_2k" --num_gaussians=30000 --quantize  --eval --render_height=3000

Control bit precision of Gaussian parameters

  • Optimize an Image-GS representation for an input image anime-1_2k.png using 10000 Gaussians with 12-bit-precision parameters
python main.py --input_path="images/anime-1_2k.png" --exp_name="test/anime-1_2k" --num_gaussians=10000 --quantize --pos_bits=12 --scale_bits 12 --rot_bits 12 --feat_bits 12

Switch to saliency-guided Gaussian position initialization

  • Optimize an Image-GS representation for an input image anime-1_2k.png using 10000 Gaussians with half-precision parameters and saliency-guided initialization
python main.py --input_path="images/anime-1_2k.png" --exp_name="test/anime-1_2k" --num_gaussians=10000 --quantize --init_mode="saliency"

Gradio Web Interface

We provide a user-friendly web interface built with Gradio for easy experimentation and training visualization.

Setup for Web Interface

  1. Install Gradio (in addition to the main dependencies):
pip install gradio>=5.0.0
  1. Launch the web interface:
python gradio_app.py
  1. Open your browser and navigate to http://localhost:7860

Features

The Gradio interface provides:

  • Interactive Parameter Configuration: Adjust all training parameters through an intuitive UI
  • Image Upload: Drag and drop any image to train on
  • Real-time Training Progress: Stream training logs and intermediate results
  • Live Visualization: Watch Gaussian placement and rendering progress during training
  • Result Gallery: View final renders, gradient maps, and saliency maps
  • Easy Experimentation: No need to remember command-line arguments

Interface Sections

  1. Configuration Panel:

    • Basic parameters (number of Gaussians, training steps)
    • Quantization settings for memory efficiency
    • Initialization modes (gradient, saliency, random)
    • Advanced optimization parameters (learning rates, loss weights)
  2. Training Progress:

    • Real-time streaming logs
    • Current render and Gaussian visualization updates
    • Training status and control buttons
  3. Results Display:

    • Final optimized image
    • Gradient and saliency maps used for initialization
    • Download capabilities for all results

Usage Tips

  • Start with default parameters for your first run
  • Use saliency initialization for better results on complex images
  • Enable Gaussian visualization to see how the representation evolves
  • Adjust save image steps to control visualization frequency (lower = more updates, but slower)
  • For quick tests, reduce max steps to 500-1000

Command Line Arguments

Please refer to cfgs/default.yaml for the full list of arguments and their default values.

Post-optimization rendering

  • --eval render the optimized Image-GS representation.
  • --render_height image height for rendering (aspect ratio is maintained).

Bit precision control: 32 bits (float32) per dimension by default

  • --quantize enable bit precision control of Gaussian parameters.
  • --pos_bits bit precision of individual coordinate dimension.
  • --scale_bits bit precision of individual scale dimension.
  • --rot_bits bit precision of Gaussian orientation angle.
  • --feat_bits bit precision of individual feature dimension.

Logging

  • --exp_name path to the logging directory.
  • --vis_gaussians: visualize Gaussians during optimization.
  • --save_image_steps frequency of rendering intermediate results during optimization.
  • --save_ckpt_steps frequency of checkpointing during optimization.

Input image

  • --input_path path to an image file or a directory containing a texture stack.
  • --downsample load a downsampled version of the input image or texture stack as the optimization target to evaluate image upsampling performance.
  • --downsample_ratio downsampling ratio.
  • --gamma optimize in a gamma-corrected space, modify with caution.

Gaussian

  • --num_gaussians number of Gaussians (for compression rate control).
  • --init_scale initial Gaussian scale in number of pixels.
  • --disable_topk_norm disable top-K normalization.
  • --disable_inverse_scale disable inverse Gaussian scale optimization.
  • --init_mode Gaussian position initialization mode, valid values include "gradient", "saliency", and "random".
  • --init_random_ratio ratio of Gaussians with randomly initialized position.

Optimization

  • --disable_tiles disable tile-based rendering (warning: optimization and rendering without tiles will be way slower).
  • --max_steps maximum number of optimization steps.
  • --pos_lr Gaussian position learning rate.
  • --scale_lr Gaussian scale learning rate.
  • --rot_lr Gaussian orientation angle learning rate.
  • --feat_lr Gaussian feature learning rate.
  • --disable_lr_schedule disable learning rate decay and early stopping schedule.
  • --disable_prog_optim disable error-guided progressive optimization.

Acknowledgements

We would like to thank the gsplat team, and the authors of 3DGS, fused-ssim, and EML-Net for their great work, based on which Image-GS was developed.

License

This project is licensed under the terms of the MIT license.

Citation

If you find this project helpful to your research, please consider citing BibTeX:

@inproceedings{zhang2025image,
  title={Image-gs: Content-adaptive image representation via 2d gaussians},
  author={Zhang, Yunxiang and Li, Bingxuan and Kuznetsov, Alexandr and Jindal, Akshay and Diolatzis, Stavros and Chen, Kenneth and Sochenov, Anton and Kaplanyan, Anton and Sun, Qi},
  booktitle={Proceedings of the Special Interest Group on Computer Graphics and Interactive Techniques Conference Conference Papers},
  pages={1--11},
  year={2025}
}