OmniX: From Unified Panoramic Generation and Perception to Graphics-Ready 3D Scenes

We present OmniX, a family of panoramic flow matching models for unified panorama generation, perception, and completion.

We introduce OmniX, a family of flow matching generative models that achieves unified panorama perception, generation, and completion. Using OmniX as a world generator, we can create graphics-ready 3D scenes ready for physically based rendering, relighting, and simualtion.

Paper Abstract

There are two prevalent ways to constructing 3D scenes: procedural generation and 2D lifting. Among them, panorama-based 2D lifting has emerged as a promising technique, leveraging powerful 2D generative priors to produce immersive, realistic, and diverse 3D environments. In this work, we advance this technique to generate graphics-ready 3D scenes suitable for physically based rendering (PBR), relighting, and simulation. Our key insight is to repurpose 2D generative models for panoramic perception of geometry, textures, and PBR materials. Unlike existing 2D lifting approaches that emphasize appearance generation and ignore the perception of intrinsic properties, we present OmniX, a versatile and unified framework. Based on a lightweight and efficient cross-modal adapter structure, OmniX reuses 2D generative priors for a broad range of panoramic vision tasks, including panoramic perception, generation, and completion. Furthermore, we construct a large-scale synthetic panorama dataset containing high-quality multimodal panoramas from diverse indoor and outdoor scenes. Extensive experiments demonstrate the effectiveness of our model in panoramic visual perception and graphics-ready 3D scene generation, opening new possibilities for immersive and physically realistic virtual world generation.

⚙️ Installation

Please follow the instructions below to get the code and install dependencies.

Clone the repo:

git clone https://github.com/HKU-MMLab/OmniX.git
cd OmniX

Create a conda environment:

conda create -n omnix python=3.11
conda activate omnix

Install dependencies:

pip install -r requirements.txt

Install Blender (optional, for exporting 3D scenes only):

Please refer to the official installation guide to install Blender on your PC or remote server. We use Blender 4.4.3 for Linux.

Alternatively, you may use:

pip install bpy

to use the Blender Python API without installing the full Blender, but we haven't tested this carefully.

🚀 Inference

Panorama Generation

OmniX can generate high-quality panoramic images from image or text prompts:

# Generation from Text
python run_pano_generation.py --prompt "Photorealistic modern living room" --output_dir "outputs/generation_from_text"

# Generation from Image and Text
python run_pano_generation.py --image "assets/examples/image.png" --prompt "Photorealistic modern living room" --output_dir "outputs/generation_from_image_and_text"

Panorama Perception

Given an RGB panorama as input, OmniX can predict geometric, intrinsic, and semantic properties:

# Perception (Distance, Normal, Albedo, Roughness, Metallic, Semantic) from Panorama
python run_pano_perception.py --panorama "assets/examples/panorama.png" --output_dir "outputs/perception_from_panorama"

Panorama Generation and Perception

Naturally, we can combine panorama generation and perception to obtain a panoramic image with multiple property annotations:

# Generation and Perception from Text
python run_pano_all.py --prompt "Photorealistic modern living room" --output_dir "outputs/generation_and_perception_from_text"

# Generation and Perception from Image and Text
python run_pano_all.py --image "assets/examples/image.png" --prompt "Photorealistic modern living room" --output_dir "outputs/generation_and_perception_from_image_and_text"

Graphics-Ready Scene Generation (Beta)

Note that the code for graphics-ready scene reconstruction/generation is not ready and is still in progress.

# Generation from Text
python run_scene_generation.py --prompt "Photorealistic modern living room" --output_dir "outputs/construction_from_text"
# Generation from Text (Fast)
python run_scene_generation.py --prompt "Photorealistic modern living room" --output_dir "outputs/construction_fast_from_text" --rgb_as_albedo --disable_normal --use_default_pbr --fill_invalid_depth

# Generation from Image and Text
python run_scene_generation.py --image "assets/examples/image.png" --prompt "Photorealistic modern living room" --output_dir "outputs/construction_from_image_and_text"
# Generation from Image and Text (Fast)
python run_scene_generation.py --image "assets/examples/image.png" --prompt "Photorealistic modern living room" --output_dir "outputs/construction_fast_from_image_and_text" --rgb_as_albedo --disable_normal --use_default_pbr --fill_invalid_depth

# Generation from Panorama
python run_scene_generation.py --panorama "assets/examples/panorama.png" --output_dir "outputs/construction_from_panorama"
# Generation from Panorama (Fast)
python run_scene_generation.py --panorama "assets/examples/panorama.png" --output_dir "outputs/construction_fast_from_panorama" --rgb_as_albedo --disable_normal --use_default_pbr --fill_invalid_depth

👏 Acknowledgement

This repository is based on many amazing research works and open-source projects: PanFusion, DreamCube, WorldGen, diffusers, equilib, etc. Thanks all the authors for their selfless contributions to the community!

😉 Citation

If you find this repository helpful for your work, please consider citing it as follows:

@article{omnix,
    title={OmniX: From Unified Panoramic Generation and Perception to Graphics-Ready 3D Scenes},
    author={Huang, Yukun and Yu, Jiwen and Zhou, Yanning and Wang, Jianan and Wang, Xintao and Wan, Pengfei and Liu, Xihui},
    journal={arXiv preprint arXiv:2510.26800},
    year={2025}
}

Downloads last month: -

Inference Providers NEW

Text-to-3D

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for KevinHuang/OmniX

Base model

black-forest-labs/FLUX.1-dev

Finetuned

(508)

this model