# ZeroGPU Migration Guide This document describes the changes made to enable FlashWorld to run on Hugging Face Spaces with ZeroGPU. ## Overview FlashWorld has been adapted to support ZeroGPU deployment on Hugging Face Spaces. This allows the model to run on free, dynamically allocated GPU resources with a configurable time budget. ## Changes Made ### 1. New Gradio Application (`app_gradio.py`) Created a new Gradio-based interface that replaces the Flask API for ZeroGPU deployment: **Key Features:** - Uses Gradio 5.49.1+ for the interface - Implements `@spaces.GPU(duration=15)` decorator with 15-second GPU budget - Model loading happens in global scope (outside GPU decorator) for efficiency - Simpler interface compared to the original Flask app with custom HTML - Accepts camera trajectory as JSON input - Returns PLY files for download **Architecture:** ```python # Model loads globally (once, at startup) generation_system = GenerationSystem(ckpt_path=ckpt_path, device=device, offload_t5=args.offload_t5) # Generation function uses GPU only when called @spaces.GPU(duration=15) def generate_scene(image_prompt, text_prompt, camera_json, resolution): # GPU-intensive work happens here # Returns PLY file + status message ``` ### 2. Requirements Updates (`requirements.txt`) **Removed:** - `flask==3.1.2` (not needed for ZeroGPU deployment) **Added:** - `spaces` (Hugging Face Spaces integration library) **Kept:** - `gradio==5.49.1` (required for Gradio SDK) - All other dependencies remain unchanged ### 3. System Dependencies (`packages.txt`) **Created new file** to install system-level dependencies required by gsplat for CUDA compilation: - `libglm-dev` (OpenGL Mathematics library headers) - `build-essential` (compilation tools) ### 4. README Updates **Added YAML frontmatter:** ```yaml --- title: FlashWorld emoji: 🌎 colorFrom: blue colorTo: green sdk: gradio sdk_version: 5.49.1 app_file: app_gradio.py pinned: false license: cc-by-nc-sa-4.0 python_version: 3.10.13 --- ``` **Added ZeroGPU deployment section:** - Instructions for deploying on Hugging Face Spaces - Documentation of 15-second GPU budget - Explanation of model loading strategy ### 5. CLAUDE.md Updates Updated the development documentation to include: - Instructions for running both Flask (local) and Gradio (ZeroGPU) versions - Documentation of ZeroGPU configuration - Explanation of decorator usage and model loading patterns ### 6. Example Camera Trajectory Created `examples/simple_trajectory.json` with a basic 5-camera forward-moving trajectory to help users get started. ## Key Design Decisions ### Why 15 Seconds? The GPU duration budget was set to 15 seconds for the following reasons: 1. Generation takes ~7 seconds on A100/A800 2. Additional time needed for: - Input processing (image resizing, camera parsing) - Export to PLY format - Buffer for slower GPUs or variable load 3. ZeroGPU default is 60 seconds, so 15 seconds is conservative ### Model Loading Strategy The model is loaded **once** in global scope, not inside the `@spaces.GPU` decorator: **Advantages:** - Model loads at startup, not on every generation - Faster response time for users - More efficient use of GPU time budget - Follows ZeroGPU best practices **Implementation:** ```python # Global scope - loads once at startup generation_system = GenerationSystem(...) # GPU decorator - only for inference @spaces.GPU(duration=15) def generate_scene(...): return generation_system.generate(...) ``` ### Input Format Camera trajectories are provided as JSON to make the Gradio interface simpler: ```json { "cameras": [ { "quaternion": [w, x, y, z], "position": [x, y, z], "fx": 352.0, "fy": 352.0, "cx": 352.0, "cy": 240.0 } ] } ``` This is different from the Flask API which used nested dictionaries in the POST request. ## Deployment Instructions ### Local Testing Test the Gradio app locally before deploying: ```bash python app_gradio.py ``` This will start the Gradio interface at `http://localhost:7860` ### Hugging Face Spaces Deployment 1. **Create a new Space:** - Go to https://huggingface.co/spaces - Click "Create new Space" - Select "ZeroGPU" as hardware 2. **Upload files:** - Push this repository to the Space - Ensure `app_gradio.py` is set as the app file in README.md 3. **Configuration:** - The Space will automatically use the YAML frontmatter in README.md - Model checkpoint will auto-download from HuggingFace Hub - No additional configuration needed 4. **Optional: Enable `--offload_t5` flag:** - Edit `app_gradio.py` to add `offload_t5=True` in `GenerationSystem` initialization - This reduces GPU memory usage but may slightly increase generation time ## Limitations ### ZeroGPU Constraints 1. **60-second hard limit:** Cannot exceed 60 seconds per GPU call 2. **No torch.compile:** Not supported in ZeroGPU environment 3. **Gradio only:** Must use Gradio SDK (no Flask or other frameworks) 4. **Python 3.10.13:** Recommended Python version ### Feature Differences from Flask App The Gradio app (`app_gradio.py`) differs from the original Flask app (`app.py`): **Missing features:** - Custom HTML/CSS interface - Real-time 3D preview with Spark.js - Manual camera trajectory recording with mouse/keyboard - Template-based trajectory generation - Queue visualization with progress bars - Concurrent request handling **Present features:** - Image and text prompts - Camera trajectory input (via JSON) - PLY file generation and download - Simple, accessible Gradio interface ### Recommended Usage For **ZeroGPU deployment:** - Use `app_gradio.py` - Keep camera trajectories reasonable (≤24 frames) - Consider enabling `--offload_t5` for memory savings For **local development with full features:** - Use `app.py` - Enjoy the full custom UI with interactive camera controls - Support for multiple concurrent generations ## Testing ### Test the Gradio App ```bash # Start the app python app_gradio.py # In the browser (http://localhost:7860): # 1. Upload an image (optional) # 2. Enter text prompt (optional) # 3. Paste example camera JSON from examples/simple_trajectory.json # 4. Select resolution (24x480x704) # 5. Click "Generate 3D Scene" ``` ### Verify GPU Decorator Check that model loading happens outside the decorator: ```python # Good - model loads once at startup generation_system = GenerationSystem(...) @spaces.GPU(duration=15) def generate_scene(...): return generation_system.generate(...) # Bad - would reload model on every call (slow!) @spaces.GPU(duration=15) def generate_scene(...): generation_system = GenerationSystem(...) # Don't do this! return generation_system.generate(...) ``` ## Troubleshooting ### "GPU budget exceeded" **Cause:** Generation took longer than 15 seconds **Solutions:** - Reduce number of frames in camera trajectory - Enable `--offload_t5` flag - Increase duration: `@spaces.GPU(duration=20)` ### "Out of memory" **Cause:** GPU memory exhausted **Solutions:** - Enable T5 offloading: `offload_t5=True` - Enable VAE offloading: `offload_vae=True` - Reduce resolution - Reduce number of frames ### "Model checkpoint not found" **Cause:** Automatic download failed **Solutions:** - Check internet connection - Verify HuggingFace access - Manually download and specify with `--ckpt` flag ### "Error building extension 'gsplat_cuda'" or "glm/gtc/type_ptr.hpp: No such file or directory" **Cause:** Missing GLM library headers required for gsplat CUDA compilation **Solutions:** - Ensure `packages.txt` exists with `libglm-dev` and `build-essential` - Restart the Space to reinstall dependencies - Check Space build logs for system package installation errors ### "Bias is not supported when out_dtype is set to Float32" **Cause:** PyTorch FP8 operations limitation on certain GPU architectures **Solutions:** - This is fixed in `quant.py` by applying bias separately when needed - Ensure you have the latest version of the code ## Future Improvements Potential enhancements for ZeroGPU deployment: 1. **Gradio Blocks UI:** Add more interactive controls 2. **Example gallery:** Pre-loaded example camera trajectories 3. **3D visualization:** Embed PLY viewer in Gradio interface 4. **Video preview:** Show rendered video before downloading PLY 5. **Dynamic duration:** Adjust GPU budget based on camera count ## References - [ZeroGPU Documentation](https://huggingface.co/docs/hub/en/spaces-zerogpu) - [Gradio Documentation](https://gradio.app/docs/) - [FlashWorld Paper](https://arxiv.org/pdf/2510.13678) - [FlashWorld Project Page](https://imlixinyang.github.io/FlashWorld-Project-Page)