--- license: mit datasets: - rsi/PixelsPointsPolygons language: - en metrics: - accuracy base_model: - timm/vit_small_patch8_224.dino pipeline_tag: object-detection tags: - building - vectorization - polygon - aerial - image - pointcloud - multimodal ---

The P3 dataset: Pixels, Points and Polygons
for Multimodal Building Vectorization

Raphael Sulzer1,2     Liuyun Duan1     Nicolas Girard1    Florent Lafarge2

1LuxCarta Technology
2Centre Inria d'UniversitΓ© CΓ΄te d'Azur Figure 1: A view of our dataset of Zurich, Switzerland
## Table of Contents - [Abstract](#abstract) - [Highlights](#highlights) - [Dataset](#dataset) - [Pretrained model weights](#pretrained-model-weights) - [Code](#code) - [Citation](#citation) - [Acknowledgements](#acknowledgements) ## Abstract
We present the P3 dataset, a large-scale multimodal benchmark for building vectorization, constructed from aerial LiDAR point clouds, high-resolution aerial imagery, and vectorized 2D building outlines, collected across three continents. The dataset contains over 10 billion LiDAR points with decimeter-level accuracy and RGB images at a ground sampling distance of 25 cm. While many existing datasets primarily focus on the image modality, P3 offers a complementary perspective by also incorporating dense 3D information. We demonstrate that LiDAR point clouds serve as a robust modality for predicting building polygons, both in hybrid and end-to-end learning frameworks. Moreover, fusing aerial LiDAR and imagery further improves accuracy and geometric quality of predicted polygons. The P3 dataset is publicly available, along with code and pretrained weights of three state-of-the-art models for building polygon prediction at https://github.com/raphaelsulzer/PixelsPointsPolygons.
## Highlights - A global, multimodal dataset of aerial images, aerial LiDAR point clouds and building outline polygons, available at [huggingface.co/datasets/rsi/PixelsPointsPolygons](https://huggingface.co/datasets/rsi/PixelsPointsPolygons) - A library for training and evaluating state-of-the-art deep learning methods on the dataset, available at [github.com/raphaelsulzer/PixelsPointsPolygons](https://github.com/raphaelsulzer/PixelsPointsPolygons) - Pretrained model weights, available at [huggingface.co/rsi/PixelsPointsPolygons](https://huggingface.co/rsi/PixelsPointsPolygons) ## Dataset ### Overview
### Download The recommended and fastest way to download the dataset is to run ``` pip install huggingface_hub python scripts/download_dataset.py --dataset-root $DATA_ROOT ``` Optionally you can also download the dataset by running ``` git lfs install git clone https://huggingface.co/datasets/rsi/PixelsPointsPolygons $DATA_ROOT ``` Both options will download the full dataset, including aerial images (as .tif), aerial lidar point clouds (as .copc.laz) and building polygon annotaions (as MS-COCO .json) into `$DATA_ROOT` . The size of the dataset is around 163GB. ### Structure
πŸ“ Click to expand dataset folder structure ```text PixelsPointsPolygons/data/224 β”œβ”€β”€ annotations β”‚ β”œβ”€β”€ annotations_all_test.json β”‚ β”œβ”€β”€ annotations_all_train.json β”‚ └── annotations_all_val.json β”‚ ... (24 files total) β”œβ”€β”€ images β”‚ β”œβ”€β”€ train β”‚ β”‚ β”œβ”€β”€ CH β”‚ β”‚ β”‚ β”œβ”€β”€ 0 β”‚ β”‚ β”‚ β”‚ β”œβ”€β”€ image0_CH_train.tif β”‚ β”‚ β”‚ β”‚ β”œβ”€β”€ image1000_CH_train.tif β”‚ β”‚ β”‚ β”‚ └── image1001_CH_train.tif β”‚ β”‚ β”‚ β”‚ ... (5000 files total) β”‚ β”‚ β”‚ β”œβ”€β”€ 5000 β”‚ β”‚ β”‚ β”‚ β”œβ”€β”€ image5000_CH_train.tif β”‚ β”‚ β”‚ β”‚ β”œβ”€β”€ image5001_CH_train.tif β”‚ β”‚ β”‚ β”‚ └── image5002_CH_train.tif β”‚ β”‚ β”‚ β”‚ ... (5000 files total) β”‚ β”‚ β”‚ └── 10000 β”‚ β”‚ β”‚ β”œβ”€β”€ image10000_CH_train.tif β”‚ β”‚ β”‚ β”œβ”€β”€ image10001_CH_train.tif β”‚ β”‚ β”‚ └── image10002_CH_train.tif β”‚ β”‚ β”‚ ... (5000 files total) β”‚ β”‚ β”‚ ... (11 dirs total) β”‚ β”‚ β”œβ”€β”€ NY β”‚ β”‚ β”‚ β”œβ”€β”€ 0 β”‚ β”‚ β”‚ β”‚ β”œβ”€β”€ image0_NY_train.tif β”‚ β”‚ β”‚ β”‚ β”œβ”€β”€ image1000_NY_train.tif β”‚ β”‚ β”‚ β”‚ └── image1001_NY_train.tif β”‚ β”‚ β”‚ β”‚ ... (5000 files total) β”‚ β”‚ β”‚ β”œβ”€β”€ 5000 β”‚ β”‚ β”‚ β”‚ β”œβ”€β”€ image5000_NY_train.tif β”‚ β”‚ β”‚ β”‚ β”œβ”€β”€ image5001_NY_train.tif β”‚ β”‚ β”‚ β”‚ └── image5002_NY_train.tif β”‚ β”‚ β”‚ β”‚ ... (5000 files total) β”‚ β”‚ β”‚ └── 10000 β”‚ β”‚ β”‚ β”œβ”€β”€ image10000_NY_train.tif β”‚ β”‚ β”‚ β”œβ”€β”€ image10001_NY_train.tif β”‚ β”‚ β”‚ └── image10002_NY_train.tif β”‚ β”‚ β”‚ ... (5000 files total) β”‚ β”‚ β”‚ ... (11 dirs total) β”‚ β”‚ └── NZ β”‚ β”‚ β”œβ”€β”€ 0 β”‚ β”‚ β”‚ β”œβ”€β”€ image0_NZ_train.tif β”‚ β”‚ β”‚ β”œβ”€β”€ image1000_NZ_train.tif β”‚ β”‚ β”‚ └── image1001_NZ_train.tif β”‚ β”‚ β”‚ ... (5000 files total) β”‚ β”‚ β”œβ”€β”€ 5000 β”‚ β”‚ β”‚ β”œβ”€β”€ image5000_NZ_train.tif β”‚ β”‚ β”‚ β”œβ”€β”€ image5001_NZ_train.tif β”‚ β”‚ β”‚ └── image5002_NZ_train.tif β”‚ β”‚ β”‚ ... (5000 files total) β”‚ β”‚ └── 10000 β”‚ β”‚ β”œβ”€β”€ image10000_NZ_train.tif β”‚ β”‚ β”œβ”€β”€ image10001_NZ_train.tif β”‚ β”‚ └── image10002_NZ_train.tif β”‚ β”‚ ... (5000 files total) β”‚ β”‚ ... (11 dirs total) β”‚ β”œβ”€β”€ val β”‚ β”‚ β”œβ”€β”€ CH β”‚ β”‚ β”‚ └── 0 β”‚ β”‚ β”‚ β”œβ”€β”€ image0_CH_val.tif β”‚ β”‚ β”‚ β”œβ”€β”€ image100_CH_val.tif β”‚ β”‚ β”‚ └── image101_CH_val.tif β”‚ β”‚ β”‚ ... (529 files total) β”‚ β”‚ β”œβ”€β”€ NY β”‚ β”‚ β”‚ └── 0 β”‚ β”‚ β”‚ β”œβ”€β”€ image0_NY_val.tif β”‚ β”‚ β”‚ β”œβ”€β”€ image100_NY_val.tif β”‚ β”‚ β”‚ └── image101_NY_val.tif β”‚ β”‚ β”‚ ... (529 files total) β”‚ β”‚ └── NZ β”‚ β”‚ └── 0 β”‚ β”‚ β”œβ”€β”€ image0_NZ_val.tif β”‚ β”‚ β”œβ”€β”€ image100_NZ_val.tif β”‚ β”‚ └── image101_NZ_val.tif β”‚ β”‚ ... (529 files total) β”‚ └── test β”‚ β”œβ”€β”€ CH β”‚ β”‚ β”œβ”€β”€ 0 β”‚ β”‚ β”‚ β”œβ”€β”€ image0_CH_test.tif β”‚ β”‚ β”‚ β”œβ”€β”€ image1000_CH_test.tif β”‚ β”‚ β”‚ └── image1001_CH_test.tif β”‚ β”‚ β”‚ ... (5000 files total) β”‚ β”‚ β”œβ”€β”€ 5000 β”‚ β”‚ β”‚ β”œβ”€β”€ image5000_CH_test.tif β”‚ β”‚ β”‚ β”œβ”€β”€ image5001_CH_test.tif β”‚ β”‚ β”‚ └── image5002_CH_test.tif β”‚ β”‚ β”‚ ... (5000 files total) β”‚ β”‚ └── 10000 β”‚ β”‚ β”œβ”€β”€ image10000_CH_test.tif β”‚ β”‚ β”œβ”€β”€ image10001_CH_test.tif β”‚ β”‚ └── image10002_CH_test.tif β”‚ β”‚ ... (4400 files total) β”‚ β”œβ”€β”€ NY β”‚ β”‚ β”œβ”€β”€ 0 β”‚ β”‚ β”‚ β”œβ”€β”€ image0_NY_test.tif β”‚ β”‚ β”‚ β”œβ”€β”€ image1000_NY_test.tif β”‚ β”‚ β”‚ └── image1001_NY_test.tif β”‚ β”‚ β”‚ ... (5000 files total) β”‚ β”‚ β”œβ”€β”€ 5000 β”‚ β”‚ β”‚ β”œβ”€β”€ image5000_NY_test.tif β”‚ β”‚ β”‚ β”œβ”€β”€ image5001_NY_test.tif β”‚ β”‚ β”‚ └── image5002_NY_test.tif β”‚ β”‚ β”‚ ... (5000 files total) β”‚ β”‚ └── 10000 β”‚ β”‚ β”œβ”€β”€ image10000_NY_test.tif β”‚ β”‚ β”œβ”€β”€ image10001_NY_test.tif β”‚ β”‚ └── image10002_NY_test.tif β”‚ β”‚ ... (4400 files total) β”‚ └── NZ β”‚ β”œβ”€β”€ 0 β”‚ β”‚ β”œβ”€β”€ image0_NZ_test.tif β”‚ β”‚ β”œβ”€β”€ image1000_NZ_test.tif β”‚ β”‚ └── image1001_NZ_test.tif β”‚ β”‚ ... (5000 files total) β”‚ β”œβ”€β”€ 5000 β”‚ β”‚ β”œβ”€β”€ image5000_NZ_test.tif β”‚ β”‚ β”œβ”€β”€ image5001_NZ_test.tif β”‚ β”‚ └── image5002_NZ_test.tif β”‚ β”‚ ... (5000 files total) β”‚ └── 10000 β”‚ β”œβ”€β”€ image10000_NZ_test.tif β”‚ β”œβ”€β”€ image10001_NZ_test.tif β”‚ └── image10002_NZ_test.tif β”‚ ... (4400 files total) β”œβ”€β”€ lidar β”‚ β”œβ”€β”€ train β”‚ β”‚ β”œβ”€β”€ CH β”‚ β”‚ β”‚ β”œβ”€β”€ 0 β”‚ β”‚ β”‚ β”‚ β”œβ”€β”€ lidar0_CH_train.copc.laz β”‚ β”‚ β”‚ β”‚ β”œβ”€β”€ lidar1000_CH_train.copc.laz β”‚ β”‚ β”‚ β”‚ └── lidar1001_CH_train.copc.laz β”‚ β”‚ β”‚ β”‚ ... (5000 files total) β”‚ β”‚ β”‚ β”œβ”€β”€ 5000 β”‚ β”‚ β”‚ β”‚ β”œβ”€β”€ lidar5000_CH_train.copc.laz β”‚ β”‚ β”‚ β”‚ β”œβ”€β”€ lidar5001_CH_train.copc.laz β”‚ β”‚ β”‚ β”‚ └── lidar5002_CH_train.copc.laz β”‚ β”‚ β”‚ β”‚ ... (5000 files total) β”‚ β”‚ β”‚ └── 10000 β”‚ β”‚ β”‚ β”œβ”€β”€ lidar10000_CH_train.copc.laz β”‚ β”‚ β”‚ β”œβ”€β”€ lidar10001_CH_train.copc.laz β”‚ β”‚ β”‚ └── lidar10002_CH_train.copc.laz β”‚ β”‚ β”‚ ... (5000 files total) β”‚ β”‚ β”‚ ... (11 dirs total) β”‚ β”‚ β”œβ”€β”€ NY β”‚ β”‚ β”‚ β”œβ”€β”€ 0 β”‚ β”‚ β”‚ β”‚ β”œβ”€β”€ lidar0_NY_train.copc.laz β”‚ β”‚ β”‚ β”‚ β”œβ”€β”€ lidar10_NY_train.copc.laz β”‚ β”‚ β”‚ β”‚ └── lidar1150_NY_train.copc.laz β”‚ β”‚ β”‚ β”‚ ... (1071 files total) β”‚ β”‚ β”‚ β”œβ”€β”€ 5000 β”‚ β”‚ β”‚ β”‚ β”œβ”€β”€ lidar5060_NY_train.copc.laz β”‚ β”‚ β”‚ β”‚ β”œβ”€β”€ lidar5061_NY_train.copc.laz β”‚ β”‚ β”‚ β”‚ └── lidar5062_NY_train.copc.laz β”‚ β”‚ β”‚ β”‚ ... (2235 files total) β”‚ β”‚ β”‚ └── 10000 β”‚ β”‚ β”‚ β”œβ”€β”€ lidar10000_NY_train.copc.laz β”‚ β”‚ β”‚ β”œβ”€β”€ lidar10001_NY_train.copc.laz β”‚ β”‚ β”‚ └── lidar10002_NY_train.copc.laz β”‚ β”‚ β”‚ ... (4552 files total) β”‚ β”‚ β”‚ ... (11 dirs total) β”‚ β”‚ └── NZ β”‚ β”‚ β”œβ”€β”€ 0 β”‚ β”‚ β”‚ β”œβ”€β”€ lidar0_NZ_train.copc.laz β”‚ β”‚ β”‚ β”œβ”€β”€ lidar1000_NZ_train.copc.laz β”‚ β”‚ β”‚ └── lidar1001_NZ_train.copc.laz β”‚ β”‚ β”‚ ... (5000 files total) β”‚ β”‚ β”œβ”€β”€ 5000 β”‚ β”‚ β”‚ β”œβ”€β”€ lidar5000_NZ_train.copc.laz β”‚ β”‚ β”‚ β”œβ”€β”€ lidar5001_NZ_train.copc.laz β”‚ β”‚ β”‚ └── lidar5002_NZ_train.copc.laz β”‚ β”‚ β”‚ ... (5000 files total) β”‚ β”‚ └── 10000 β”‚ β”‚ β”œβ”€β”€ lidar10000_NZ_train.copc.laz β”‚ β”‚ β”œβ”€β”€ lidar10001_NZ_train.copc.laz β”‚ β”‚ └── lidar10002_NZ_train.copc.laz β”‚ β”‚ ... (4999 files total) β”‚ β”‚ ... (11 dirs total) β”‚ β”œβ”€β”€ val β”‚ β”‚ β”œβ”€β”€ CH β”‚ β”‚ β”‚ └── 0 β”‚ β”‚ β”‚ β”œβ”€β”€ lidar0_CH_val.copc.laz β”‚ β”‚ β”‚ β”œβ”€β”€ lidar100_CH_val.copc.laz β”‚ β”‚ β”‚ └── lidar101_CH_val.copc.laz β”‚ β”‚ β”‚ ... (529 files total) β”‚ β”‚ β”œβ”€β”€ NY β”‚ β”‚ β”‚ └── 0 β”‚ β”‚ β”‚ β”œβ”€β”€ lidar0_NY_val.copc.laz β”‚ β”‚ β”‚ β”œβ”€β”€ lidar100_NY_val.copc.laz β”‚ β”‚ β”‚ └── lidar101_NY_val.copc.laz β”‚ β”‚ β”‚ ... (529 files total) β”‚ β”‚ └── NZ β”‚ β”‚ └── 0 β”‚ β”‚ β”œβ”€β”€ lidar0_NZ_val.copc.laz β”‚ β”‚ β”œβ”€β”€ lidar100_NZ_val.copc.laz β”‚ β”‚ └── lidar101_NZ_val.copc.laz β”‚ β”‚ ... (529 files total) β”‚ └── test β”‚ β”œβ”€β”€ CH β”‚ β”‚ β”œβ”€β”€ 0 β”‚ β”‚ β”‚ β”œβ”€β”€ lidar0_CH_test.copc.laz β”‚ β”‚ β”‚ β”œβ”€β”€ lidar1000_CH_test.copc.laz β”‚ β”‚ β”‚ └── lidar1001_CH_test.copc.laz β”‚ β”‚ β”‚ ... (5000 files total) β”‚ β”‚ β”œβ”€β”€ 5000 β”‚ β”‚ β”‚ β”œβ”€β”€ lidar5000_CH_test.copc.laz β”‚ β”‚ β”‚ β”œβ”€β”€ lidar5001_CH_test.copc.laz β”‚ β”‚ β”‚ └── lidar5002_CH_test.copc.laz β”‚ β”‚ β”‚ ... (5000 files total) β”‚ β”‚ └── 10000 β”‚ β”‚ β”œβ”€β”€ lidar10000_CH_test.copc.laz β”‚ β”‚ β”œβ”€β”€ lidar10001_CH_test.copc.laz β”‚ β”‚ └── lidar10002_CH_test.copc.laz β”‚ β”‚ ... (4400 files total) β”‚ β”œβ”€β”€ NY β”‚ β”‚ β”œβ”€β”€ 0 β”‚ β”‚ β”‚ β”œβ”€β”€ lidar0_NY_test.copc.laz β”‚ β”‚ β”‚ β”œβ”€β”€ lidar1000_NY_test.copc.laz β”‚ β”‚ β”‚ └── lidar1001_NY_test.copc.laz β”‚ β”‚ β”‚ ... (4964 files total) β”‚ β”‚ β”œβ”€β”€ 5000 β”‚ β”‚ β”‚ β”œβ”€β”€ lidar5000_NY_test.copc.laz β”‚ β”‚ β”‚ β”œβ”€β”€ lidar5001_NY_test.copc.laz β”‚ β”‚ β”‚ └── lidar5002_NY_test.copc.laz β”‚ β”‚ β”‚ ... (4953 files total) β”‚ β”‚ └── 10000 β”‚ β”‚ β”œβ”€β”€ lidar10000_NY_test.copc.laz β”‚ β”‚ β”œβ”€β”€ lidar10001_NY_test.copc.laz β”‚ β”‚ └── lidar10002_NY_test.copc.laz β”‚ β”‚ ... (4396 files total) β”‚ └── NZ β”‚ β”œβ”€β”€ 0 β”‚ β”‚ β”œβ”€β”€ lidar0_NZ_test.copc.laz β”‚ β”‚ β”œβ”€β”€ lidar1000_NZ_test.copc.laz β”‚ β”‚ └── lidar1001_NZ_test.copc.laz β”‚ β”‚ ... (5000 files total) β”‚ β”œβ”€β”€ 5000 β”‚ β”‚ β”œβ”€β”€ lidar5000_NZ_test.copc.laz β”‚ β”‚ β”œβ”€β”€ lidar5001_NZ_test.copc.laz β”‚ β”‚ └── lidar5002_NZ_test.copc.laz β”‚ β”‚ ... (5000 files total) β”‚ └── 10000 β”‚ β”œβ”€β”€ lidar10000_NZ_test.copc.laz β”‚ β”œβ”€β”€ lidar10001_NZ_test.copc.laz β”‚ └── lidar10002_NZ_test.copc.laz β”‚ ... (4400 files total) └── ffl β”œβ”€β”€ train β”‚ β”œβ”€β”€ CH β”‚ β”‚ β”œβ”€β”€ 0 β”‚ β”‚ β”‚ β”œβ”€β”€ image0_CH_train.pt β”‚ β”‚ β”‚ β”œβ”€β”€ image1000_CH_train.pt β”‚ β”‚ β”‚ └── image1001_CH_train.pt β”‚ β”‚ β”‚ ... (5000 files total) β”‚ β”‚ β”œβ”€β”€ 5000 β”‚ β”‚ β”‚ β”œβ”€β”€ image5000_CH_train.pt β”‚ β”‚ β”‚ β”œβ”€β”€ image5001_CH_train.pt β”‚ β”‚ β”‚ └── image5002_CH_train.pt β”‚ β”‚ β”‚ ... (5000 files total) β”‚ β”‚ └── 10000 β”‚ β”‚ β”œβ”€β”€ image10000_CH_train.pt β”‚ β”‚ β”œβ”€β”€ image10001_CH_train.pt β”‚ β”‚ └── image10002_CH_train.pt β”‚ β”‚ ... (5000 files total) β”‚ β”‚ ... (11 dirs total) β”‚ β”œβ”€β”€ NY β”‚ β”‚ β”œβ”€β”€ 0 β”‚ β”‚ β”‚ β”œβ”€β”€ image0_NY_train.pt β”‚ β”‚ β”‚ β”œβ”€β”€ image1000_NY_train.pt β”‚ β”‚ β”‚ └── image1001_NY_train.pt β”‚ β”‚ β”‚ ... (5000 files total) β”‚ β”‚ β”œβ”€β”€ 5000 β”‚ β”‚ β”‚ β”œβ”€β”€ image5000_NY_train.pt β”‚ β”‚ β”‚ β”œβ”€β”€ image5001_NY_train.pt β”‚ β”‚ β”‚ └── image5002_NY_train.pt β”‚ β”‚ β”‚ ... (5000 files total) β”‚ β”‚ └── 10000 β”‚ β”‚ β”œβ”€β”€ image10000_NY_train.pt β”‚ β”‚ β”œβ”€β”€ image10001_NY_train.pt β”‚ β”‚ └── image10002_NY_train.pt β”‚ β”‚ ... (5000 files total) β”‚ β”‚ ... (11 dirs total) β”‚ β”œβ”€β”€ NZ β”‚ β”‚ β”œβ”€β”€ 0 β”‚ β”‚ β”‚ β”œβ”€β”€ image0_NZ_train.pt β”‚ β”‚ β”‚ β”œβ”€β”€ image1000_NZ_train.pt β”‚ β”‚ β”‚ └── image1001_NZ_train.pt β”‚ β”‚ β”‚ ... (5000 files total) β”‚ β”‚ β”œβ”€β”€ 5000 β”‚ β”‚ β”‚ β”œβ”€β”€ image5000_NZ_train.pt β”‚ β”‚ β”‚ β”œβ”€β”€ image5001_NZ_train.pt β”‚ β”‚ β”‚ └── image5002_NZ_train.pt β”‚ β”‚ β”‚ ... (5000 files total) β”‚ β”‚ └── 10000 β”‚ β”‚ β”œβ”€β”€ image10000_NZ_train.pt β”‚ β”‚ β”œβ”€β”€ image10001_NZ_train.pt β”‚ β”‚ └── image10002_NZ_train.pt β”‚ β”‚ ... (5000 files total) β”‚ β”‚ ... (11 dirs total) β”‚ β”œβ”€β”€ processed-flag-all β”‚ β”œβ”€β”€ processed-flag-CH β”‚ └── processed-flag-NY β”‚ ... (8 files total) β”œβ”€β”€ val β”‚ β”œβ”€β”€ CH β”‚ β”‚ └── 0 β”‚ β”‚ β”œβ”€β”€ image0_CH_val.pt β”‚ β”‚ β”œβ”€β”€ image100_CH_val.pt β”‚ β”‚ └── image101_CH_val.pt β”‚ β”‚ ... (529 files total) β”‚ β”œβ”€β”€ NY β”‚ β”‚ └── 0 β”‚ β”‚ β”œβ”€β”€ image0_NY_val.pt β”‚ β”‚ β”œβ”€β”€ image100_NY_val.pt β”‚ β”‚ └── image101_NY_val.pt β”‚ β”‚ ... (529 files total) β”‚ β”œβ”€β”€ NZ β”‚ β”‚ └── 0 β”‚ β”‚ β”œβ”€β”€ image0_NZ_val.pt β”‚ β”‚ β”œβ”€β”€ image100_NZ_val.pt β”‚ β”‚ └── image101_NZ_val.pt β”‚ β”‚ ... (529 files total) β”‚ β”œβ”€β”€ processed-flag-all β”‚ β”œβ”€β”€ processed-flag-CH β”‚ └── processed-flag-NY β”‚ ... (8 files total) └── test β”œβ”€β”€ CH β”‚ β”œβ”€β”€ 0 β”‚ β”‚ β”œβ”€β”€ image0_CH_test.pt β”‚ β”‚ β”œβ”€β”€ image1000_CH_test.pt β”‚ β”‚ └── image1001_CH_test.pt β”‚ β”‚ ... (5000 files total) β”‚ β”œβ”€β”€ 5000 β”‚ β”‚ β”œβ”€β”€ image5000_CH_test.pt β”‚ β”‚ β”œβ”€β”€ image5001_CH_test.pt β”‚ β”‚ └── image5002_CH_test.pt β”‚ β”‚ ... (5000 files total) β”‚ └── 10000 β”‚ β”œβ”€β”€ image10000_CH_test.pt β”‚ β”œβ”€β”€ image10001_CH_test.pt β”‚ └── image10002_CH_test.pt β”‚ ... (4400 files total) β”œβ”€β”€ NY β”‚ β”œβ”€β”€ 0 β”‚ β”‚ β”œβ”€β”€ image0_NY_test.pt β”‚ β”‚ β”œβ”€β”€ image1000_NY_test.pt β”‚ β”‚ └── image1001_NY_test.pt β”‚ β”‚ ... (5000 files total) β”‚ β”œβ”€β”€ 5000 β”‚ β”‚ β”œβ”€β”€ image5000_NY_test.pt β”‚ β”‚ β”œβ”€β”€ image5001_NY_test.pt β”‚ β”‚ └── image5002_NY_test.pt β”‚ β”‚ ... (5000 files total) β”‚ └── 10000 β”‚ β”œβ”€β”€ image10000_NY_test.pt β”‚ β”œβ”€β”€ image10001_NY_test.pt β”‚ └── image10002_NY_test.pt β”‚ ... (4400 files total) β”œβ”€β”€ NZ β”‚ β”œβ”€β”€ 0 β”‚ β”‚ β”œβ”€β”€ image0_NZ_test.pt β”‚ β”‚ β”œβ”€β”€ image1000_NZ_test.pt β”‚ β”‚ └── image1001_NZ_test.pt β”‚ β”‚ ... (5000 files total) β”‚ β”œβ”€β”€ 5000 β”‚ β”‚ β”œβ”€β”€ image5000_NZ_test.pt β”‚ β”‚ β”œβ”€β”€ image5001_NZ_test.pt β”‚ β”‚ └── image5002_NZ_test.pt β”‚ β”‚ ... (5000 files total) β”‚ └── 10000 β”‚ β”œβ”€β”€ image10000_NZ_test.pt β”‚ β”œβ”€β”€ image10001_NZ_test.pt β”‚ └── image10002_NZ_test.pt β”‚ ... (4400 files total) β”œβ”€β”€ processed-flag-all β”œβ”€β”€ processed-flag-CH └── processed-flag-NY ... (8 files total) ```
## Pretrained model weights ### Download The recommended and fastest way to download the pretrained model weights is to run ``` python scripts/download_pretrained.py --model-root $MODEL_ROOT ``` Optionally you can also download the weights by running ``` git clone https://huggingface.co/rsi/PixelsPointsPolygons $MODEL_ROOT ``` Both options will download all checkpoints (as .pth) and results presented in the paper (as MS-COCO .json) into `$MODEL_ROOT` . ## Code ### Download ``` git clone https://github.com/raphaelsulzer/PixelsPointsPolygons ``` ### Installation To create a conda environment named `p3` and install the repository as a python package with all dependencies run ``` bash install.sh ``` or, if you want to manage the environment yourself run ``` pip install -r requirements-torch-cuda.txt pip install . ``` ⚠️ **Warning**: The implementation of the LiDAR point cloud encoder uses Open3D-ML. Currently, Open3D-ML officially only supports the PyTorch version specified in `requirements-torch-cuda.txt`. ### Setup The project supports hydra configuration which allows to modify any parameter either from a `.yaml` file or directly from the command line. To setup the project structure we recommend to specify your `$DATA_ROOT` and `$MODEL_ROOT` in `config/host/default.yaml`. To view all available configuration options run ``` python scripts/train.py --help ``` ### Predict demo tile After downloading the model weights and setting up the code you can predict a demo tile by running ``` python scripts/predict_demo.py checkpoint=best_val_iou experiment=$MODEL_$MODALITY +image_file=demo_data/image0_CH_val.tif +lidar_file=demo_data/lidar0_CH_val.copc.laz ``` At least one of `image_file` or `lidar_file` has to be specified. `$MODEL` can be one of the following: `ffl`, `hisup` or `p2p`. `$MODALITY` can be `image`, `lidar` or `fusion`. The result will be stored in `prediction.png`. ### Reproduce paper results To reproduce the results from the paper you can run the following commands ``` python scripts/modality_ablation.py python scripts/lidar_density_ablation.py python scripts/all_countries.py ``` ### Custom training, prediction and evaluation We recommend to first setup a custom experiment file `$EXP_FILE` in `config/experiment/` following the structure of one of the existing files, e.g. `ffl_fusion.yaml`. You can then run ``` # train your model (on multiple GPUs) torchrun --nproc_per_node=$NUM_GPU scripts/train.py experiment=$EXP_FILE # predict the test set with your model (on multiple GPUs) torchrun --nproc_per_node=$NUM_GPU scripts/predict.py experiment=$EXP_FILE evaluation=test checkpoint=best_val_iou # evaluate your prediction of the test set python scripts/evaluate.py experiment=$EXP_FILE evaluation=test checkpoint=best_val_iou ``` You could also continue training from a provided pretrained model with ``` # train your model (on a single GPU) python scripts/train.py experiment=p2p_fusion checkpoint=latest ``` ## Citation If you use our work please cite ```bibtex TODO ``` ## Acknowledgements This repository benefits from the following open-source work. We thank the authors for their great work. 1. [Frame Field Learning](https://github.com/Lydorn/Polygonization-by-Frame-Field-Learning) 2. [HiSup](https://github.com/SarahwXU/HiSup) 3. [Pix2Poly](https://github.com/yeshwanth95/Pix2Poly)