---
license: mit
datasets:
- rsi/PixelsPointsPolygons
language:
- en
metrics:
- accuracy
base_model:
- timm/vit_small_patch8_224.dino
pipeline_tag: object-detection
tags:
- building
- vectorization
- polygon
- aerial
- image
- pointcloud
- multimodal
---
The P3 dataset: Pixels, Points and Polygons
for Multimodal Building Vectorization
Raphael Sulzer1,2 Liuyun Duan1
Nicolas Girard1 Florent Lafarge2
1LuxCarta Technology
2Centre Inria d'UniversitΓ© CΓ΄te d'Azur
Figure 1: A view of our dataset of Zurich, Switzerland
## Table of Contents
- [Abstract](#abstract)
- [Highlights](#highlights)
- [Dataset](#dataset)
- [Pretrained model weights](#pretrained-model-weights)
- [Code](#code)
- [Citation](#citation)
- [Acknowledgements](#acknowledgements)
## Abstract
We present the P3 dataset, a large-scale multimodal benchmark for building vectorization, constructed from aerial LiDAR point clouds, high-resolution aerial imagery, and vectorized 2D building outlines, collected across three continents. The dataset contains over 10 billion LiDAR points with decimeter-level accuracy and RGB images at a ground sampling distance of 25 cm. While many existing datasets primarily focus on the image modality, P3 offers a complementary perspective by also incorporating dense 3D information. We demonstrate that LiDAR point clouds serve as a robust modality for predicting building polygons, both in hybrid and end-to-end learning frameworks. Moreover, fusing aerial LiDAR and imagery further improves accuracy and geometric quality of predicted polygons. The P3 dataset is publicly available, along with code and pretrained weights of three state-of-the-art models for building polygon prediction at https://github.com/raphaelsulzer/PixelsPointsPolygons.
## Highlights
- A global, multimodal dataset of aerial images, aerial LiDAR point clouds and building outline polygons, available at [huggingface.co/datasets/rsi/PixelsPointsPolygons](https://huggingface.co/datasets/rsi/PixelsPointsPolygons)
- A library for training and evaluating state-of-the-art deep learning methods on the dataset, available at [github.com/raphaelsulzer/PixelsPointsPolygons](https://github.com/raphaelsulzer/PixelsPointsPolygons)
- Pretrained model weights, available at [huggingface.co/rsi/PixelsPointsPolygons](https://huggingface.co/rsi/PixelsPointsPolygons)
## Dataset
### Overview
### Download
The recommended and fastest way to download the dataset is to run
```
pip install huggingface_hub
python scripts/download_dataset.py --dataset-root $DATA_ROOT
```
Optionally you can also download the dataset by running
```
git lfs install
git clone https://huggingface.co/datasets/rsi/PixelsPointsPolygons $DATA_ROOT
```
Both options will download the full dataset, including aerial images (as .tif), aerial lidar point clouds (as .copc.laz) and building polygon annotaions (as MS-COCO .json) into `$DATA_ROOT` . The size of the dataset is around 163GB.
### Structure
π Click to expand dataset folder structure
```text
PixelsPointsPolygons/data/224
βββ annotations
β βββ annotations_all_test.json
β βββ annotations_all_train.json
β βββ annotations_all_val.json
β ... (24 files total)
βββ images
β βββ train
β β βββ CH
β β β βββ 0
β β β β βββ image0_CH_train.tif
β β β β βββ image1000_CH_train.tif
β β β β βββ image1001_CH_train.tif
β β β β ... (5000 files total)
β β β βββ 5000
β β β β βββ image5000_CH_train.tif
β β β β βββ image5001_CH_train.tif
β β β β βββ image5002_CH_train.tif
β β β β ... (5000 files total)
β β β βββ 10000
β β β βββ image10000_CH_train.tif
β β β βββ image10001_CH_train.tif
β β β βββ image10002_CH_train.tif
β β β ... (5000 files total)
β β β ... (11 dirs total)
β β βββ NY
β β β βββ 0
β β β β βββ image0_NY_train.tif
β β β β βββ image1000_NY_train.tif
β β β β βββ image1001_NY_train.tif
β β β β ... (5000 files total)
β β β βββ 5000
β β β β βββ image5000_NY_train.tif
β β β β βββ image5001_NY_train.tif
β β β β βββ image5002_NY_train.tif
β β β β ... (5000 files total)
β β β βββ 10000
β β β βββ image10000_NY_train.tif
β β β βββ image10001_NY_train.tif
β β β βββ image10002_NY_train.tif
β β β ... (5000 files total)
β β β ... (11 dirs total)
β β βββ NZ
β β βββ 0
β β β βββ image0_NZ_train.tif
β β β βββ image1000_NZ_train.tif
β β β βββ image1001_NZ_train.tif
β β β ... (5000 files total)
β β βββ 5000
β β β βββ image5000_NZ_train.tif
β β β βββ image5001_NZ_train.tif
β β β βββ image5002_NZ_train.tif
β β β ... (5000 files total)
β β βββ 10000
β β βββ image10000_NZ_train.tif
β β βββ image10001_NZ_train.tif
β β βββ image10002_NZ_train.tif
β β ... (5000 files total)
β β ... (11 dirs total)
β βββ val
β β βββ CH
β β β βββ 0
β β β βββ image0_CH_val.tif
β β β βββ image100_CH_val.tif
β β β βββ image101_CH_val.tif
β β β ... (529 files total)
β β βββ NY
β β β βββ 0
β β β βββ image0_NY_val.tif
β β β βββ image100_NY_val.tif
β β β βββ image101_NY_val.tif
β β β ... (529 files total)
β β βββ NZ
β β βββ 0
β β βββ image0_NZ_val.tif
β β βββ image100_NZ_val.tif
β β βββ image101_NZ_val.tif
β β ... (529 files total)
β βββ test
β βββ CH
β β βββ 0
β β β βββ image0_CH_test.tif
β β β βββ image1000_CH_test.tif
β β β βββ image1001_CH_test.tif
β β β ... (5000 files total)
β β βββ 5000
β β β βββ image5000_CH_test.tif
β β β βββ image5001_CH_test.tif
β β β βββ image5002_CH_test.tif
β β β ... (5000 files total)
β β βββ 10000
β β βββ image10000_CH_test.tif
β β βββ image10001_CH_test.tif
β β βββ image10002_CH_test.tif
β β ... (4400 files total)
β βββ NY
β β βββ 0
β β β βββ image0_NY_test.tif
β β β βββ image1000_NY_test.tif
β β β βββ image1001_NY_test.tif
β β β ... (5000 files total)
β β βββ 5000
β β β βββ image5000_NY_test.tif
β β β βββ image5001_NY_test.tif
β β β βββ image5002_NY_test.tif
β β β ... (5000 files total)
β β βββ 10000
β β βββ image10000_NY_test.tif
β β βββ image10001_NY_test.tif
β β βββ image10002_NY_test.tif
β β ... (4400 files total)
β βββ NZ
β βββ 0
β β βββ image0_NZ_test.tif
β β βββ image1000_NZ_test.tif
β β βββ image1001_NZ_test.tif
β β ... (5000 files total)
β βββ 5000
β β βββ image5000_NZ_test.tif
β β βββ image5001_NZ_test.tif
β β βββ image5002_NZ_test.tif
β β ... (5000 files total)
β βββ 10000
β βββ image10000_NZ_test.tif
β βββ image10001_NZ_test.tif
β βββ image10002_NZ_test.tif
β ... (4400 files total)
βββ lidar
β βββ train
β β βββ CH
β β β βββ 0
β β β β βββ lidar0_CH_train.copc.laz
β β β β βββ lidar1000_CH_train.copc.laz
β β β β βββ lidar1001_CH_train.copc.laz
β β β β ... (5000 files total)
β β β βββ 5000
β β β β βββ lidar5000_CH_train.copc.laz
β β β β βββ lidar5001_CH_train.copc.laz
β β β β βββ lidar5002_CH_train.copc.laz
β β β β ... (5000 files total)
β β β βββ 10000
β β β βββ lidar10000_CH_train.copc.laz
β β β βββ lidar10001_CH_train.copc.laz
β β β βββ lidar10002_CH_train.copc.laz
β β β ... (5000 files total)
β β β ... (11 dirs total)
β β βββ NY
β β β βββ 0
β β β β βββ lidar0_NY_train.copc.laz
β β β β βββ lidar10_NY_train.copc.laz
β β β β βββ lidar1150_NY_train.copc.laz
β β β β ... (1071 files total)
β β β βββ 5000
β β β β βββ lidar5060_NY_train.copc.laz
β β β β βββ lidar5061_NY_train.copc.laz
β β β β βββ lidar5062_NY_train.copc.laz
β β β β ... (2235 files total)
β β β βββ 10000
β β β βββ lidar10000_NY_train.copc.laz
β β β βββ lidar10001_NY_train.copc.laz
β β β βββ lidar10002_NY_train.copc.laz
β β β ... (4552 files total)
β β β ... (11 dirs total)
β β βββ NZ
β β βββ 0
β β β βββ lidar0_NZ_train.copc.laz
β β β βββ lidar1000_NZ_train.copc.laz
β β β βββ lidar1001_NZ_train.copc.laz
β β β ... (5000 files total)
β β βββ 5000
β β β βββ lidar5000_NZ_train.copc.laz
β β β βββ lidar5001_NZ_train.copc.laz
β β β βββ lidar5002_NZ_train.copc.laz
β β β ... (5000 files total)
β β βββ 10000
β β βββ lidar10000_NZ_train.copc.laz
β β βββ lidar10001_NZ_train.copc.laz
β β βββ lidar10002_NZ_train.copc.laz
β β ... (4999 files total)
β β ... (11 dirs total)
β βββ val
β β βββ CH
β β β βββ 0
β β β βββ lidar0_CH_val.copc.laz
β β β βββ lidar100_CH_val.copc.laz
β β β βββ lidar101_CH_val.copc.laz
β β β ... (529 files total)
β β βββ NY
β β β βββ 0
β β β βββ lidar0_NY_val.copc.laz
β β β βββ lidar100_NY_val.copc.laz
β β β βββ lidar101_NY_val.copc.laz
β β β ... (529 files total)
β β βββ NZ
β β βββ 0
β β βββ lidar0_NZ_val.copc.laz
β β βββ lidar100_NZ_val.copc.laz
β β βββ lidar101_NZ_val.copc.laz
β β ... (529 files total)
β βββ test
β βββ CH
β β βββ 0
β β β βββ lidar0_CH_test.copc.laz
β β β βββ lidar1000_CH_test.copc.laz
β β β βββ lidar1001_CH_test.copc.laz
β β β ... (5000 files total)
β β βββ 5000
β β β βββ lidar5000_CH_test.copc.laz
β β β βββ lidar5001_CH_test.copc.laz
β β β βββ lidar5002_CH_test.copc.laz
β β β ... (5000 files total)
β β βββ 10000
β β βββ lidar10000_CH_test.copc.laz
β β βββ lidar10001_CH_test.copc.laz
β β βββ lidar10002_CH_test.copc.laz
β β ... (4400 files total)
β βββ NY
β β βββ 0
β β β βββ lidar0_NY_test.copc.laz
β β β βββ lidar1000_NY_test.copc.laz
β β β βββ lidar1001_NY_test.copc.laz
β β β ... (4964 files total)
β β βββ 5000
β β β βββ lidar5000_NY_test.copc.laz
β β β βββ lidar5001_NY_test.copc.laz
β β β βββ lidar5002_NY_test.copc.laz
β β β ... (4953 files total)
β β βββ 10000
β β βββ lidar10000_NY_test.copc.laz
β β βββ lidar10001_NY_test.copc.laz
β β βββ lidar10002_NY_test.copc.laz
β β ... (4396 files total)
β βββ NZ
β βββ 0
β β βββ lidar0_NZ_test.copc.laz
β β βββ lidar1000_NZ_test.copc.laz
β β βββ lidar1001_NZ_test.copc.laz
β β ... (5000 files total)
β βββ 5000
β β βββ lidar5000_NZ_test.copc.laz
β β βββ lidar5001_NZ_test.copc.laz
β β βββ lidar5002_NZ_test.copc.laz
β β ... (5000 files total)
β βββ 10000
β βββ lidar10000_NZ_test.copc.laz
β βββ lidar10001_NZ_test.copc.laz
β βββ lidar10002_NZ_test.copc.laz
β ... (4400 files total)
βββ ffl
βββ train
β βββ CH
β β βββ 0
β β β βββ image0_CH_train.pt
β β β βββ image1000_CH_train.pt
β β β βββ image1001_CH_train.pt
β β β ... (5000 files total)
β β βββ 5000
β β β βββ image5000_CH_train.pt
β β β βββ image5001_CH_train.pt
β β β βββ image5002_CH_train.pt
β β β ... (5000 files total)
β β βββ 10000
β β βββ image10000_CH_train.pt
β β βββ image10001_CH_train.pt
β β βββ image10002_CH_train.pt
β β ... (5000 files total)
β β ... (11 dirs total)
β βββ NY
β β βββ 0
β β β βββ image0_NY_train.pt
β β β βββ image1000_NY_train.pt
β β β βββ image1001_NY_train.pt
β β β ... (5000 files total)
β β βββ 5000
β β β βββ image5000_NY_train.pt
β β β βββ image5001_NY_train.pt
β β β βββ image5002_NY_train.pt
β β β ... (5000 files total)
β β βββ 10000
β β βββ image10000_NY_train.pt
β β βββ image10001_NY_train.pt
β β βββ image10002_NY_train.pt
β β ... (5000 files total)
β β ... (11 dirs total)
β βββ NZ
β β βββ 0
β β β βββ image0_NZ_train.pt
β β β βββ image1000_NZ_train.pt
β β β βββ image1001_NZ_train.pt
β β β ... (5000 files total)
β β βββ 5000
β β β βββ image5000_NZ_train.pt
β β β βββ image5001_NZ_train.pt
β β β βββ image5002_NZ_train.pt
β β β ... (5000 files total)
β β βββ 10000
β β βββ image10000_NZ_train.pt
β β βββ image10001_NZ_train.pt
β β βββ image10002_NZ_train.pt
β β ... (5000 files total)
β β ... (11 dirs total)
β βββ processed-flag-all
β βββ processed-flag-CH
β βββ processed-flag-NY
β ... (8 files total)
βββ val
β βββ CH
β β βββ 0
β β βββ image0_CH_val.pt
β β βββ image100_CH_val.pt
β β βββ image101_CH_val.pt
β β ... (529 files total)
β βββ NY
β β βββ 0
β β βββ image0_NY_val.pt
β β βββ image100_NY_val.pt
β β βββ image101_NY_val.pt
β β ... (529 files total)
β βββ NZ
β β βββ 0
β β βββ image0_NZ_val.pt
β β βββ image100_NZ_val.pt
β β βββ image101_NZ_val.pt
β β ... (529 files total)
β βββ processed-flag-all
β βββ processed-flag-CH
β βββ processed-flag-NY
β ... (8 files total)
βββ test
βββ CH
β βββ 0
β β βββ image0_CH_test.pt
β β βββ image1000_CH_test.pt
β β βββ image1001_CH_test.pt
β β ... (5000 files total)
β βββ 5000
β β βββ image5000_CH_test.pt
β β βββ image5001_CH_test.pt
β β βββ image5002_CH_test.pt
β β ... (5000 files total)
β βββ 10000
β βββ image10000_CH_test.pt
β βββ image10001_CH_test.pt
β βββ image10002_CH_test.pt
β ... (4400 files total)
βββ NY
β βββ 0
β β βββ image0_NY_test.pt
β β βββ image1000_NY_test.pt
β β βββ image1001_NY_test.pt
β β ... (5000 files total)
β βββ 5000
β β βββ image5000_NY_test.pt
β β βββ image5001_NY_test.pt
β β βββ image5002_NY_test.pt
β β ... (5000 files total)
β βββ 10000
β βββ image10000_NY_test.pt
β βββ image10001_NY_test.pt
β βββ image10002_NY_test.pt
β ... (4400 files total)
βββ NZ
β βββ 0
β β βββ image0_NZ_test.pt
β β βββ image1000_NZ_test.pt
β β βββ image1001_NZ_test.pt
β β ... (5000 files total)
β βββ 5000
β β βββ image5000_NZ_test.pt
β β βββ image5001_NZ_test.pt
β β βββ image5002_NZ_test.pt
β β ... (5000 files total)
β βββ 10000
β βββ image10000_NZ_test.pt
β βββ image10001_NZ_test.pt
β βββ image10002_NZ_test.pt
β ... (4400 files total)
βββ processed-flag-all
βββ processed-flag-CH
βββ processed-flag-NY
... (8 files total)
```
## Pretrained model weights
### Download
The recommended and fastest way to download the pretrained model weights is to run
```
python scripts/download_pretrained.py --model-root $MODEL_ROOT
```
Optionally you can also download the weights by running
```
git clone https://huggingface.co/rsi/PixelsPointsPolygons $MODEL_ROOT
```
Both options will download all checkpoints (as .pth) and results presented in the paper (as MS-COCO .json) into `$MODEL_ROOT` .
## Code
### Download
```
git clone https://github.com/raphaelsulzer/PixelsPointsPolygons
```
### Installation
To create a conda environment named `p3` and install the repository as a python package with all dependencies run
```
bash install.sh
```
or, if you want to manage the environment yourself run
```
pip install -r requirements-torch-cuda.txt
pip install .
```
β οΈ **Warning**: The implementation of the LiDAR point cloud encoder uses Open3D-ML. Currently, Open3D-ML officially only supports the PyTorch version specified in `requirements-torch-cuda.txt`.
### Setup
The project supports hydra configuration which allows to modify any parameter either from a `.yaml` file or directly from the command line.
To setup the project structure we recommend to specify your `$DATA_ROOT` and `$MODEL_ROOT` in `config/host/default.yaml`.
To view all available configuration options run
```
python scripts/train.py --help
```
### Predict demo tile
After downloading the model weights and setting up the code you can predict a demo tile by running
```
python scripts/predict_demo.py checkpoint=best_val_iou experiment=$MODEL_$MODALITY +image_file=demo_data/image0_CH_val.tif +lidar_file=demo_data/lidar0_CH_val.copc.laz
```
At least one of `image_file` or `lidar_file` has to be specified. `$MODEL` can be one of the following: `ffl`, `hisup` or `p2p`. `$MODALITY` can be `image`, `lidar` or `fusion`.
The result will be stored in `prediction.png`.
### Reproduce paper results
To reproduce the results from the paper you can run the following commands
```
python scripts/modality_ablation.py
python scripts/lidar_density_ablation.py
python scripts/all_countries.py
```
### Custom training, prediction and evaluation
We recommend to first setup a custom experiment file `$EXP_FILE` in `config/experiment/` following the structure of one of the existing files, e.g. `ffl_fusion.yaml`. You can then run
```
# train your model (on multiple GPUs)
torchrun --nproc_per_node=$NUM_GPU scripts/train.py experiment=$EXP_FILE
# predict the test set with your model (on multiple GPUs)
torchrun --nproc_per_node=$NUM_GPU scripts/predict.py experiment=$EXP_FILE evaluation=test checkpoint=best_val_iou
# evaluate your prediction of the test set
python scripts/evaluate.py experiment=$EXP_FILE evaluation=test checkpoint=best_val_iou
```
You could also continue training from a provided pretrained model with
```
# train your model (on a single GPU)
python scripts/train.py experiment=p2p_fusion checkpoint=latest
```
## Citation
If you use our work please cite
```bibtex
TODO
```
## Acknowledgements
This repository benefits from the following open-source work. We thank the authors for their great work.
1. [Frame Field Learning](https://github.com/Lydorn/Polygonization-by-Frame-Field-Learning)
2. [HiSup](https://github.com/SarahwXU/HiSup)
3. [Pix2Poly](https://github.com/yeshwanth95/Pix2Poly)