rsi commited on
Commit
e19bfa9
·
1 Parent(s): a713834

update readme

Browse files
README.md CHANGED
@@ -18,6 +18,7 @@ tags:
18
  - pointcloud
19
  - multimodal
20
  ---
 
21
  <div align="center">
22
  <h1 align="center">The P<sup>3</sup> dataset: Pixels, Points and Polygons <br> for Multimodal Building Vectorization</h1>
23
  <h3><align="center">Raphael Sulzer<sup>1,2</sup> &nbsp;&nbsp;&nbsp; Liuyun Duan<sup>1</sup>
@@ -27,7 +28,7 @@ tags:
27
  <b>Figure 1</b>: A view of our dataset of Zurich, Switzerland
28
  </div>
29
 
30
- ## Abstract:
31
 
32
  <div align="justify">
33
  We present the P<sup>3</sup> dataset, a large-scale multimodal benchmark for building vectorization, constructed from aerial LiDAR point clouds, high-resolution aerial imagery, and vectorized 2D building outlines, collected across three continents. The dataset contains over 10 billion LiDAR points with decimeter-level accuracy and RGB images at a ground sampling distance of 25 cm. While many existing datasets primarily focus on the image modality, P<sup>3</sup> offers a complementary perspective by also incorporating dense 3D information. We demonstrate that LiDAR point clouds serve as a robust modality for predicting building polygons, both in hybrid and end-to-end learning frameworks. Moreover, fusing aerial LiDAR and imagery further improves accuracy and geometric quality of predicted polygons. The P<sup>3</sup> dataset is publicly available, along with code and pretrained weights of three state-of-the-art models for building polygon prediction at https://github.com/raphaelsulzer/PixelsPointsPolygons.
@@ -35,28 +36,425 @@ We present the P<sup>3</sup> dataset, a large-scale multimodal benchmark for bui
35
 
36
  ## Highlights
37
 
38
- - A global, multimodal dataset of aerial images, aerial lidar point clouds and building polygons
39
- - A library for training and evaluating state-of-the-art deep learning methods on the dataset
 
40
 
41
 
42
  ## Dataset
43
 
44
- ### Download
45
-
46
- You can download the dataset at [huggingface.co/datasets/rsi/PixelsPointsPolygons](https://huggingface.co/datasets/rsi/PixelsPointsPolygons) .
47
-
48
-
49
  ### Overview
50
 
51
  <div align="left">
52
  <img src="./worldmap.jpg" width=60% height=50%>
53
  </div>
54
 
 
55
 
56
- <!-- ### Prepare custom tile size
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
57
 
58
- See [datasets preprocessing](data_preprocess) for instructions on preparing a dataset with different tile sizes. -->
 
 
 
 
59
 
 
 
 
 
60
 
61
  ## Code
62
 
@@ -66,9 +464,9 @@ See [datasets preprocessing](data_preprocess) for instructions on preparing a da
66
  git clone https://github.com/raphaelsulzer/PixelsPointsPolygons
67
  ```
68
 
69
- ### Requirements
70
 
71
- To create a conda environment named `ppp` and install the repository as a python package with all dependencies run
72
  ```
73
  bash install.sh
74
  ```
@@ -97,50 +495,75 @@ pip install .
97
  | Pix2Poly |\<pix2poly>| PointPillars (PP) + ViT | \<pp_vit> | | ✅ | 0.80 | 0.88 |
98
  | Pix2Poly |\<pix2poly>| PP+ViT \& ViT | \<fusion_vit> | ✅ |✅ | 0.78 | 0.85 | -->
99
 
100
- ### Configuration
 
 
 
 
101
 
102
- The project supports hydra configuration which allows to modify any parameter from the command line, such as the model and encoder types from the table above.
103
- To view all available options run
104
  ```
105
- python train.py --help
106
  ```
107
 
108
- ### Training
109
 
110
- Start training with the following command:
111
 
112
- ```
113
- torchrun --nproc_per_node=<num GPUs> train.py model=<model> encoder=<encoder> model.batch_size=<batch size> ...
 
114
 
 
 
 
 
 
115
  ```
116
 
117
- ### Prediction
118
 
119
- ```
120
- torchrun --nproc_per_node=<num GPUs> predict.py model=<model> checkpoint=best_val_iou ...
 
121
 
122
  ```
 
 
 
 
123
 
124
- ### Evaluation
125
 
126
  ```
127
- python evaluate.py model=<model> checkpoint=best_val_iou
 
 
128
  ```
129
- <!-- ## Trained models
130
 
131
- asd -->
132
 
 
133
 
134
- <!-- ## Results
 
 
 
 
 
 
 
135
 
136
- #TODO Put paper main results table here -->
137
 
 
 
 
 
138
 
139
  ## Citation
140
 
141
  If you find our work useful, please consider citing:
142
  ```bibtex
143
- ...
144
  ```
145
 
146
  ## Acknowledgements
 
18
  - pointcloud
19
  - multimodal
20
  ---
21
+
22
  <div align="center">
23
  <h1 align="center">The P<sup>3</sup> dataset: Pixels, Points and Polygons <br> for Multimodal Building Vectorization</h1>
24
  <h3><align="center">Raphael Sulzer<sup>1,2</sup> &nbsp;&nbsp;&nbsp; Liuyun Duan<sup>1</sup>
 
28
  <b>Figure 1</b>: A view of our dataset of Zurich, Switzerland
29
  </div>
30
 
31
+ ## Abstract
32
 
33
  <div align="justify">
34
  We present the P<sup>3</sup> dataset, a large-scale multimodal benchmark for building vectorization, constructed from aerial LiDAR point clouds, high-resolution aerial imagery, and vectorized 2D building outlines, collected across three continents. The dataset contains over 10 billion LiDAR points with decimeter-level accuracy and RGB images at a ground sampling distance of 25 cm. While many existing datasets primarily focus on the image modality, P<sup>3</sup> offers a complementary perspective by also incorporating dense 3D information. We demonstrate that LiDAR point clouds serve as a robust modality for predicting building polygons, both in hybrid and end-to-end learning frameworks. Moreover, fusing aerial LiDAR and imagery further improves accuracy and geometric quality of predicted polygons. The P<sup>3</sup> dataset is publicly available, along with code and pretrained weights of three state-of-the-art models for building polygon prediction at https://github.com/raphaelsulzer/PixelsPointsPolygons.
 
36
 
37
  ## Highlights
38
 
39
+ - A global, multimodal dataset of aerial images, aerial LiDAR point clouds and building outline polygons, available at [huggingface.co/datasets/rsi/PixelsPointsPolygons](https://huggingface.co/datasets/rsi/PixelsPointsPolygons)
40
+ - A library for training and evaluating state-of-the-art deep learning methods on the dataset, available at [github.com/raphaelsulzer/PixelsPointsPolygons](https://github.com/raphaelsulzer/PixelsPointsPolygons)
41
+ - Pretrained model weights, available at [huggingface.co/rsi/PixelsPointsPolygons](https://huggingface.co/rsi/PixelsPointsPolygons)
42
 
43
 
44
  ## Dataset
45
 
 
 
 
 
 
46
  ### Overview
47
 
48
  <div align="left">
49
  <img src="./worldmap.jpg" width=60% height=50%>
50
  </div>
51
 
52
+ ### Download
53
 
54
+ ```
55
+ git lfs install
56
+ git clone https://huggingface.co/datasets/rsi/PixelsPointsPolygons $DATA_ROOT
57
+ ```
58
+
59
+ ### Structure
60
+
61
+ <details>
62
+ <summary>📁 Click to expand folder structure</summary -->
63
+
64
+ ```text
65
+ PixelsPointsPolygons/data/224
66
+ ├── annotations
67
+ │ ├── annotations_all_test.json
68
+ │ ├── annotations_all_train.json
69
+ │ └── annotations_all_val.json
70
+ │ ... (24 files total)
71
+ ├── images
72
+ │ ├── train
73
+ │ │ ├── CH
74
+ │ │ │ ├── 0
75
+ │ │ │ │ ├── image0_CH_train.tif
76
+ │ │ │ │ ├── image1000_CH_train.tif
77
+ │ │ │ │ └── image1001_CH_train.tif
78
+ │ │ │ │ ... (5000 files total)
79
+ │ │ │ ├── 5000
80
+ │ │ │ │ ├── image5000_CH_train.tif
81
+ │ │ │ │ ├── image5001_CH_train.tif
82
+ │ │ │ │ └── image5002_CH_train.tif
83
+ │ │ │ │ ... (5000 files total)
84
+ │ │ │ └── 10000
85
+ │ │ │ ├── image10000_CH_train.tif
86
+ │ │ │ ├── image10001_CH_train.tif
87
+ │ │ │ └── image10002_CH_train.tif
88
+ │ │ │ ... (5000 files total)
89
+ │ │ │ ... (11 dirs total)
90
+ │ │ ├── NY
91
+ │ │ │ ├── 0
92
+ │ │ │ │ ├── image0_NY_train.tif
93
+ │ │ │ │ ├── image1000_NY_train.tif
94
+ │ │ │ │ └── image1001_NY_train.tif
95
+ │ │ │ │ ... (5000 files total)
96
+ │ │ │ ├── 5000
97
+ │ │ │ │ ├── image5000_NY_train.tif
98
+ │ │ │ │ ├── image5001_NY_train.tif
99
+ │ │ │ │ └── image5002_NY_train.tif
100
+ │ │ │ │ ... (5000 files total)
101
+ │ │ │ └── 10000
102
+ │ │ │ ├── image10000_NY_train.tif
103
+ │ │ │ ├── image10001_NY_train.tif
104
+ │ │ │ └── image10002_NY_train.tif
105
+ │ │ │ ... (5000 files total)
106
+ │ │ │ ... (11 dirs total)
107
+ │ │ └── NZ
108
+ │ │ ├── 0
109
+ │ │ │ ├── image0_NZ_train.tif
110
+ │ │ │ ├── image1000_NZ_train.tif
111
+ │ │ │ └── image1001_NZ_train.tif
112
+ │ │ │ ... (5000 files total)
113
+ │ │ ├── 5000
114
+ │ │ │ ├── image5000_NZ_train.tif
115
+ │ │ │ ├── image5001_NZ_train.tif
116
+ │ │ │ └── image5002_NZ_train.tif
117
+ │ │ │ ... (5000 files total)
118
+ │ │ └── 10000
119
+ │ │ ├── image10000_NZ_train.tif
120
+ │ │ ├── image10001_NZ_train.tif
121
+ │ │ └── image10002_NZ_train.tif
122
+ │ │ ... (5000 files total)
123
+ │ │ ... (11 dirs total)
124
+ │ ├── val
125
+ │ │ ├── CH
126
+ │ │ │ └── 0
127
+ │ │ │ ├── image0_CH_val.tif
128
+ │ │ │ ├── image100_CH_val.tif
129
+ │ │ │ └── image101_CH_val.tif
130
+ │ │ │ ... (529 files total)
131
+ │ │ ├── NY
132
+ │ │ │ └── 0
133
+ │ │ │ ├── image0_NY_val.tif
134
+ │ │ │ ├── image100_NY_val.tif
135
+ │ │ │ └── image101_NY_val.tif
136
+ │ │ │ ... (529 files total)
137
+ │ │ └── NZ
138
+ │ │ └── 0
139
+ │ │ ├── image0_NZ_val.tif
140
+ │ │ ├── image100_NZ_val.tif
141
+ │ │ └── image101_NZ_val.tif
142
+ │ │ ... (529 files total)
143
+ │ └── test
144
+ │ ├── CH
145
+ │ │ ├── 0
146
+ │ │ │ ├── image0_CH_test.tif
147
+ │ │ │ ├── image1000_CH_test.tif
148
+ │ │ │ └── image1001_CH_test.tif
149
+ │ │ │ ... (5000 files total)
150
+ │ │ ├── 5000
151
+ │ │ │ ├── image5000_CH_test.tif
152
+ │ │ │ ├── image5001_CH_test.tif
153
+ │ │ │ └── image5002_CH_test.tif
154
+ │ │ │ ... (5000 files total)
155
+ │ │ └── 10000
156
+ │ │ ├── image10000_CH_test.tif
157
+ │ │ ├── image10001_CH_test.tif
158
+ │ │ └── image10002_CH_test.tif
159
+ │ │ ... (4400 files total)
160
+ │ ├── NY
161
+ │ │ ├── 0
162
+ │ │ │ ├── image0_NY_test.tif
163
+ │ │ │ ├── image1000_NY_test.tif
164
+ │ │ │ └── image1001_NY_test.tif
165
+ │ │ │ ... (5000 files total)
166
+ │ │ ├── 5000
167
+ │ │ │ ├── image5000_NY_test.tif
168
+ │ │ │ ├── image5001_NY_test.tif
169
+ │ │ │ └── image5002_NY_test.tif
170
+ │ │ │ ... (5000 files total)
171
+ │ │ └── 10000
172
+ │ │ ├── image10000_NY_test.tif
173
+ │ │ ├── image10001_NY_test.tif
174
+ │ │ └── image10002_NY_test.tif
175
+ │ │ ... (4400 files total)
176
+ │ └── NZ
177
+ │ ├── 0
178
+ │ │ ├── image0_NZ_test.tif
179
+ │ │ ├── image1000_NZ_test.tif
180
+ │ │ └── image1001_NZ_test.tif
181
+ │ │ ... (5000 files total)
182
+ │ ├── 5000
183
+ │ │ ├── image5000_NZ_test.tif
184
+ │ │ ├── image5001_NZ_test.tif
185
+ │ │ └── image5002_NZ_test.tif
186
+ │ │ ... (5000 files total)
187
+ │ └── 10000
188
+ │ ├── image10000_NZ_test.tif
189
+ │ ├── image10001_NZ_test.tif
190
+ │ └── image10002_NZ_test.tif
191
+ │ ... (4400 files total)
192
+ ├── lidar
193
+ │ ├── train
194
+ │ │ ├── CH
195
+ │ │ │ ├── 0
196
+ │ │ │ │ ├── lidar0_CH_train.copc.laz
197
+ │ │ │ │ ├── lidar1000_CH_train.copc.laz
198
+ │ │ │ │ └── lidar1001_CH_train.copc.laz
199
+ │ │ │ │ ... (5000 files total)
200
+ │ │ │ ├── 5000
201
+ │ │ │ │ ├── lidar5000_CH_train.copc.laz
202
+ │ │ │ │ ├── lidar5001_CH_train.copc.laz
203
+ │ │ │ │ └── lidar5002_CH_train.copc.laz
204
+ │ │ │ │ ... (5000 files total)
205
+ │ │ │ └── 10000
206
+ │ │ │ ├── lidar10000_CH_train.copc.laz
207
+ │ │ │ ├── lidar10001_CH_train.copc.laz
208
+ │ │ │ └── lidar10002_CH_train.copc.laz
209
+ │ │ │ ... (5000 files total)
210
+ │ │ │ ... (11 dirs total)
211
+ │ │ ├── NY
212
+ │ │ │ ├── 0
213
+ │ │ │ │ ├── lidar0_NY_train.copc.laz
214
+ │ │ │ │ ├── lidar10_NY_train.copc.laz
215
+ │ │ │ │ └── lidar1150_NY_train.copc.laz
216
+ │ │ │ │ ... (1071 files total)
217
+ │ │ │ ├── 5000
218
+ │ │ │ │ ├── lidar5060_NY_train.copc.laz
219
+ │ │ │ │ ├── lidar5061_NY_train.copc.laz
220
+ │ │ │ │ └── lidar5062_NY_train.copc.laz
221
+ │ │ │ │ ... (2235 files total)
222
+ │ │ │ └── 10000
223
+ │ │ │ ├── lidar10000_NY_train.copc.laz
224
+ │ │ │ ├── lidar10001_NY_train.copc.laz
225
+ │ │ │ └── lidar10002_NY_train.copc.laz
226
+ │ │ │ ... (4552 files total)
227
+ │ │ │ ... (11 dirs total)
228
+ │ │ └── NZ
229
+ │ │ ├── 0
230
+ │ │ │ ├── lidar0_NZ_train.copc.laz
231
+ │ │ │ ├── lidar1000_NZ_train.copc.laz
232
+ │ │ │ └── lidar1001_NZ_train.copc.laz
233
+ │ │ │ ... (5000 files total)
234
+ │ │ ├── 5000
235
+ │ │ │ ├── lidar5000_NZ_train.copc.laz
236
+ │ │ │ ├── lidar5001_NZ_train.copc.laz
237
+ │ │ │ └── lidar5002_NZ_train.copc.laz
238
+ │ │ │ ... (5000 files total)
239
+ │ │ └── 10000
240
+ │ │ ├── lidar10000_NZ_train.copc.laz
241
+ │ │ ├── lidar10001_NZ_train.copc.laz
242
+ │ │ └── lidar10002_NZ_train.copc.laz
243
+ │ │ ... (4999 files total)
244
+ │ │ ... (11 dirs total)
245
+ │ ├── val
246
+ │ │ ├── CH
247
+ │ │ │ └── 0
248
+ │ │ │ ├── lidar0_CH_val.copc.laz
249
+ │ │ │ ├── lidar100_CH_val.copc.laz
250
+ │ │ │ └── lidar101_CH_val.copc.laz
251
+ │ │ │ ... (529 files total)
252
+ │ │ ├── NY
253
+ │ │ │ └── 0
254
+ │ │ │ ├── lidar0_NY_val.copc.laz
255
+ │ │ │ ├── lidar100_NY_val.copc.laz
256
+ │ │ │ └── lidar101_NY_val.copc.laz
257
+ │ │ │ ... (529 files total)
258
+ │ │ └── NZ
259
+ │ │ └── 0
260
+ │ │ ├── lidar0_NZ_val.copc.laz
261
+ │ │ ├── lidar100_NZ_val.copc.laz
262
+ │ │ └── lidar101_NZ_val.copc.laz
263
+ │ │ ... (529 files total)
264
+ │ └── test
265
+ │ ├── CH
266
+ │ │ ├── 0
267
+ │ │ │ ├── lidar0_CH_test.copc.laz
268
+ │ │ │ ├── lidar1000_CH_test.copc.laz
269
+ │ │ │ └── lidar1001_CH_test.copc.laz
270
+ │ │ │ ... (5000 files total)
271
+ │ │ ├── 5000
272
+ │ │ │ ├── lidar5000_CH_test.copc.laz
273
+ │ │ │ ├── lidar5001_CH_test.copc.laz
274
+ │ │ │ └── lidar5002_CH_test.copc.laz
275
+ │ │ │ ... (5000 files total)
276
+ │ │ └── 10000
277
+ │ │ ├── lidar10000_CH_test.copc.laz
278
+ │ │ ├── lidar10001_CH_test.copc.laz
279
+ │ │ └── lidar10002_CH_test.copc.laz
280
+ │ │ ... (4400 files total)
281
+ │ ├── NY
282
+ │ │ ├── 0
283
+ │ │ │ ├── lidar0_NY_test.copc.laz
284
+ │ │ │ ├── lidar1000_NY_test.copc.laz
285
+ │ │ │ └── lidar1001_NY_test.copc.laz
286
+ │ │ │ ... (4964 files total)
287
+ │ │ ├── 5000
288
+ │ │ │ ├── lidar5000_NY_test.copc.laz
289
+ │ │ │ ├── lidar5001_NY_test.copc.laz
290
+ │ │ │ └── lidar5002_NY_test.copc.laz
291
+ │ │ │ ... (4953 files total)
292
+ │ │ └── 10000
293
+ │ │ ├── lidar10000_NY_test.copc.laz
294
+ │ │ ├── lidar10001_NY_test.copc.laz
295
+ │ │ └── lidar10002_NY_test.copc.laz
296
+ │ │ ... (4396 files total)
297
+ │ └── NZ
298
+ │ ├── 0
299
+ │ │ ├── lidar0_NZ_test.copc.laz
300
+ │ │ ├── lidar1000_NZ_test.copc.laz
301
+ │ │ └── lidar1001_NZ_test.copc.laz
302
+ │ │ ... (5000 files total)
303
+ │ ├── 5000
304
+ │ │ ├── lidar5000_NZ_test.copc.laz
305
+ │ │ ├── lidar5001_NZ_test.copc.laz
306
+ │ │ └── lidar5002_NZ_test.copc.laz
307
+ │ │ ... (5000 files total)
308
+ │ └── 10000
309
+ │ ├── lidar10000_NZ_test.copc.laz
310
+ │ ├── lidar10001_NZ_test.copc.laz
311
+ │ └── lidar10002_NZ_test.copc.laz
312
+ │ ... (4400 files total)
313
+ └── ffl
314
+ ├── train
315
+ │ ├── CH
316
+ │ │ ├── 0
317
+ │ │ │ ├── image0_CH_train.pt
318
+ │ │ │ ├── image1000_CH_train.pt
319
+ │ │ │ └── image1001_CH_train.pt
320
+ │ │ │ ... (5000 files total)
321
+ │ │ ├── 5000
322
+ │ │ │ ├── image5000_CH_train.pt
323
+ │ │ │ ├── image5001_CH_train.pt
324
+ │ │ │ └── image5002_CH_train.pt
325
+ │ │ │ ... (5000 files total)
326
+ │ │ └── 10000
327
+ │ │ ├── image10000_CH_train.pt
328
+ │ │ ├── image10001_CH_train.pt
329
+ │ │ └── image10002_CH_train.pt
330
+ │ │ ... (5000 files total)
331
+ │ │ ... (11 dirs total)
332
+ │ ├── NY
333
+ │ │ ├── 0
334
+ │ │ │ ├── image0_NY_train.pt
335
+ │ │ │ ├── image1000_NY_train.pt
336
+ │ │ │ └── image1001_NY_train.pt
337
+ │ │ │ ... (5000 files total)
338
+ │ │ ├── 5000
339
+ │ │ │ ├── image5000_NY_train.pt
340
+ │ │ │ ├── image5001_NY_train.pt
341
+ │ │ │ └── image5002_NY_train.pt
342
+ │ │ │ ... (5000 files total)
343
+ │ │ └── 10000
344
+ │ │ ├── image10000_NY_train.pt
345
+ │ │ ├── image10001_NY_train.pt
346
+ │ │ └── image10002_NY_train.pt
347
+ │ │ ... (5000 files total)
348
+ │ │ ... (11 dirs total)
349
+ │ ├── NZ
350
+ │ │ ├── 0
351
+ │ │ │ ├── image0_NZ_train.pt
352
+ │ │ │ ├── image1000_NZ_train.pt
353
+ │ │ │ └── image1001_NZ_train.pt
354
+ │ │ │ ... (5000 files total)
355
+ │ │ ├── 5000
356
+ │ │ │ ├── image5000_NZ_train.pt
357
+ │ │ │ ├── image5001_NZ_train.pt
358
+ │ │ │ └── image5002_NZ_train.pt
359
+ │ │ │ ... (5000 files total)
360
+ │ │ └── 10000
361
+ │ │ ├── image10000_NZ_train.pt
362
+ │ │ ├── image10001_NZ_train.pt
363
+ │ │ └── image10002_NZ_train.pt
364
+ │ │ ... (5000 files total)
365
+ │ │ ... (11 dirs total)
366
+ │ ├── processed-flag-all
367
+ │ ├── processed-flag-CH
368
+ │ └── processed-flag-NY
369
+ │ ... (8 files total)
370
+ ├── val
371
+ │ ├── CH
372
+ │ │ └── 0
373
+ │ │ ├── image0_CH_val.pt
374
+ │ │ ├── image100_CH_val.pt
375
+ │ │ └── image101_CH_val.pt
376
+ │ │ ... (529 files total)
377
+ │ ├── NY
378
+ │ │ └── 0
379
+ │ │ ├── image0_NY_val.pt
380
+ │ │ ├── image100_NY_val.pt
381
+ │ │ └── image101_NY_val.pt
382
+ │ │ ... (529 files total)
383
+ │ ├── NZ
384
+ │ │ └── 0
385
+ │ │ ├── image0_NZ_val.pt
386
+ │ │ ├── image100_NZ_val.pt
387
+ │ │ └── image101_NZ_val.pt
388
+ │ │ ... (529 files total)
389
+ │ ├── processed-flag-all
390
+ │ ├── processed-flag-CH
391
+ │ └── processed-flag-NY
392
+ │ ... (8 files total)
393
+ └── test
394
+ ├── CH
395
+ │ ├── 0
396
+ │ │ ├── image0_CH_test.pt
397
+ │ │ ├── image1000_CH_test.pt
398
+ │ │ └── image1001_CH_test.pt
399
+ │ │ ... (5000 files total)
400
+ │ ├── 5000
401
+ │ │ ├── image5000_CH_test.pt
402
+ │ │ ├── image5001_CH_test.pt
403
+ │ │ └── image5002_CH_test.pt
404
+ │ │ ... (5000 files total)
405
+ │ └── 10000
406
+ │ ├── image10000_CH_test.pt
407
+ │ ├── image10001_CH_test.pt
408
+ │ └── image10002_CH_test.pt
409
+ │ ... (4400 files total)
410
+ ├── NY
411
+ │ ├── 0
412
+ │ │ ├── image0_NY_test.pt
413
+ │ │ ├── image1000_NY_test.pt
414
+ │ │ └── image1001_NY_test.pt
415
+ │ │ ... (5000 files total)
416
+ │ ├── 5000
417
+ │ │ ├── image5000_NY_test.pt
418
+ │ │ ├── image5001_NY_test.pt
419
+ │ │ └── image5002_NY_test.pt
420
+ │ │ ... (5000 files total)
421
+ │ └── 10000
422
+ │ ├── image10000_NY_test.pt
423
+ │ ├── image10001_NY_test.pt
424
+ │ └── image10002_NY_test.pt
425
+ │ ... (4400 files total)
426
+ ├── NZ
427
+ │ ├── 0
428
+ │ │ ├── image0_NZ_test.pt
429
+ │ │ ├── image1000_NZ_test.pt
430
+ │ │ └── image1001_NZ_test.pt
431
+ │ │ ... (5000 files total)
432
+ │ ├── 5000
433
+ │ │ ├── image5000_NZ_test.pt
434
+ │ │ ├── image5001_NZ_test.pt
435
+ │ │ └── image5002_NZ_test.pt
436
+ │ │ ... (5000 files total)
437
+ │ └── 10000
438
+ │ ├── image10000_NZ_test.pt
439
+ │ ├── image10001_NZ_test.pt
440
+ │ └── image10002_NZ_test.pt
441
+ │ ... (4400 files total)
442
+ ├── processed-flag-all
443
+ ├── processed-flag-CH
444
+ └── processed-flag-NY
445
+ ... (8 files total)
446
+ ```
447
 
448
+ </details>
449
+
450
+ ## Pretrained model weights
451
+
452
+ ### Download
453
 
454
+ ```
455
+ git lfs install
456
+ git clone https://huggingface.co/rsi/PixelsPointsPolygons $MODEL_ROOT
457
+ ```
458
 
459
  ## Code
460
 
 
464
  git clone https://github.com/raphaelsulzer/PixelsPointsPolygons
465
  ```
466
 
467
+ ### Installation
468
 
469
+ To create a conda environment named `p3` and install the repository as a python package with all dependencies run
470
  ```
471
  bash install.sh
472
  ```
 
495
  | Pix2Poly |\<pix2poly>| PointPillars (PP) + ViT | \<pp_vit> | | ✅ | 0.80 | 0.88 |
496
  | Pix2Poly |\<pix2poly>| PP+ViT \& ViT | \<fusion_vit> | ✅ |✅ | 0.78 | 0.85 | -->
497
 
498
+ ### Setup
499
+
500
+ The project supports hydra configuration which allows to modify any parameter either from a `.yaml` file of directly from the command line.
501
+
502
+ To setup the project structure we recommend to specify your `$DATA_ROOT` and `$MODEL_ROOT` in `config/host/default.yaml`.
503
 
504
+ To view all available configuration options run
 
505
  ```
506
+ python scripts/train.py --help
507
  ```
508
 
 
509
 
 
510
 
511
+ <!-- The most important parameters are described below:
512
+ <details>
513
+ <summary>CLI Parameters</summary>
514
 
515
+ ```text
516
+ ├── processed-flag-all
517
+ ├── processed-flag-CH
518
+ └── processed-flag-NY
519
+ ... (8 files total)
520
  ```
521
 
522
+ </details> -->
523
 
524
+ ### Predict a single tile
525
+
526
+ TODO
527
 
528
  ```
529
+ python scripts/predict_demo.py
530
+ ```
531
+
532
+ ### Reproduce paper results
533
 
534
+ To reproduce the results from the paper you can run any of the following commands
535
 
536
  ```
537
+ python scripts/modality_ablation.py
538
+ python scripts/lidar_density_ablation.py
539
+ python scripts/all_countries.py
540
  ```
 
541
 
542
+ ### Custom training, prediction and evaluation
543
 
544
+ We recommend to first setup a custom `$EXP_FILE` in `config/experiment` following the structure of one of the existing experiment files, e.g. `ffl_fusion.yaml`. You can then run:
545
 
546
+ ```
547
+ # train your model (on multiple GPUs)
548
+ torchrun --nproc_per_node=$NUM_GPU scripts/train.py experiment=$EXP_FILE
549
+ # predict the test set with your model (on multiple GPUs)
550
+ torchrun --nproc_per_node=$NUM_GPU scripts/predict.py evaluation=test checkpoint=best_val_iou
551
+ # evaluate your prediction of the test set
552
+ python scripts/evaluate.py model=<model> evaluation=test checkpoint=best_val_iou
553
+ ```
554
 
555
+ You could also continue training from a provided pretrained model with
556
 
557
+ ```
558
+ # train your model (on a single GPU)
559
+ python scripts/train.py experiment=p2p_fusion checkpoint=latest
560
+ ```
561
 
562
  ## Citation
563
 
564
  If you find our work useful, please consider citing:
565
  ```bibtex
566
+ TODO
567
  ```
568
 
569
  ## Acknowledgements
pix2poly/224/v0_all_bs4x16/.hydra/config.yaml CHANGED
@@ -1,117 +1,32 @@
1
  host:
2
- name: jeanzay
3
- data_root: /lustre/fswork/projects/rech/cso/uku93eu/data
4
- update_pbar_every: 60
 
 
 
 
5
  run_type:
6
- name: release
7
  batch_size: 16
8
- train_subset: null
9
- val_subset: null
10
- test_subset: null
11
- logging: INFO
12
- num_workers: 16
13
- log_to_wandb: true
14
- polygonization:
15
- method:
16
- - acm
17
- common_params:
18
- init_data_level: 0.5
19
- simple_method:
20
- data_level: 0.5
21
- tolerance:
22
- - 1.0
23
- seg_threshold: 0.5
24
- min_area: 10
25
- asm_method:
26
- init_method: skeleton
27
- data_level: 0.5
28
- loss_params:
29
- coefs:
30
- step_thresholds:
31
- - 0
32
- - 100
33
- - 200
34
- - 300
35
- data:
36
- - 1.0
37
- - 0.1
38
- - 0.0
39
- - 0.0
40
- crossfield:
41
- - 0.0
42
- - 0.05
43
- - 0.0
44
- - 0.0
45
- length:
46
- - 0.1
47
- - 0.01
48
- - 0.0
49
- - 0.0
50
- curvature:
51
- - 0.0
52
- - 0.0
53
- - 1.0
54
- - 0.0
55
- corner:
56
- - 0.0
57
- - 0.0
58
- - 0.5
59
- - 0.0
60
- junction:
61
- - 0.0
62
- - 0.0
63
- - 0.5
64
- - 0.0
65
- curvature_dissimilarity_threshold: 2
66
- corner_angles:
67
- - 45
68
- - 90
69
- - 135
70
- corner_angle_threshold: 22.5
71
- junction_angles:
72
- - 0
73
- - 45
74
- - 90
75
- - 135
76
- junction_angle_weights:
77
- - 1
78
- - 0.01
79
- - 0.1
80
- - 0.01
81
- junction_angle_threshold: 22.5
82
- lr: 0.1
83
- gamma: 0.995
84
- device: cuda
85
- tolerance:
86
- - 1
87
- seg_threshold: 0.5
88
- min_area: 10
89
- acm_method:
90
- steps: 500
91
- data_level: 0.5
92
- data_coef: 0.1
93
- length_coef: 0.4
94
- crossfield_coef: 0.5
95
- poly_lr: 0.01
96
- warmup_iters: 100
97
- warmup_factor: 0.1
98
- device: cuda
99
- tolerance:
100
- - 1
101
- seg_threshold: 0.5
102
- min_area: 10
103
  dataset:
104
- name: lidarpoly
105
  size: ${..experiment.encoder.in_size}
106
- path: ${host.data_root}/${.name}/${.size}
107
  annotations:
108
- train: ${..path}/annotations_${...country}_train.json
109
- val: ${..path}/annotations_${...country}_val.json
110
- test: ${..path}/annotations_${...country}_test.json
111
  ffl_stats:
112
- train: ${..path}/ffl/train/stats-${...country}.pt
113
- val: ${..path}/ffl/val/stats-${...country}.pt
114
- test: ${..path}/ffl/test/stats-${...country}.pt
115
  train_subset: ${..run_type.train_subset}
116
  val_subset: ${..run_type.val_subset}
117
  test_subset: ${..run_type.test_subset}
@@ -135,7 +50,7 @@ experiment:
135
  out_feature_height: 28
136
  vit:
137
  type: vit_small_patch${..patch_size}_${..in_size}.dino
138
- checkpoint_file: ${....host.data_root}/checkpoints/backbones/dino_deitsmall8_pretrain.pth
139
  pretrained: true
140
  patch_size: 8
141
  patch_feature_size: 28
@@ -185,26 +100,21 @@ experiment:
185
  weight_decay: 0.0001
186
  name: v0_all_bs4x16
187
  group_name: v2_${.model.name}
188
- output_dir: ${.host.data_root}/${.experiment.model.name}_outputs/${.dataset.name}/${.experiment.encoder.in_size}/${.experiment.name}
189
- checkpoint: null
190
- checkpoint_file: null
191
- save_best: true
192
- save_latest: true
193
- save_every: 10
194
- val_every: 1
195
- best_val_loss: 10000000.0
196
- best_val_iou: 0.0
197
- multi_gpu: true
198
- device: cuda
199
- log_to_wandb: true
200
- num_workers: ${.run_type.num_workers}
201
- update_pbar_every: ${.host.update_pbar_every}
202
- country: all
203
- use_lidar: ${.experiment.encoder.use_lidar}
204
- use_images: ${.experiment.encoder.use_images}
205
- eval:
206
  split: val
207
- pred_file: ${..output_dir}/predictions_${..country}_${.split}/${..checkpoint}.json
208
  modes:
209
  - iou
210
  eval_file: results/metrics
 
 
 
 
 
1
  host:
2
+ name: gin
3
+ data_root: /data/rsulzer/${..dataset.name}
4
+ model_root: /data/rsulzer/${..dataset.name}_output
5
+ multi_gpu: false
6
+ device: cuda
7
+ update_pbar_every: 1
8
+ ldof_exe: /user/rsulzer/home/cpp/line-DOF-metric/build/calculate_DoF
9
  run_type:
10
+ name: debug
11
  batch_size: 16
12
+ train_subset: 256
13
+ val_subset: 32
14
+ test_subset: 32
15
+ logging: DEBUG
16
+ num_workers: 0
17
+ log_to_wandb: false
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
  dataset:
19
+ name: PixelsPointsPolygons
20
  size: ${..experiment.encoder.in_size}
21
+ path: ${host.data_root}/data/${.size}
22
  annotations:
23
+ train: ${..path}/annotations/annotations_${...experiment.country}_train.json
24
+ val: ${..path}/annotations/annotations_${...experiment.country}_val.json
25
+ test: ${..path}/annotations/annotations_${...experiment.country}_test.json
26
  ffl_stats:
27
+ train: ${..path}/ffl/train/stats-${...experiment.country}.pt
28
+ val: ${..path}/ffl/val/stats-${...experiment.country}.pt
29
+ test: ${..path}/ffl/test/stats-${...experiment.country}.pt
30
  train_subset: ${..run_type.train_subset}
31
  val_subset: ${..run_type.val_subset}
32
  test_subset: ${..run_type.test_subset}
 
50
  out_feature_height: 28
51
  vit:
52
  type: vit_small_patch${..patch_size}_${..in_size}.dino
53
+ checkpoint_file: ${....host.model_root}/backbones/dino_deitsmall8_pretrain.pth
54
  pretrained: true
55
  patch_size: 8
56
  patch_feature_size: 28
 
100
  weight_decay: 0.0001
101
  name: v0_all_bs4x16
102
  group_name: v2_${.model.name}
103
+ country: all
104
+ training:
105
+ save_best: true
106
+ save_latest: true
107
+ save_every: 10
108
+ val_every: 1
109
+ best_val_loss: 10000000.0
110
+ best_val_iou: 0.0
111
+ evaluation:
 
 
 
 
 
 
 
 
 
112
  split: val
113
+ pred_file: ${..output_dir}/predictions_${..experiment.country}_${.split}/${..checkpoint}.json
114
  modes:
115
  - iou
116
  eval_file: results/metrics
117
+ experiment.name: debug
118
+ output_dir: ${.host.model_root}/${.experiment.model.name}/${.experiment.encoder.in_size}/${.experiment.name}
119
+ checkpoint: null
120
+ num_workers: ${.run_type.num_workers}
pix2poly/224/v0_all_bs4x16/.hydra/hydra.yaml CHANGED
@@ -112,18 +112,13 @@ hydra:
112
  hydra:
113
  - hydra.mode=RUN
114
  task:
115
- - log_to_wandb=true
116
- - host=jz
117
- - run_type=release
118
- - multi_gpu=true
119
- - checkpoint=null
120
- - experiment=p2p_fusion
121
- - experiment.name=v0_all_bs4x16
122
- - country=all
123
  job:
124
  name: train
125
  chdir: null
126
- override_dirname: checkpoint=null,country=all,experiment.name=v0_all_bs4x16,experiment=p2p_fusion,host=jz,log_to_wandb=true,multi_gpu=true,run_type=release
127
  id: ???
128
  num: ???
129
  config_name: config
@@ -137,26 +132,27 @@ hydra:
137
  runtime:
138
  version: 1.3.2
139
  version_base: '1.3'
140
- cwd: /lustre/fswork/projects/rech/cso/uku93eu/python/PixelsPointsPolygons
141
  config_sources:
142
  - path: hydra.conf
143
  schema: pkg
144
  provider: hydra
145
- - path: /lustre/fswork/projects/rech/cso/uku93eu/python/PixelsPointsPolygons/config
146
  schema: file
147
  provider: main
148
  - path: ''
149
  schema: structured
150
  provider: schema
151
- output_dir: /lustre/fswork/projects/rech/cso/uku93eu/data/pix2poly_outputs/lidarpoly/224/v0_all_bs4x16
152
  choices:
 
 
153
  experiment: p2p_fusion
154
  [email protected]: pix2poly
155
  [email protected]: early_fusion_vit
156
- dataset: lidarpoly
157
- polygonization: asm_acm
158
- run_type: release
159
- host: jz
160
  hydra/env: default
161
  hydra/callbacks: null
162
  hydra/job_logging: default
 
112
  hydra:
113
  - hydra.mode=RUN
114
  task:
115
+ - run_type=debug
116
+ - host=gin
117
+ - run_type.log_to_wandb=false
 
 
 
 
 
118
  job:
119
  name: train
120
  chdir: null
121
+ override_dirname: host=gin,run_type.log_to_wandb=false,run_type=debug
122
  id: ???
123
  num: ???
124
  config_name: config
 
132
  runtime:
133
  version: 1.3.2
134
  version_base: '1.3'
135
+ cwd: /run/netsop/u/home-sam/home/rsulzer/remote_python/pixelspointspolygons
136
  config_sources:
137
  - path: hydra.conf
138
  schema: pkg
139
  provider: hydra
140
+ - path: /run/netsop/u/home-sam/home/rsulzer/remote_python/pixelspointspolygons/config
141
  schema: file
142
  provider: main
143
  - path: ''
144
  schema: structured
145
  provider: schema
146
+ output_dir: /data/rsulzer/PixelsPointsPolygons_output/pix2poly/224/v0_all_bs4x16
147
  choices:
148
+ evaluation: val
149
+ training: default
150
  experiment: p2p_fusion
151
  [email protected]: pix2poly
152
  [email protected]: early_fusion_vit
153
+ dataset: p3
154
+ run_type: debug
155
+ host: gin
 
156
  hydra/env: default
157
  hydra/callbacks: null
158
  hydra/job_logging: default
pix2poly/224/v0_all_bs4x16/.hydra/overrides.yaml CHANGED
@@ -1,8 +1,3 @@
1
- - log_to_wandb=true
2
- - host=jz
3
- - run_type=release
4
- - multi_gpu=true
5
- - checkpoint=null
6
- - experiment=p2p_fusion
7
- - experiment.name=v0_all_bs4x16
8
- - country=all
 
1
+ - run_type=debug
2
+ - host=gin
3
+ - run_type.log_to_wandb=false