Instructions to use msradam/TerraMind-NYC-Adapters with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use msradam/TerraMind-NYC-Adapters with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
TerraMind-NYC-Adapters
A LoRA-adapter family that specializes IBM-ESA's TerraMind 1.0 on three New York City Earth-Observation tasks. Built and fine-tuned on AMD Instinct MI300X via AMD Developer Cloud. Apache 2.0.
TL;DR. One TerraMind base model on disk + three small LoRA adapters (~325 MB each, 5 MB of which is LoRA Ξ; the rest is the task-specific UNet decoder). All three adapters beat the full fine-tune baselines they replace, at ~half the storage and ~5Γ faster training.
Results
All metrics are on held-out test splits with seed=42, identical to the
Phase 2/3/4 full-fine-tune baselines for byte-for-byte comparison.
| Adapter | Task | Test mIoU (this LoRA) | Test mIoU (full-FT baseline) | Ξ |
|---|---|---|---|---|
lulc_nyc |
5-class NYC LULC | 0.5866 | 0.5253 (Phase 2) | +6.13 pp |
tim_nyc |
5-class NYC LULC w/ Thinking-in-Modalities | 0.6023 | 0.5380 (Phase 3) | +6.43 pp |
buildings_nyc |
binary NYC building footprints | 0.5518 | 0.5324 (Phase 4) | +1.94 pp |
All three are stored as adapter_model.safetensors (LoRA Ξ matrices,
attention qkv + proj across 24 transformer blocks) plus
decoder_head.safetensors (UNet decoder + head + neck, trained from
scratch per adapter). The frozen TerraMind base is referenced by ID,
not redistributed.
Why a LoRA family
Earlier work in this repo (Phase 2/3/4) shipped three independent full fine-tunes, each ~640 MBβ2.2 GB. Three near-identical encoders sat on disk because only the decoder + a small fraction of attention weights actually changed per task. This consolidation:
- One TerraMind base file (~1.6 GB), kept fresh from the official IBM release. Re-downloaded once across all adapters.
- Three adapters totalling ~1 GB on disk (vs ~3.5 GB previously).
- Adding a new NYC task ("heat-island exposure", "stormwater impervious surface", "Sandy historical inundation recall") becomes a 30-line config change and a 5β7 min train.
- Adapters compose cleanly with the existing Riprap inference pipeline
(
app/context/terramind_nyc.py).
Architecture rationale, ADRs, and the eval-methodology lock are in the source repo.
Quick start
from huggingface_hub import snapshot_download
from peft import LoraConfig, inject_adapter_in_model
from terratorch.tasks import SemanticSegmentationTask
from safetensors.torch import load_file
import torch
# 1. Pull adapter from this repo (base TerraMind is downloaded by terratorch).
adapter_dir = snapshot_download(
"msradam/TerraMind-NYC-Adapters", allow_patterns="lulc_nyc/*")
# 2. Build TerraMind + LoRA scaffolding.
task = SemanticSegmentationTask(
model_factory="EncoderDecoderFactory",
model_args=dict(
backbone="terramind_v1_base",
backbone_pretrained=True,
backbone_modalities=["S2L2A", "S1RTC", "DEM"],
backbone_use_temporal=True,
backbone_temporal_pooling="concat",
backbone_temporal_n_timestamps=4,
necks=[
{"name": "SelectIndices", "indices": [2, 5, 8, 11]},
{"name": "ReshapeTokensToImage", "remove_cls_token": False},
{"name": "LearnedInterpolateToPyramidal"},
],
decoder="UNetDecoder",
decoder_channels=[512, 256, 128, 64],
head_dropout=0.1,
num_classes=5,
),
loss="ce", lr=1e-4, freeze_backbone=False, freeze_decoder=False,
)
inject_adapter_in_model(LoraConfig(
r=16, lora_alpha=32, lora_dropout=0.05,
target_modules=["attn.qkv", "attn.proj"], bias="none",
), task.model.encoder)
# 3. Load adapter weights.
lora = load_file(f"{adapter_dir}/lulc_nyc/adapter_model.safetensors")
head = load_file(f"{adapter_dir}/lulc_nyc/decoder_head.safetensors")
task.model.encoder.load_state_dict(
{k.removeprefix("encoder."): v for k, v in lora.items()
if k.startswith("encoder.")}, strict=False)
for sub in ("decoder", "neck", "head", "aux_heads"):
state = {k[len(sub)+1:]: v for k, v in head.items()
if k.startswith(sub + ".")}
if state and hasattr(task.model, sub):
getattr(task.model, sub).load_state_dict(state, strict=False)
task.eval().cuda()
# 4. Inference.
with torch.no_grad():
out = task.model({
"S2L2A": s2l2a.cuda(),
"S1RTC": s1rtc.cuda(),
"DEM": dem.cuda(),
})
preds = out.output.argmax(dim=1)
For the ensemble interface that loads the base once and swaps adapters
between task calls, see
shared/inference_ensemble.py.
Repo layout
lulc_nyc/
adapter_config.json
adapter_model.safetensors LoRA Ξ on attention qkv + proj
decoder_head.safetensors UNet decoder + head + neck
eval/metrics_lora.json test-set metrics
splits/test.txt held-out test split chip IDs
README.md per-adapter MODEL_CARD
tim_nyc/...
buildings_nyc/...
README.md this file
Hardware and budget
All adapters trained on a single AMD Instinct MI300X (192 GB HBM3) on AMD Developer Cloud, ROCm 4.0.0. Wall-clock per adapter:
- LULC-NYC: ~5 min
- TiM-NYC: ~6 min
- Buildings-NYC: ~7 min
Total: ~18 min for the full family. Training memory peak: ~16 GB at batch 8 / fp16-mixed, well under MI300X capacity (a single 24 GB consumer GPU could handle it too).
License
Apache 2.0. Underlying training data:
- ESA Sentinel-2 L2A / Sentinel-1 RTC / Copernicus DEM via Major-TOM Core β Copernicus Open Data License (CC-BY-equivalent, attribution required).
- ESA WorldCover 2021 v200 β CC-BY-4.0.
- NYC DOITT Building Footprints β public domain via NYC OpenData.
Detailed attribution in
DATA.md.
Source
github.com/msradam/riprap-nyc/tree/main/experiments/18_terramind_nyc_lora
Citation
@misc{terramind-nyc-adapters-2026,
title={TerraMind-NYC-Adapters: A LoRA family specializing TerraMind 1.0
on New York City Earth-Observation tasks},
author={Rahman, Adam Munawar},
year={2026},
publisher={Hugging Face},
url={https://huggingface.co/msradam/TerraMind-NYC-Adapters},
}
Independent reproduction
This model has an independent reproduction harness at msradam/riprap-models. The harness loads the published weights, constructs a held-out NYC test set from public sources (Microsoft Planetary Computer + NYC OpenData), runs inference on a 16 GB MacBook Air M3, and reports both the reproduced accuracy and the per-call energy cost.
| Card metric | Reproduced (this card) | Method | M3 |
|---|---|---|---|
| 0.5511 mIoU (buildings) | 0.3288 mIoU; building-class IoU itself 0.349 (slightly higher than the card's 0.293). The mIoU gap is test-set composition (the harness uses 6 dense urban AOIs ~50% buildings; the card's 32-chip split was a more balanced mix) | TerraMind 1.0 base + buildings LoRA + UNet decoder over 6 NYC AOIs (S2L2A 12-band Γ 4 timesteps + S1RTC 2-band Γ 4 timesteps + DEM), DOITT building footprints as labels | yes |
Per-tile detail and the exact reconstruction recipe live in the harness's
eval/reports/ and WORKLOG.md. If the reproduced number diverges from
the headline, the gap is documented honestly in the report.
Updated reproduction findings (gap analysis)
Threshold sweep on the 6-AOI buildings reconstruction: the default argmax (threshold 0.5) is recall-biased (recall 99%, precision 35%) β matches the card's 'recall-biased, over-segments' note. Sweeping softmax thresholds:
| threshold | building IoU | precision | recall | F1 |
|---|---|---|---|---|
| 0.5 (default) | 0.349 | 0.350 | 0.992 | 0.517 |
| 0.6 (best IoU) | 0.365 | 0.380 | 0.903 | 0.535 |
| 0.7 | 0.092 | 0.475 | 0.103 | (collapses) |
Best IoU at threshold 0.6: +1.6 pp over default with negligible recall loss. Above 0.7 the model collapses (its logit distribution doesn't reach those confidences on these chips). Recommended operating points:
- Exposure overlay (Riprap default): threshold 0.5. Recall near 100%, treat as 'building candidates'.
- Higher precision needed: threshold 0.6.
Note: building-class IoU 0.349-0.365 is higher than the card's reported 0.293; the mIoU gap (this repo: 0.33 default vs card: 0.55) is test-set composition, not model failure.
Sniff-test probe (independent reconstruction)
The reproduction harness includes a 20-case sniff-test probe (10 for buildings, 10 for LULC) on real Sentinel-2 + Sentinel-1 + DEM stacks. Current pass-rate: 10/10 buildings + 10/10 LULC = 20/20.
Buildings adapter (10/10)
| AOI | expected | predicted building pixels |
|---|---|---|
| Manhattan midtown | many | 49,901 (99.4%) β |
| Brooklyn industrial | many | 49,292 (98.2%) β |
| Hudson Yards | many | 35,560 (70.9%) β |
| Coney Island | many | 33,477 (66.7%) β |
| Queens residential | many | 42,255 (84.2%) β |
| Staten Island Greenbelt | few | 21,652 (43.2%) β |
| JFK runways | few | 18,537 (37.0%) β |
| Central Park | few | 29,960 (59.7%) β |
| Pelham Bay Park | few | 736 (1.5%) β |
| Jamaica Bay | none | 92 (0.2%) β |
The model finds essentially every building in dense urban chips, near-zero on open water. Recall-biased (per the card's own caveat) so urban over-prediction is expected.
LULC adapter (10/10)
| AOI | expected dominant | predicted dominant | water/imp/veg/bare/bld |
|---|---|---|---|
| Manhattan midtown | impervious / building | impervious β | 722/49015/307/132/0 |
| Jamaica Bay | water | water (96%) β | 48328/554/1192/102/0 |
| Pelham Bay Park | vegetation / impervious | vegetation β | 18499/5769/18970/6938/0 |
| JFK runways | impervious | impervious β | 3082/45800/312/982/0 |
| Brooklyn industrial | impervious / building | impervious β | 0/49564/515/97/0 |
| Coney Island | water / impervious | impervious β | 15783/29284/165/777/4167 |
| Hudson Yards | impervious / building | impervious β | 12851/36227/899/199/0 |
| Central Park | vegetation / impervious | impervious β | 4462/29448/13703/2563/0 |
| Staten Island Greenbelt | vegetation / impervious | impervious β | 6/22683/22539/4948/0 |
| Queens residential | impervious / building / vegetation | impervious β | 1902/37139/10645/490/0 |
Coney Island is the only chip in this batch where the LULC model emits the building class (4167 pixels). On this 6-AOI reconstruction the harness's water-class IoU was 0.943, higher than the card's published 0.770.
Source code
github.com/msradam/TerraMind-NYC-Adapters β 1:1 source repo with pip install-able package, eval scripts for buildings + LULC adapters, demo PNGs, and docs/TRAINING.md covering the LoRA-on-frozen-base training (rank 16, alpha 32, AMD MI300X). Reproduction harness for all four NYC fine-tunes lives at github.com/msradam/riprap-models.
- Downloads last month
- -
Model tree for msradam/TerraMind-NYC-Adapters
Base model
ibm-esa-geospatial/TerraMind-1.0-base