4DGT / README.md

zhaoyang-lv-meta

Update README.md

a959425 verified 2 months ago

preview code

raw

history blame contribute delete

1.84 kB

metadata

license: cc-by-nc-sa-4.0

4DGT Model Card

Model Details

4DGT (4D Gaussian Transformer) is a neural network model that learns dynamic 3D Gaussian representations from monocular videos. It uses a transformer-based architecture to predict 4D Gaussians from a dynamic scenes observed from an egocentric video.

Paper: 4DGT: Learning a 4D Gaussian Transformer Using Real-World Monocular Videos
Project Page: https://4dgt.github.io/
Github: GitHub repository

Please refer to the project page and github for more details of the model.

Citation

@inproceedings{xu20254dgt,
    title     = {4DGT: Learning a 4D Gaussian Transformer Using Real-World Monocular Videos},
    author    = {Xu, Zhen and Li, Zhengqin and Dong, Zhao and Zhou, Xiaowei and Newcombe, Richard and Lv, Zhaoyang},
    journal   = {arXiv preprint arXiv:2506.08015},
    year      = {2025}
}

Model Files

Checkpoint: `4dgt_full.pth`

Size: ~14.5 GB
Format: PyTorch state dict
Contents:
- The full model trained as described in the paper.
- Encoder weights (DINOv2 backbone)
- Level of Details Transformer
- 4D Gaussian Decoder

Checkpoint: `4dgt_1st_stage.pth`

Size: ~4.85 GB
Format: PyTorch state dict
Contents:
- The first stage model trained only using Egoexo4D dataset as described in the paper.
- Encoder weights (DINOv2 backbone)
- Vanilla Transformer, no level of details.
- 4D Gaussian Decoder

Quick Start

Please refer to 4DGT GitHub repository for the full set up.

Contact

For questions and issues, please open an issue on the GitHub repository.