Improve model card: Update pipeline tag, add library name, and detailed content

#2
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +135 -3
README.md CHANGED
@@ -1,8 +1,140 @@
1
  ---
2
  license: cc-by-nc-4.0
3
- pipeline_tag: image-to-image
 
4
  ---
5
 
6
- Models required for [TC-Light: Temporally Consistent Relighting for Dynamic Long Videos](https://huggingface.co/papers/2506.18904).
 
 
 
 
 
 
7
 
8
- Project page: https://dekuliutesla.github.io/tclight/
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-nc-4.0
3
+ pipeline_tag: video-to-video
4
+ library_name: diffusers
5
  ---
6
 
7
+ <p align="center">
8
+ <h1 align="center"><strong> TC-Light: Temporally Coherent Generative
9
+ Rendering for Realistic World Transfer</strong></h1>
10
+ <p align="center">
11
+ <em>Institute of Automation, Chinese Academy of Sciences; University of Chinese Academy of Sciences</em>
12
+ </p>
13
+ </p>
14
 
15
+ <div id="top" align="center">
16
+
17
+ [![](https://img.shields.io/badge/%F0%9F%9A%80%20-Project%20Page-blue)](https://dekuliutesla.github.io/tclight/)
18
+ [![arXiv](https://img.shields.io/badge/arXiv-2506.18904-b31b1b.svg)](https://huggingface.co/papers/2506.18904)
19
+ ![GitHub Repo stars](https://img.shields.io/github/stars/Linketic/TC-Light)
20
+
21
+ </div>
22
+
23
+ <div align="center">
24
+ <img src="https://github.com/user-attachments/assets/9fc9c6ce-a83c-4ca5-9273-7cb672c99452" alt="TC-Light Demo" style="max-width: 100%;"/>
25
+ </div>
26
+
27
+ This repository contains the official implementations of **TC-Light**, a one-shot model used to manipulate the illumination distribution of video and realize **realistic world transfer**, presented in the paper [TC-Light: Temporally Coherent Generative Rendering for Realistic World Transfer](https://huggingface.co/papers/2506.18904).
28
+
29
+ The code is available at: [https://github.com/Linketic/TC-Light](https://github.com/Linketic/TC-Light)
30
+
31
+ It's especially suitable for **high-dynamic videos** such as motion-rich actions and frequent switch of foreground and background objects. It is distinguished by:
32
+
33
+ - πŸ”₯ Outstanding Temporal Consistency on Highly Dynamic Scenarios.
34
+ - πŸ”₯ Superior Computational Efficiency that Enables Long Video Processing (can process 300 frames with resolution of 1280x720 on 40G A100).
35
+
36
+ These features make it particularly valuable for sim2real and real2real augmentation for Embodied Agents or preparing video pairs to train stronger video relighting models. Star ⭐ us if you like it!
37
+
38
+ ## Abstract
39
+
40
+ Illumination and texture editing are critical dimensions for world-to-world transfer, which is valuable for applications including sim2real and real2real visual data scaling up for embodied AI. Existing techniques generatively re-render the input video to realize the transfer, such as video relighting models and conditioned world generation models. Nevertheless, these models are predominantly limited to the domain of training data (e.g., portrait) or fall into the bottleneck of temporal consistency and computation efficiency, especially when the input video involves complex dynamics and long durations. In this paper, we propose TC-Light, a novel generative renderer to overcome these problems. Starting from the video preliminarily relighted by an inflated video relighting model, it optimizes appearance embedding in the first stage to align global illumination. Then it optimizes the proposed canonical video representation, i.e., Unique Video Tensor (UVT), to align fine-grained texture and lighting in the second stage. To comprehensively evaluate performance, we also establish a long and highly dynamic video benchmark. Extensive experiments show that our method enables physically plausible re-rendering results with superior temporal coherence and low computation cost. The code and video demos are available at this https URL .
41
+
42
+ ## πŸ’‘ Method
43
+
44
+ <div align="center">
45
+ <img src='https://github.com/Linketic/TC-Light/raw/main/assets/pipeline.png' alt="TC-Light Pipeline"/>
46
+ </div>
47
+
48
+ <b>TC-Light</b> overview. Given the source video and text prompt p, the model tokenizes input latents in xy plane and yt plane seperately. The predicted noises are combined together for denoising. Its output then undergoes two-stage optimization. The first stage aligns exposure by optimizing appearance embedding. The second stage aligns detailed texture and illumination by optimizing <b>Unique Video Tensor</b>, which is compressed version of video Please refer to the paper for more details.
49
+
50
+ ## πŸ’Ύ Preparation
51
+
52
+ Install the required environment as follows:
53
+ ```bash
54
+ git clone https://github.com/Linketic/TC-Light.git
55
+ cd TC-Light
56
+ conda create -n tclight python=3.10
57
+ conda activate tclight
58
+ pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121
59
+ pip install -r requirements.txt
60
+ ```
61
+ Then download required model weights to `./models` from the following links:
62
+
63
+ - **Hugging Face**: https://huggingface.co/TeslaYang123/TC-Light
64
+ - **Baidu Netdisk**: https://pan.baidu.com/s/1L-mk6Ilzd2o7KLAc7-gIHQ?pwd=rj99
65
+
66
+ ## ⚑ Quick Start
67
+
68
+ As a quick start, you can use:
69
+ ```bash
70
+ # support .mp4, .gif, .avi, and folder containing sequential images
71
+ # --multi_axis enables decayed multi-axis denoising, which enhances consistency but slow down the diffusion process
72
+ python run.py -i /path/to/your/video -p "your_prompt" \
73
+ -n "your_negative_prompt" \ # optional
74
+ --multi_axis # optional
75
+ ```
76
+ By default, it will relight the first 30 frames with resolution 960x720. The default negative prompt is adopted from [Cosmos-Transfer1](https://github.com/nvidia-cosmos/cosmos-transfer1), which makes the edited illumination as real as possible. If it is the first-time running on a specific video, it would generate and save flow un the path to your video.
77
+
78
+ For a fine-grained control, you can customize your .yaml config file and run:
79
+ ```bash
80
+ python run.py --config path/to/your_config.yaml
81
+ ```
82
+ You can start from [configs/tclight_custom.yaml](https://github.com/Linketic/TC-Light/blob/main/configs/tclight_custom.yaml), which records the most frequently used parameters and detailed explanation.
83
+
84
+ <details>
85
+ <summary><span style="font-weight: bold;">Examples</span></summary>
86
+
87
+ #### relight the entire field of view
88
+ ```bash
89
+ python run.py --config configs/examples/tclight_droid.yaml
90
+ ```
91
+ ```bash
92
+ python run.py --config configs/examples/tclight_navsim.yaml
93
+ ```
94
+ ```bash
95
+ python run.py --config configs/examples/tclight_scand.yaml
96
+ ```
97
+
98
+ #### relight all three videos parallelly
99
+ ```bash
100
+ bash scripts/relight.sh
101
+ ```
102
+
103
+ #### relight foreground with static background condition
104
+ ```bash
105
+ # we generate compatible background image by using foreground mode of IC-Light, then remove foreground and inpaint the image with tools like sider.ai
106
+ # for satisfactory results, a consistent and complete foreground segmentation is preferred, and we use BriaRMBG as default.
107
+ python run.py --config configs/examples/tclight_bkgd_robotwin.yaml
108
+ ```
109
+ </details>
110
+
111
+ For evaluation, you can simply use:
112
+ ```bash
113
+ python evaluate.py --output_dir path/to/your_output_dir --eval_cost
114
+ ```
115
+
116
+ ## πŸ”Ž Behaviors
117
+ 1. Works better on video resolution over 512x512, which is the minimum resolution used to train IC-Light. A higher resolution helps consistency of image intrinsic properties.
118
+ 2. Works relatively better on realistic scenes than synthetics scenes, no matter in temporal consistency or physical plausibility.
119
+ 3. Stuggle to drastically change illumination of night scenarios or hard shadows, as done in IC-Light.
120
+
121
+ ## πŸ“ TODO List
122
+ - [x] Release the arXiv and the project page.
123
+ - [x] Release the code base.
124
+ - [ ] Release the dataset.
125
+
126
+ ## πŸ€— Citation
127
+ If you find this repository useful for your research, please use the following BibTeX entry for citation.
128
+
129
+ ```bibtex
130
+ @article{Liu2025TCLight,
131
+ title={TC-Light: Temporally Coherent Generative Rendering for Realistic World Transfer},
132
+ author={Liu, Yang and Luo, Chuanchen and Tang, Zimo and Li, Yingyan and Yang, Yuran and Ning, Yuanyong and Fan, Lue and Peng, Junran and Zhang, Zhaoxiang},
133
+ journal={arXiv preprint arXiv:2506.18904},
134
+ year={2025},
135
+ }
136
+ ```
137
+
138
+ ## πŸ‘ Acknowledgements
139
+
140
+ This repo benefits from [IC-Light](https://github.com/lllyasviel/IC-Light/), [VidToMe](https://github.com/lixirui142/VidToMe/), [Slicedit](https://github.com/fallenshock/Slicedit/), [RAVE](https://github.com/RehgLab/RAVE), [Cosmos](https://github.com/NVIDIA/Cosmos). Thanks for their great work! The repo is still under development, we are open to pull request and discussions!