ViBT
Collection
3 items
โข
Updated
โข
2
This repository introduces Vision Bridge Transformer (ViBT), a large-scale instantiation of Brownian Bridge Models designed for efficient conditional generation. ViBT directly models the trajectory between inputs and outputs, creating an efficient data-to-data translation paradigm. The models demonstrate effectiveness for various image and video translation tasks, including instruction-based image editing and complex video translation.