3 14 19

Zhongwei Zhang

zzwustc

zzw-ustc

AI & ML interests

AIGC

Recent Activity

upvoted a paper 27 days ago

FARMER: Flow AutoRegressive Transformer over Pixels

upvoted a paper about 2 months ago

Ovi: Twin Backbone Cross-Modal Fusion for Audio-Video Generation

liked a model about 2 months ago

chetwinlow1/Ovi

View all activity

Organizations

upvoted a paper 27 days ago

FARMER: Flow AutoRegressive Transformer over Pixels

Paper • 2510.23588 • Published 27 days ago • 57

upvoted a paper about 2 months ago

Ovi: Twin Backbone Cross-Modal Fusion for Audio-Video Generation

Paper • 2510.01284 • Published Sep 30 • 32

upvoted an article 4 months ago

Article

You could have designed state of the art positional encoding

Nov 25, 2024

•

398

upvoted 2 papers 6 months ago

JavisDiT: Joint Audio-Video Diffusion Transformer with Hierarchical Spatio-Temporal Prior Synchronization

Paper • 2503.23377 • Published Mar 30 • 57

MotionPro: A Precise Motion Controller for Image-to-Video Generation

Paper • 2505.20287 • Published May 26 • 20

upvoted a paper 11 months ago

Wonderland: Navigating 3D Scenes from a Single Image

Paper • 2412.12091 • Published Dec 16, 2024 • 16

upvoted a paper 12 months ago

Stable Flow: Vital Layers for Training-Free Image Editing

Paper • 2411.14430 • Published Nov 21, 2024 • 21

upvoted 2 papers about 1 year ago

GenXD: Generating Any 3D and 4D Scenes

Paper • 2411.02319 • Published Nov 4, 2024 • 20

ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning

Paper • 2411.05003 • Published Nov 7, 2024 • 71

upvoted 2 papers over 1 year ago

CAT3D: Create Anything in 3D with Multi-View Diffusion Models

Paper • 2405.10314 • Published May 16, 2024 • 48

TRIP: Temporal Residual Learning with Image Noise Prior for Image-to-Video Diffusion Models

Paper • 2403.17005 • Published Mar 25, 2024 • 13

upvoted 3 papers almost 2 years ago

Zhongwei Zhang

AI & ML interests

Recent Activity

Organizations

zzwustc's activity

You could have designed state of the art positional encoding