Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

VISIONx @ NYU

university
https://www.sainingxie.com/
Activity Feed

AI & ML interests

None defined yet.

Recent Activity

AustinWang0330  new activity about 3 hours ago
nyu-visionx/siglip2_decoder:RAE repo fails when using google/siglip2-so400m-patch14-224 as encoder
bytetriper  new activity 1 day ago
nyu-visionx/siglip2_decoder:RAE repo fails when using google/siglip2-so400m-patch14-224 as encoder
AustinWang0330  updated a collection 3 days ago
Scale RAE
View all activity

Papers

Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders

SIMS-V: Simulated Instruction-Tuning for Spatial Video Understanding

View all Papers

Ellis Brown's profile picture Peter Tong's profile picture Manoj Middepogu's profile picture Sai Charitha Akula's profile picture Penghao Wu's profile picture Jihan Yang's profile picture Saining Xie's profile picture Bingda Tang's profile picture BoYang Zheng's profile picture Sayak Paul's profile picture Shusheng Yang's profile picture Chenyu, Li's profile picture Anjali W Gupta's profile picture Xichen Pan's profile picture Pinzhi Huang's profile picture Nanye Ma's profile picture Jaskirat Singh's profile picture Ziteng Wang's profile picture Junwan Kim's profile picture
nyu-visionx 's Papers 4
Submitted by
BoYang Zheng
51

Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders

nyu-visionx VISIONx @ NYU
202 2
Submitted by
Ellis Brown
5

SIMS-V: Simulated Instruction-Tuning for Spatial Video Understanding

nyu-visionx VISIONx @ NYU
9 2
Submitted by
Jihan Yang
8

Benchmark Designers Should "Train on the Test Set" to Expose Exploitable Non-Visual Shortcuts

nyu-visionx VISIONx @ NYU
2
Submitted by
Peter Tong
166

Diffusion Transformers with Representation Autoencoders

nyu-visionx VISIONx @ NYU
1.75k 6
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs