-
MM-VID: Advancing Video Understanding with GPT-4V(ision)
Paper • 2310.19773 • Published • 20 -
Fine-grained Audio-Visual Joint Representations for Multimodal Large Language Models
Paper • 2310.05863 • Published • 2 -
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
Paper • 2311.06242 • Published • 95 -
I&S-ViT: An Inclusive & Stable Method for Pushing the Limit of Post-Training ViTs Quantization
Paper • 2311.10126 • Published • 10
Zach Mustafa PRO
Zmu
AI & ML interests
None yet
Recent Activity
upvoted
an
article
about 16 hours ago
New in llama.cpp: Model Management
liked
a model
3 days ago
Qwen/Qwen-Image-Edit-2509
liked
a model
3 days ago
huawei-bayerlab/windowseat-reflection-removal-v1-0