Modality Curation: Building Universal Embeddings for Advanced Multimodal Information Retrieval Paper • 2505.19650 • Published May 26 • 5
TUNA: Comprehensive Fine-grained Temporal Understanding Evaluation on Dense Dynamic Videos Paper • 2505.20124 • Published May 26
From Pixels to Tokens: Revisiting Object Hallucinations in Large Vision-Language Models Paper • 2410.06795 • Published Oct 9, 2024
Evil Geniuses: Delving into the Safety of LLM-based Agents Paper • 2311.11855 • Published Nov 20, 2023