Seeing the Forest and the Trees: Query-Aware Tokenizer for Long-Video Multimodal Language Models Paper • 2511.11910 • Published 15 days ago • 35
CMI-Bench: A Comprehensive Benchmark for Evaluating Music Instruction Following Paper • 2506.12285 • Published Jun 14 • 53