REVISOR: Beyond Textual Reflection, Towards Multimodal Introspective Reasoning in Long-Form Video Understanding Paper • 2511.13026 • Published 14 days ago • 24
Q-Frame: Query-aware Frame Selection and Multi-Resolution Adaptation for Video-LLMs Paper • 2506.22139 • Published Jun 27 • 2
HyperClick: Advancing Reliable GUI Grounding via Uncertainty Calibration Paper • 2510.27266 • Published about 1 month ago • 20