HERMES: KV Cache as Hierarchical Memory for Efficient Streaming Video Understanding Paper • 2601.14724 • Published 2 days ago • 52
DiRL: An Efficient Post-Training Framework for Diffusion Language Models Paper • 2512.22234 • Published Dec 23, 2025 • 20
Consensus Entropy: Harnessing Multi-VLM Agreement for Self-Verifying and Self-Improving OCR Paper • 2504.11101 • Published Apr 15, 2025 • 1
DiRL: An Efficient Post-Training Framework for Diffusion Language Models Paper • 2512.22234 • Published Dec 23, 2025 • 20
Beyond Real: Imaginary Extension of Rotary Position Embeddings for Long-Context LLMs Paper • 2512.07525 • Published Dec 8, 2025 • 59
GeoVista: Web-Augmented Agentic Visual Reasoning for Geolocalization Paper • 2511.15705 • Published Nov 19, 2025 • 97
view article Article The Heterogeneous Feature of RoPE-based Attention in Long-Context LLMs Nov 15, 2025 • 13
MPJudge: Towards Perceptual Assessment of Music-Induced Paintings Paper • 2511.07137 • Published Nov 10, 2025 • 6
MPJudge: Towards Perceptual Assessment of Music-Induced Paintings Paper • 2511.07137 • Published Nov 10, 2025 • 6 • 2
MPJudge: Towards Perceptual Assessment of Music-Induced Paintings Paper • 2511.07137 • Published Nov 10, 2025 • 6
Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm Paper • 2511.04570 • Published Nov 6, 2025 • 212
IFDECORATOR: Wrapping Instruction Following Reinforcement Learning with Verifiable Rewards Paper • 2508.04632 • Published Aug 6, 2025 • 2
Sel3DCraft: Interactive Visual Prompts for User-Friendly Text-to-3D Generation Paper • 2508.00428 • Published Aug 1, 2025 • 3 • 2
Sel3DCraft: Interactive Visual Prompts for User-Friendly Text-to-3D Generation Paper • 2508.00428 • Published Aug 1, 2025 • 3
IFDECORATOR: Wrapping Instruction Following Reinforcement Learning with Verifiable Rewards Paper • 2508.04632 • Published Aug 6, 2025 • 2 • 2
IFDECORATOR: Wrapping Instruction Following Reinforcement Learning with Verifiable Rewards Paper • 2508.04632 • Published Aug 6, 2025 • 2
Sel3DCraft: Interactive Visual Prompts for User-Friendly Text-to-3D Generation Paper • 2508.00428 • Published Aug 1, 2025 • 3
Consensus Entropy: Harnessing Multi-VLM Agreement for Self-Verifying and Self-Improving OCR Paper • 2504.11101 • Published Apr 15, 2025 • 1