Scope: Selective Cross-modal Orchestration of Visual Perception Experts Paper • 2510.12974 • Published Oct 14
VeritasFi: An Adaptable, Multi-tiered RAG Framework for Multi-modal Financial Question Answering Paper • 2510.10828 • Published Oct 12 • 1
Improving GUI Grounding with Explicit Position-to-Coordinate Mapping Paper • 2510.03230 • Published Oct 3 • 3
WebMMU: A Benchmark for Multimodal Multilingual Website Understanding and Code Generation Paper • 2508.16763 • Published Aug 22 • 2
Enquire One's Parent and Child Before Decision: Fully Exploit Hierarchical Structure for Self-Supervised Taxonomy Expansion Paper • 2101.11268 • Published Jan 27, 2021
Improving Context Fidelity via Native Retrieval-Augmented Reasoning Paper • 2509.13683 • Published Sep 17 • 8
MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation Paper • 2406.07529 • Published Jun 11, 2024
FinSage: A Multi-aspect RAG System for Financial Filings Question Answering Paper • 2504.14493 • Published Apr 20
MultiFinBen: A Multilingual, Multimodal, and Difficulty-Aware Benchmark for Financial LLM Evaluation Paper • 2506.14028 • Published Jun 16 • 93
System-1.5 Reasoning: Traversal in Language and Latent Spaces with Dynamic Shortcuts Paper • 2505.18962 • Published May 25 • 12
FACT: Examining the Effectiveness of Iterative Context Rewriting for Multi-fact Retrieval Paper • 2410.21012 • Published Oct 28, 2024
R$^3$Mem: Bridging Memory Retention and Retrieval via Reversible Compression Paper • 2502.15957 • Published Feb 21
GraphOmni: A Comprehensive and Extendable Benchmark Framework for Large Language Models on Graph-theoretic Tasks Paper • 2504.12764 • Published Apr 17 • 41
Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems Paper • 2504.01990 • Published Mar 31 • 300
AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding Paper • 2502.01341 • Published Feb 3 • 39
BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks Paper • 2412.04626 • Published Dec 5, 2024 • 14
LongRecipe: Recipe for Efficient Long Context Generalization in Large Languge Models Paper • 2409.00509 • Published Aug 31, 2024 • 42
Resonance RoPE: Improving Context Length Generalization of Large Language Models Paper • 2403.00071 • Published Feb 29, 2024 • 24