interesting paper
updated
FMViT: A multiple-frequency mixing Vision Transformer
Paper
• 2311.05707
• Published
• 7
DocLLM: A layout-aware generative language model for multimodal document
understanding
Paper
• 2401.00908
• Published
• 189
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report
Paper
• 2405.00732
• Published
• 122
An Introduction to Vision-Language Modeling
Paper
• 2405.17247
• Published
• 90
TabReD: A Benchmark of Tabular Machine Learning in-the-Wild
Paper
• 2406.19380
• Published
• 49
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems
Paper
• 2407.01370
• Published
• 89
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse
Synthetic Data and Global-to-Local Adaptive Perception
Paper
• 2410.12628
• Published
• 41
Document Parsing Unveiled: Techniques, Challenges, and Prospects for
Structured Information Extraction
Paper
• 2410.21169
• Published
• 30