Language Is Not All You Need: Aligning Perception with Language Models Paper • 2302.14045 • Published Feb 27, 2023
LayoutLM: Pre-training of Text and Layout for Document Image Understanding Paper • 1912.13318 • Published Dec 31, 2019 • 4
TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering Paper • 2311.16465 • Published Nov 28, 2023 • 2
LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding Paper • 2012.14740 • Published Dec 29, 2020 • 2
LayoutXLM: Multimodal Pre-training for Multilingual Visually-rich Document Understanding Paper • 2104.08836 • Published Apr 18, 2021
LayoutReader: Pre-training of Text and Layout for Reading Order Detection Paper • 2108.11591 • Published Aug 26, 2021 • 1
TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models Paper • 2109.10282 • Published Sep 21, 2021 • 11
MarkupLM: Pre-training of Text and Markup Language for Visually-rich Document Understanding Paper • 2110.08518 • Published Oct 16, 2021 • 2
DiT: Self-supervised Pre-training for Document Image Transformer Paper • 2203.02378 • Published Mar 4, 2022 • 2
LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking Paper • 2204.08387 • Published Apr 18, 2022 • 5
Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models Paper • 2404.03622 • Published Apr 4, 2024 • 5
XDoc: Unified Pre-training for Cross-Format Document Understanding Paper • 2210.02849 • Published Oct 6, 2022
TableBank: A Benchmark Dataset for Table Detection and Recognition Paper • 1903.01949 • Published Mar 5, 2019
RedStone: Curating General, Code, Math, and QA Data for Large Language Models Paper • 2412.03398 • Published Dec 4, 2024 • 2
MMLU-CF: A Contamination-free Multi-task Language Understanding Benchmark Paper • 2412.15194 • Published Dec 19, 2024 • 1
PEACE: Empowering Geologic Map Holistic Understanding with MLLMs Paper • 2501.06184 • Published Jan 10
Think Only When You Need with Large Hybrid-Reasoning Models Paper • 2505.14631 • Published May 20 • 20
Model as a Game: On Numerical and Spatial Consistency for Generative Games Paper • 2503.21172 • Published Mar 27