ProtoReasoning: Prototypes as the Foundation for Generalizable Reasoning in LLMs Paper • 2506.15211 • Published Jun 18, 2025 • 38
MME-Reasoning: A Comprehensive Benchmark for Logical Reasoning in MLLMs Paper • 2505.21327 • Published May 27, 2025 • 83
NovelSeek: When Agent Becomes the Scientist -- Building Closed-Loop System from Hypothesis to Verification Paper • 2505.16938 • Published May 22, 2025 • 120
Beyond Theorem Proving: Formulation, Framework and Benchmark for Formal Problem-Solving Paper • 2505.04528 • Published May 7, 2025 • 12
Formal Problem-Solving Collection This collection is part of the official implementation of Beyond Theorem Proving: Formulation, Framework and Benchmark for Formal Problem-Solving. • 5 items • Updated May 8, 2025 • 3
TrustGeoGen: Scalable and Formal-Verified Data Engine for Trustworthy Multi-modal Geometric Problem Solving Paper • 2504.15780 • Published Apr 22, 2025 • 6
SurveyForge: On the Outline Heuristics, Memory-Driven Generation, and Multi-dimensional Evaluation for Automated Survey Writing Paper • 2503.04629 • Published Mar 6, 2025 • 19
GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training Paper • 2412.11863 • Published Dec 16, 2024 • 4
Chimera: Improving Generalist Model with Domain-Specific Experts Paper • 2412.05983 • Published Dec 8, 2024 • 9
CDM: A Reliable Metric for Fair and Accurate Formula Recognition Evaluation Paper • 2409.03643 • Published Sep 5, 2024 • 19