HUME: Measuring the Human-Model Performance Gap in Text Embedding Task Paper • 2510.10062 • Published 16 days ago • 8
FocusAgent: Simple Yet Effective Ways of Trimming the Large Context of Web Agents Paper • 2510.03204 • Published 24 days ago • 6
FocusAgent: Simple Yet Effective Ways of Trimming the Large Context of Web Agents Paper • 2510.03204 • Published 24 days ago • 6
LineRetriever: Planning-Aware Observation Reduction for Web Agents Paper • 2507.00210 • Published Jun 30 • 6
LineRetriever: Planning-Aware Observation Reduction for Web Agents Paper • 2507.00210 • Published Jun 30 • 6
Maintaining MTEB: Towards Long Term Usability and Reproducibility of Embedding Benchmarks Paper • 2506.21182 • Published Jun 26 • 2
view article Article MIEB: The Benchmark That Stress-Tests Image-Text Embeddings Like Never Before By isaacchung and 2 others • Apr 24 • 14
MTEB-French Collection A collection of datasets used for the French sentence embedding benchmark. • 6 items • Updated Apr 17