HUME: Measuring the Human-Model Performance Gap in Text Embedding Task Paper ⢠2510.10062 ⢠Published Oct 11 ⢠8
Dynaword: From One-shot to Continuously Developed Datasets Paper ⢠2508.02271 ⢠Published Aug 4 ⢠14
TextDescriptives: A Python package for calculating a large variety of metrics from text Paper ⢠2301.02057 ⢠Published Jan 5, 2023
MMTEB: Massive Multilingual Text Embedding Benchmark Paper ⢠2502.13595 ⢠Published Feb 19 ⢠42
Encoder vs Decoder: Comparative Analysis of Encoder and Decoder Language Models on Multilingual NLU Tasks Paper ⢠2406.13469 ⢠Published Jun 19, 2024
Augmenty: A Python Library for Structured Text Augmentation Paper ⢠2312.05520 ⢠Published Dec 9, 2023
DANSK and DaCy 2.6.0: Domain Generalization of Danish Named Entity Recognition Paper ⢠2402.18209 ⢠Published Feb 28, 2024 ⢠1
The Scandinavian Embedding Benchmarks: Comprehensive Assessment of Multilingual and Monolingual Text Embedding Paper ⢠2406.02396 ⢠Published Jun 4, 2024