view article Article Introducing MTEB v2: Evaluation of embedding and retrieval systems for more than just text By isaacchung and 2 others • 11 days ago • 33
view article Article Introducing RTEB: A New Standard for Retrieval Evaluation about 1 month ago • 118
BrowseComp-Plus: A More Fair and Transparent Evaluation Benchmark of Deep-Research Agent Paper • 2508.06600 • Published Aug 8 • 40
view article Article Training and Finetuning Sparse Embedding Models with Sentence Transformers v5 Jul 1 • 126
Fixing Data That Hurts Performance: Cascading LLMs to Relabel Hard Negatives for Robust Information Retrieval Paper • 2505.16967 • Published May 22 • 24
RLHN Datasets Collection RLHN: Cleaned Training Datasets with False Negatives Identified & Relabeled as ground truth. • 5 items • Updated May 23 • 4
Multilingual SFT & DPO Datasets Collection These SFT or DPO datasets were translated from English using the Mistral-7B-Instruct-v0.2 or taken from other sources. • 8 items • Updated Mar 31 • 3
FreshStack: Building Realistic Benchmarks for Evaluating Retrieval on Technical Documents Paper • 2504.13128 • Published Apr 17 • 7
Multimodal DSE Retrievers Collection A collection of DSE models for multimodal retrieval • 5 items • Updated Apr 15 • 15
🌐 NoMIRACL Dataset [EMNLP'24] Collection A collection of multilingual relevance assessment datasets. We also have SFT fine-tuned models (Mistral-7B & Llama-3 8B) • 7 items • Updated Mar 31 • 1
🏜️MIRAGE-Bench [NAACL'25] Collection Dataset Collection from the MIRAGE-Bench paper • 13 items • Updated Mar 31 • 2
DRAMA Collection A collection of small (sub-1B) multilingual dense retrievers that generalize well across a number of tasks and languages. • 3 items • Updated Feb 26 • 7
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22 • 420
NoMIRACL: Knowing When You Don't Know for Robust Multilingual Retrieval-Augmented Generation Paper • 2312.11361 • Published Dec 18, 2023 • 1
MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark Paper • 2406.01574 • Published Jun 3, 2024 • 51
view article Article Training and Finetuning Embedding Models with Sentence Transformers v3 May 28, 2024 • 257