NIFE models Collection Nearly Inference Free Embedding (NIFE) models trained using pyNIFE: github.com/stephantul/pynife • 2 items • Updated 20 days ago • 2
view article Article Introducing MTEB v2: Evaluation of embedding and retrieval systems for more than just text Oct 20 • 33
HUME: Measuring the Human-Model Performance Gap in Text Embedding Task Paper • 2510.10062 • Published Oct 11 • 8
Mxbai-large-v1 EmbedPress Collection Large datasets of mxbai-large-v1 embeddings with their truncated texts. Useful for distillation • 13 items • Updated about 1 month ago • 2
The Ultimate Collection of Code Classifiers Collection 🔥 15 classifiers, 124M parameters, one per programming language— for assessing the educational value of GitHub code • 15 items • Updated May 5 • 15
view article Article Introducing EuroBERT: A High-Performance Multilingual Encoder Model Mar 10 • 146
view article Article Agentic RAG Stack (1/5) - Index and retrieve documents for vector search using Sentence Transformers and DuckDB Jan 27 • 21
POTION Collection These are the flagship POTION models. Load them and use them with model2vec (https://github.com/MinishLab/model2vec) or sentence-transformers • 6 items • Updated 11 days ago • 14
NanoBEIR 🍺 Collection A collection of smaller versions of BEIR datasets with 50 queries and up to 10K documents each. • 13 items • Updated Sep 11, 2024 • 23
view article Article Model2Vec: Distill a Small Fast Model from any Sentence Transformer Oct 14, 2024 • 97
Model2Vec base models Collection These are the Minishlab Model2Vec base models. Load them and use them with model2vec (https://github.com/MinishLab/model2vec) or sentence-transformers • 10 items • Updated 11 days ago • 9