Fix various snippets; add required safe_serialization

by tomaarsen HF Staff - opened Feb 14, 2024

base: refs/heads/main

←

from: refs/pr/2

Discussion Files changed

-4

tomaarsen

Nomic AI org Feb 14, 2024

•

edited Feb 14, 2024

Hello!

Pull Request overview

Fix various snippets: point to nomic-ai/nomic-embed-text-v1.5 rather than "." or nomic-ai/nomic-embed-text-v1.
Add safe_serialization=True.

Details

The serialization parameter is required because of this line. Without safe_serialization=True, it will only allow loading models with pytorch_model.bin, and your model is uploaded in the newer model.safetensors format.

Tom Aarsen

Fix various snippets; add required safe_serialization6d202215

tomaarsen changed pull request status to open Feb 14, 2024

zpn

Feb 14, 2024

agh thank you for this !

zpn changed pull request status to merged Feb 14, 2024

tomaarsen

Nomic AI org Feb 14, 2024

Feel free to test this with:

import torch.nn.functional as F
from sentence_transformers import SentenceTransformer

matryoshka_dim = 512

model = SentenceTransformer("nomic-ai/nomic-embed-text-v1.5", trust_remote_code=True, revision="refs/pr/2")
sentences = ['search_query: What is TSNE?', 'search_query: Who is Laurens van der Maaten?']
embeddings = model.encode(sentences, convert_to_tensor=True)
embeddings = F.layer_norm(embeddings, normalized_shape=(embeddings.shape[1],))
embeddings = embeddings[:, :matryoshka_dim]
embeddings = F.normalize(embeddings, p=2, dim=1)
print(embeddings)

Tom Aarsen

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment