Fix various snippets; add required safe_serialization
#2
by
tomaarsen
HF Staff
- opened
Hello!
Pull Request overview
- Fix various snippets: point to
nomic-ai/nomic-embed-text-v1.5rather than"."ornomic-ai/nomic-embed-text-v1. - Add
safe_serialization=True.
Details
The serialization parameter is required because of this line. Without safe_serialization=True, it will only allow loading models with pytorch_model.bin, and your model is uploaded in the newer model.safetensors format.
- Tom Aarsen
tomaarsen
changed pull request status to
open
agh thank you for this !
zpn
changed pull request status to
merged
Feel free to test this with:
import torch.nn.functional as F
from sentence_transformers import SentenceTransformer
matryoshka_dim = 512
model = SentenceTransformer("nomic-ai/nomic-embed-text-v1.5", trust_remote_code=True, revision="refs/pr/2")
sentences = ['search_query: What is TSNE?', 'search_query: Who is Laurens van der Maaten?']
embeddings = model.encode(sentences, convert_to_tensor=True)
embeddings = F.layer_norm(embeddings, normalized_shape=(embeddings.shape[1],))
embeddings = embeddings[:, :matryoshka_dim]
embeddings = F.normalize(embeddings, p=2, dim=1)
print(embeddings)
- Tom Aarsen