Instructions to use opensearch-project/opensearch-neural-sparse-encoding-doc-v3-gte with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use opensearch-project/opensearch-neural-sparse-encoding-doc-v3-gte with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("opensearch-project/opensearch-neural-sparse-encoding-doc-v3-gte", trust_remote_code=True) sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] - Transformers
How to use opensearch-project/opensearch-neural-sparse-encoding-doc-v3-gte with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("feature-extraction", model="opensearch-project/opensearch-neural-sparse-encoding-doc-v3-gte", trust_remote_code=True)# Load model directly from transformers import AutoModelForMaskedLM model = AutoModelForMaskedLM.from_pretrained("opensearch-project/opensearch-neural-sparse-encoding-doc-v3-gte", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
Unable to deploy model with huggingface tei
Hi, has anyone been able to deploy this with huggingface tei on Sagemaker? I am trying to see how to use this in Sagemaker and do the sparse embedding computation through the endpoint
Hi @dhruv-wrk , I'm not quite familiar with HF tei. will take a look into it. Before we run to a solution, you can try this tutorial to get it deployed on sagemaker https://github.com/opensearch-project/ml-commons/blob/main/docs/model_serving_framework/deploy_sparse_model_to_SageMaker.ipynb
I am facing the same issue as @dhruv-wrk .
Code as given on the model card
import json
import sagemaker
import boto3
from sagemaker.huggingface import HuggingFaceModel, get_huggingface_llm_image_uri
try:
role = sagemaker.get_execution_role()
except ValueError:
iam = boto3.client('iam')
role = iam.get_role(RoleName='sagemaker_execution_role')['Role']['Arn']
# Hub Model configuration. https://huggingface.co/models
hub = {
'HF_MODEL_ID':'opensearch-project/opensearch-neural-sparse-encoding-doc-v3-gte'
}
# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
image_uri=get_huggingface_llm_image_uri("huggingface-tei",version="1.8.2"),
env=hub,
role=role,
)
# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
initial_instance_count=1,
instance_type="ml.g5.2xlarge",
)
# send request
predictor.predict({
"inputs": "My name is Clara and I am",
})
Hi @dhruv-wrk @vishva399 ,
To use SPLADE pooling in TEI, we need to apply one change to @vishva399 's code. I.e. add a "POOLING" field to env.
hub = {
'HF_MODEL_ID':'opensearch-project/opensearch-neural-sparse-encoding-doc-v2-mini',
"POOLING": "splade",
}
Furthermore, the TEI's pooling logics are hard coded https://github.com/huggingface/text-embeddings-inference/blob/9ef569d83083afa30784223d0a0352229d094898/backends/python/server/text_embeddings_server/models/pooling.py#L38 And for v3-series, we're using log1p_relu, which is different from TEI's implementation. So we'd recommend to use v2/v1 series models with TEI.
And to support the new pooling options, we need to create issues or PRs to huggingface/text-embeddings-inference repo.
Thank you for your quick response it was very helpful