ONNX Model

Converted from: granite-embedding-reranker-english-r2

Files

  • model.onnx - FP32 version
  • model_quantized.onnx - INT8 quantized version
  • *.json - tokenizer and config files

Usage

from transformers import AutoTokenizer
import onnxruntime as ort

tokenizer = AutoTokenizer.from_pretrained("granite-onnx")
session = ort.InferenceSession("granite-onnx/model_quantized.onnx")
Downloads last month
6
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Zenabius/granite-embedding-reranker-english-r2