embeddinggemma-300M Sentence-Transformers GGUF

This GGUF Model includes all sentence-transformers (dense) modules.

Recommended way to run this model:

llama-server -hf sabafallah/embeddinggemma-300m-sentence-transformers-qat-q4_0-gguf --embeddings

Then the endpoint can be accessed at http://localhost:8080/embedding, for example using curl:

curl --request POST \
    --url http://localhost:8080/embedding \
    --header "Content-Type: application/json" \
    --data '{"input": "task: sentence similarity | query: Hello embeddings"}' \
    --silent

Alternatively, the llama-embedding command line tool can be used:

llama-embedding -hf sabafallah/embeddinggemma-300m-sentence-transformers-qat-q4_0-gguf --verbose-prompt -p "task: sentence similarity | query: Hello embeddings"
Downloads last month
79
GGUF
Model size
0.3B params
Architecture
gemma-embedding
Hardware compatibility
Log In to view the estimation

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for sabafallah/embeddinggemma-300m-sentence-transformers-qat-q4_0-gguf

Quantized
(6)
this model

Collection including sabafallah/embeddinggemma-300m-sentence-transformers-qat-q4_0-gguf