embeddinggemma-300M Sentence-Transformers GGUF

This GGUF Model includes all sentence-transformers (dense) modules.

Recommended way to run this model:

llama-server -hf sabafallah/embeddinggemma-300m-sentence-transformers-qat-q4_0-gguf --embeddings

Then the endpoint can be accessed at http://localhost:8080/embedding, for example using curl:

curl --request POST \
    --url http://localhost:8080/embedding \
    --header "Content-Type: application/json" \
    --data '{"input": "task: sentence similarity | query: Hello embeddings"}' \
    --silent

Alternatively, the llama-embedding command line tool can be used:

llama-embedding -hf sabafallah/embeddinggemma-300m-sentence-transformers-qat-q4_0-gguf --verbose-prompt -p "task: sentence similarity | query: Hello embeddings"

Downloads last month: 79

GGUF

Model size

0.3B params

Architecture

gemma-embedding

Hardware compatibility

4-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sabafallah/embeddinggemma-300m-sentence-transformers-qat-q4_0-gguf

Base model

google/embeddinggemma-300m-qat-q4_0-unquantized

Quantized

(6)

this model

Collection including sabafallah/embeddinggemma-300m-sentence-transformers-qat-q4_0-gguf

EmbeddingGemma 300M Sentence Transformers

Collection

3 items • Updated 19 days ago