GGUF for Ollama

#3
by SwimTreeWire - opened

I would like to use this with ollama. How can i make the GGUF from this repo?

This is a new architecture and support hasn't been merged into Llama.cpp yet.

how can this be achieved? can i somehow make a GGUF myself and upload?

Nope, that means someone has to write the support for the model in the backend itself. You can probably sub to https://github.com/ggml-org/llama.cpp/issues/15748 to get updates.

Changes have been merged to llama.cpp and are hopefully coming to Ollama πŸ₯³

I've been experimenting today and wrote about how I experimentally run Apertus in Ollama on my Mac here: https://gist.github.com/pd95/7841bb5d15220773c4ca8666f024c7c9

Swiss AI Initiative org

this is supported now in https://github.com/ggml-org/llama.cpp

you already find many GGUF quantizations on huggingface: https://huggingface.co/models?library=gguf&sort=trending&search=apertus

those will be working in ollama as well soon (but ollama has to first update to use the most recent llama.cpp code).

awesome!

Should now also work with ollama.
https://github.com/ollama/ollama/issues/12149

Confirmed, thanks. See my blog post for more details & instructions https://log.alets.ch/110/#using-ollama

Sign up or log in to comment