https://huggingface.co/swap-uniba/LLaMAntino-2-70b-hf-UltraChat-ITA

#1411

by Rubertigno - opened Sep 25, 2025

Sep 25, 2025

Please add GGUF quantized.
Many Thank in advance for your wonderful work

Rubertigno changed discussion title from Request: swap-uniba/LLaMAntino-2-70b-hf-UltraChat-ITA to swap-uniba/LLaMAntino-2-70b-hf-UltraChat-ITA Sep 25, 2025

Rubertigno changed discussion title from swap-uniba/LLaMAntino-2-70b-hf-UltraChat-ITA to https://huggingface.co/swap-uniba/LLaMAntino-2-70b-hf-UltraChat-ITA Sep 25, 2025

mradermacher

Owner Sep 25, 2025

Unfortunately, we tried before and it failed, it seems the tokenizer does not match the model:

LLaMAntino-2-70b-hf-UltraChat-ITA WARNING:hf-to-gguf:ignore token 32000: id is out of range, max=31999
LLaMAntino-2-70b-hf-UltraChat-ITA File "/root/cvs/llama.cpp/convert_hf_to_gguf.py", line 823, in _create_vocab_sentencepiece
LLaMAntino-2-70b-hf-UltraChat-ITA if toktypes[token_id] != SentencePieceTokenTypes.UNUSED:
LLaMAntino-2-70b-hf-UltraChat-ITA ~~~~~~~~^^^^^^^^^^
LLaMAntino-2-70b-hf-UltraChat-ITA IndexError: list index out of range

We have a preprocessor that tries to fix such things, I'll see if I can extend it for this architecture.

Rubertigno

Sep 25, 2025

many thanks

mradermacher

Owner Sep 25, 2025

It should work, you can watch the status of the model at http://hf.tst.eu/status.html (imatrix quants aqre delayed due to another big model).

mradermacher

Owner Sep 26, 2025

Unfortunately, imatrix generation failed, and I am not sure llama.cpp can load the static quants. I am not convinced this error is a problem with the model per-se, so I will keep the static quants.

/llmjob/llama.cpp-cuda512/tools/imatrix/imatrix.cpp:915: GGML_ASSERT(!llama_vocab_get_add_eos(vocab)) failed

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment