Instructions to use arcee-ai/SuperNova-Medius with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use arcee-ai/SuperNova-Medius with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="arcee-ai/SuperNova-Medius")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("arcee-ai/SuperNova-Medius")
model = AutoModelForCausalLM.from_pretrained("arcee-ai/SuperNova-Medius")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use arcee-ai/SuperNova-Medius with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "arcee-ai/SuperNova-Medius"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "arcee-ai/SuperNova-Medius",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/arcee-ai/SuperNova-Medius

SGLang

How to use arcee-ai/SuperNova-Medius with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "arcee-ai/SuperNova-Medius" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "arcee-ai/SuperNova-Medius",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "arcee-ai/SuperNova-Medius" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "arcee-ai/SuperNova-Medius",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use arcee-ai/SuperNova-Medius with Docker Model Runner:
```
docker model run hf.co/arcee-ai/SuperNova-Medius
```

llama.cpp convert problem report(about `tokenizer.json`)

by DataSoul - opened Oct 12, 2024

Discussion

DataSoul

Oct 12, 2024

I attempted to convert this model to gguf using the convert_hf_to_gguf.py script from llama.cpp, but encountered an error:

[
FileNotFoundError: File not found: F:\OpensourceAI-models\SuperNova-Medius\tokenizer.model
Exception: data did not match any variant of untagged enum ModelWrapper at line 757443 column 3
]

After downloading tokenizer.json from qwen2.5-14B, replacing the file with the same name in this model's directory with it, I was able to successfully convert the model to gguf.

I made a rough comparison of the two "tokenizer.json" files and found that they are mostly similar except for some formatting differences. This model's tokenizer.json has an additional line "ignore_merges": false, while other parts seem unchanged.

I am unsure of the reason behind this issue, nor do I know if others might encounter a similar problem. Therefore, I report it here for reference.

Crystalcareai

Arcee AI org Oct 12, 2024

I appreciate the report. I’ll loop in @bartowski - as he did our GGUF conversions.

gopi87

Oct 13, 2024

@Crystalcareai i did chat with fp16 gguf but its not doing very well pretty slow tbh

xellDart

Oct 13, 2024

AWQ with dataset calibration?

chargoddard

Arcee AI org Oct 13, 2024

If you update transformers and tokenizers this error should go away.

bartowski

Arcee AI org Oct 13, 2024

I actually did have a problem with the tokenizer but i think because my docker image had a more updated version than my main OS i got past it for the conversion, so yeah tokenizers and/or transformers definitely needs an update

DataSoul

Oct 14, 2024

Thanks for the suggestions. Then, I will close this topic later.😊

DataSoul changed discussion status to closed Oct 14, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment