Motif-Technologies
/

Motif-2-12.7B-Instruct

Text Generation

text-generation-inference

Model card Files Files and versions

leejunhyeok commited on 5 days ago

Commit

7f8f6e3

·

verified ·

1 Parent(s): aede6c2

add vllm example in readme

Files changed (1) hide show

README.md +27 -1

README.md CHANGED Viewed

@@ -112,4 +112,30 @@ The capital city of South Korea is Seoul.
 ```
 ## How to use in vllm
-TBD

 ```
 ## How to use in vllm
+Currently, [PR]() for supporting motif model in official vllm package is under review.
+To use our model with vllm, use this [image](https://github.com/motiftechnologies/vllm/pkgs/container/vllm)
+Our model supports 32K seq length
+The [PR](https://github.com/vllm-project/vllm/pull/27396) adding support for the Motif model in the official vLLM package is currently under review.
+In the meantime, to use our model with vLLM, please use the following container [image](https://github.com/motiftechnologies/vllm/pkgs/container/vllm).
+Our model supports a sequence length of up to 32K tokens.
+```bash
+# run vllm api server
+VLLM_ATTENTION_BACKEND="DIFFERENTIAL_FLASH_ATTN" vllm serve Motif-Technologies/Motif-2-12.7B-Instruct --trust-remote-code --data-parallel-size <gpu_count>
+# sending requests with curl
+curl http://localhost:8000/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -d '{
+    "messages": [
+      {"role": "system", "content": "You are a helpful assistant."},
+      {"role": "user", "content": "What is the capital city of South Korea?"}
+    ],
+    "temperature": 0.6,
+    "skip_special_tokens": false,
+    "chat_template_kwargs": {
+        "enable_thinking": true
+    }
+  }'
+```