mistralai
/

Pixtral-Large-Instruct-2411

Model card Files Files and versions

patrickvonplaten commited on Nov 15, 2024

Commit

120679d

·

verified ·

1 Parent(s): b339928

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -234,7 +234,7 @@ messages = [
 sampling_params = SamplingParams(max_tokens=128_000)
 # note that running this model on GPU requires over 300 GB of GPU RAM
-llm = LLM(model=model_name, tokenizer_mode="mistral", tensor_parallel=8, limit_mm_per_prompt={"image": 4})
 outputs = llm.chat(messages, sampling_params=sampling_params)
@@ -249,7 +249,7 @@ You can also use Pixtral-Large-Instruct-2411 in a server/client setting.
 1. Spin up a server:
 ```
-vllm serve mistralai/Pixtral-Large-Instruct-2411 --tokenizer_mode mistral --limit_mm_per_prompt 'image=4'
 ```
 2. And ping the client:

 sampling_params = SamplingParams(max_tokens=128_000)
 # note that running this model on GPU requires over 300 GB of GPU RAM
+llm = LLM(model=model_name, tokenizer_mode="mistral", tensor_parallel_size=8, limit_mm_per_prompt={"image": 4})
 outputs = llm.chat(messages, sampling_params=sampling_params)
 1. Spin up a server:
 ```
+vllm serve mistralai/Pixtral-Large-Instruct-2411 --tokenizer_mode mistral --limit_mm_per_prompt 'image=4' --tensor_parallel_size 8
 ```
 2. And ping the client: