Qwen
/

Qwen3-30B-A3B-Thinking-2507

Text Generation

Model card Files Files and versions

hzhwcmhf commited on Aug 17

Commit

144afc2

·

verified ·

1 Parent(s): 328ebc6

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -245,9 +245,9 @@ After updating the config, proceed with either **vLLM** or **SGLang** for servin
 To run Qwen with 1M context support:
 ```bash
-git clone https://github.com/vllm-project/vllm.git
-cd vllm
-pip install -e .
 ```
 Then launch the server with Dual Chunk Flash Attention enabled:

 To run Qwen with 1M context support:
 ```bash
+pip install -U vllm \
+    --torch-backend=auto \
+    --extra-index-url https://wheels.vllm.ai/nightly
 ```
 Then launch the server with Dual Chunk Flash Attention enabled: