Update README.md
Browse files
README.md
CHANGED
|
@@ -224,9 +224,9 @@ After updating the config, proceed with either **vLLM** or **SGLang** for servin
|
|
| 224 |
To run Qwen with 1M context support:
|
| 225 |
|
| 226 |
```bash
|
| 227 |
-
|
| 228 |
-
|
| 229 |
-
|
| 230 |
```
|
| 231 |
|
| 232 |
Then launch the server with Dual Chunk Flash Attention enabled:
|
|
|
|
| 224 |
To run Qwen with 1M context support:
|
| 225 |
|
| 226 |
```bash
|
| 227 |
+
pip install -U vllm \
|
| 228 |
+
--torch-backend=auto \
|
| 229 |
+
--extra-index-url https://wheels.vllm.ai/nightly
|
| 230 |
```
|
| 231 |
|
| 232 |
Then launch the server with Dual Chunk Flash Attention enabled:
|