Error when run by TensorRT-LLM
#1
by
k-l-lambda
- opened
[08/04/2025-03:48:22] [TRT-LLM] [E] Failed to initialize executor on rank 0: w1_weight_scale_2 != w3_weight_scale_2
[08/04/2025-03:48:22] [TRT-LLM] [E] Traceback (most recent call last):
File "/code/tensorrt_llm/tensorrt_llm/executor/worker.py", line 700, in worker_main
worker: ExecutorBindingsWorker = worker_cls(
^^^^^^^^^^^
File "/code/tensorrt_llm/tensorrt_llm/executor/worker.py", line 128, in __init__
self.engine = _create_engine()
^^^^^^^^^^^^^^^^
File "/code/tensorrt_llm/tensorrt_llm/executor/worker.py", line 126, in _create_engine
return create_executor(**args)
^^^^^^^^^^^^^^^^^^^^^^^
File "/code/tensorrt_llm/tensorrt_llm/_torch/pyexecutor/py_executor_creator.py", line 73, in create_py_executor
model_engine = PyTorchModelEngine(
^^^^^^^^^^^^^^^^^^^
File "/code/tensorrt_llm/tensorrt_llm/_torch/pyexecutor/model_engine.py", line 318, in __init__
self.model = self._load_model(
^^^^^^^^^^^^^^^^^
File "/code/tensorrt_llm/tensorrt_llm/_torch/pyexecutor/model_engine.py", line 974, in _load_model
model.load_weights(weights)
File "/code/tensorrt_llm/tensorrt_llm/_torch/models/modeling_qwen3_moe.py", line 326, in load_weights
module.load_weights(weights=[updated_module_weights])
File "/code/tensorrt_llm/tensorrt_llm/_torch/modules/fused_moe.py", line 1336, in load_weights
self._load_nvfp4_scales(weights)
File "/code/tensorrt_llm/tensorrt_llm/_torch/modules/fused_moe.py", line 1660, in _load_nvfp4_scales
load_expert_fc31_alpha_nvfp4(w1_weight_scale_2, w3_weight_scale_2,
File "/code/tensorrt_llm/tensorrt_llm/_torch/modules/fused_moe.py", line 1613, in load_expert_fc31_alpha_nvfp4
assert torch.allclose(
^^^^^^^^^^^^^^^
AssertionError: w1_weight_scale_2 != w3_weight_scale_2
Any ideas?