Not able to load model with dotnet (using latest version of runtime)

#6
by fwaris - opened

<PackageReference Include="Microsoft.ML.OnnxRuntimeGenAI.Cuda" Version="0.9.0" />

Issue with 'input size 12'

Microsoft.ML.OnnxRuntimeGenAI.OnnxRuntimeGenAIException: 'Load model from E:\s\models\gpt-oss-20b-onnx\cuda\cuda-int4-kquant-block-32-mixed\model.onnx failed:This is an invalid model. In Node, ("/model/layers.0/attn/GroupQueryAttention", GroupQueryAttention, "com.microsoft", -1) : ("/model/layers.0/attn/qkv_proj/Add/output_0": tensor(float16),"","","past_key_values.0.key": tensor(float16),"past_key_values.0.value": tensor(float16),"/model/attn_mask_reformat/attn_mask_subgraph/Sub/Cast/output_0": tensor(int32),"/model/attn_mask_reformat/attn_mask_subgraph/Gather/Cast/output_0": tensor(int32),"cos_cache": tensor(float16),"sin_cache": tensor(float16),"","","model.layers.0.attn.sinks": tensor(float16),) -> ("/model/layers.0/attn/GroupQueryAttention/output_0": tensor(float16),"present.0.key": tensor(float16),"present.0.value": tensor(float16),) , Error Node(/model/layers.0/attn/GroupQueryAttention) with schema(com.microsoft::GroupQueryAttention:1) has input size 12 not in range [min=7, max=11].'

ok i see from a previous discussion that I need to get the nightly build. Let me figure that out

still seeing the same issue with the nightly version from:
https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT-Nightly/nuget/v3/index.json

    <PackageReference Include="Microsoft.ML.OnnxRuntime.Gpu" Version="1.23.0-dev-20250429-1449-a9a3ad2e0c" />	  
    <PackageReference Include="Microsoft.ML.OnnxRuntimeGenAI.Cuda" Version="0.9.0" />

The date on the package shows as 'Tuesday, April 29, 2025 (4/29/2025)'. Maybe the build pipeline is broken?

Any resolution on this issue?
I have converted the openai/gpt-oss-20b model with the builder.py script from the 0.9.1 onnxruntime-genai repo and still get the error " Error Node(/model/layers.0/attn/GroupQueryAttention) with schema(com.microsoft::GroupQueryAttention:1) has input size 12 not in range [min=7, max=11]."
But now with a hint that doesn't apply: "Hint: Remove or correct 'custom_ops_library' entry in the model's config.json if the referenced DLL (e.g. onnxruntime_vitis_ai_custom_ops.dll) is not present or needed for your selected execution provider.
You are using provider: cuda. VitisAI custom ops are optional unless targeting VitisAI hardware."

Running with csproj packages:
PackageReference Include="Microsoft.ML.OnnxRuntime.Gpu" Version="1.23.0-dev-20250429-1449-a9a3ad2e0c"
PackageReference Include="Microsoft.ML.OnnxRuntimeGenAI.Cuda" Version="0.9.1"

ONNX Runtime org

Your ONNX Runtime version does not contain the latest changes. See here for more information.

kvaishnavi changed discussion status to closed

Sign up or log in to comment