Not able to load model with dotnet (using latest version of runtime)

by fwaris - opened Aug 19

Discussion

fwaris

Aug 19

•

edited Aug 19

<PackageReference Include="Microsoft.ML.OnnxRuntimeGenAI.Cuda" Version="0.9.0" />

Issue with 'input size 12'

Microsoft.ML.OnnxRuntimeGenAI.OnnxRuntimeGenAIException: 'Load model from E:\s\models\gpt-oss-20b-onnx\cuda\cuda-int4-kquant-block-32-mixed\model.onnx failed:This is an invalid model. In Node, ("/model/layers.0/attn/GroupQueryAttention", GroupQueryAttention, "com.microsoft", -1) : ("/model/layers.0/attn/qkv_proj/Add/output_0": tensor(float16),"","","past_key_values.0.key": tensor(float16),"past_key_values.0.value": tensor(float16),"/model/attn_mask_reformat/attn_mask_subgraph/Sub/Cast/output_0": tensor(int32),"/model/attn_mask_reformat/attn_mask_subgraph/Gather/Cast/output_0": tensor(int32),"cos_cache": tensor(float16),"sin_cache": tensor(float16),"","","model.layers.0.attn.sinks": tensor(float16),) -> ("/model/layers.0/attn/GroupQueryAttention/output_0": tensor(float16),"present.0.key": tensor(float16),"present.0.value": tensor(float16),) , Error Node(/model/layers.0/attn/GroupQueryAttention) with schema(com.microsoft::GroupQueryAttention:1) has input size 12 not in range [min=7, max=11].'

fwaris

Aug 19

ok i see from a previous discussion that I need to get the nightly build. Let me figure that out

fwaris

Aug 19

•

edited Aug 19

still seeing the same issue with the nightly version from:
https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT-Nightly/nuget/v3/index.json

    <PackageReference Include="Microsoft.ML.OnnxRuntime.Gpu" Version="1.23.0-dev-20250429-1449-a9a3ad2e0c" />	  
    <PackageReference Include="Microsoft.ML.OnnxRuntimeGenAI.Cuda" Version="0.9.0" />

The date on the package shows as 'Tuesday, April 29, 2025 (4/29/2025)'. Maybe the build pipeline is broken?

Conleyp

Sep 10

•

edited Sep 10

Any resolution on this issue?
I have converted the openai/gpt-oss-20b model with the builder.py script from the 0.9.1 onnxruntime-genai repo and still get the error " Error Node(/model/layers.0/attn/GroupQueryAttention) with schema(com.microsoft::GroupQueryAttention:1) has input size 12 not in range [min=7, max=11]."
But now with a hint that doesn't apply: "Hint: Remove or correct 'custom_ops_library' entry in the model's config.json if the referenced DLL (e.g. onnxruntime_vitis_ai_custom_ops.dll) is not present or needed for your selected execution provider.
You are using provider: cuda. VitisAI custom ops are optional unless targeting VitisAI hardware."

Running with csproj packages:
PackageReference Include="Microsoft.ML.OnnxRuntime.Gpu" Version="1.23.0-dev-20250429-1449-a9a3ad2e0c"
PackageReference Include="Microsoft.ML.OnnxRuntimeGenAI.Cuda" Version="0.9.1"

kvaishnavi

ONNX Runtime org Sep 19

Your ONNX Runtime version does not contain the latest changes. See here for more information.

kvaishnavi changed discussion status to closed Sep 19

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment