Not able to load model with dotnet (using latest version of runtime)
 <PackageReference Include="Microsoft.ML.OnnxRuntimeGenAI.Cuda" Version="0.9.0" />
Issue with 'input size 12'
Microsoft.ML.OnnxRuntimeGenAI.OnnxRuntimeGenAIException: 'Load model from E:\s\models\gpt-oss-20b-onnx\cuda\cuda-int4-kquant-block-32-mixed\model.onnx failed:This is an invalid model. In Node, ("/model/layers.0/attn/GroupQueryAttention", GroupQueryAttention, "com.microsoft", -1) : ("/model/layers.0/attn/qkv_proj/Add/output_0": tensor(float16),"","","past_key_values.0.key": tensor(float16),"past_key_values.0.value": tensor(float16),"/model/attn_mask_reformat/attn_mask_subgraph/Sub/Cast/output_0": tensor(int32),"/model/attn_mask_reformat/attn_mask_subgraph/Gather/Cast/output_0": tensor(int32),"cos_cache": tensor(float16),"sin_cache": tensor(float16),"","","model.layers.0.attn.sinks": tensor(float16),) -> ("/model/layers.0/attn/GroupQueryAttention/output_0": tensor(float16),"present.0.key": tensor(float16),"present.0.value": tensor(float16),) , Error Node(/model/layers.0/attn/GroupQueryAttention) with schema(com.microsoft::GroupQueryAttention:1) has input size 12 not in range [min=7, max=11].'
ok i see from a previous discussion that I need to get the nightly build. Let me figure that out
still seeing the same issue with the nightly version from:
https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT-Nightly/nuget/v3/index.json
    <PackageReference Include="Microsoft.ML.OnnxRuntime.Gpu" Version="1.23.0-dev-20250429-1449-a9a3ad2e0c" />	  
    <PackageReference Include="Microsoft.ML.OnnxRuntimeGenAI.Cuda" Version="0.9.0" />
The date on the package shows as 'Tuesday, April 29, 2025 (4/29/2025)'. Maybe the build pipeline is broken?
Any resolution on this issue?
I have converted the openai/gpt-oss-20b model with the builder.py script from the 0.9.1 onnxruntime-genai repo and still get the error " Error Node(/model/layers.0/attn/GroupQueryAttention) with schema(com.microsoft::GroupQueryAttention:1) has input size 12 not in range [min=7, max=11]."
But now with a hint that doesn't apply: "Hint: Remove or correct 'custom_ops_library' entry in the model's config.json if the referenced DLL (e.g. onnxruntime_vitis_ai_custom_ops.dll) is not present or needed for your selected execution provider.
You are using provider: cuda. VitisAI custom ops are optional unless targeting VitisAI hardware."
Running with csproj packages:
  PackageReference Include="Microsoft.ML.OnnxRuntime.Gpu" Version="1.23.0-dev-20250429-1449-a9a3ad2e0c"
  PackageReference Include="Microsoft.ML.OnnxRuntimeGenAI.Cuda" Version="0.9.1"

