Request for reproducible evaluation details for claimed ParseBench result

#1
by boyang-runllama - opened

Hi, thanks for sharing your ParseBench result.

From the model card, it looks like the main information provided is the claimed score on our benchmark, but there are not enough details for us to verify or reproduce the result.

Could you please provide the evaluation details, including:

  • The exact model used
  • Evaluation config and prompts

Without these details, we cannot validate the claimed result or compare it fairly with other submissions on the leaderboard.

Thanks!

openinnovation org
β€’
edited 5 days ago

Hi @boyang-runllama β€” thanks for the note.

oi-OCR is a closed-source solution (it's part of the "OI Doc Intel" platform), so there's no public sign-up page where you could grab an API key on your own. We're working on a PR to run-llama/ParseBench to register an oi_ocr provider so your team can rerun the pipeline end-to-end.

We'd be happy to provision API access for the LlamaIndex team to validate the scores β€” what's the best way to share credentials with you? Email, signed link, whatever works on your side.

Hey @Thiago-cs ! Thanks for the submission and congrats on the great results!
However, we only host open source models in leaderboards. Would it be possible to either remove the result files or publish the weights?

openinnovation org

Hi @SaylorTwift We'll remove the result files from HF.

We'd still like to land on the canonical leaderboard.csv via the closed-source route β€” opening a PR to run-llama/ParseBench that registers an oi_ocr provider so your team can rerun end-to-end on your harness.

Want to confirm that's still the right route on your side before we invest?

Thanks for confirming.

For our leaderboard:

The model or API needs to be publicly accessible, either via open weights or a self-serve API that any user can sign up for.

If the API registration becomes available to general users, we’d be happy to include the result on the leaderboard.

Sign up or log in to comment