OpenEvals

community

AI & ML interests

LLM evaluation

Recent Activity

SaylorTwift updated a Space 7 days ago

OpenEvals/open_benchmark_index

clefourrier updated a Space 7 days ago

OpenEvals/InferenceProviderTesting

SaylorTwift updated a Space 8 days ago

OpenEvals/evals

View all activity

Articles

Gaia2 and ARE: Empowering the community to study agents

OpenEvals 's Spaces 6

Find a leaderboard

Explore and discover all leaderboards from the HF community

Benchmark Finder

A space to view and inspect all the tasks in lighteval

InferenceProviderTestingBackend

Launch and monitor model evaluation jobs

Evals

Run your LLM evaluations on the hub

Generate a command to run model evaluations

README