A collection of evaluation benchmarks for the Italian language.
Simone Conia
s-conia
AI & ML interests
Natural Language Processing, Multilinguality, Knowledge Graphs, Semantics, Large Language Models
Recent Activity
liked
a dataset
12 days ago
BSC-LT/multi_lmentry
authored
a paper
4 months ago
Right Answer, Wrong Score: Uncovering the Inconsistencies of LLM
Evaluation in Multiple-Choice Question Answering