arxiv:2512.11150
Eddie Landesberg
elandy
AI & ML interests
None yet
Recent Activity
updated
a dataset
about 1 month ago
elandy/cje-chatbot-arena
published
a dataset
about 1 month ago
elandy/cje-chatbot-arena
submitted
a paper
about 1 month ago
Causal Judge Evaluation: Calibrated Surrogate Metrics for LLM Systems
Organizations
None yet