Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model Evaluators Paper • 2403.16950 • Published Mar 25, 2024 • 4
TopViewRS: Vision-Language Models as Top-View Spatial Reasoners Paper • 2406.02537 • Published Jun 4, 2024
Fairer Preferences Elicit Improved Human-Aligned Large Language Model Judgments Paper • 2406.11370 • Published Jun 17, 2024
From Few to Many: Self-Improving Many-Shot Reasoners Through Iterative Optimization and Generation Paper • 2502.00330 • Published Feb 1
Multi-Agent Design: Optimizing Agents with Better Prompts and Topologies Paper • 2502.02533 • Published Feb 4 • 3
Linguini: A benchmark for language-agnostic linguistic reasoning Paper • 2409.12126 • Published Sep 18, 2024
LCFO: Long Context and Long Form Output Dataset and Benchmarking Paper • 2412.08268 • Published Dec 11, 2024
Large Concept Models: Language Modeling in a Sentence Representation Space Paper • 2412.08821 • Published Dec 11, 2024 • 17
BOUQuET: dataset, Benchmark and Open initiative for Universal Quality Evaluation in Translation Paper • 2502.04314 • Published Feb 6
view post Post 3424 Hello everyone,I am pleased to announce that I have founded the University of Glasgow organization on Huggingface. If you are affiliated with the University of Glasgow or have a relative who is, you can log in through the relevant link. UniversityofGlasgow 1 reply · 🚀 12 12 + Reply
AutoPEFT: Automatic Configuration Search for Parameter-Efficient Fine-Tuning Paper • 2301.12132 • Published Jan 28, 2023 • 1
Batch Calibration: Rethinking Calibration for In-Context Learning and Prompt Engineering Paper • 2309.17249 • Published Sep 29, 2023
Survival of the Most Influential Prompts: Efficient Black-Box Prompt Search via Clustering and Pruning Paper • 2310.12774 • Published Oct 19, 2023
Multi3WOZ: A Multilingual, Multi-Domain, Multi-Parallel Dataset for Training and Evaluating Culturally Adapted Task-Oriented Dialog Systems Paper • 2307.14031 • Published Jul 26, 2023
XQA-DST: Multi-Domain and Multi-Lingual Dialogue State Tracking Paper • 2204.05895 • Published Apr 12, 2022
A Systematic Study of Performance Disparities in Multilingual Task-Oriented Dialogue Systems Paper • 2310.12892 • Published Oct 19, 2023
On Task Performance and Model Calibration with Supervised and Self-Ensembled In-Context Learning Paper • 2312.13772 • Published Dec 21, 2023