MT-Bench Results
#8
by
0-hero
- opened
MT-Bench
| Model | MT-Bench |
|---|---|
| Claude 3 Opus | 9.43 |
| GPT-4-1106-Preview | 9.32 |
| Claude 3 Sonnet | 9.18 |
| WizardLM-2 8x22B | 9.12 |
| GPT-4-0314 | 8.96 |
| Mixtral-8x22B-Instruct-v0.1 | 8.66 |
| zephyr-orpo-141b-A35b-v0.1 | 8.17 |
| Matter-0.2-8x22B | 8.00 |
Nice!
It will be interesting to see more benchmark results here.
I guess Mixtral-8x22B-Instruct-v0.1 is better in multilingualty than WizardLM-2 8x22B.
Maybe merging them can work even better :)