Update README.md
Browse files
README.md
CHANGED
|
@@ -38,7 +38,7 @@ This is the SFT + DPO version of Mixtral Hermes 2, we will also be providing an
|
|
| 38 |
- GPT4All
|
| 39 |
- AGIEval
|
| 40 |
- BigBench
|
| 41 |
-
-
|
| 42 |
3. [Prompt Format](#prompt-format)
|
| 43 |
4. [Inference Example Code](#inference-code)
|
| 44 |
5. [Quantized Models](#quantized-models)
|
|
@@ -131,14 +131,6 @@ BigBench:
|
|
| 131 |
```
|
| 132 |
Average: 49.70
|
| 133 |
|
| 134 |
-
TruthfulQA:
|
| 135 |
-
```
|
| 136 |
-
| Task |Version|Metric|Value | |Stderr|
|
| 137 |
-
|-------------|------:|------|-----:|---|-----:|
|
| 138 |
-
|truthfulqa_mc| 1|mc1 |0.4162|± |0.0173|
|
| 139 |
-
| | |mc2 |0.5783|± |0.0151|
|
| 140 |
-
```
|
| 141 |
-
|
| 142 |
## GPT4All
|
| 143 |
|
| 144 |
|
|
@@ -148,9 +140,11 @@ TruthfulQA:
|
|
| 148 |
## BigBench Reasoning Test
|
| 149 |
|
| 150 |
|
| 151 |
-
##
|
| 152 |
|
|
|
|
| 153 |
|
|
|
|
| 154 |
|
| 155 |
# Prompt Format
|
| 156 |
|
|
|
|
| 38 |
- GPT4All
|
| 39 |
- AGIEval
|
| 40 |
- BigBench
|
| 41 |
+
- Comparison to Mixtral-Instruct
|
| 42 |
3. [Prompt Format](#prompt-format)
|
| 43 |
4. [Inference Example Code](#inference-code)
|
| 44 |
5. [Quantized Models](#quantized-models)
|
|
|
|
| 131 |
```
|
| 132 |
Average: 49.70
|
| 133 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 134 |
## GPT4All
|
| 135 |
|
| 136 |
|
|
|
|
| 140 |
## BigBench Reasoning Test
|
| 141 |
|
| 142 |
|
| 143 |
+
## Comparison to Mixtral Instruct:
|
| 144 |
|
| 145 |
+
Our benchmarks show gains in many benchmarks against Mixtral Instruct v0.1, on average, beating the flagship Mixtral model.
|
| 146 |
|
| 147 |
+

|
| 148 |
|
| 149 |
# Prompt Format
|
| 150 |
|