swj0419 commited on
Commit
ba4450e
·
1 Parent(s): e6dca5e
Files changed (1) hide show
  1. README.md +14 -13
README.md CHANGED
@@ -46,20 +46,21 @@ print(tokenizer.decode(out[0]))
46
  ```
47
 
48
  # Evaluation Snapshot
49
- | Model | **MC9** | **Gen5** | **MMLU** | **MMLU Pro** | **AGIEval** | **BBH** | **Math2** | **NewsG** | **PoemG** | **SciRIFF5** | **Code4** | **Avg.** |
50
- |-------|---------|----------|----------|--------------|-------------|---------|-----------|-----------|-----------|--------------|-----------|----------|
51
- | Prev. Public model | 68.7 | 58.8 | 55.9 | 26.2 | 39.9 | 35.7 | 8.2 | 76.0 | 47.8 | 48.1 | 1.1 | **42.4** |
52
  | **Individual** |
53
- | [Math](https://huggingface.co/allenai/Flex-math-2x7B-1T) | 62.5 | 44.3 | 50.6 | 24.1 | 42.0 | 45.6 | **53.1** | 42.6 | 28.0 | 50.7 | 15.8 | **41.8** |
54
- | [Code](https://huggingface.co/allenai/Flex-code-2x7B-1T) | 40.5 | 39.4 | 29.5 | 14.5 | 27.4 | 38.1 | 6.0 | 45.1 | 28.2 | 48.0 | 21.0 | **30.7** |
55
- | Textbook | 64.3 | 52.1 | 56.5 | 27.0 | 39.7 | 40.3 | 13.6 | 57.6 | 51.8 | 51.7 | 3.0 | **41.6** |
56
- | [News](https://huggingface.co/allenai/Flex-news-2x7B-1T) | 46.5 | 48.6 | 36.4 | 15.2 | 25.7 | 30.9 | 2.5 | 77.7 | 26.9 | 47.0 | 0.0 | **32.5** |
57
- | [Creative Writing](https://huggingface.co/allenai/Flex-creative-2x7B-1T) | 42.7 | 43.9 | 31.5 | 11.6 | 23.3 | 27.6 | 1.7 | 56.9 | **67.5** | 42.4 | 0.0 | **31.7** |
58
- | [Academic](https://huggingface.co/allenai/Flex-pes2o-2x7B-1T) | 41.0 | 45.2 | 33.8 | 14.8 | 24.1 | 32.4 | 6.5 | 51.8 | 23.0 | 52.0 | 0.0 | **29.5** |
59
- | [Reddit](https://huggingface.co/allenai/Flex-reddit-2x7B-1T) | 64.7 | 36.5 | 56.1 | 25.5 | 35.5 | 19.7 | 2.5 | 54.1 | 8.6 | 32.7 | 1.7 | **30.7** |
60
- | **Combined** |
61
- | BTM (top-2) | 68.7 | 57.7 | 59.4 | 28.3 | 43.2 | 44.3 | 23.1 | 73.6 | 54.4 | 46.3 | **24.0** | **47.6** |
62
- | 🔥 **FlexOlmo-7x7B-1T** | 70.4 | 60.1 | 60.2 | 30.5 | 47.3 | 47.9 | 79.6 | 66.3 | 60.1 | **53.9** | 14.6 | **53.7** |
 
63
 
64
 
65
  # Citation
 
46
  ```
47
 
48
  # Evaluation Snapshot
49
+ | **Model** | **MC9** | **Gen5** | **MMLU** | **MMLU Pro** | **AGIEval** | **BBH** | **Math2** | **NewsG** | **PoemG** | **SciRIFF5** | **Code4** | **Avg.** |
50
+ |----------|--------|----------|----------|--------------|-------------|---------|-----------|-----------|-----------|--------------|-----------|----------|
51
+ | Prev. Public model | 68.7 | 58.8 | 55.9 | 26.2 | 39.9 | 35.7 | 8.2 | 76.0 | 47.8 | 48.1 | 1.1 | 42.4 |
52
  | **Individual** |
53
+ | [Math](https://huggingface.co/allenai/Flex-math-2x7B-1T) | 62.5 | 44.3 | 50.6 | 24.1 | 42.0 | 45.6 | **53.1** | 42.6 | 28.0 | 50.7 | 15.8 | 41.8 |
54
+ | [Code](https://huggingface.co/allenai/Flex-code-2x7B-1T) | 40.5 | 39.4 | 29.5 | 14.5 | 27.4 | 38.1 | 6.0 | 45.1 | 28.2 | 48.0 | 21.0 | 30.7 |
55
+ | Textbook | 64.3 | 52.1 | 56.5 | 27.0 | 39.7 | 40.3 | 13.6 | 57.6 | 51.8 | 51.7 | 3.0 | 41.6 |
56
+ | [News](https://huggingface.co/allenai/Flex-news-2x7B-1T) | 46.5 | 48.6 | 36.4 | 15.2 | 25.7 | 30.9 | 2.5 | 77.7 | 26.9 | 47.0 | 0.0 | 32.5 |
57
+ | [Creative Writing](https://huggingface.co/allenai/Flex-creative-2x7B-1T) | 42.7 | 43.9 | 31.5 | 11.6 | 23.3 | 27.6 | 1.7 | 56.9 | **67.5** | 42.4 | 0.0 | 31.7 |
58
+ | [Academic](https://huggingface.co/allenai/Flex-pes2o-2x7B-1T) | 41.0 | 45.2 | 33.8 | 14.8 | 24.1 | 32.4 | 6.5 | 51.8 | 23.0 | 52.0 | 0.0 | 29.5 |
59
+ | [Reddit](https://huggingface.co/allenai/Flex-reddit-2x7B-1T) | 64.7 | 36.5 | 56.1 | 25.5 | 35.5 | 19.7 | 2.5 | 54.1 | 8.6 | 32.7 | 1.7 | 30.7 |
60
+ | **Combined** |
61
+ | BTM (top-2) | 68.7 | 57.7 | 59.4 | 28.3 | 43.2 | 44.3 | 23.1 | 73.6 | 54.4 | 46.3 | **24.0** | 47.6 |
62
+ | 🔥 **FlexOlmo-7x7B-1T** | **70.4** | **60.1** | **60.2** | **30.5** | 44.8 | 46.8 | 47.9 | 77.2 | 66.2 | 53.8 | 14.6 | 52.0 |
63
+ | **FlexOlmo-7x7B-1T-RT** | 70.3 | 60.0 | **60.2** | 30.3 | **45.2** | **47.2** | 47.7 | **78.3** | **67.6** | **53.9** | 12.6 | **52.1** |
64
 
65
 
66
  # Citation