Update pipeline tag to text-ranking and add descriptive tags (#3)
Browse files- Update pipeline tag to text-ranking and add descriptive tags (b38f9139057875a729ef5f4606a83fa74fce66ad)
Co-authored-by: Niels Rogge <[email protected]>
README.md
CHANGED
|
@@ -1,8 +1,14 @@
|
|
| 1 |
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
pipeline_tag: text-generation
|
| 4 |
library_name: transformers
|
|
|
|
|
|
|
| 5 |
paper: 2507.09104
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
---
|
| 7 |
|
| 8 |
# CompassJudger-2
|
|
@@ -113,7 +119,7 @@ CompassJudger-2 sets a new state-of-the-art for judge models, outperforming gene
|
|
| 113 |
|
| 114 |
| Model | JudgerBench V2 | JudgeBench | RMB | RewardBench | Average |
|
| 115 |
| :--------------------------------- | :------------: | :--------: | :-------: | :---------: | :-------: |
|
| 116 |
-
| **7B Judge Models** | | | | |
|
| 117 |
| CompassJudger-1-7B-Instruct | 57.96 | 46.00 | 38.18 | 80.74 | 55.72 |
|
| 118 |
| Con-J-7B-Instruct | 52.35 | 38.06 | 71.50 | 87.10 | 62.25 |
|
| 119 |
| RISE-Judge-Qwen2.5-7B | 46.12 | 40.48 | 72.64 | 88.20 | 61.61 |
|
|
@@ -129,7 +135,7 @@ CompassJudger-2 sets a new state-of-the-art for judge models, outperforming gene
|
|
| 129 |
| Qwen3-235B-A22B | 61.40 | 65.97 | 75.59 | 84.68 | 71.91 |
|
| 130 |
|
| 131 |
|
| 132 |
-
For detailed benchmark performance and methodology, please refer to our
|
| 133 |
|
| 134 |
## License
|
| 135 |
|
|
|
|
| 1 |
---
|
|
|
|
|
|
|
| 2 |
library_name: transformers
|
| 3 |
+
license: apache-2.0
|
| 4 |
+
pipeline_tag: text-ranking
|
| 5 |
paper: 2507.09104
|
| 6 |
+
language: en
|
| 7 |
+
tags:
|
| 8 |
+
- judge-model
|
| 9 |
+
- evaluation
|
| 10 |
+
- reward-modeling
|
| 11 |
+
- text-ranking
|
| 12 |
---
|
| 13 |
|
| 14 |
# CompassJudger-2
|
|
|
|
| 119 |
|
| 120 |
| Model | JudgerBench V2 | JudgeBench | RMB | RewardBench | Average |
|
| 121 |
| :--------------------------------- | :------------: | :--------: | :-------: | :---------: | :-------: |
|
| 122 |
+
| **7B Judge Models** | | | | | |\
|
| 123 |
| CompassJudger-1-7B-Instruct | 57.96 | 46.00 | 38.18 | 80.74 | 55.72 |
|
| 124 |
| Con-J-7B-Instruct | 52.35 | 38.06 | 71.50 | 87.10 | 62.25 |
|
| 125 |
| RISE-Judge-Qwen2.5-7B | 46.12 | 40.48 | 72.64 | 88.20 | 61.61 |
|
|
|
|
| 135 |
| Qwen3-235B-A22B | 61.40 | 65.97 | 75.59 | 84.68 | 71.91 |
|
| 136 |
|
| 137 |
|
| 138 |
+
For detailed benchmark performance and methodology, please refer to our 📑 [Paper](https://arxiv.org/abs/2507.09104).
|
| 139 |
|
| 140 |
## License
|
| 141 |
|