rahulseetharaman commited on
Commit
b8f5f4a
·
verified ·
1 Parent(s): 70bc068

Add new CrossEncoder model

Browse files
README.md ADDED
@@ -0,0 +1,505 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ tags:
5
+ - sentence-transformers
6
+ - cross-encoder
7
+ - reranker
8
+ - generated_from_trainer
9
+ - dataset_size:78704
10
+ - loss:ListNetLoss
11
+ base_model: bansalaman18/bert-uncased_L-10_H-512_A-8
12
+ datasets:
13
+ - microsoft/ms_marco
14
+ pipeline_tag: text-ranking
15
+ library_name: sentence-transformers
16
+ metrics:
17
+ - map
18
+ - mrr@10
19
+ - ndcg@10
20
+ model-index:
21
+ - name: CrossEncoder based on bansalaman18/bert-uncased_L-10_H-512_A-8
22
+ results:
23
+ - task:
24
+ type: cross-encoder-reranking
25
+ name: Cross Encoder Reranking
26
+ dataset:
27
+ name: NanoMSMARCO R100
28
+ type: NanoMSMARCO_R100
29
+ metrics:
30
+ - type: map
31
+ value: 0.077
32
+ name: Map
33
+ - type: mrr@10
34
+ value: 0.05
35
+ name: Mrr@10
36
+ - type: ndcg@10
37
+ value: 0.0612
38
+ name: Ndcg@10
39
+ - task:
40
+ type: cross-encoder-reranking
41
+ name: Cross Encoder Reranking
42
+ dataset:
43
+ name: NanoNFCorpus R100
44
+ type: NanoNFCorpus_R100
45
+ metrics:
46
+ - type: map
47
+ value: 0.2696
48
+ name: Map
49
+ - type: mrr@10
50
+ value: 0.3686
51
+ name: Mrr@10
52
+ - type: ndcg@10
53
+ value: 0.2518
54
+ name: Ndcg@10
55
+ - task:
56
+ type: cross-encoder-reranking
57
+ name: Cross Encoder Reranking
58
+ dataset:
59
+ name: NanoNQ R100
60
+ type: NanoNQ_R100
61
+ metrics:
62
+ - type: map
63
+ value: 0.0734
64
+ name: Map
65
+ - type: mrr@10
66
+ value: 0.056
67
+ name: Mrr@10
68
+ - type: ndcg@10
69
+ value: 0.0866
70
+ name: Ndcg@10
71
+ - task:
72
+ type: cross-encoder-nano-beir
73
+ name: Cross Encoder Nano BEIR
74
+ dataset:
75
+ name: NanoBEIR R100 mean
76
+ type: NanoBEIR_R100_mean
77
+ metrics:
78
+ - type: map
79
+ value: 0.14
80
+ name: Map
81
+ - type: mrr@10
82
+ value: 0.1582
83
+ name: Mrr@10
84
+ - type: ndcg@10
85
+ value: 0.1332
86
+ name: Ndcg@10
87
+ ---
88
+
89
+ # CrossEncoder based on bansalaman18/bert-uncased_L-10_H-512_A-8
90
+
91
+ This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model finetuned from [bansalaman18/bert-uncased_L-10_H-512_A-8](https://huggingface.co/bansalaman18/bert-uncased_L-10_H-512_A-8) on the [ms_marco](https://huggingface.co/datasets/microsoft/ms_marco) dataset using the [sentence-transformers](https://www.SBERT.net) library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
92
+
93
+ ## Model Details
94
+
95
+ ### Model Description
96
+ - **Model Type:** Cross Encoder
97
+ - **Base model:** [bansalaman18/bert-uncased_L-10_H-512_A-8](https://huggingface.co/bansalaman18/bert-uncased_L-10_H-512_A-8) <!-- at revision bcd427de8f981c76647be3705e2e00da7be40dbe -->
98
+ - **Maximum Sequence Length:** 512 tokens
99
+ - **Number of Output Labels:** 1 label
100
+ - **Training Dataset:**
101
+ - [ms_marco](https://huggingface.co/datasets/microsoft/ms_marco)
102
+ - **Language:** en
103
+ <!-- - **License:** Unknown -->
104
+
105
+ ### Model Sources
106
+
107
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
108
+ - **Documentation:** [Cross Encoder Documentation](https://www.sbert.net/docs/cross_encoder/usage/usage.html)
109
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
110
+ - **Hugging Face:** [Cross Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=cross-encoder)
111
+
112
+ ## Usage
113
+
114
+ ### Direct Usage (Sentence Transformers)
115
+
116
+ First install the Sentence Transformers library:
117
+
118
+ ```bash
119
+ pip install -U sentence-transformers
120
+ ```
121
+
122
+ Then you can load this model and run inference.
123
+ ```python
124
+ from sentence_transformers import CrossEncoder
125
+
126
+ # Download from the 🤗 Hub
127
+ model = CrossEncoder("rahulseetharaman/reranker-msmarco-v1.1-bert-uncased_L-10_H-512_A-8-listnet")
128
+ # Get scores for pairs of texts
129
+ pairs = [
130
+ ['largest mammal on earth', "The Blue whale is the largest mammal on earth. average size of the whale is 87 feet long. that's about as big as a passenger jet. it is also the loudest animal on the planet. … And the whale shark is a fish. The blue whale of course being the largest. Blue whale is not only the largest mammal on earth … , but the largest animal of any kind on the earth. Better still, the blue whale, is the largest animal to ever exist"],
131
+ ['largest mammal on earth', 'The Largest Animal Ever: The Blue Whale. The blue whale (Balaenoptera musculus) is a marine mammal belonging to the suborder of baleen whales. At 30 metres (98 ft) in length and 180 metric tons (200 short tons) or more in weight, it is the largest known animal to have ever existed. The African Bush Elephant is the largest living terrestrial (land) animal, with males reaching 6 to 7.5 metres (19.7 to 24.6 ft) in length, 3.3 metres (10.8 ft) in height at the shoulder, and weighing 6 t (13,000 lb).'],
132
+ ['largest mammal on earth', "By considerable measure, the largest known animal on Earth is the blue whale. Mature blue whales can measure anywhere from 75 feet (23 m) to 100 feet (30.5 m) from head to tail, and can weigh as much as 150 tons (136 metric tons). Most people believe that the largest animals to ever exist on Earth were the dinosaurs. However, one of the largest land dinosaurs, the sauropod Argentinosaurus, weighed only about 180,000 pounds (81,647 kg). That's little more than half the size of an adult blue whale"],
133
+ ['largest mammal on earth', 'When a blue whale exhales, the spray from its blowhole can reach nearly 30 ft (9m) into the air. Size relative to a bus: Blue whales are the largest animals ever known to have lived on Earth. These magnificent marine mammals rule the oceans at up to 100 feet (30 meters) long and upwards of 200 tons (181 metric tons). Their tongues alone can weigh as much as an elephant. Their hearts, as much as an automobile. '],
134
+ ['largest mammal on earth', 'The largest mammal on Earth is also the largest animal to ever live on land or in the ocean, dwarfing even the largest Dinosaur, and is the Blue Whale. The blue whale of course being the largest. Blue whale is not only the largest mammal on earth … , but the largest animal of any kind on the earth. Better still, the blue whale, is the largest animal to ever exist'],
135
+ ]
136
+ scores = model.predict(pairs)
137
+ print(scores.shape)
138
+ # (5,)
139
+
140
+ # Or rank different texts based on similarity to a single text
141
+ ranks = model.rank(
142
+ 'largest mammal on earth',
143
+ [
144
+ "The Blue whale is the largest mammal on earth. average size of the whale is 87 feet long. that's about as big as a passenger jet. it is also the loudest animal on the planet. … And the whale shark is a fish. The blue whale of course being the largest. Blue whale is not only the largest mammal on earth … , but the largest animal of any kind on the earth. Better still, the blue whale, is the largest animal to ever exist",
145
+ 'The Largest Animal Ever: The Blue Whale. The blue whale (Balaenoptera musculus) is a marine mammal belonging to the suborder of baleen whales. At 30 metres (98 ft) in length and 180 metric tons (200 short tons) or more in weight, it is the largest known animal to have ever existed. The African Bush Elephant is the largest living terrestrial (land) animal, with males reaching 6 to 7.5 metres (19.7 to 24.6 ft) in length, 3.3 metres (10.8 ft) in height at the shoulder, and weighing 6 t (13,000 lb).',
146
+ "By considerable measure, the largest known animal on Earth is the blue whale. Mature blue whales can measure anywhere from 75 feet (23 m) to 100 feet (30.5 m) from head to tail, and can weigh as much as 150 tons (136 metric tons). Most people believe that the largest animals to ever exist on Earth were the dinosaurs. However, one of the largest land dinosaurs, the sauropod Argentinosaurus, weighed only about 180,000 pounds (81,647 kg). That's little more than half the size of an adult blue whale",
147
+ 'When a blue whale exhales, the spray from its blowhole can reach nearly 30 ft (9m) into the air. Size relative to a bus: Blue whales are the largest animals ever known to have lived on Earth. These magnificent marine mammals rule the oceans at up to 100 feet (30 meters) long and upwards of 200 tons (181 metric tons). Their tongues alone can weigh as much as an elephant. Their hearts, as much as an automobile. ',
148
+ 'The largest mammal on Earth is also the largest animal to ever live on land or in the ocean, dwarfing even the largest Dinosaur, and is the Blue Whale. The blue whale of course being the largest. Blue whale is not only the largest mammal on earth … , but the largest animal of any kind on the earth. Better still, the blue whale, is the largest animal to ever exist',
149
+ ]
150
+ )
151
+ # [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
152
+ ```
153
+
154
+ <!--
155
+ ### Direct Usage (Transformers)
156
+
157
+ <details><summary>Click to see the direct usage in Transformers</summary>
158
+
159
+ </details>
160
+ -->
161
+
162
+ <!--
163
+ ### Downstream Usage (Sentence Transformers)
164
+
165
+ You can finetune this model on your own dataset.
166
+
167
+ <details><summary>Click to expand</summary>
168
+
169
+ </details>
170
+ -->
171
+
172
+ <!--
173
+ ### Out-of-Scope Use
174
+
175
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
176
+ -->
177
+
178
+ ## Evaluation
179
+
180
+ ### Metrics
181
+
182
+ #### Cross Encoder Reranking
183
+
184
+ * Datasets: `NanoMSMARCO_R100`, `NanoNFCorpus_R100` and `NanoNQ_R100`
185
+ * Evaluated with [<code>CrossEncoderRerankingEvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CrossEncoderRerankingEvaluator) with these parameters:
186
+ ```json
187
+ {
188
+ "at_k": 10,
189
+ "always_rerank_positives": true
190
+ }
191
+ ```
192
+
193
+ | Metric | NanoMSMARCO_R100 | NanoNFCorpus_R100 | NanoNQ_R100 |
194
+ |:------------|:---------------------|:---------------------|:---------------------|
195
+ | map | 0.0770 (-0.4126) | 0.2696 (+0.0087) | 0.0734 (-0.3463) |
196
+ | mrr@10 | 0.0500 (-0.4275) | 0.3686 (-0.1313) | 0.0560 (-0.3707) |
197
+ | **ndcg@10** | **0.0612 (-0.4792)** | **0.2518 (-0.0732)** | **0.0866 (-0.4141)** |
198
+
199
+ #### Cross Encoder Nano BEIR
200
+
201
+ * Dataset: `NanoBEIR_R100_mean`
202
+ * Evaluated with [<code>CrossEncoderNanoBEIREvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CrossEncoderNanoBEIREvaluator) with these parameters:
203
+ ```json
204
+ {
205
+ "dataset_names": [
206
+ "msmarco",
207
+ "nfcorpus",
208
+ "nq"
209
+ ],
210
+ "rerank_k": 100,
211
+ "at_k": 10,
212
+ "always_rerank_positives": true
213
+ }
214
+ ```
215
+
216
+ | Metric | Value |
217
+ |:------------|:---------------------|
218
+ | map | 0.1400 (-0.2501) |
219
+ | mrr@10 | 0.1582 (-0.3098) |
220
+ | **ndcg@10** | **0.1332 (-0.3222)** |
221
+
222
+ <!--
223
+ ## Bias, Risks and Limitations
224
+
225
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
226
+ -->
227
+
228
+ <!--
229
+ ### Recommendations
230
+
231
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
232
+ -->
233
+
234
+ ## Training Details
235
+
236
+ ### Training Dataset
237
+
238
+ #### ms_marco
239
+
240
+ * Dataset: [ms_marco](https://huggingface.co/datasets/microsoft/ms_marco) at [a47ee7a](https://huggingface.co/datasets/microsoft/ms_marco/tree/a47ee7aae8d7d466ba15f9f0bfac3b3681087b3a)
241
+ * Size: 78,704 training samples
242
+ * Columns: <code>query</code>, <code>docs</code>, and <code>labels</code>
243
+ * Approximate statistics based on the first 1000 samples:
244
+ | | query | docs | labels |
245
+ |:--------|:-----------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------|
246
+ | type | string | list | list |
247
+ | details | <ul><li>min: 9 characters</li><li>mean: 33.93 characters</li><li>max: 100 characters</li></ul> | <ul><li>min: 2 elements</li><li>mean: 6.38 elements</li><li>max: 10 elements</li></ul> | <ul><li>min: 2 elements</li><li>mean: 6.38 elements</li><li>max: 10 elements</li></ul> |
248
+ * Samples:
249
+ | query | docs | labels |
250
+ |:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------|
251
+ | <code>what animals does wool come from</code> | <code>['Wool comes from animals who grow it. The most common animal is sheep.There are others which include lama, alpaca ,and other mammals. Sheep wool is also very popular along with Cashmere wool which is collected from cashmere goats. Sheep are often farmed in places like Australia.', 'Goats. Mohair and cashmere come from goats. Mohair comes from the Angora goat, which produces a long shaggy coat that is generally clipped twice per year. According to the Department of Animal Science at Oklahoma State University, each shearing will yield approximately 5.3 pounds of wool.', 'Wool is the fibre derived from the fur of animals of the Caprinae family, principally sheep, but the hair of certain species of other mammals such as goats, alpacas and rabbits may also be called wool. This article deals explicitly with the wool produced from domestic sheep.', 'Fashionistas with a flair for fabrics know that wool can come from many animals, and that some wools are more desirable than others. Wools range from soft and silky ones used for clothing to coarse and strong wools used for rugs.', 'Confidence votes 259. Wool comes from animals who grow it. The most common animal is sheep.There are others which include lama, alpaca ,and other mammals. Sheep wool is also very popular along with Cashmere wool which is collected from cashmere goats.']</code> | <code>[1, 0, 0, 0, 0]</code> |
252
+ | <code>what is ecoflex</code> | <code>['ecoFLEX™: A blend of recycled polymers and wood byproducts, ecoFLEX exhibits the best characteristics of the polymers and wood that forms the basis for the material. ecoFLEX is the best of both. ecoFLEX is made by: The material is mixed with UV inhibitors and color blending compounds and granulated into a form that is used to mold the parts that are used to make an ecoFLEX based product.', 'Ecoflex is a pre-insulated pipe system at its best — versatile, flexible and durable. Ecoflex consists of Uponor crosslinked polyethylene (PEX) or high-density polyethylene (HDPE) pipe encased in closed-cell, PEX-foam insulation and covered by a watertight, corrugated HDPE jacket. Potable HDPE pipe is available in 1¼, 1½, 2, 3 and 4 sizes. 2 Ecoflex Potable Plus (one Uponor AquaPEX service pipe with heat trace) is available on a made-to-order basis. 3 Ecoflex Thermal and Potable PEX products up to 2 can use Uponor’s ProPEX® expansion fitting system.', 'JOIN THE QUIET REVOLUTION. Vauxhall’s ecoFL...</code> | <code>[1, 0, 0, 0, 0, ...]</code> |
253
+ | <code>common cold incubation period, period of illness and contagious period</code> | <code>['Most experts agree that adults with a cold or the flu start being contagious about a day before they start experiencing symptoms. For the flu, the contagious period then lasts five to seven days into the illness. For children, the contagious period for the flu can last up to two weeks after they start feeling sick, even if they start feeling better before that. The contagious period for a cold lasts about three to four days into the illness. As a general rule, people with a cold are most contagious about three days after their initial exposure to the virus. For more information on colds, influenza and related topics, check out the links on the next page. Print.', 'Confidence votes 40.4K. The incubation period of the common cold is typically 36 to 48 hours, but, in some cases, symptoms can arise within 12 hours, or slightly less, of catching the virus. There are over 200 viruses responsible for the common cold. According to the US Centers for Disease Control and Prevention (CDC), the ...</code> | <code>[1, 1, 0, 0, 0, ...]</code> |
254
+ * Loss: [<code>ListNetLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#listnetloss) with these parameters:
255
+ ```json
256
+ {
257
+ "activation_fn": "torch.nn.modules.linear.Identity",
258
+ "mini_batch_size": 16
259
+ }
260
+ ```
261
+
262
+ ### Evaluation Dataset
263
+
264
+ #### ms_marco
265
+
266
+ * Dataset: [ms_marco](https://huggingface.co/datasets/microsoft/ms_marco) at [a47ee7a](https://huggingface.co/datasets/microsoft/ms_marco/tree/a47ee7aae8d7d466ba15f9f0bfac3b3681087b3a)
267
+ * Size: 1,000 evaluation samples
268
+ * Columns: <code>query</code>, <code>docs</code>, and <code>labels</code>
269
+ * Approximate statistics based on the first 1000 samples:
270
+ | | query | docs | labels |
271
+ |:--------|:----------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------|
272
+ | type | string | list | list |
273
+ | details | <ul><li>min: 9 characters</li><li>mean: 34.15 characters</li><li>max: 99 characters</li></ul> | <ul><li>min: 3 elements</li><li>mean: 6.50 elements</li><li>max: 10 elements</li></ul> | <ul><li>min: 3 elements</li><li>mean: 6.50 elements</li><li>max: 10 elements</li></ul> |
274
+ * Samples:
275
+ | query | docs | labels |
276
+ |:-----------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------|
277
+ | <code>largest mammal on earth</code> | <code>["The Blue whale is the largest mammal on earth. average size of the whale is 87 feet long. that's about as big as a passenger jet. it is also the loudest animal on the planet. … And the whale shark is a fish. The blue whale of course being the largest. Blue whale is not only the largest mammal on earth … , but the largest animal of any kind on the earth. Better still, the blue whale, is the largest animal to ever exist", 'The Largest Animal Ever: The Blue Whale. The blue whale (Balaenoptera musculus) is a marine mammal belonging to the suborder of baleen whales. At 30 metres (98 ft) in length and 180 metric tons (200 short tons) or more in weight, it is the largest known animal to have ever existed. The African Bush Elephant is the largest living terrestrial (land) animal, with males reaching 6 to 7.5 metres (19.7 to 24.6 ft) in length, 3.3 metres (10.8 ft) in height at the shoulder, and weighing 6 t (13,000 lb).', "By considerable measure, the largest known animal on Earth is the blu...</code> | <code>[1, 0, 0, 0, 0, ...]</code> |
278
+ | <code>what is the common name for metacarpals</code> | <code>['Ten percent of all fractures that occur are those to the metacarpals and phalanges, the most common injuries being from car accidents, sports injuries, and work-related injuries. The goal in repairing these injuries is to do so while maintaining strength of hand grip and no residual pain upon using the hand. Metacarpals. The metacarpals are long bones within the hand that are connected to the carpals, or wrist bones, and to the phalanges, or finger bones. The metacarpals together are referred to as the metacarpus .. The tops of the metacarpals form the knuckles where they join to the wrist', "Metacarpal neck fractures are the most common type of metacarpal fracture. Such a fracture seen in the 5th (or rarely, the 4 th) metacarpal neck is called a boxer’s fracture.. Figure: A boxer's fracture, i.e., a fracture of the neck of the 5th metacarpal. (http://radiopaedia.org/cases/boxers-fracture-1). Metacarpal head fractures are rare and usually require surgical intervention.", 'The metacar...</code> | <code>[1, 0, 0, 0, 0, ...]</code> |
279
+ | <code>how does a molle system work</code> | <code>["The system's modularity is derived from the use of PALS webbing as rows of heavy-duty nylon stitched onto the vest to allow for the attachment of various MOLLE-compatible pouches and accessories.", 'MOLLE SYSTEMS. MOLLE (pronounced MOLLY) is an acronym for MOdular Lightweight Load-carrying Equipment. Molle uses rows of heavy-duty nylon stitched onto the vest/pack to allow for attachment of various MOLLE-compatible pouches and accessories. These are called PALS (Pouch Attachment Ladder System).', 'This is the webbing you are seeing. The military standard for this webbing is horizontal rows of 1″ webbing, spaced 1″ apart, sewn to the item at 1.5″ intervals. Modular MOLLE Utility Pouch-Coyote Tan. The second term is again a military acronym-MOLLE; Modular Lightweight Load carrying Equipment.', 'The MOLLE system is a fully combined which is made up of a load bearing vest along with butt pack, main ruck that has sustainable pouches and also a partition to carry a sleeping bag. The load be...</code> | <code>[1, 0, 0, 0, 0, ...]</code> |
280
+ * Loss: [<code>ListNetLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#listnetloss) with these parameters:
281
+ ```json
282
+ {
283
+ "activation_fn": "torch.nn.modules.linear.Identity",
284
+ "mini_batch_size": 16
285
+ }
286
+ ```
287
+
288
+ ### Training Hyperparameters
289
+ #### Non-Default Hyperparameters
290
+
291
+ - `eval_strategy`: steps
292
+ - `per_device_train_batch_size`: 16
293
+ - `per_device_eval_batch_size`: 16
294
+ - `learning_rate`: 2e-05
295
+ - `num_train_epochs`: 1
296
+ - `warmup_ratio`: 0.1
297
+ - `seed`: 12
298
+ - `bf16`: True
299
+ - `load_best_model_at_end`: True
300
+
301
+ #### All Hyperparameters
302
+ <details><summary>Click to expand</summary>
303
+
304
+ - `overwrite_output_dir`: False
305
+ - `do_predict`: False
306
+ - `eval_strategy`: steps
307
+ - `prediction_loss_only`: True
308
+ - `per_device_train_batch_size`: 16
309
+ - `per_device_eval_batch_size`: 16
310
+ - `per_gpu_train_batch_size`: None
311
+ - `per_gpu_eval_batch_size`: None
312
+ - `gradient_accumulation_steps`: 1
313
+ - `eval_accumulation_steps`: None
314
+ - `torch_empty_cache_steps`: None
315
+ - `learning_rate`: 2e-05
316
+ - `weight_decay`: 0.0
317
+ - `adam_beta1`: 0.9
318
+ - `adam_beta2`: 0.999
319
+ - `adam_epsilon`: 1e-08
320
+ - `max_grad_norm`: 1.0
321
+ - `num_train_epochs`: 1
322
+ - `max_steps`: -1
323
+ - `lr_scheduler_type`: linear
324
+ - `lr_scheduler_kwargs`: {}
325
+ - `warmup_ratio`: 0.1
326
+ - `warmup_steps`: 0
327
+ - `log_level`: passive
328
+ - `log_level_replica`: warning
329
+ - `log_on_each_node`: True
330
+ - `logging_nan_inf_filter`: True
331
+ - `save_safetensors`: True
332
+ - `save_on_each_node`: False
333
+ - `save_only_model`: False
334
+ - `restore_callback_states_from_checkpoint`: False
335
+ - `no_cuda`: False
336
+ - `use_cpu`: False
337
+ - `use_mps_device`: False
338
+ - `seed`: 12
339
+ - `data_seed`: None
340
+ - `jit_mode_eval`: False
341
+ - `use_ipex`: False
342
+ - `bf16`: True
343
+ - `fp16`: False
344
+ - `fp16_opt_level`: O1
345
+ - `half_precision_backend`: auto
346
+ - `bf16_full_eval`: False
347
+ - `fp16_full_eval`: False
348
+ - `tf32`: None
349
+ - `local_rank`: 0
350
+ - `ddp_backend`: None
351
+ - `tpu_num_cores`: None
352
+ - `tpu_metrics_debug`: False
353
+ - `debug`: []
354
+ - `dataloader_drop_last`: False
355
+ - `dataloader_num_workers`: 0
356
+ - `dataloader_prefetch_factor`: None
357
+ - `past_index`: -1
358
+ - `disable_tqdm`: False
359
+ - `remove_unused_columns`: True
360
+ - `label_names`: None
361
+ - `load_best_model_at_end`: True
362
+ - `ignore_data_skip`: False
363
+ - `fsdp`: []
364
+ - `fsdp_min_num_params`: 0
365
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
366
+ - `fsdp_transformer_layer_cls_to_wrap`: None
367
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
368
+ - `deepspeed`: None
369
+ - `label_smoothing_factor`: 0.0
370
+ - `optim`: adamw_torch
371
+ - `optim_args`: None
372
+ - `adafactor`: False
373
+ - `group_by_length`: False
374
+ - `length_column_name`: length
375
+ - `ddp_find_unused_parameters`: None
376
+ - `ddp_bucket_cap_mb`: None
377
+ - `ddp_broadcast_buffers`: False
378
+ - `dataloader_pin_memory`: True
379
+ - `dataloader_persistent_workers`: False
380
+ - `skip_memory_metrics`: True
381
+ - `use_legacy_prediction_loop`: False
382
+ - `push_to_hub`: False
383
+ - `resume_from_checkpoint`: None
384
+ - `hub_model_id`: None
385
+ - `hub_strategy`: every_save
386
+ - `hub_private_repo`: None
387
+ - `hub_always_push`: False
388
+ - `hub_revision`: None
389
+ - `gradient_checkpointing`: False
390
+ - `gradient_checkpointing_kwargs`: None
391
+ - `include_inputs_for_metrics`: False
392
+ - `include_for_metrics`: []
393
+ - `eval_do_concat_batches`: True
394
+ - `fp16_backend`: auto
395
+ - `push_to_hub_model_id`: None
396
+ - `push_to_hub_organization`: None
397
+ - `mp_parameters`:
398
+ - `auto_find_batch_size`: False
399
+ - `full_determinism`: False
400
+ - `torchdynamo`: None
401
+ - `ray_scope`: last
402
+ - `ddp_timeout`: 1800
403
+ - `torch_compile`: False
404
+ - `torch_compile_backend`: None
405
+ - `torch_compile_mode`: None
406
+ - `include_tokens_per_second`: False
407
+ - `include_num_input_tokens_seen`: False
408
+ - `neftune_noise_alpha`: None
409
+ - `optim_target_modules`: None
410
+ - `batch_eval_metrics`: False
411
+ - `eval_on_start`: False
412
+ - `use_liger_kernel`: False
413
+ - `liger_kernel_config`: None
414
+ - `eval_use_gather_object`: False
415
+ - `average_tokens_across_devices`: False
416
+ - `prompts`: None
417
+ - `batch_sampler`: batch_sampler
418
+ - `multi_dataset_batch_sampler`: proportional
419
+ - `router_mapping`: {}
420
+ - `learning_rate_mapping`: {}
421
+
422
+ </details>
423
+
424
+ ### Training Logs
425
+ | Epoch | Step | Training Loss | Validation Loss | NanoMSMARCO_R100_ndcg@10 | NanoNFCorpus_R100_ndcg@10 | NanoNQ_R100_ndcg@10 | NanoBEIR_R100_mean_ndcg@10 |
426
+ |:----------:|:--------:|:-------------:|:---------------:|:------------------------:|:-------------------------:|:--------------------:|:--------------------------:|
427
+ | -1 | -1 | - | - | 0.0329 (-0.5075) | 0.3072 (-0.0179) | 0.0411 (-0.4596) | 0.1270 (-0.3283) |
428
+ | 0.0002 | 1 | 2.1125 | - | - | - | - | - |
429
+ | 0.0508 | 250 | 2.0939 | - | - | - | - | - |
430
+ | 0.1016 | 500 | 2.0907 | 2.0901 | 0.0470 (-0.4934) | 0.2425 (-0.0825) | 0.0401 (-0.4605) | 0.1099 (-0.3455) |
431
+ | 0.1525 | 750 | 2.0878 | - | - | - | - | - |
432
+ | 0.2033 | 1000 | 2.0852 | 2.0895 | 0.0449 (-0.4955) | 0.2333 (-0.0917) | 0.0691 (-0.4315) | 0.1158 (-0.3396) |
433
+ | 0.2541 | 1250 | 2.0922 | - | - | - | - | - |
434
+ | 0.3049 | 1500 | 2.0852 | 2.0887 | 0.0477 (-0.4927) | 0.2630 (-0.0621) | 0.0605 (-0.4401) | 0.1237 (-0.3316) |
435
+ | 0.3558 | 1750 | 2.0927 | - | - | - | - | - |
436
+ | 0.4066 | 2000 | 2.0834 | 2.0883 | 0.0616 (-0.4788) | 0.2742 (-0.0508) | 0.0562 (-0.4445) | 0.1307 (-0.3247) |
437
+ | 0.4574 | 2250 | 2.0838 | - | - | - | - | - |
438
+ | 0.5082 | 2500 | 2.0844 | 2.0879 | 0.0557 (-0.4847) | 0.2558 (-0.0692) | 0.0811 (-0.4195) | 0.1309 (-0.3245) |
439
+ | 0.5591 | 2750 | 2.0852 | - | - | - | - | - |
440
+ | **0.6099** | **3000** | **2.087** | **2.0878** | **0.0612 (-0.4792)** | **0.2518 (-0.0732)** | **0.0866 (-0.4141)** | **0.1332 (-0.3222)** |
441
+ | 0.6607 | 3250 | 2.0878 | - | - | - | - | - |
442
+ | 0.7115 | 3500 | 2.0854 | 2.0878 | 0.0595 (-0.4809) | 0.2569 (-0.0681) | 0.0750 (-0.4257) | 0.1305 (-0.3249) |
443
+ | 0.7624 | 3750 | 2.0866 | - | - | - | - | - |
444
+ | 0.8132 | 4000 | 2.0868 | 2.0874 | 0.0553 (-0.4851) | 0.2519 (-0.0731) | 0.0729 (-0.4278) | 0.1267 (-0.3287) |
445
+ | 0.8640 | 4250 | 2.0915 | - | - | - | - | - |
446
+ | 0.9148 | 4500 | 2.0851 | 2.0873 | 0.0553 (-0.4851) | 0.2528 (-0.0722) | 0.0722 (-0.4285) | 0.1268 (-0.3286) |
447
+ | 0.9656 | 4750 | 2.084 | - | - | - | - | - |
448
+ | -1 | -1 | - | - | 0.0612 (-0.4792) | 0.2518 (-0.0732) | 0.0866 (-0.4141) | 0.1332 (-0.3222) |
449
+
450
+ * The bold row denotes the saved checkpoint.
451
+
452
+ ### Framework Versions
453
+ - Python: 3.10.18
454
+ - Sentence Transformers: 5.0.0
455
+ - Transformers: 4.56.0.dev0
456
+ - PyTorch: 2.7.1+cu126
457
+ - Accelerate: 1.9.0
458
+ - Datasets: 4.0.0
459
+ - Tokenizers: 0.21.4
460
+
461
+ ## Citation
462
+
463
+ ### BibTeX
464
+
465
+ #### Sentence Transformers
466
+ ```bibtex
467
+ @inproceedings{reimers-2019-sentence-bert,
468
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
469
+ author = "Reimers, Nils and Gurevych, Iryna",
470
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
471
+ month = "11",
472
+ year = "2019",
473
+ publisher = "Association for Computational Linguistics",
474
+ url = "https://arxiv.org/abs/1908.10084",
475
+ }
476
+ ```
477
+
478
+ #### ListNetLoss
479
+ ```bibtex
480
+ @inproceedings{cao2007learning,
481
+ title={Learning to Rank: From Pairwise Approach to Listwise Approach},
482
+ author={Cao, Zhe and Qin, Tao and Liu, Tie-Yan and Tsai, Ming-Feng and Li, Hang},
483
+ booktitle={Proceedings of the 24th international conference on Machine learning},
484
+ pages={129--136},
485
+ year={2007}
486
+ }
487
+ ```
488
+
489
+ <!--
490
+ ## Glossary
491
+
492
+ *Clearly define terms in order to be accessible across audiences.*
493
+ -->
494
+
495
+ <!--
496
+ ## Model Card Authors
497
+
498
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
499
+ -->
500
+
501
+ <!--
502
+ ## Model Card Contact
503
+
504
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
505
+ -->
config.json ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "BertForSequenceClassification"
4
+ ],
5
+ "attention_probs_dropout_prob": 0.1,
6
+ "classifier_dropout": null,
7
+ "hidden_act": "gelu",
8
+ "hidden_dropout_prob": 0.1,
9
+ "hidden_size": 512,
10
+ "id2label": {
11
+ "0": "LABEL_0"
12
+ },
13
+ "initializer_range": 0.02,
14
+ "intermediate_size": 2048,
15
+ "label2id": {
16
+ "LABEL_0": 0
17
+ },
18
+ "layer_norm_eps": 1e-12,
19
+ "max_position_embeddings": 512,
20
+ "model_type": "bert",
21
+ "num_attention_heads": 8,
22
+ "num_hidden_layers": 10,
23
+ "pad_token_id": 0,
24
+ "position_embedding_type": "absolute",
25
+ "sentence_transformers": {
26
+ "activation_fn": "torch.nn.modules.activation.Sigmoid",
27
+ "version": "5.0.0"
28
+ },
29
+ "torch_dtype": "float32",
30
+ "transformers_version": "4.56.0.dev0",
31
+ "type_vocab_size": 2,
32
+ "use_cache": true,
33
+ "vocab_size": 30522
34
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b8922d27286bb64c39a1c44b744766869b0e348ad1c1dd950e8bd367fe7abbf7
3
+ size 190733508
special_tokens_map.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": "[CLS]",
3
+ "mask_token": "[MASK]",
4
+ "pad_token": "[PAD]",
5
+ "sep_token": "[SEP]",
6
+ "unk_token": "[UNK]"
7
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "extra_special_tokens": {},
49
+ "mask_token": "[MASK]",
50
+ "model_max_length": 512,
51
+ "never_split": null,
52
+ "pad_token": "[PAD]",
53
+ "sep_token": "[SEP]",
54
+ "strip_accents": null,
55
+ "tokenize_chinese_chars": true,
56
+ "tokenizer_class": "BertTokenizer",
57
+ "unk_token": "[UNK]"
58
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff