akot commited on
Commit
97cc552
·
verified ·
1 Parent(s): c5e489e

Add new SentenceTransformer model.

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,838 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: jinaai/jina-embeddings-v2-base-de
3
+ datasets: []
4
+ language:
5
+ - en
6
+ library_name: sentence-transformers
7
+ license: apache-2.0
8
+ metrics:
9
+ - cosine_accuracy@1
10
+ - cosine_accuracy@3
11
+ - cosine_accuracy@5
12
+ - cosine_accuracy@10
13
+ - cosine_precision@1
14
+ - cosine_precision@3
15
+ - cosine_precision@5
16
+ - cosine_precision@10
17
+ - cosine_recall@1
18
+ - cosine_recall@3
19
+ - cosine_recall@5
20
+ - cosine_recall@10
21
+ - cosine_ndcg@10
22
+ - cosine_mrr@10
23
+ - cosine_map@100
24
+ pipeline_tag: sentence-similarity
25
+ tags:
26
+ - sentence-transformers
27
+ - sentence-similarity
28
+ - feature-extraction
29
+ - generated_from_trainer
30
+ - dataset_size:4957
31
+ - loss:MatryoshkaLoss
32
+ - loss:MultipleNegativesRankingLoss
33
+ widget:
34
+ - source_sentence: 312 Aus steuerlicher Sicht ist es möglich, mehrere Versorgungszusagen
35
+ nebeneinander, also neben einer Altzusage auch eine Neuzusage zu erteilen (z.
36
+ B. „alte“ Direktversicherung und „neuer“ Pensionsfonds).
37
+ sentences:
38
+ - Wann liegt bei der betrieblichen Altersversorgung eine schädliche Verwendung vor?
39
+ - Welche steuerliche Behandlung erfahren Auszahlungen aus Altersvorsorgeverträgen
40
+ nach § 22 Nr. 5 EStG?
41
+ - Können verschiedene Versorgungszusagen wie Direktversicherung und Pensionsfonds
42
+ gleichzeitig bestehen?
43
+ - source_sentence: 5 Pflichtversicherte nach dem Gesetz über die Alterssicherung der
44
+ Landwirte gehören, soweit sie nicht als Pflichtversicherte der gesetzlichen Rentenversicherung
45
+ ohnehin bereits anspruchsberechtigt sind, in dieser Eigenschaft ebenfalls zum
46
+ begünstigten Personenkreis. Darunter fallen insbesondere die in Anlage 1 Abschnitt
47
+ B aufgeführten Personen.
48
+ sentences:
49
+ - Wann wird das Anrecht der ausgleichsberechtigten Person bei intern geteilter Altersvorsorge
50
+ als abgeschlossen betrachtet?
51
+ - Welche Personen sind in der Anlage 1 Abschnitt B bezüglich der Alterssicherung
52
+ der Landwirte aufgeführt?
53
+ - In welchen Fällen führt die Möglichkeit einer Beitragserstattung nicht zur Versagung
54
+ der Anerkennung als betriebliche Altersversorgung?
55
+ - source_sentence: 233 Voraussetzung für die Förderung durch Sonderausgabenabzug nach
56
+ § 10a EStG und Zulage nach Abschnitt XI EStG ist in den Fällen der Rz. 231 f.,
57
+ dass der Steuerpflichtige zum begünstigten Personenkreis gehört. Die zeitliche
58
+ Zuordnung dieser Altersvorsorgebeiträge richtet sich grundsätzlich nach § 11 Abs.
59
+ 2 EStG.
60
+ sentences:
61
+ - Wer gehört zum begünstigten Personenkreis für die Altersvorsorgeförderung?
62
+ - Wie werden erstattete Kosten eines Altersvorsorgevertrags besteuert, wenn sie
63
+ dem Steuerpflichtigen ausgezahlt werden?
64
+ - Ist der Übertragungswert einer betrieblichen Altersversorgung bei einem Arbeitgeberwechsel
65
+ steuerfrei?
66
+ - source_sentence: 127 Die Entnahme des Teilkapitalbetrags von bis zu 30 % des zur
67
+ Verfügung stehenden Kapitals aus dem Vertrag hat zu Beginn der Auszahlungsphase
68
+ zu erfolgen. Eine Verteilung über mehrere Auszahlungszeitpunkte ist nicht möglich.
69
+ sentences:
70
+ - Kann ich den Teilkapitalbetrag aus meiner Altersvorsorge zu verschiedenen Zeitpunkten
71
+ entnehmen?
72
+ - Welche Einkunftsarten können Leistungen aus einer Versorgungszusage des Arbeitgebers
73
+ sein?
74
+ - Was ist im Todesfall des Zulageberechtigten bezüglich der Förderbeiträge zu tun?
75
+ - source_sentence: '67 Abwandlung des Beispiels 1 in Rn. 66: A erhält zudem zwei Kinderzulagen
76
+ für seine in den Jahren 2004 und 2005 geborenen Kinder. Beitragspflichtige Einnahmen
77
+ 53.000 € 4 % 2.120 € höchstens 2.100 € anzusetzen 2.100 € abzüglich Zulage 175
78
+ € Mindesteigenbeitrag (§ 86 Abs. 1 Satz 2 EStG) 1.925 € Sockelbetrag (§ 86 Abs.
79
+ 1 Satz 4 EStG) 60 € maßgebend (§ 86 Abs. 1 Satz 5 EStG) 1.925 € Die von A geleisteten
80
+ Beiträge übersteigen den Mindesteigenbeitrag. Die Zulage wird nicht gekürzt.'
81
+ sentences:
82
+ - Wird die Zulage für A gekürzt, wenn die Beiträge den Mindesteigenbeitrag übersteigen?
83
+ - Was versteht man unter Sonderzahlungen des Arbeitgebers?
84
+ - Wie erfolgt die Besteuerung bei der ausgleichsberechtigten Person nach einer externen
85
+ Teilung?
86
+ model-index:
87
+ - name: German Semantic V3 BMF
88
+ results:
89
+ - task:
90
+ type: information-retrieval
91
+ name: Information Retrieval
92
+ dataset:
93
+ name: dim 768
94
+ type: dim_768
95
+ metrics:
96
+ - type: cosine_accuracy@1
97
+ value: 0.0
98
+ name: Cosine Accuracy@1
99
+ - type: cosine_accuracy@3
100
+ value: 0.0018148820326678765
101
+ name: Cosine Accuracy@3
102
+ - type: cosine_accuracy@5
103
+ value: 0.0018148820326678765
104
+ name: Cosine Accuracy@5
105
+ - type: cosine_accuracy@10
106
+ value: 0.003629764065335753
107
+ name: Cosine Accuracy@10
108
+ - type: cosine_precision@1
109
+ value: 0.0
110
+ name: Cosine Precision@1
111
+ - type: cosine_precision@3
112
+ value: 0.0006049606775559588
113
+ name: Cosine Precision@3
114
+ - type: cosine_precision@5
115
+ value: 0.00036297640653357535
116
+ name: Cosine Precision@5
117
+ - type: cosine_precision@10
118
+ value: 0.00036297640653357535
119
+ name: Cosine Precision@10
120
+ - type: cosine_recall@1
121
+ value: 0.0
122
+ name: Cosine Recall@1
123
+ - type: cosine_recall@3
124
+ value: 0.0018148820326678765
125
+ name: Cosine Recall@3
126
+ - type: cosine_recall@5
127
+ value: 0.0018148820326678765
128
+ name: Cosine Recall@5
129
+ - type: cosine_recall@10
130
+ value: 0.003629764065335753
131
+ name: Cosine Recall@10
132
+ - type: cosine_ndcg@10
133
+ value: 0.0017915370974219231
134
+ name: Cosine Ndcg@10
135
+ - type: cosine_mrr@10
136
+ value: 0.0012099213551119176
137
+ name: Cosine Mrr@10
138
+ - type: cosine_map@100
139
+ value: 0.001909500365693067
140
+ name: Cosine Map@100
141
+ - task:
142
+ type: information-retrieval
143
+ name: Information Retrieval
144
+ dataset:
145
+ name: dim 512
146
+ type: dim_512
147
+ metrics:
148
+ - type: cosine_accuracy@1
149
+ value: 0.0
150
+ name: Cosine Accuracy@1
151
+ - type: cosine_accuracy@3
152
+ value: 0.0018148820326678765
153
+ name: Cosine Accuracy@3
154
+ - type: cosine_accuracy@5
155
+ value: 0.003629764065335753
156
+ name: Cosine Accuracy@5
157
+ - type: cosine_accuracy@10
158
+ value: 0.003629764065335753
159
+ name: Cosine Accuracy@10
160
+ - type: cosine_precision@1
161
+ value: 0.0
162
+ name: Cosine Precision@1
163
+ - type: cosine_precision@3
164
+ value: 0.0006049606775559588
165
+ name: Cosine Precision@3
166
+ - type: cosine_precision@5
167
+ value: 0.0007259528130671507
168
+ name: Cosine Precision@5
169
+ - type: cosine_precision@10
170
+ value: 0.00036297640653357535
171
+ name: Cosine Precision@10
172
+ - type: cosine_recall@1
173
+ value: 0.0
174
+ name: Cosine Recall@1
175
+ - type: cosine_recall@3
176
+ value: 0.0018148820326678765
177
+ name: Cosine Recall@3
178
+ - type: cosine_recall@5
179
+ value: 0.003629764065335753
180
+ name: Cosine Recall@5
181
+ - type: cosine_recall@10
182
+ value: 0.003629764065335753
183
+ name: Cosine Recall@10
184
+ - type: cosine_ndcg@10
185
+ value: 0.0019266902207710538
186
+ name: Cosine Ndcg@10
187
+ - type: cosine_mrr@10
188
+ value: 0.0013611615245009074
189
+ name: Cosine Mrr@10
190
+ - type: cosine_map@100
191
+ value: 0.0022709772837192373
192
+ name: Cosine Map@100
193
+ - task:
194
+ type: information-retrieval
195
+ name: Information Retrieval
196
+ dataset:
197
+ name: dim 256
198
+ type: dim_256
199
+ metrics:
200
+ - type: cosine_accuracy@1
201
+ value: 0.0
202
+ name: Cosine Accuracy@1
203
+ - type: cosine_accuracy@3
204
+ value: 0.0018148820326678765
205
+ name: Cosine Accuracy@3
206
+ - type: cosine_accuracy@5
207
+ value: 0.0018148820326678765
208
+ name: Cosine Accuracy@5
209
+ - type: cosine_accuracy@10
210
+ value: 0.003629764065335753
211
+ name: Cosine Accuracy@10
212
+ - type: cosine_precision@1
213
+ value: 0.0
214
+ name: Cosine Precision@1
215
+ - type: cosine_precision@3
216
+ value: 0.0006049606775559588
217
+ name: Cosine Precision@3
218
+ - type: cosine_precision@5
219
+ value: 0.00036297640653357535
220
+ name: Cosine Precision@5
221
+ - type: cosine_precision@10
222
+ value: 0.00036297640653357535
223
+ name: Cosine Precision@10
224
+ - type: cosine_recall@1
225
+ value: 0.0
226
+ name: Cosine Recall@1
227
+ - type: cosine_recall@3
228
+ value: 0.0018148820326678765
229
+ name: Cosine Recall@3
230
+ - type: cosine_recall@5
231
+ value: 0.0018148820326678765
232
+ name: Cosine Recall@5
233
+ - type: cosine_recall@10
234
+ value: 0.003629764065335753
235
+ name: Cosine Recall@10
236
+ - type: cosine_ndcg@10
237
+ value: 0.001717594610448614
238
+ name: Cosine Ndcg@10
239
+ - type: cosine_mrr@10
240
+ value: 0.001134301270417423
241
+ name: Cosine Mrr@10
242
+ - type: cosine_map@100
243
+ value: 0.0020295043929584406
244
+ name: Cosine Map@100
245
+ - task:
246
+ type: information-retrieval
247
+ name: Information Retrieval
248
+ dataset:
249
+ name: dim 128
250
+ type: dim_128
251
+ metrics:
252
+ - type: cosine_accuracy@1
253
+ value: 0.0
254
+ name: Cosine Accuracy@1
255
+ - type: cosine_accuracy@3
256
+ value: 0.0
257
+ name: Cosine Accuracy@3
258
+ - type: cosine_accuracy@5
259
+ value: 0.0
260
+ name: Cosine Accuracy@5
261
+ - type: cosine_accuracy@10
262
+ value: 0.0054446460980036296
263
+ name: Cosine Accuracy@10
264
+ - type: cosine_precision@1
265
+ value: 0.0
266
+ name: Cosine Precision@1
267
+ - type: cosine_precision@3
268
+ value: 0.0
269
+ name: Cosine Precision@3
270
+ - type: cosine_precision@5
271
+ value: 0.0
272
+ name: Cosine Precision@5
273
+ - type: cosine_precision@10
274
+ value: 0.000544464609800363
275
+ name: Cosine Precision@10
276
+ - type: cosine_recall@1
277
+ value: 0.0
278
+ name: Cosine Recall@1
279
+ - type: cosine_recall@3
280
+ value: 0.0
281
+ name: Cosine Recall@3
282
+ - type: cosine_recall@5
283
+ value: 0.0
284
+ name: Cosine Recall@5
285
+ - type: cosine_recall@10
286
+ value: 0.0054446460980036296
287
+ name: Cosine Recall@10
288
+ - type: cosine_ndcg@10
289
+ value: 0.001765339491030367
290
+ name: Cosine Ndcg@10
291
+ - type: cosine_mrr@10
292
+ value: 0.0007309941520467837
293
+ name: Cosine Mrr@10
294
+ - type: cosine_map@100
295
+ value: 0.0014606683373623462
296
+ name: Cosine Map@100
297
+ - task:
298
+ type: information-retrieval
299
+ name: Information Retrieval
300
+ dataset:
301
+ name: dim 64
302
+ type: dim_64
303
+ metrics:
304
+ - type: cosine_accuracy@1
305
+ value: 0.0
306
+ name: Cosine Accuracy@1
307
+ - type: cosine_accuracy@3
308
+ value: 0.0
309
+ name: Cosine Accuracy@3
310
+ - type: cosine_accuracy@5
311
+ value: 0.0018148820326678765
312
+ name: Cosine Accuracy@5
313
+ - type: cosine_accuracy@10
314
+ value: 0.0018148820326678765
315
+ name: Cosine Accuracy@10
316
+ - type: cosine_precision@1
317
+ value: 0.0
318
+ name: Cosine Precision@1
319
+ - type: cosine_precision@3
320
+ value: 0.0
321
+ name: Cosine Precision@3
322
+ - type: cosine_precision@5
323
+ value: 0.00036297640653357535
324
+ name: Cosine Precision@5
325
+ - type: cosine_precision@10
326
+ value: 0.00018148820326678767
327
+ name: Cosine Precision@10
328
+ - type: cosine_recall@1
329
+ value: 0.0
330
+ name: Cosine Recall@1
331
+ - type: cosine_recall@3
332
+ value: 0.0
333
+ name: Cosine Recall@3
334
+ - type: cosine_recall@5
335
+ value: 0.0018148820326678765
336
+ name: Cosine Recall@5
337
+ - type: cosine_recall@10
338
+ value: 0.0018148820326678765
339
+ name: Cosine Recall@10
340
+ - type: cosine_ndcg@10
341
+ value: 0.0007816271471386444
342
+ name: Cosine Ndcg@10
343
+ - type: cosine_mrr@10
344
+ value: 0.00045372050816696913
345
+ name: Cosine Mrr@10
346
+ - type: cosine_map@100
347
+ value: 0.0011358919077603595
348
+ name: Cosine Map@100
349
+ ---
350
+
351
+ # German Semantic V3 BMF
352
+
353
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [jinaai/jina-embeddings-v2-base-de](https://huggingface.co/jinaai/jina-embeddings-v2-base-de). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
354
+
355
+ ## Model Details
356
+
357
+ ### Model Description
358
+ - **Model Type:** Sentence Transformer
359
+ - **Base model:** [jinaai/jina-embeddings-v2-base-de](https://huggingface.co/jinaai/jina-embeddings-v2-base-de) <!-- at revision 13b8b30bd0bbee829ceffb82b282cc714cef836e -->
360
+ - **Maximum Sequence Length:** 1024 tokens
361
+ - **Output Dimensionality:** 768 tokens
362
+ - **Similarity Function:** Cosine Similarity
363
+ <!-- - **Training Dataset:** Unknown -->
364
+ - **Language:** en
365
+ - **License:** apache-2.0
366
+
367
+ ### Model Sources
368
+
369
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
370
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
371
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
372
+
373
+ ### Full Model Architecture
374
+
375
+ ```
376
+ SentenceTransformer(
377
+ (0): Transformer({'max_seq_length': 1024, 'do_lower_case': False}) with Transformer model: BertModel
378
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
379
+ (2): Normalize()
380
+ )
381
+ ```
382
+
383
+ ## Usage
384
+
385
+ ### Direct Usage (Sentence Transformers)
386
+
387
+ First install the Sentence Transformers library:
388
+
389
+ ```bash
390
+ pip install -U sentence-transformers
391
+ ```
392
+
393
+ Then you can load this model and run inference.
394
+ ```python
395
+ from sentence_transformers import SentenceTransformer
396
+
397
+ # Download from the 🤗 Hub
398
+ model = SentenceTransformer("akot/jina-semantic-bmf-matryoshka-1024-10epochs")
399
+ # Run inference
400
+ sentences = [
401
+ '67 Abwandlung des Beispiels 1 in Rn. 66: A erhält zudem zwei Kinderzulagen für seine in den Jahren 2004 und 2005 geborenen Kinder. Beitragspflichtige Einnahmen 53.000 € 4 % 2.120 € höchstens 2.100 € anzusetzen 2.100 € abzüglich Zulage 175 € Mindesteigenbeitrag (§ 86 Abs. 1 Satz 2 EStG) 1.925 € Sockelbetrag (§ 86 Abs. 1 Satz 4 EStG) 60 € maßgebend (§ 86 Abs. 1 Satz 5 EStG) 1.925 € Die von A geleisteten Beiträge übersteigen den Mindesteigenbeitrag. Die Zulage wird nicht gekürzt.',
402
+ 'Wird die Zulage für A gekürzt, wenn die Beiträge den Mindesteigenbeitrag übersteigen?',
403
+ 'Wie erfolgt die Besteuerung bei der ausgleichsberechtigten Person nach einer externen Teilung?',
404
+ ]
405
+ embeddings = model.encode(sentences)
406
+ print(embeddings.shape)
407
+ # [3, 768]
408
+
409
+ # Get the similarity scores for the embeddings
410
+ similarities = model.similarity(embeddings, embeddings)
411
+ print(similarities.shape)
412
+ # [3, 3]
413
+ ```
414
+
415
+ <!--
416
+ ### Direct Usage (Transformers)
417
+
418
+ <details><summary>Click to see the direct usage in Transformers</summary>
419
+
420
+ </details>
421
+ -->
422
+
423
+ <!--
424
+ ### Downstream Usage (Sentence Transformers)
425
+
426
+ You can finetune this model on your own dataset.
427
+
428
+ <details><summary>Click to expand</summary>
429
+
430
+ </details>
431
+ -->
432
+
433
+ <!--
434
+ ### Out-of-Scope Use
435
+
436
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
437
+ -->
438
+
439
+ ## Evaluation
440
+
441
+ ### Metrics
442
+
443
+ #### Information Retrieval
444
+ * Dataset: `dim_768`
445
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
446
+
447
+ | Metric | Value |
448
+ |:--------------------|:-----------|
449
+ | cosine_accuracy@1 | 0.0 |
450
+ | cosine_accuracy@3 | 0.0018 |
451
+ | cosine_accuracy@5 | 0.0018 |
452
+ | cosine_accuracy@10 | 0.0036 |
453
+ | cosine_precision@1 | 0.0 |
454
+ | cosine_precision@3 | 0.0006 |
455
+ | cosine_precision@5 | 0.0004 |
456
+ | cosine_precision@10 | 0.0004 |
457
+ | cosine_recall@1 | 0.0 |
458
+ | cosine_recall@3 | 0.0018 |
459
+ | cosine_recall@5 | 0.0018 |
460
+ | cosine_recall@10 | 0.0036 |
461
+ | cosine_ndcg@10 | 0.0018 |
462
+ | cosine_mrr@10 | 0.0012 |
463
+ | **cosine_map@100** | **0.0019** |
464
+
465
+ #### Information Retrieval
466
+ * Dataset: `dim_512`
467
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
468
+
469
+ | Metric | Value |
470
+ |:--------------------|:-----------|
471
+ | cosine_accuracy@1 | 0.0 |
472
+ | cosine_accuracy@3 | 0.0018 |
473
+ | cosine_accuracy@5 | 0.0036 |
474
+ | cosine_accuracy@10 | 0.0036 |
475
+ | cosine_precision@1 | 0.0 |
476
+ | cosine_precision@3 | 0.0006 |
477
+ | cosine_precision@5 | 0.0007 |
478
+ | cosine_precision@10 | 0.0004 |
479
+ | cosine_recall@1 | 0.0 |
480
+ | cosine_recall@3 | 0.0018 |
481
+ | cosine_recall@5 | 0.0036 |
482
+ | cosine_recall@10 | 0.0036 |
483
+ | cosine_ndcg@10 | 0.0019 |
484
+ | cosine_mrr@10 | 0.0014 |
485
+ | **cosine_map@100** | **0.0023** |
486
+
487
+ #### Information Retrieval
488
+ * Dataset: `dim_256`
489
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
490
+
491
+ | Metric | Value |
492
+ |:--------------------|:----------|
493
+ | cosine_accuracy@1 | 0.0 |
494
+ | cosine_accuracy@3 | 0.0018 |
495
+ | cosine_accuracy@5 | 0.0018 |
496
+ | cosine_accuracy@10 | 0.0036 |
497
+ | cosine_precision@1 | 0.0 |
498
+ | cosine_precision@3 | 0.0006 |
499
+ | cosine_precision@5 | 0.0004 |
500
+ | cosine_precision@10 | 0.0004 |
501
+ | cosine_recall@1 | 0.0 |
502
+ | cosine_recall@3 | 0.0018 |
503
+ | cosine_recall@5 | 0.0018 |
504
+ | cosine_recall@10 | 0.0036 |
505
+ | cosine_ndcg@10 | 0.0017 |
506
+ | cosine_mrr@10 | 0.0011 |
507
+ | **cosine_map@100** | **0.002** |
508
+
509
+ #### Information Retrieval
510
+ * Dataset: `dim_128`
511
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
512
+
513
+ | Metric | Value |
514
+ |:--------------------|:-----------|
515
+ | cosine_accuracy@1 | 0.0 |
516
+ | cosine_accuracy@3 | 0.0 |
517
+ | cosine_accuracy@5 | 0.0 |
518
+ | cosine_accuracy@10 | 0.0054 |
519
+ | cosine_precision@1 | 0.0 |
520
+ | cosine_precision@3 | 0.0 |
521
+ | cosine_precision@5 | 0.0 |
522
+ | cosine_precision@10 | 0.0005 |
523
+ | cosine_recall@1 | 0.0 |
524
+ | cosine_recall@3 | 0.0 |
525
+ | cosine_recall@5 | 0.0 |
526
+ | cosine_recall@10 | 0.0054 |
527
+ | cosine_ndcg@10 | 0.0018 |
528
+ | cosine_mrr@10 | 0.0007 |
529
+ | **cosine_map@100** | **0.0015** |
530
+
531
+ #### Information Retrieval
532
+ * Dataset: `dim_64`
533
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
534
+
535
+ | Metric | Value |
536
+ |:--------------------|:-----------|
537
+ | cosine_accuracy@1 | 0.0 |
538
+ | cosine_accuracy@3 | 0.0 |
539
+ | cosine_accuracy@5 | 0.0018 |
540
+ | cosine_accuracy@10 | 0.0018 |
541
+ | cosine_precision@1 | 0.0 |
542
+ | cosine_precision@3 | 0.0 |
543
+ | cosine_precision@5 | 0.0004 |
544
+ | cosine_precision@10 | 0.0002 |
545
+ | cosine_recall@1 | 0.0 |
546
+ | cosine_recall@3 | 0.0 |
547
+ | cosine_recall@5 | 0.0018 |
548
+ | cosine_recall@10 | 0.0018 |
549
+ | cosine_ndcg@10 | 0.0008 |
550
+ | cosine_mrr@10 | 0.0005 |
551
+ | **cosine_map@100** | **0.0011** |
552
+
553
+ <!--
554
+ ## Bias, Risks and Limitations
555
+
556
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
557
+ -->
558
+
559
+ <!--
560
+ ### Recommendations
561
+
562
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
563
+ -->
564
+
565
+ ## Training Details
566
+
567
+ ### Training Dataset
568
+
569
+ #### Unnamed Dataset
570
+
571
+
572
+ * Size: 4,957 training samples
573
+ * Columns: <code>positive</code> and <code>anchor</code>
574
+ * Approximate statistics based on the first 1000 samples:
575
+ | | positive | anchor |
576
+ |:--------|:-------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
577
+ | type | string | string |
578
+ | details | <ul><li>min: 5 tokens</li><li>mean: 145.09 tokens</li><li>max: 1024 tokens</li></ul> | <ul><li>min: 9 tokens</li><li>mean: 19.57 tokens</li><li>max: 41 tokens</li></ul> |
579
+ * Samples:
580
+ | positive | anchor |
581
+ |:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------|
582
+ | <code>134 Eine Rückzahlungsverpflichtung besteht nicht für den Teil der Zulagen, der auf nach § 1 Abs. 1 Nr. 2 AltZertG angespartes gefördertes Altersvorsorgevermögen entfällt, wenn es in Form einer Hinterbliebenenrente an die dort genannten Hinterbliebenen ausgezahlt wird. Dies gilt auch für den entsprechenden Teil der Steuerermäßigung.</code> | <code>Muss man Zulagen zurückzahlen, wenn das Altersvorsorgevermögen als Hinterbliebenenrente ausgezahlt wird?</code> |
583
+ | <code>140 Beendet der Zulageberechtigte vor der vollständigen Rückzahlung des AltersvorsorgeEigenheimbetrags die Nutzung zu eigenen Wohnzwecken, wird er so behandelt, als habe er den noch nicht zurückgezahlten Betrag schädlich verwendet. Die auf den noch ausstehenden Rückzahlungsbetrag entfallenden Zulagen sowie die nach § 10a Abs. 4 EStG gesondert festgestellten Steuerermäßigungen sind zurückzuzahlen (§ 92a Abs. 3 EStG). Die im noch ausstehenden Rückzahlungsbetrag enthaltenen Zuwächse (z.B. Zinserträge und Kursgewinne) Seite 41 sind als sonstige Einkünfte zu versteuern (§ 22 Nr. 5 Satz 5 Halbsatz 1 EStG). Außerdem hat der Zulageberechtigte den Vorteil zu versteuern, der sich aus der zinslosen Nutzung des noch nicht zurückgezahlten Betrags ergibt. Zugrunde gelegt wird hierbei eine Verzinsung von 5 % (Zins und Zinseszins) für jedes volle Kalenderjahr der Nutzung (§ 22 Nr. 5 Satz 5 Halbsatz 2 EStG). Diese Folgen treten nicht ein, wenn er den noch nicht zurückgezahlten Betrag in ein Folgeobjekt investiert (§ 92a Abs. 4 Satz 3 Nr. 1 EStG) oder zugunsten eines auf seinen Namen lautenden zertifizierten Altersvorsorgevertrags einzahlt (§ 92a Abs. 4 Satz 3 Nr. 2 EStG).</code> | <code>Was geschieht steuerlich, wenn der AltersvorsorgeEigenheimbetrag nicht vollständig zurückgezahlt wird und die Immobilie nicht mehr selbst genutzt wird?</code> |
584
+ | <code>144 Die als Einkünfte nach § 22 Nr. 5 Satz 3 EStG i.V.m. § 22 Nr. 5 Satz 2 EStG zu besteuernden Beträge muss der Anbieter gem. § 94 Abs. 1 Satz 4 EStG dem Zulageberechtigten bescheinigen und im Wege des Rentenbezugsmitteilungsverfahrens (§ 22a EStG) mitteilen. Ergeben sich insoweit steuerpflichtige Einkünfte nach § 22 Nr. 5 Satz 3 EStG für einen anderen Leistungsempfänger (z. B. Erben), ist für diesen eine entsprechende Rentenbezugsmitteilung der ZfA zu übermitteln.</code> | <code>Was muss im Falle eines anderen Leistungsempfängers, wie Erben, hinsichtlich der Rentenbezugsmitteilung getan werden?</code> |
585
+ * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
586
+ ```json
587
+ {
588
+ "loss": "MultipleNegativesRankingLoss",
589
+ "matryoshka_dims": [
590
+ 768,
591
+ 512,
592
+ 256,
593
+ 128,
594
+ 64
595
+ ],
596
+ "matryoshka_weights": [
597
+ 1,
598
+ 1,
599
+ 1,
600
+ 1,
601
+ 1
602
+ ],
603
+ "n_dims_per_step": -1
604
+ }
605
+ ```
606
+
607
+ ### Training Hyperparameters
608
+ #### Non-Default Hyperparameters
609
+
610
+ - `eval_strategy`: epoch
611
+ - `per_device_train_batch_size`: 16
612
+ - `per_device_eval_batch_size`: 16
613
+ - `gradient_accumulation_steps`: 16
614
+ - `learning_rate`: 2e-05
615
+ - `num_train_epochs`: 10
616
+ - `lr_scheduler_type`: cosine
617
+ - `warmup_ratio`: 0.1
618
+ - `bf16`: True
619
+ - `tf32`: True
620
+ - `load_best_model_at_end`: True
621
+ - `optim`: adamw_torch_fused
622
+ - `batch_sampler`: no_duplicates
623
+
624
+ #### All Hyperparameters
625
+ <details><summary>Click to expand</summary>
626
+
627
+ - `overwrite_output_dir`: False
628
+ - `do_predict`: False
629
+ - `eval_strategy`: epoch
630
+ - `prediction_loss_only`: True
631
+ - `per_device_train_batch_size`: 16
632
+ - `per_device_eval_batch_size`: 16
633
+ - `per_gpu_train_batch_size`: None
634
+ - `per_gpu_eval_batch_size`: None
635
+ - `gradient_accumulation_steps`: 16
636
+ - `eval_accumulation_steps`: None
637
+ - `learning_rate`: 2e-05
638
+ - `weight_decay`: 0.0
639
+ - `adam_beta1`: 0.9
640
+ - `adam_beta2`: 0.999
641
+ - `adam_epsilon`: 1e-08
642
+ - `max_grad_norm`: 1.0
643
+ - `num_train_epochs`: 10
644
+ - `max_steps`: -1
645
+ - `lr_scheduler_type`: cosine
646
+ - `lr_scheduler_kwargs`: {}
647
+ - `warmup_ratio`: 0.1
648
+ - `warmup_steps`: 0
649
+ - `log_level`: passive
650
+ - `log_level_replica`: warning
651
+ - `log_on_each_node`: True
652
+ - `logging_nan_inf_filter`: True
653
+ - `save_safetensors`: True
654
+ - `save_on_each_node`: False
655
+ - `save_only_model`: False
656
+ - `restore_callback_states_from_checkpoint`: False
657
+ - `no_cuda`: False
658
+ - `use_cpu`: False
659
+ - `use_mps_device`: False
660
+ - `seed`: 42
661
+ - `data_seed`: None
662
+ - `jit_mode_eval`: False
663
+ - `use_ipex`: False
664
+ - `bf16`: True
665
+ - `fp16`: False
666
+ - `fp16_opt_level`: O1
667
+ - `half_precision_backend`: auto
668
+ - `bf16_full_eval`: False
669
+ - `fp16_full_eval`: False
670
+ - `tf32`: True
671
+ - `local_rank`: 0
672
+ - `ddp_backend`: None
673
+ - `tpu_num_cores`: None
674
+ - `tpu_metrics_debug`: False
675
+ - `debug`: []
676
+ - `dataloader_drop_last`: False
677
+ - `dataloader_num_workers`: 0
678
+ - `dataloader_prefetch_factor`: None
679
+ - `past_index`: -1
680
+ - `disable_tqdm`: False
681
+ - `remove_unused_columns`: True
682
+ - `label_names`: None
683
+ - `load_best_model_at_end`: True
684
+ - `ignore_data_skip`: False
685
+ - `fsdp`: []
686
+ - `fsdp_min_num_params`: 0
687
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
688
+ - `fsdp_transformer_layer_cls_to_wrap`: None
689
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
690
+ - `deepspeed`: None
691
+ - `label_smoothing_factor`: 0.0
692
+ - `optim`: adamw_torch_fused
693
+ - `optim_args`: None
694
+ - `adafactor`: False
695
+ - `group_by_length`: False
696
+ - `length_column_name`: length
697
+ - `ddp_find_unused_parameters`: None
698
+ - `ddp_bucket_cap_mb`: None
699
+ - `ddp_broadcast_buffers`: False
700
+ - `dataloader_pin_memory`: True
701
+ - `dataloader_persistent_workers`: False
702
+ - `skip_memory_metrics`: True
703
+ - `use_legacy_prediction_loop`: False
704
+ - `push_to_hub`: False
705
+ - `resume_from_checkpoint`: None
706
+ - `hub_model_id`: None
707
+ - `hub_strategy`: every_save
708
+ - `hub_private_repo`: False
709
+ - `hub_always_push`: False
710
+ - `gradient_checkpointing`: False
711
+ - `gradient_checkpointing_kwargs`: None
712
+ - `include_inputs_for_metrics`: False
713
+ - `eval_do_concat_batches`: True
714
+ - `fp16_backend`: auto
715
+ - `push_to_hub_model_id`: None
716
+ - `push_to_hub_organization`: None
717
+ - `mp_parameters`:
718
+ - `auto_find_batch_size`: False
719
+ - `full_determinism`: False
720
+ - `torchdynamo`: None
721
+ - `ray_scope`: last
722
+ - `ddp_timeout`: 1800
723
+ - `torch_compile`: False
724
+ - `torch_compile_backend`: None
725
+ - `torch_compile_mode`: None
726
+ - `dispatch_batches`: None
727
+ - `split_batches`: None
728
+ - `include_tokens_per_second`: False
729
+ - `include_num_input_tokens_seen`: False
730
+ - `neftune_noise_alpha`: None
731
+ - `optim_target_modules`: None
732
+ - `batch_eval_metrics`: False
733
+ - `batch_sampler`: no_duplicates
734
+ - `multi_dataset_batch_sampler`: proportional
735
+
736
+ </details>
737
+
738
+ ### Training Logs
739
+ | Epoch | Step | Training Loss | dim_128_cosine_map@100 | dim_256_cosine_map@100 | dim_512_cosine_map@100 | dim_64_cosine_map@100 | dim_768_cosine_map@100 |
740
+ |:----------:|:------:|:-------------:|:----------------------:|:----------------------:|:----------------------:|:---------------------:|:----------------------:|
741
+ | 0.5161 | 10 | 19.1189 | - | - | - | - | - |
742
+ | 0.9806 | 19 | - | 0.0008 | 0.0009 | 0.0007 | 0.0004 | 0.0006 |
743
+ | 1.0323 | 20 | 19.0037 | - | - | - | - | - |
744
+ | 1.5484 | 30 | 18.7625 | - | - | - | - | - |
745
+ | 1.9613 | 38 | - | 0.0007 | 0.0008 | 0.0007 | 0.0004 | 0.0008 |
746
+ | 2.0645 | 40 | 18.3193 | - | - | - | - | - |
747
+ | 2.5806 | 50 | 18.5378 | - | - | - | - | - |
748
+ | **2.9935** | **58** | **-** | **0.0018** | **0.002** | **0.0017** | **0.0018** | **0.0016** |
749
+ | 3.0968 | 60 | 18.103 | - | - | - | - | - |
750
+ | 3.6129 | 70 | 18.0413 | - | - | - | - | - |
751
+ | 3.9742 | 77 | - | 0.0011 | 0.0006 | 0.0004 | 0.0006 | 0.0004 |
752
+ | 4.1290 | 80 | 17.7272 | - | - | - | - | - |
753
+ | 4.6452 | 90 | 17.4282 | - | - | - | - | - |
754
+ | 4.9548 | 96 | - | 0.0014 | 0.0011 | 0.0009 | 0.0005 | 0.0007 |
755
+ | 5.1613 | 100 | 17.2805 | - | - | - | - | - |
756
+ | 5.6774 | 110 | 17.3362 | - | - | - | - | - |
757
+ | 5.9871 | 116 | - | 0.0005 | 0.0002 | 0.0002 | 0.0016 | 0.0003 |
758
+ | 6.1935 | 120 | 17.1479 | - | - | - | - | - |
759
+ | 6.7097 | 130 | 17.1761 | - | - | - | - | - |
760
+ | 6.9677 | 135 | - | 0.0020 | 0.0025 | 0.0025 | 0.0017 | 0.0020 |
761
+ | 7.2258 | 140 | 16.8886 | - | - | - | - | - |
762
+ | 7.7419 | 150 | 16.9517 | - | - | - | - | - |
763
+ | 8.0 | 155 | - | 0.0008 | 0.0009 | 0.0010 | 0.0005 | 0.0010 |
764
+ | 8.2581 | 160 | 17.0198 | - | - | - | - | - |
765
+ | 8.7742 | 170 | 16.7575 | - | - | - | - | - |
766
+ | 8.9806 | 174 | - | 0.0013 | 0.0009 | 0.0007 | 0.0008 | 0.0007 |
767
+ | 9.2903 | 180 | 16.8138 | - | - | - | - | - |
768
+ | 9.8065 | 190 | 16.9424 | 0.0015 | 0.0020 | 0.0023 | 0.0011 | 0.0019 |
769
+
770
+ * The bold row denotes the saved checkpoint.
771
+
772
+ ### Framework Versions
773
+ - Python: 3.11.4
774
+ - Sentence Transformers: 3.0.1
775
+ - Transformers: 4.41.2
776
+ - PyTorch: 2.1.2+cu121
777
+ - Accelerate: 0.33.0
778
+ - Datasets: 2.19.1
779
+ - Tokenizers: 0.19.1
780
+
781
+ ## Citation
782
+
783
+ ### BibTeX
784
+
785
+ #### Sentence Transformers
786
+ ```bibtex
787
+ @inproceedings{reimers-2019-sentence-bert,
788
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
789
+ author = "Reimers, Nils and Gurevych, Iryna",
790
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
791
+ month = "11",
792
+ year = "2019",
793
+ publisher = "Association for Computational Linguistics",
794
+ url = "https://arxiv.org/abs/1908.10084",
795
+ }
796
+ ```
797
+
798
+ #### MatryoshkaLoss
799
+ ```bibtex
800
+ @misc{kusupati2024matryoshka,
801
+ title={Matryoshka Representation Learning},
802
+ author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
803
+ year={2024},
804
+ eprint={2205.13147},
805
+ archivePrefix={arXiv},
806
+ primaryClass={cs.LG}
807
+ }
808
+ ```
809
+
810
+ #### MultipleNegativesRankingLoss
811
+ ```bibtex
812
+ @misc{henderson2017efficient,
813
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
814
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
815
+ year={2017},
816
+ eprint={1705.00652},
817
+ archivePrefix={arXiv},
818
+ primaryClass={cs.CL}
819
+ }
820
+ ```
821
+
822
+ <!--
823
+ ## Glossary
824
+
825
+ *Clearly define terms in order to be accessible across audiences.*
826
+ -->
827
+
828
+ <!--
829
+ ## Model Card Authors
830
+
831
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
832
+ -->
833
+
834
+ <!--
835
+ ## Model Card Contact
836
+
837
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
838
+ -->
config.json ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "jinaai/jina-embeddings-v2-base-de",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.0,
7
+ "auto_map": {
8
+ "AutoConfig": "jinaai/jina-bert-implementation--configuration_bert.JinaBertConfig",
9
+ "AutoModel": "jinaai/jina-bert-implementation--modeling_bert.JinaBertModel",
10
+ "AutoModelForMaskedLM": "jinaai/jina-bert-implementation--modeling_bert.JinaBertForMaskedLM",
11
+ "AutoModelForSequenceClassification": "jinaai/jina-bert-implementation--modeling_bert.JinaBertForSequenceClassification"
12
+ },
13
+ "classifier_dropout": null,
14
+ "emb_pooler": "mean",
15
+ "feed_forward_type": "geglu",
16
+ "gradient_checkpointing": false,
17
+ "hidden_act": "gelu",
18
+ "hidden_dropout_prob": 0.1,
19
+ "hidden_size": 768,
20
+ "initializer_range": 0.02,
21
+ "intermediate_size": 3072,
22
+ "layer_norm_eps": 1e-12,
23
+ "max_position_embeddings": 8192,
24
+ "model_max_length": 8192,
25
+ "model_type": "bert",
26
+ "num_attention_heads": 12,
27
+ "num_hidden_layers": 12,
28
+ "pad_token_id": 0,
29
+ "position_embedding_type": "alibi",
30
+ "torch_dtype": "float32",
31
+ "transformers_version": "4.41.2",
32
+ "type_vocab_size": 2,
33
+ "use_cache": true,
34
+ "vocab_size": 61056
35
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.0.1",
4
+ "transformers": "4.41.2",
5
+ "pytorch": "2.1.2+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:008acf6a2c9172a3216cdde2d2fbbde428ea8d790a7d0a7ed42e07a8fc079bf7
3
+ size 555344752
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 1024,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": true,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "cls_token": {
10
+ "content": "<s>",
11
+ "lstrip": false,
12
+ "normalized": true,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "eos_token": {
17
+ "content": "</s>",
18
+ "lstrip": false,
19
+ "normalized": true,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "mask_token": {
24
+ "content": "<mask>",
25
+ "lstrip": true,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "pad_token": {
31
+ "content": "<pad>",
32
+ "lstrip": false,
33
+ "normalized": true,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "sep_token": {
38
+ "content": "</s>",
39
+ "lstrip": false,
40
+ "normalized": true,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ },
44
+ "unk_token": {
45
+ "content": "<unk>",
46
+ "lstrip": false,
47
+ "normalized": true,
48
+ "rstrip": false,
49
+ "single_word": false
50
+ }
51
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": false,
3
+ "added_tokens_decoder": {
4
+ "0": {
5
+ "content": "<s>",
6
+ "lstrip": false,
7
+ "normalized": true,
8
+ "rstrip": false,
9
+ "single_word": false,
10
+ "special": true
11
+ },
12
+ "1": {
13
+ "content": "<pad>",
14
+ "lstrip": false,
15
+ "normalized": true,
16
+ "rstrip": false,
17
+ "single_word": false,
18
+ "special": true
19
+ },
20
+ "2": {
21
+ "content": "</s>",
22
+ "lstrip": false,
23
+ "normalized": true,
24
+ "rstrip": false,
25
+ "single_word": false,
26
+ "special": true
27
+ },
28
+ "3": {
29
+ "content": "<unk>",
30
+ "lstrip": false,
31
+ "normalized": true,
32
+ "rstrip": false,
33
+ "single_word": false,
34
+ "special": true
35
+ },
36
+ "4": {
37
+ "content": "<mask>",
38
+ "lstrip": true,
39
+ "normalized": false,
40
+ "rstrip": false,
41
+ "single_word": false,
42
+ "special": true
43
+ }
44
+ },
45
+ "bos_token": "<s>",
46
+ "clean_up_tokenization_spaces": true,
47
+ "cls_token": "<s>",
48
+ "eos_token": "</s>",
49
+ "errors": "replace",
50
+ "mask_token": "<mask>",
51
+ "model_max_length": 8192,
52
+ "pad_token": "<pad>",
53
+ "sep_token": "</s>",
54
+ "tokenizer_class": "RobertaTokenizer",
55
+ "trim_offsets": true,
56
+ "unk_token": "<unk>"
57
+ }
vocab.json ADDED
The diff for this file is too large to render. See raw diff