am-azadi commited on
Commit
fe8bb26
·
verified ·
1 Parent(s): 1551402

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 896,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,463 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ - generated_from_trainer
7
+ - dataset_size:21769
8
+ - loss:MultipleNegativesRankingLoss
9
+ base_model: am-azadi/KaLM-embedding-multilingual-mini-v1_Fine_Tuned_1e
10
+ widget:
11
+ - source_sentence: Day 1 - Job Losses Biden CANCELS Keystone Pipeline on - Day 1.
12
+ XX 83k jobs lost. XX Get ready for $4 Gas prices by Summer.
13
+ sentences:
14
+ - Pedro Castillo manipulated a photograph of a rally Photo of a Castillo rally in
15
+ Peru was altered to show alleged manipulation
16
+ - Biden's Keystone XL pipeline reversal will cost tens of thousands of jobs Posts
17
+ inflate job losses from Biden's Keystone pipeline reversal
18
+ - Delete System32 folder to clear your privacy browsing history? This hoax has been
19
+ around for years, don’t delete System32 folder
20
+ - source_sentence: ' Thiago Brazil The "Russian missile" turns cars and destroys
21
+ the street but the windows of the buildings did not break down and neither did
22
+ smeared with dust. 1 more fake advertising piece of the neo-Nazi Zelensky. need
23
+ do more next time!!'
24
+ sentences:
25
+ - Intact windows in Ukraine prove Russian attack simulation Intact windows and overturned
26
+ cars do not prove simulated attack in Bucha, Ukraine
27
+ - Editorial of the newspaper Le Monde says that Mexico is on the way to ruin The
28
+ newspaper Le Monde did not publish an editorial titled "Mexico on a direct path
29
+ to ruin"
30
+ - A photo of 189-year-old Jonathan the turtle This photo does not show the 189-year-old
31
+ Jonathan tortoise, but a giant tortoise in Australia
32
+ - source_sentence: ' madri Greta Thunberg, urged the Chinese to do without traditional
33
+ chopsticks to protect the trees The Chinese then asked Greta to return to the
34
+ school she was in could find out that traditional Chopsticks are made from bamboo,
35
+ and bamboo is a grass! The Chinese have Greta and her friends too asked to give
36
+ up toilet paper to wipe her ass ''cause this one will made from trees.'
37
+ sentences:
38
+ - This is how they repress in Bolivia tear gas grenade to the head The man killed
39
+ by the shot of a tear gas grenade to the head was the victim of the repression
40
+ of a protest in Iraq, not in Bolivia
41
+ - Greta Thunberg urged China to ban chopsticks There is no evidence for Greta Thunberg's
42
+ demand for a ban on chopsticks
43
+ - Accurate reporting on Pfizer-BioNTech Covid-19 vaccination drive and deaths in
44
+ Germany Social media posts misrepresent Pfizer-BioNTech Covid-19 vaccinations
45
+ in Germany
46
+ - source_sentence: Do you want to know the truth? The "Vaccine" they are voting for
47
+ is mRNA, it circulates through the blood until it finds a receptor for the enzyme
48
+ ACE2, which exists mainly in the testicles, a little in the ovaries, and in the
49
+ myelin of neurons. The m is for messenger, through the receptor, the RNA penetrates
50
+ the cell and rewrites its genetic code. The cell no longer serves what nature
51
+ created it for, it serves to create what the laboratories designed RNA for. Theoretically
52
+ "defenses". The price is that 97% of the inoculated males will remain sterile,
53
+ but also, if they are young children, they will never develop secondary sexual
54
+ characteristics. They will be androgynous, without sexual desire, or very moderate,
55
+ and probably more manageable and obedient. 45% of girls will be sterile. Neuronal
56
+ damage, on the other hand, will affect part of your frontal cortex. You will be
57
+ able to work, even drive a car, but not think deeply. Perfect slaves of the New
58
+ Normality. It is PHASE 3 of the plan, as it was finalized in EVENT 201. PHASE
59
+ 1 was to scare you, isolate you and lock you up, due to a virus that, as a single
60
+ cause, only killed a tiny handful of people. Much less than last year's flu. PHASE
61
+ 2 makes you wear a grotesque and useless mask, which depersonalizes you and deprives
62
+ you of oxygen. May you lose your job, partner and affections. PHASE 3, when you
63
+ are already desperate, is the "Vaccine". They're going to tell you I'm lying,
64
+ so ask what's in it. They will answer that by law not even doctors can know. Secret.
65
+ You don't have the maturity to know it, trust the government, the media, the WHO
66
+ and the employees of George Soros and Bill Gates, like Dr. Pedro Cahn and his
67
+ sinister Fundación Huésped. They will tell you that the laboratories are responsible,
68
+ but by law you will not be able to claim anyone. Let's see, when your balls dry,
69
+ you'll only have to cry about what's left of them. When you know that you will
70
+ never have a grandchild, that you will never see your son become a father, nor
71
+ graduate from a career, because his brain will be lobotomized. The opposition"?
72
+ I don't want to make you bitter, but 90% receive money from Soros' Open Society,
73
+ from the Bill and Melinda Gates Foundation, from the Ford Foundation, from the
74
+ Rockefeller Foundation, from the Chinese Communist Party through its figurehead,
75
+ the investor Ming Wai Lau. Who will tell you the truth? Dr. Roxana Bruno, Dr.
76
+ Chinda Brandolino, Dr. Heiko Schöning, Doctors for Truth, Lawyers for Truth, Teachers
77
+ for Truth. Turn off the TV, burn your muzzle, breathe, go out to hug your parents,
78
+ embrace freedom, don't let any doctor who isn't the truth touch you. In this way
79
+ the Global Sanitary Dictatorship will collapse, and we will wake up from this
80
+ nightmare. From the wall of Horacio Rivara CLIPARTO CLIPARTO CLIPARTO CLIPARTO
81
+ CLIPARTO CLIPARTO C C
82
+ sentences:
83
+ - Today- Mayor of São Paulo thanking the Bolsonaro government on vaccines against
84
+ covid-19 Video with thanks to the federal government was for funds for a hospital
85
+ in May 2020
86
+ - This photo shows Bolsonaro surrounded by global leaders at a G20 meeting No, the
87
+ photo is a montage and who appears in the original is US President Donald Trump
88
+ - Vaccines to prevent covid-19 cause infertility Vaccines to prevent covid-19 are
89
+ not designed to affect fertility
90
+ - source_sentence: 'The moment of the death of President Mohamed Morsi, may God have
91
+ mercy on him, God willing '
92
+ sentences:
93
+ - Cuba has Interferon Alpha 2B, the cure, the vaccine against the coronavirus The
94
+ Cuban antiviral Interferon Alfa 2B is used in China to treat patients with the
95
+ new coronavirus, but it is neither a vaccine nor a cure
96
+ - 'José Antonio Kast said: "Juvenile delinquent of 16 or 17 years will not go to
97
+ sename, he will do compulsory military service" Chilean presidential candidate
98
+ Kast does not propose that young people who commit crimes do military service'
99
+ - The moment of the death of President Mohamed Morsi This video belongs to the trial
100
+ of those accused of the Port Said events and does not show the moment of the death
101
+ of former Egyptian President Mohamed Morsi
102
+ pipeline_tag: sentence-similarity
103
+ library_name: sentence-transformers
104
+ ---
105
+
106
+ # SentenceTransformer based on am-azadi/KaLM-embedding-multilingual-mini-v1_Fine_Tuned_1e
107
+
108
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [am-azadi/KaLM-embedding-multilingual-mini-v1_Fine_Tuned_1e](https://huggingface.co/am-azadi/KaLM-embedding-multilingual-mini-v1_Fine_Tuned_1e). It maps sentences & paragraphs to a 896-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
109
+
110
+ ## Model Details
111
+
112
+ ### Model Description
113
+ - **Model Type:** Sentence Transformer
114
+ - **Base model:** [am-azadi/KaLM-embedding-multilingual-mini-v1_Fine_Tuned_1e](https://huggingface.co/am-azadi/KaLM-embedding-multilingual-mini-v1_Fine_Tuned_1e) <!-- at revision db90e52b2078421f04b71e31b5a90f5bf8d321d7 -->
115
+ - **Maximum Sequence Length:** 512 tokens
116
+ - **Output Dimensionality:** 896 dimensions
117
+ - **Similarity Function:** Cosine Similarity
118
+ <!-- - **Training Dataset:** Unknown -->
119
+ <!-- - **Language:** Unknown -->
120
+ <!-- - **License:** Unknown -->
121
+
122
+ ### Model Sources
123
+
124
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
125
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
126
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
127
+
128
+ ### Full Model Architecture
129
+
130
+ ```
131
+ SentenceTransformer(
132
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: Qwen2Model
133
+ (1): Pooling({'word_embedding_dimension': 896, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
134
+ (2): Normalize()
135
+ )
136
+ ```
137
+
138
+ ## Usage
139
+
140
+ ### Direct Usage (Sentence Transformers)
141
+
142
+ First install the Sentence Transformers library:
143
+
144
+ ```bash
145
+ pip install -U sentence-transformers
146
+ ```
147
+
148
+ Then you can load this model and run inference.
149
+ ```python
150
+ from sentence_transformers import SentenceTransformer
151
+
152
+ # Download from the 🤗 Hub
153
+ model = SentenceTransformer("sentence_transformers_model_id")
154
+ # Run inference
155
+ sentences = [
156
+ 'The moment of the death of President Mohamed Morsi, may God have mercy on him, God willing ',
157
+ 'The moment of the death of President Mohamed Morsi This video belongs to the trial of those accused of the Port Said events and does not show the moment of the death of former Egyptian President Mohamed Morsi',
158
+ 'Cuba has Interferon Alpha 2B, the cure, the vaccine against the coronavirus The Cuban antiviral Interferon Alfa 2B is used in China to treat patients with the new coronavirus, but it is neither a vaccine nor a cure',
159
+ ]
160
+ embeddings = model.encode(sentences)
161
+ print(embeddings.shape)
162
+ # [3, 896]
163
+
164
+ # Get the similarity scores for the embeddings
165
+ similarities = model.similarity(embeddings, embeddings)
166
+ print(similarities.shape)
167
+ # [3, 3]
168
+ ```
169
+
170
+ <!--
171
+ ### Direct Usage (Transformers)
172
+
173
+ <details><summary>Click to see the direct usage in Transformers</summary>
174
+
175
+ </details>
176
+ -->
177
+
178
+ <!--
179
+ ### Downstream Usage (Sentence Transformers)
180
+
181
+ You can finetune this model on your own dataset.
182
+
183
+ <details><summary>Click to expand</summary>
184
+
185
+ </details>
186
+ -->
187
+
188
+ <!--
189
+ ### Out-of-Scope Use
190
+
191
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
192
+ -->
193
+
194
+ <!--
195
+ ## Bias, Risks and Limitations
196
+
197
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
198
+ -->
199
+
200
+ <!--
201
+ ### Recommendations
202
+
203
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
204
+ -->
205
+
206
+ ## Training Details
207
+
208
+ ### Training Dataset
209
+
210
+ #### Unnamed Dataset
211
+
212
+ * Size: 21,769 training samples
213
+ * Columns: <code>sentence_0</code> and <code>sentence_1</code>
214
+ * Approximate statistics based on the first 1000 samples:
215
+ | | sentence_0 | sentence_1 |
216
+ |:--------|:-----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
217
+ | type | string | string |
218
+ | details | <ul><li>min: 4 tokens</li><li>mean: 109.9 tokens</li><li>max: 512 tokens</li></ul> | <ul><li>min: 14 tokens</li><li>mean: 34.57 tokens</li><li>max: 132 tokens</li></ul> |
219
+ * Samples:
220
+ | sentence_0 | sentence_1 |
221
+ |:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
222
+ | <code>Sad, and we are hostages!! If that doesn't make you think about the “measures” that governors and mayors are taking there is nothing that can be made in Brazil</code> | <code>Action by a military police officer against a street vendor amid restrictive measures due to the covid-19 pandemic The photo in which a PM seizes products from a street vendor is from 2016, unrelated to the pandemic</code> |
223
+ | <code> This is why it's important to know your history d4 Rare photo of Queen Elizabeth II and Prince Phillip bowing before the real original African Royalty, Empress Menen Asfaw and her husband Emperor Ras Tafari Makonnen Woldemikael Haile Selassie I of Ethiopia...</code> | <code>Queen Elizabeth II bows before Ethiopian Emperor Haile Selassie British monarch's first visit to Ethiopia came 10 years after this photo was taken</code> |
224
+ | <code>Public Reaction on Hyderabad Priyanka Reddy Case Drabad Encounter . Common people say</code> | <code>Photo of suspects killed by police in Hyderabad rape-murder case This photo has circulated online since at least 2015 in connection with an unrelated case in a different Indian state</code> |
225
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
226
+ ```json
227
+ {
228
+ "scale": 20.0,
229
+ "similarity_fct": "cos_sim"
230
+ }
231
+ ```
232
+
233
+ ### Training Hyperparameters
234
+ #### Non-Default Hyperparameters
235
+
236
+ - `per_device_train_batch_size`: 2
237
+ - `per_device_eval_batch_size`: 2
238
+ - `num_train_epochs`: 1
239
+ - `multi_dataset_batch_sampler`: round_robin
240
+
241
+ #### All Hyperparameters
242
+ <details><summary>Click to expand</summary>
243
+
244
+ - `overwrite_output_dir`: False
245
+ - `do_predict`: False
246
+ - `eval_strategy`: no
247
+ - `prediction_loss_only`: True
248
+ - `per_device_train_batch_size`: 2
249
+ - `per_device_eval_batch_size`: 2
250
+ - `per_gpu_train_batch_size`: None
251
+ - `per_gpu_eval_batch_size`: None
252
+ - `gradient_accumulation_steps`: 1
253
+ - `eval_accumulation_steps`: None
254
+ - `torch_empty_cache_steps`: None
255
+ - `learning_rate`: 5e-05
256
+ - `weight_decay`: 0.0
257
+ - `adam_beta1`: 0.9
258
+ - `adam_beta2`: 0.999
259
+ - `adam_epsilon`: 1e-08
260
+ - `max_grad_norm`: 1
261
+ - `num_train_epochs`: 1
262
+ - `max_steps`: -1
263
+ - `lr_scheduler_type`: linear
264
+ - `lr_scheduler_kwargs`: {}
265
+ - `warmup_ratio`: 0.0
266
+ - `warmup_steps`: 0
267
+ - `log_level`: passive
268
+ - `log_level_replica`: warning
269
+ - `log_on_each_node`: True
270
+ - `logging_nan_inf_filter`: True
271
+ - `save_safetensors`: True
272
+ - `save_on_each_node`: False
273
+ - `save_only_model`: False
274
+ - `restore_callback_states_from_checkpoint`: False
275
+ - `no_cuda`: False
276
+ - `use_cpu`: False
277
+ - `use_mps_device`: False
278
+ - `seed`: 42
279
+ - `data_seed`: None
280
+ - `jit_mode_eval`: False
281
+ - `use_ipex`: False
282
+ - `bf16`: False
283
+ - `fp16`: False
284
+ - `fp16_opt_level`: O1
285
+ - `half_precision_backend`: auto
286
+ - `bf16_full_eval`: False
287
+ - `fp16_full_eval`: False
288
+ - `tf32`: None
289
+ - `local_rank`: 0
290
+ - `ddp_backend`: None
291
+ - `tpu_num_cores`: None
292
+ - `tpu_metrics_debug`: False
293
+ - `debug`: []
294
+ - `dataloader_drop_last`: False
295
+ - `dataloader_num_workers`: 0
296
+ - `dataloader_prefetch_factor`: None
297
+ - `past_index`: -1
298
+ - `disable_tqdm`: False
299
+ - `remove_unused_columns`: True
300
+ - `label_names`: None
301
+ - `load_best_model_at_end`: False
302
+ - `ignore_data_skip`: False
303
+ - `fsdp`: []
304
+ - `fsdp_min_num_params`: 0
305
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
306
+ - `fsdp_transformer_layer_cls_to_wrap`: None
307
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
308
+ - `deepspeed`: None
309
+ - `label_smoothing_factor`: 0.0
310
+ - `optim`: adamw_torch
311
+ - `optim_args`: None
312
+ - `adafactor`: False
313
+ - `group_by_length`: False
314
+ - `length_column_name`: length
315
+ - `ddp_find_unused_parameters`: None
316
+ - `ddp_bucket_cap_mb`: None
317
+ - `ddp_broadcast_buffers`: False
318
+ - `dataloader_pin_memory`: True
319
+ - `dataloader_persistent_workers`: False
320
+ - `skip_memory_metrics`: True
321
+ - `use_legacy_prediction_loop`: False
322
+ - `push_to_hub`: False
323
+ - `resume_from_checkpoint`: None
324
+ - `hub_model_id`: None
325
+ - `hub_strategy`: every_save
326
+ - `hub_private_repo`: None
327
+ - `hub_always_push`: False
328
+ - `gradient_checkpointing`: False
329
+ - `gradient_checkpointing_kwargs`: None
330
+ - `include_inputs_for_metrics`: False
331
+ - `include_for_metrics`: []
332
+ - `eval_do_concat_batches`: True
333
+ - `fp16_backend`: auto
334
+ - `push_to_hub_model_id`: None
335
+ - `push_to_hub_organization`: None
336
+ - `mp_parameters`:
337
+ - `auto_find_batch_size`: False
338
+ - `full_determinism`: False
339
+ - `torchdynamo`: None
340
+ - `ray_scope`: last
341
+ - `ddp_timeout`: 1800
342
+ - `torch_compile`: False
343
+ - `torch_compile_backend`: None
344
+ - `torch_compile_mode`: None
345
+ - `dispatch_batches`: None
346
+ - `split_batches`: None
347
+ - `include_tokens_per_second`: False
348
+ - `include_num_input_tokens_seen`: False
349
+ - `neftune_noise_alpha`: None
350
+ - `optim_target_modules`: None
351
+ - `batch_eval_metrics`: False
352
+ - `eval_on_start`: False
353
+ - `use_liger_kernel`: False
354
+ - `eval_use_gather_object`: False
355
+ - `average_tokens_across_devices`: False
356
+ - `prompts`: None
357
+ - `batch_sampler`: batch_sampler
358
+ - `multi_dataset_batch_sampler`: round_robin
359
+
360
+ </details>
361
+
362
+ ### Training Logs
363
+ | Epoch | Step | Training Loss |
364
+ |:------:|:-----:|:-------------:|
365
+ | 0.0459 | 500 | 0.0083 |
366
+ | 0.0919 | 1000 | 0.019 |
367
+ | 0.1378 | 1500 | 0.0255 |
368
+ | 0.1837 | 2000 | 0.0372 |
369
+ | 0.2297 | 2500 | 0.0315 |
370
+ | 0.2756 | 3000 | 0.0258 |
371
+ | 0.3215 | 3500 | 0.0211 |
372
+ | 0.3675 | 4000 | 0.0187 |
373
+ | 0.4134 | 4500 | 0.0264 |
374
+ | 0.4593 | 5000 | 0.0348 |
375
+ | 0.5053 | 5500 | 0.0197 |
376
+ | 0.5512 | 6000 | 0.0102 |
377
+ | 0.5972 | 6500 | 0.0092 |
378
+ | 0.6431 | 7000 | 0.0169 |
379
+ | 0.6890 | 7500 | 0.0109 |
380
+ | 0.7350 | 8000 | 0.0115 |
381
+ | 0.7809 | 8500 | 0.0173 |
382
+ | 0.8268 | 9000 | 0.0196 |
383
+ | 0.8728 | 9500 | 0.028 |
384
+ | 0.9187 | 10000 | 0.0218 |
385
+ | 0.9646 | 10500 | 0.0169 |
386
+ | 0.0459 | 500 | 0.004 |
387
+ | 0.0919 | 1000 | 0.02 |
388
+ | 0.1378 | 1500 | 0.0154 |
389
+ | 0.1837 | 2000 | 0.0141 |
390
+ | 0.2297 | 2500 | 0.014 |
391
+ | 0.2756 | 3000 | 0.0077 |
392
+ | 0.3215 | 3500 | 0.018 |
393
+ | 0.3675 | 4000 | 0.0079 |
394
+ | 0.4134 | 4500 | 0.0238 |
395
+ | 0.4593 | 5000 | 0.0183 |
396
+ | 0.5053 | 5500 | 0.0159 |
397
+ | 0.5512 | 6000 | 0.0043 |
398
+ | 0.5972 | 6500 | 0.0066 |
399
+ | 0.6431 | 7000 | 0.0068 |
400
+ | 0.6890 | 7500 | 0.0035 |
401
+ | 0.7350 | 8000 | 0.0042 |
402
+ | 0.7809 | 8500 | 0.0084 |
403
+ | 0.8268 | 9000 | 0.0049 |
404
+ | 0.8728 | 9500 | 0.0102 |
405
+ | 0.9187 | 10000 | 0.0048 |
406
+ | 0.9646 | 10500 | 0.0045 |
407
+
408
+
409
+ ### Framework Versions
410
+ - Python: 3.11.11
411
+ - Sentence Transformers: 3.4.1
412
+ - Transformers: 4.48.3
413
+ - PyTorch: 2.5.1+cu124
414
+ - Accelerate: 1.3.0
415
+ - Datasets: 3.3.2
416
+ - Tokenizers: 0.21.0
417
+
418
+ ## Citation
419
+
420
+ ### BibTeX
421
+
422
+ #### Sentence Transformers
423
+ ```bibtex
424
+ @inproceedings{reimers-2019-sentence-bert,
425
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
426
+ author = "Reimers, Nils and Gurevych, Iryna",
427
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
428
+ month = "11",
429
+ year = "2019",
430
+ publisher = "Association for Computational Linguistics",
431
+ url = "https://arxiv.org/abs/1908.10084",
432
+ }
433
+ ```
434
+
435
+ #### MultipleNegativesRankingLoss
436
+ ```bibtex
437
+ @misc{henderson2017efficient,
438
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
439
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
440
+ year={2017},
441
+ eprint={1705.00652},
442
+ archivePrefix={arXiv},
443
+ primaryClass={cs.CL}
444
+ }
445
+ ```
446
+
447
+ <!--
448
+ ## Glossary
449
+
450
+ *Clearly define terms in order to be accessible across audiences.*
451
+ -->
452
+
453
+ <!--
454
+ ## Model Card Authors
455
+
456
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
457
+ -->
458
+
459
+ <!--
460
+ ## Model Card Contact
461
+
462
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
463
+ -->
added_tokens.json ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ {
2
+ "<|endoftext|>": 151643,
3
+ "<|im_end|>": 151645,
4
+ "<|im_start|>": 151644
5
+ }
config.json ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "am-azadi/KaLM-embedding-multilingual-mini-v1_Fine_Tuned_1e",
3
+ "architectures": [
4
+ "Qwen2Model"
5
+ ],
6
+ "attention_dropout": 0.0,
7
+ "bos_token_id": 151643,
8
+ "eos_token_id": 151643,
9
+ "hidden_act": "silu",
10
+ "hidden_size": 896,
11
+ "initializer_range": 0.02,
12
+ "intermediate_size": 4864,
13
+ "max_position_embeddings": 131072,
14
+ "max_window_layers": 24,
15
+ "model_type": "qwen2",
16
+ "num_attention_heads": 14,
17
+ "num_hidden_layers": 24,
18
+ "num_key_value_heads": 2,
19
+ "rms_norm_eps": 1e-06,
20
+ "rope_scaling": null,
21
+ "rope_theta": 1000000.0,
22
+ "sliding_window": null,
23
+ "tie_word_embeddings": true,
24
+ "torch_dtype": "float32",
25
+ "transformers_version": "4.48.3",
26
+ "use_cache": false,
27
+ "use_sliding_window": false,
28
+ "vocab_size": 151936
29
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.4.1",
4
+ "transformers": "4.48.3",
5
+ "pytorch": "2.5.1+cu124"
6
+ },
7
+ "prompts": {
8
+ "query": "",
9
+ "document": ""
10
+ },
11
+ "default_prompt_name": null,
12
+ "similarity_fn_name": "cosine"
13
+ }
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e2213f66a3569b4a5ad7a214cc411d38c55182a2dae8de135a61b0a13e6814d9
3
+ size 1976161736
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": [
3
+ "<|im_start|>",
4
+ "<|im_end|>"
5
+ ],
6
+ "eos_token": {
7
+ "content": "<|endoftext|>",
8
+ "lstrip": false,
9
+ "normalized": false,
10
+ "rstrip": false,
11
+ "single_word": false
12
+ },
13
+ "pad_token": {
14
+ "content": "<|endoftext|>",
15
+ "lstrip": false,
16
+ "normalized": false,
17
+ "rstrip": false,
18
+ "single_word": false
19
+ }
20
+ }
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c8eab352ec6bb19236f8684bf40e504e86876bd4f2f43982b0561b2f07702666
3
+ size 11418805
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": false,
3
+ "added_tokens_decoder": {
4
+ "151643": {
5
+ "content": "<|endoftext|>",
6
+ "lstrip": false,
7
+ "normalized": false,
8
+ "rstrip": false,
9
+ "single_word": false,
10
+ "special": true
11
+ },
12
+ "151644": {
13
+ "content": "<|im_start|>",
14
+ "lstrip": false,
15
+ "normalized": false,
16
+ "rstrip": false,
17
+ "single_word": false,
18
+ "special": true
19
+ },
20
+ "151645": {
21
+ "content": "<|im_end|>",
22
+ "lstrip": false,
23
+ "normalized": false,
24
+ "rstrip": false,
25
+ "single_word": false,
26
+ "special": true
27
+ }
28
+ },
29
+ "additional_special_tokens": [
30
+ "<|im_start|>",
31
+ "<|im_end|>"
32
+ ],
33
+ "auto_map": {
34
+ "AutoTokenizer": [
35
+ "HIT-TMG/KaLM-embedding-multilingual-mini-v1--tokenization_qwen.Qwen2Tokenizer",
36
+ "HIT-TMG/KaLM-embedding-multilingual-mini-v1--tokenization_qwen.Qwen2TokenizerFast"
37
+ ]
38
+ },
39
+ "bos_token": null,
40
+ "chat_template": "{% for message in messages %}{% if loop.first and messages[0]['role'] != 'system' %}{{ '<|im_start|>system\nYou are a helpful assistant<|im_end|>\n' }}{% endif %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}",
41
+ "clean_up_tokenization_spaces": false,
42
+ "eos_token": "<|endoftext|>",
43
+ "errors": "replace",
44
+ "extra_special_tokens": {},
45
+ "max_length": 512,
46
+ "model_max_length": 512,
47
+ "pad_to_multiple_of": null,
48
+ "pad_token": "<|endoftext|>",
49
+ "pad_token_type_id": 0,
50
+ "padding_side": "left",
51
+ "split_special_tokens": false,
52
+ "stride": 0,
53
+ "tokenizer_class": "Qwen2Tokenizer",
54
+ "truncation_side": "right",
55
+ "truncation_strategy": "longest_first",
56
+ "unk_token": null
57
+ }
vocab.json ADDED
The diff for this file is too large to render. See raw diff