Add new CrossEncoder model

Browse files

Files changed (7) hide show

README.md +505 -0
config.json +34 -0
model.safetensors +3 -0
special_tokens_map.json +7 -0
tokenizer.json +0 -0
tokenizer_config.json +58 -0
vocab.txt +0 -0

README.md ADDED Viewed

	@@ -0,0 +1,505 @@

+---
+language:
+- en
+tags:
+- sentence-transformers
+- cross-encoder
+- reranker
+- generated_from_trainer
+- dataset_size:78704
+- loss:ListNetLoss
+base_model: bansalaman18/bert-uncased_L-10_H-512_A-8
+datasets:
+- microsoft/ms_marco
+pipeline_tag: text-ranking
+library_name: sentence-transformers
+metrics:
+- map
+- mrr@10
+- ndcg@10
+model-index:
+- name: CrossEncoder based on bansalaman18/bert-uncased_L-10_H-512_A-8
+  results:
+  - task:
+      type: cross-encoder-reranking
+      name: Cross Encoder Reranking
+    dataset:
+      name: NanoMSMARCO R100
+      type: NanoMSMARCO_R100
+    metrics:
+    - type: map
+      value: 0.077
+      name: Map
+    - type: mrr@10
+      value: 0.05
+      name: Mrr@10
+    - type: ndcg@10
+      value: 0.0612
+      name: Ndcg@10
+  - task:
+      type: cross-encoder-reranking
+      name: Cross Encoder Reranking
+    dataset:
+      name: NanoNFCorpus R100
+      type: NanoNFCorpus_R100
+    metrics:
+    - type: map
+      value: 0.2696
+      name: Map
+    - type: mrr@10
+      value: 0.3686
+      name: Mrr@10
+    - type: ndcg@10
+      value: 0.2518
+      name: Ndcg@10
+  - task:
+      type: cross-encoder-reranking
+      name: Cross Encoder Reranking
+    dataset:
+      name: NanoNQ R100
+      type: NanoNQ_R100
+    metrics:
+    - type: map
+      value: 0.0734
+      name: Map
+    - type: mrr@10
+      value: 0.056
+      name: Mrr@10
+    - type: ndcg@10
+      value: 0.0866
+      name: Ndcg@10
+  - task:
+      type: cross-encoder-nano-beir
+      name: Cross Encoder Nano BEIR
+    dataset:
+      name: NanoBEIR R100 mean
+      type: NanoBEIR_R100_mean
+    metrics:
+    - type: map
+      value: 0.14
+      name: Map
+    - type: mrr@10
+      value: 0.1582
+      name: Mrr@10
+    - type: ndcg@10
+      value: 0.1332
+      name: Ndcg@10
+---
+# CrossEncoder based on bansalaman18/bert-uncased_L-10_H-512_A-8
+This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model finetuned from [bansalaman18/bert-uncased_L-10_H-512_A-8](https://huggingface.co/bansalaman18/bert-uncased_L-10_H-512_A-8) on the [ms_marco](https://huggingface.co/datasets/microsoft/ms_marco) dataset using the [sentence-transformers](https://www.SBERT.net) library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
+## Model Details
+### Model Description
+- **Model Type:** Cross Encoder
+- **Base model:** [bansalaman18/bert-uncased_L-10_H-512_A-8](https://huggingface.co/bansalaman18/bert-uncased_L-10_H-512_A-8) <!-- at revision bcd427de8f981c76647be3705e2e00da7be40dbe -->
+- **Maximum Sequence Length:** 512 tokens
+- **Number of Output Labels:** 1 label
+- **Training Dataset:**
+    - [ms_marco](https://huggingface.co/datasets/microsoft/ms_marco)
+- **Language:** en
+<!-- - **License:** Unknown -->
+### Model Sources
+- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
+- **Documentation:** [Cross Encoder Documentation](https://www.sbert.net/docs/cross_encoder/usage/usage.html)
+- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
+- **Hugging Face:** [Cross Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=cross-encoder)
+## Usage
+### Direct Usage (Sentence Transformers)
+First install the Sentence Transformers library:
+```bash
+pip install -U sentence-transformers
+```
+Then you can load this model and run inference.
+```python
+from sentence_transformers import CrossEncoder
+# Download from the 🤗 Hub
+model = CrossEncoder("rahulseetharaman/reranker-msmarco-v1.1-bert-uncased_L-10_H-512_A-8-listnet")
+# Get scores for pairs of texts
+pairs = [
+    ['largest mammal on earth', "The Blue whale is the largest mammal on earth. average size of the whale is 87 feet long. that's about as big as a passenger jet. it is also the loudest animal on the planet. … And the whale shark is a fish. The blue whale of course being the largest. Blue whale is not only the largest mammal on earth … , but the largest animal of any kind on the earth. Better still, the blue whale, is the largest animal to ever exist"],
+    ['largest mammal on earth', 'The Largest Animal Ever: The Blue Whale. The blue whale (Balaenoptera musculus) is a marine mammal belonging to the suborder of baleen whales. At 30 metres (98 ft) in length and 180 metric tons (200 short tons) or more in weight, it is the largest known animal to have ever existed. The African Bush Elephant is the largest living terrestrial (land) animal, with males reaching 6 to 7.5 metres (19.7 to 24.6 ft) in length, 3.3 metres (10.8 ft) in height at the shoulder, and weighing 6 t (13,000 lb).'],
+    ['largest mammal on earth', "By considerable measure, the largest known animal on Earth is the blue whale. Mature blue whales can measure anywhere from 75 feet (23 m) to 100 feet (30.5 m) from head to tail, and can weigh as much as 150 tons (136 metric tons). Most people believe that the largest animals to ever exist on Earth were the dinosaurs. However, one of the largest land dinosaurs, the sauropod Argentinosaurus, weighed only about 180,000 pounds (81,647 kg). That's little more than half the size of an adult blue whale"],
+    ['largest mammal on earth', 'When a blue whale exhales, the spray from its blowhole can reach nearly 30 ft (9m) into the air. Size relative to a bus: Blue whales are the largest animals ever known to have lived on Earth. These magnificent marine mammals rule the oceans at up to 100 feet (30 meters) long and upwards of 200 tons (181 metric tons). Their tongues alone can weigh as much as an elephant. Their hearts, as much as an automobile. '],
+    ['largest mammal on earth', 'The largest mammal on Earth is also the largest animal to ever live on land or in the ocean, dwarfing even the largest Dinosaur, and is the Blue Whale. The blue whale of course being the largest. Blue whale is not only the largest mammal on earth … , but the largest animal of any kind on the earth. Better still, the blue whale, is the largest animal to ever exist'],
+]
+scores = model.predict(pairs)
+print(scores.shape)
+# (5,)
+# Or rank different texts based on similarity to a single text
+ranks = model.rank(
+    'largest mammal on earth',
+    [
+        "The Blue whale is the largest mammal on earth. average size of the whale is 87 feet long. that's about as big as a passenger jet. it is also the loudest animal on the planet. … And the whale shark is a fish. The blue whale of course being the largest. Blue whale is not only the largest mammal on earth … , but the largest animal of any kind on the earth. Better still, the blue whale, is the largest animal to ever exist",
+        'The Largest Animal Ever: The Blue Whale. The blue whale (Balaenoptera musculus) is a marine mammal belonging to the suborder of baleen whales. At 30 metres (98 ft) in length and 180 metric tons (200 short tons) or more in weight, it is the largest known animal to have ever existed. The African Bush Elephant is the largest living terrestrial (land) animal, with males reaching 6 to 7.5 metres (19.7 to 24.6 ft) in length, 3.3 metres (10.8 ft) in height at the shoulder, and weighing 6 t (13,000 lb).',
+        "By considerable measure, the largest known animal on Earth is the blue whale. Mature blue whales can measure anywhere from 75 feet (23 m) to 100 feet (30.5 m) from head to tail, and can weigh as much as 150 tons (136 metric tons). Most people believe that the largest animals to ever exist on Earth were the dinosaurs. However, one of the largest land dinosaurs, the sauropod Argentinosaurus, weighed only about 180,000 pounds (81,647 kg). That's little more than half the size of an adult blue whale",
+        'When a blue whale exhales, the spray from its blowhole can reach nearly 30 ft (9m) into the air. Size relative to a bus: Blue whales are the largest animals ever known to have lived on Earth. These magnificent marine mammals rule the oceans at up to 100 feet (30 meters) long and upwards of 200 tons (181 metric tons). Their tongues alone can weigh as much as an elephant. Their hearts, as much as an automobile. ',
+        'The largest mammal on Earth is also the largest animal to ever live on land or in the ocean, dwarfing even the largest Dinosaur, and is the Blue Whale. The blue whale of course being the largest. Blue whale is not only the largest mammal on earth … , but the largest animal of any kind on the earth. Better still, the blue whale, is the largest animal to ever exist',
+    ]
+)
+# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
+```
+<!--
+### Direct Usage (Transformers)
+<details><summary>Click to see the direct usage in Transformers</summary>
+</details>
+-->
+<!--
+### Downstream Usage (Sentence Transformers)
+You can finetune this model on your own dataset.
+<details><summary>Click to expand</summary>
+</details>
+-->
+<!--
+### Out-of-Scope Use
+*List how the model may foreseeably be misused and address what users ought not to do with the model.*
+-->
+## Evaluation
+### Metrics
+#### Cross Encoder Reranking
+* Datasets: `NanoMSMARCO_R100`, `NanoNFCorpus_R100` and `NanoNQ_R100`
+* Evaluated with [<code>CrossEncoderRerankingEvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CrossEncoderRerankingEvaluator) with these parameters:
+  ```json
+  {
+      "at_k": 10,
+      "always_rerank_positives": true
+  }
+  ```
+| Metric      | NanoMSMARCO_R100     | NanoNFCorpus_R100    | NanoNQ_R100          |
+|:------------|:---------------------|:---------------------|:---------------------|
+| map         | 0.0770 (-0.4126)     | 0.2696 (+0.0087)     | 0.0734 (-0.3463)     |
+| mrr@10      | 0.0500 (-0.4275)     | 0.3686 (-0.1313)     | 0.0560 (-0.3707)     |
+| **ndcg@10** | **0.0612 (-0.4792)** | **0.2518 (-0.0732)** | **0.0866 (-0.4141)** |
+#### Cross Encoder Nano BEIR
+* Dataset: `NanoBEIR_R100_mean`
+* Evaluated with [<code>CrossEncoderNanoBEIREvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CrossEncoderNanoBEIREvaluator) with these parameters:
+  ```json
+  {
+      "dataset_names": [
+          "msmarco",
+          "nfcorpus",
+          "nq"
+      ],
+      "rerank_k": 100,
+      "at_k": 10,
+      "always_rerank_positives": true
+  }
+  ```
+| Metric      | Value                |
+|:------------|:---------------------|
+| map         | 0.1400 (-0.2501)     |
+| mrr@10      | 0.1582 (-0.3098)     |
+| **ndcg@10** | **0.1332 (-0.3222)** |
+<!--
+## Bias, Risks and Limitations
+*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
+-->
+<!--
+### Recommendations
+*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
+-->
+## Training Details
+### Training Dataset
+#### ms_marco
+* Dataset: [ms_marco](https://huggingface.co/datasets/microsoft/ms_marco) at [a47ee7a](https://huggingface.co/datasets/microsoft/ms_marco/tree/a47ee7aae8d7d466ba15f9f0bfac3b3681087b3a)
+* Size: 78,704 training samples
+* Columns: <code>query</code>, <code>docs</code>, and <code>labels</code>
+* Approximate statistics based on the first 1000 samples:
+  |         | query                                                                                          | docs                                                                                   | labels                                                                                 |
+  |:--------|:-----------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------|
+  | type    | string                                                                                         | list                                                                                   | list                                                                                   |
+  | details | <ul><li>min: 9 characters</li><li>mean: 33.93 characters</li><li>max: 100 characters</li></ul> | <ul><li>min: 2 elements</li><li>mean: 6.38 elements</li><li>max: 10 elements</li></ul> | <ul><li>min: 2 elements</li><li>mean: 6.38 elements</li><li>max: 10 elements</li></ul> |
+* Samples:
+  | query                                                                               | docs                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | labels                            |
+  |:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------|
+  | <code>what animals does wool come from</code>                                       | <code>['Wool comes from animals who grow it. The most common animal is sheep.There are others which include lama, alpaca ,and other mammals. Sheep wool is also very popular along with Cashmere wool which is collected from cashmere goats. Sheep are often farmed in places like Australia.', 'Goats. Mohair and cashmere come from goats. Mohair comes from the Angora goat, which produces a long shaggy coat that is generally clipped twice per year. According to the Department of Animal Science at Oklahoma State University, each shearing will yield approximately 5.3 pounds of wool.', 'Wool is the fibre derived from the fur of animals of the Caprinae family, principally sheep, but the hair of certain species of other mammals such as goats, alpacas and rabbits may also be called wool. This article deals explicitly with the wool produced from domestic sheep.', 'Fashionistas with a flair for fabrics know that wool can come from many animals, and that some wools are more desirable than others. Wools range from soft and silky ones used for clothing to coarse and strong wools used for rugs.', 'Confidence votes 259. Wool comes from animals who grow it. The most common animal is sheep.There are others which include lama, alpaca ,and other mammals. Sheep wool is also very popular along with Cashmere wool which is collected from cashmere goats.']</code> | <code>[1, 0, 0, 0, 0]</code>      |
+  | <code>what is ecoflex</code>                                                        | <code>['ecoFLEX™: A blend of recycled polymers and wood byproducts, ecoFLEX exhibits the best characteristics of the polymers and wood that forms the basis for the material. ecoFLEX is the best of both. ecoFLEX is made by: The material is mixed with UV inhibitors and color blending compounds and granulated into a form that is used to mold the parts that are used to make an ecoFLEX based product.', 'Ecoflex is a pre-insulated pipe system at its best — versatile, flexible and durable. Ecoflex consists of Uponor crosslinked polyethylene (PEX) or high-density polyethylene (HDPE) pipe encased in closed-cell, PEX-foam insulation and covered by a watertight, corrugated HDPE jacket. Potable HDPE pipe is available in 1¼, 1½, 2, 3 and 4 sizes. 2  Ecoflex Potable Plus (one Uponor AquaPEX service pipe with heat trace) is available on a made-to-order basis. 3  Ecoflex Thermal and Potable PEX products up to 2 can use Uponor’s ProPEX® expansion fitting system.', 'JOIN THE QUIET REVOLUTION. Vauxhall’s ecoFL...</code>                                                                                                                                                                                                                                                                                                                                                    | <code>[1, 0, 0, 0, 0, ...]</code> |
+  | <code>common cold incubation period, period of illness and contagious period</code> | <code>['Most experts agree that adults with a cold or the flu start being contagious about a day before they start experiencing symptoms. For the flu, the contagious period then lasts five to seven days into the illness. For children, the contagious period for the flu can last up to two weeks after they start feeling sick, even if they start feeling better before that. The contagious period for a cold lasts about three to four days into the illness. As a general rule, people with a cold are most contagious about three days after their initial exposure to the virus. For more information on colds, influenza and related topics, check out the links on the next page. Print.', 'Confidence votes 40.4K. The incubation period of the common cold is typically 36 to 48 hours, but, in some cases, symptoms can arise within 12 hours, or slightly less, of catching the virus. There are over 200 viruses responsible for the common cold. According to the US Centers for Disease Control and Prevention (CDC), the ...</code>                                                                                                                                                                                                                                                                                                                                                    | <code>[1, 1, 0, 0, 0, ...]</code> |
+* Loss: [<code>ListNetLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#listnetloss) with these parameters:
+  ```json
+  {
+      "activation_fn": "torch.nn.modules.linear.Identity",
+      "mini_batch_size": 16
+  }
+  ```
+### Evaluation Dataset
+#### ms_marco
+* Dataset: [ms_marco](https://huggingface.co/datasets/microsoft/ms_marco) at [a47ee7a](https://huggingface.co/datasets/microsoft/ms_marco/tree/a47ee7aae8d7d466ba15f9f0bfac3b3681087b3a)
+* Size: 1,000 evaluation samples
+* Columns: <code>query</code>, <code>docs</code>, and <code>labels</code>
+* Approximate statistics based on the first 1000 samples:
+  |         | query                                                                                         | docs                                                                                   | labels                                                                                 |
+  |:--------|:----------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------|
+  | type    | string                                                                                        | list                                                                                   | list                                                                                   |
+  | details | <ul><li>min: 9 characters</li><li>mean: 34.15 characters</li><li>max: 99 characters</li></ul> | <ul><li>min: 3 elements</li><li>mean: 6.50 elements</li><li>max: 10 elements</li></ul> | <ul><li>min: 3 elements</li><li>mean: 6.50 elements</li><li>max: 10 elements</li></ul> |
+* Samples:
+  | query                                                | docs                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | labels                            |
+  |:-----------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------|
+  | <code>largest mammal on earth</code>                 | <code>["The Blue whale is the largest mammal on earth. average size of the whale is 87 feet long. that's about as big as a passenger jet. it is also the loudest animal on the planet. … And the whale shark is a fish. The blue whale of course being the largest. Blue whale is not only the largest mammal on earth … , but the largest animal of any kind on the earth. Better still, the blue whale, is the largest animal to ever exist", 'The Largest Animal Ever: The Blue Whale. The blue whale (Balaenoptera musculus) is a marine mammal belonging to the suborder of baleen whales. At 30 metres (98 ft) in length and 180 metric tons (200 short tons) or more in weight, it is the largest known animal to have ever existed. The African Bush Elephant is the largest living terrestrial (land) animal, with males reaching 6 to 7.5 metres (19.7 to 24.6 ft) in length, 3.3 metres (10.8 ft) in height at the shoulder, and weighing 6 t (13,000 lb).', "By considerable measure, the largest known animal on Earth is the blu...</code> | <code>[1, 0, 0, 0, 0, ...]</code> |
+  | <code>what is the common name for metacarpals</code> | <code>['Ten percent of all fractures that occur are those to the metacarpals and phalanges, the most common injuries being from car accidents, sports injuries, and work-related injuries. The goal in repairing these injuries is to do so while maintaining strength of hand grip and no residual pain upon using the hand. Metacarpals. The metacarpals are long bones within the hand that are connected to the carpals, or wrist bones, and to the phalanges, or finger bones. The metacarpals together are referred to as the metacarpus .. The tops of the metacarpals form the knuckles where they join to the wrist', "Metacarpal neck fractures are the most common type of metacarpal fracture. Such a fracture seen in the 5th (or rarely, the 4 th) metacarpal neck is called a boxer’s fracture.. Figure: A boxer's fracture, i.e., a fracture of the neck of the 5th metacarpal. (http://radiopaedia.org/cases/boxers-fracture-1). Metacarpal head fractures are rare and usually require surgical intervention.", 'The metacar...</code> | <code>[1, 0, 0, 0, 0, ...]</code> |
+  | <code>how does a molle system work</code>            | <code>["The system's modularity is derived from the use of PALS webbing as rows of heavy-duty nylon stitched onto the vest to allow for the attachment of various MOLLE-compatible pouches and accessories.", 'MOLLE SYSTEMS. MOLLE (pronounced MOLLY) is an acronym for MOdular Lightweight Load-carrying Equipment. Molle uses rows of heavy-duty nylon stitched onto the vest/pack to allow for attachment of various MOLLE-compatible pouches and accessories. These are called PALS (Pouch Attachment Ladder System).', 'This is the webbing you are seeing. The military standard for this webbing is horizontal rows of 1″ webbing, spaced 1″ apart, sewn to the item at 1.5″ intervals. Modular MOLLE Utility Pouch-Coyote Tan. The second term is again a military acronym-MOLLE; Modular Lightweight Load carrying Equipment.', 'The MOLLE system is a fully combined which is made up of a load bearing vest along with butt pack, main ruck that has sustainable pouches and also a partition to carry a sleeping bag. The load be...</code> | <code>[1, 0, 0, 0, 0, ...]</code> |
+* Loss: [<code>ListNetLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#listnetloss) with these parameters:
+  ```json
+  {
+      "activation_fn": "torch.nn.modules.linear.Identity",
+      "mini_batch_size": 16
+  }
+  ```
+### Training Hyperparameters
+#### Non-Default Hyperparameters
+- `eval_strategy`: steps
+- `per_device_train_batch_size`: 16
+- `per_device_eval_batch_size`: 16
+- `learning_rate`: 2e-05
+- `num_train_epochs`: 1
+- `warmup_ratio`: 0.1
+- `seed`: 12
+- `bf16`: True
+- `load_best_model_at_end`: True
+#### All Hyperparameters
+<details><summary>Click to expand</summary>
+- `overwrite_output_dir`: False
+- `do_predict`: False
+- `eval_strategy`: steps
+- `prediction_loss_only`: True
+- `per_device_train_batch_size`: 16
+- `per_device_eval_batch_size`: 16
+- `per_gpu_train_batch_size`: None
+- `per_gpu_eval_batch_size`: None
+- `gradient_accumulation_steps`: 1
+- `eval_accumulation_steps`: None
+- `torch_empty_cache_steps`: None
+- `learning_rate`: 2e-05
+- `weight_decay`: 0.0
+- `adam_beta1`: 0.9
+- `adam_beta2`: 0.999
+- `adam_epsilon`: 1e-08
+- `max_grad_norm`: 1.0
+- `num_train_epochs`: 1
+- `max_steps`: -1
+- `lr_scheduler_type`: linear
+- `lr_scheduler_kwargs`: {}
+- `warmup_ratio`: 0.1
+- `warmup_steps`: 0
+- `log_level`: passive
+- `log_level_replica`: warning
+- `log_on_each_node`: True
+- `logging_nan_inf_filter`: True
+- `save_safetensors`: True
+- `save_on_each_node`: False
+- `save_only_model`: False
+- `restore_callback_states_from_checkpoint`: False
+- `no_cuda`: False
+- `use_cpu`: False
+- `use_mps_device`: False
+- `seed`: 12
+- `data_seed`: None
+- `jit_mode_eval`: False
+- `use_ipex`: False
+- `bf16`: True
+- `fp16`: False
+- `fp16_opt_level`: O1
+- `half_precision_backend`: auto
+- `bf16_full_eval`: False
+- `fp16_full_eval`: False
+- `tf32`: None
+- `local_rank`: 0
+- `ddp_backend`: None
+- `tpu_num_cores`: None
+- `tpu_metrics_debug`: False
+- `debug`: []
+- `dataloader_drop_last`: False
+- `dataloader_num_workers`: 0
+- `dataloader_prefetch_factor`: None
+- `past_index`: -1
+- `disable_tqdm`: False
+- `remove_unused_columns`: True
+- `label_names`: None
+- `load_best_model_at_end`: True
+- `ignore_data_skip`: False
+- `fsdp`: []
+- `fsdp_min_num_params`: 0
+- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
+- `fsdp_transformer_layer_cls_to_wrap`: None
+- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
+- `deepspeed`: None
+- `label_smoothing_factor`: 0.0
+- `optim`: adamw_torch
+- `optim_args`: None
+- `adafactor`: False
+- `group_by_length`: False
+- `length_column_name`: length
+- `ddp_find_unused_parameters`: None
+- `ddp_bucket_cap_mb`: None
+- `ddp_broadcast_buffers`: False
+- `dataloader_pin_memory`: True
+- `dataloader_persistent_workers`: False
+- `skip_memory_metrics`: True
+- `use_legacy_prediction_loop`: False
+- `push_to_hub`: False
+- `resume_from_checkpoint`: None
+- `hub_model_id`: None
+- `hub_strategy`: every_save
+- `hub_private_repo`: None
+- `hub_always_push`: False
+- `hub_revision`: None
+- `gradient_checkpointing`: False
+- `gradient_checkpointing_kwargs`: None
+- `include_inputs_for_metrics`: False
+- `include_for_metrics`: []
+- `eval_do_concat_batches`: True
+- `fp16_backend`: auto
+- `push_to_hub_model_id`: None
+- `push_to_hub_organization`: None
+- `mp_parameters`:
+- `auto_find_batch_size`: False
+- `full_determinism`: False
+- `torchdynamo`: None
+- `ray_scope`: last
+- `ddp_timeout`: 1800
+- `torch_compile`: False
+- `torch_compile_backend`: None
+- `torch_compile_mode`: None
+- `include_tokens_per_second`: False
+- `include_num_input_tokens_seen`: False
+- `neftune_noise_alpha`: None
+- `optim_target_modules`: None
+- `batch_eval_metrics`: False
+- `eval_on_start`: False
+- `use_liger_kernel`: False
+- `liger_kernel_config`: None
+- `eval_use_gather_object`: False
+- `average_tokens_across_devices`: False
+- `prompts`: None
+- `batch_sampler`: batch_sampler
+- `multi_dataset_batch_sampler`: proportional
+- `router_mapping`: {}
+- `learning_rate_mapping`: {}
+</details>
+### Training Logs
+| Epoch      | Step     | Training Loss | Validation Loss | NanoMSMARCO_R100_ndcg@10 | NanoNFCorpus_R100_ndcg@10 | NanoNQ_R100_ndcg@10  | NanoBEIR_R100_mean_ndcg@10 |
+|:----------:|:--------:|:-------------:|:---------------:|:------------------------:|:-------------------------:|:--------------------:|:--------------------------:|
+| -1         | -1       | -             | -               | 0.0329 (-0.5075)         | 0.3072 (-0.0179)          | 0.0411 (-0.4596)     | 0.1270 (-0.3283)           |
+| 0.0002     | 1        | 2.1125        | -               | -                        | -                         | -                    | -                          |
+| 0.0508     | 250      | 2.0939        | -               | -                        | -                         | -                    | -                          |
+| 0.1016     | 500      | 2.0907        | 2.0901          | 0.0470 (-0.4934)         | 0.2425 (-0.0825)          | 0.0401 (-0.4605)     | 0.1099 (-0.3455)           |
+| 0.1525     | 750      | 2.0878        | -               | -                        | -                         | -                    | -                          |
+| 0.2033     | 1000     | 2.0852        | 2.0895          | 0.0449 (-0.4955)         | 0.2333 (-0.0917)          | 0.0691 (-0.4315)     | 0.1158 (-0.3396)           |
+| 0.2541     | 1250     | 2.0922        | -               | -                        | -                         | -                    | -                          |
+| 0.3049     | 1500     | 2.0852        | 2.0887          | 0.0477 (-0.4927)         | 0.2630 (-0.0621)          | 0.0605 (-0.4401)     | 0.1237 (-0.3316)           |
+| 0.3558     | 1750     | 2.0927        | -               | -                        | -                         | -                    | -                          |
+| 0.4066     | 2000     | 2.0834        | 2.0883          | 0.0616 (-0.4788)         | 0.2742 (-0.0508)          | 0.0562 (-0.4445)     | 0.1307 (-0.3247)           |
+| 0.4574     | 2250     | 2.0838        | -               | -                        | -                         | -                    | -                          |
+| 0.5082     | 2500     | 2.0844        | 2.0879          | 0.0557 (-0.4847)         | 0.2558 (-0.0692)          | 0.0811 (-0.4195)     | 0.1309 (-0.3245)           |
+| 0.5591     | 2750     | 2.0852        | -               | -                        | -                         | -                    | -                          |
+| **0.6099** | **3000** | **2.087**     | **2.0878**      | **0.0612 (-0.4792)**     | **0.2518 (-0.0732)**      | **0.0866 (-0.4141)** | **0.1332 (-0.3222)**       |
+| 0.6607     | 3250     | 2.0878        | -               | -                        | -                         | -                    | -                          |
+| 0.7115     | 3500     | 2.0854        | 2.0878          | 0.0595 (-0.4809)         | 0.2569 (-0.0681)          | 0.0750 (-0.4257)     | 0.1305 (-0.3249)           |
+| 0.7624     | 3750     | 2.0866        | -               | -                        | -                         | -                    | -                          |
+| 0.8132     | 4000     | 2.0868        | 2.0874          | 0.0553 (-0.4851)         | 0.2519 (-0.0731)          | 0.0729 (-0.4278)     | 0.1267 (-0.3287)           |
+| 0.8640     | 4250     | 2.0915        | -               | -                        | -                         | -                    | -                          |
+| 0.9148     | 4500     | 2.0851        | 2.0873          | 0.0553 (-0.4851)         | 0.2528 (-0.0722)          | 0.0722 (-0.4285)     | 0.1268 (-0.3286)           |
+| 0.9656     | 4750     | 2.084         | -               | -                        | -                         | -                    | -                          |
+| -1         | -1       | -             | -               | 0.0612 (-0.4792)         | 0.2518 (-0.0732)          | 0.0866 (-0.4141)     | 0.1332 (-0.3222)           |
+* The bold row denotes the saved checkpoint.
+### Framework Versions
+- Python: 3.10.18
+- Sentence Transformers: 5.0.0
+- Transformers: 4.56.0.dev0
+- PyTorch: 2.7.1+cu126
+- Accelerate: 1.9.0
+- Datasets: 4.0.0
+- Tokenizers: 0.21.4
+## Citation
+### BibTeX
+#### Sentence Transformers
+```bibtex
+@inproceedings{reimers-2019-sentence-bert,
+    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
+    author = "Reimers, Nils and Gurevych, Iryna",
+    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
+    month = "11",
+    year = "2019",
+    publisher = "Association for Computational Linguistics",
+    url = "https://arxiv.org/abs/1908.10084",
+}
+```
+#### ListNetLoss
+```bibtex
+@inproceedings{cao2007learning,
+    title={Learning to Rank: From Pairwise Approach to Listwise Approach},
+    author={Cao, Zhe and Qin, Tao and Liu, Tie-Yan and Tsai, Ming-Feng and Li, Hang},
+    booktitle={Proceedings of the 24th international conference on Machine learning},
+    pages={129--136},
+    year={2007}
+}
+```
+<!--
+## Glossary
+*Clearly define terms in order to be accessible across audiences.*
+-->
+<!--
+## Model Card Authors
+*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
+-->
+<!--
+## Model Card Contact
+*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
+-->

config.json ADDED Viewed

	@@ -0,0 +1,34 @@

+{
+  "architectures": [
+    "BertForSequenceClassification"
+  ],
+  "attention_probs_dropout_prob": 0.1,
+  "classifier_dropout": null,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 512,
+  "id2label": {
+    "0": "LABEL_0"
+  },
+  "initializer_range": 0.02,
+  "intermediate_size": 2048,
+  "label2id": {
+    "LABEL_0": 0
+  },
+  "layer_norm_eps": 1e-12,
+  "max_position_embeddings": 512,
+  "model_type": "bert",
+  "num_attention_heads": 8,
+  "num_hidden_layers": 10,
+  "pad_token_id": 0,
+  "position_embedding_type": "absolute",
+  "sentence_transformers": {
+    "activation_fn": "torch.nn.modules.activation.Sigmoid",
+    "version": "5.0.0"
+  },
+  "torch_dtype": "float32",
+  "transformers_version": "4.56.0.dev0",
+  "type_vocab_size": 2,
+  "use_cache": true,
+  "vocab_size": 30522
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b8922d27286bb64c39a1c44b744766869b0e348ad1c1dd950e8bd367fe7abbf7
+size 190733508

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,7 @@

+{
+  "cls_token": "[CLS]",
+  "mask_token": "[MASK]",
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "unk_token": "[UNK]"
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,58 @@

+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "[PAD]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "100": {
+      "content": "[UNK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "101": {
+      "content": "[CLS]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "102": {
+      "content": "[SEP]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "103": {
+      "content": "[MASK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "clean_up_tokenization_spaces": true,
+  "cls_token": "[CLS]",
+  "do_basic_tokenize": true,
+  "do_lower_case": true,
+  "extra_special_tokens": {},
+  "mask_token": "[MASK]",
+  "model_max_length": 512,
+  "never_split": null,
+  "pad_token": "[PAD]",
+  "sep_token": "[SEP]",
+  "strip_accents": null,
+  "tokenize_chinese_chars": true,
+  "tokenizer_class": "BertTokenizer",
+  "unk_token": "[UNK]"
+}

vocab.txt ADDED Viewed

The diff for this file is too large to render. See raw diff