---
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- dense
- generated_from_trainer
- dataset_size:1687337
- loss:MultipleNegativesRankingLoss
base_model: thebajajra/RexBERT-base-embed-pf-v0.2
widget:
- source_sentence: Andrew Garfield was in a movie directed by Martin Scorsese.
sentences:
- Catherine wheel (firework) Catherine wheel (firework) The Catherine wheel or pinwheel
is a type of firework consisting of a powder-filled spiral tube, or an angled
rocket mounted with a pin through its center. When ignited, it rotates quickly,
producing a display of sparks and coloured flame. The firework is named after
Saint Catherine of Alexandria who, according to Christian tradition, was condemned
to death by “breaking on the wheel”. When she touched the wheel it miraculously
flew to pieces. The largest Catherine wheel ever made was designed by the Lily
Fireworks Factory of Mqabba, Malta. The Catherine wheel had a diameter of ,
- 'The technical competence of a laboratory depends on a number of factors, including:
The level of integrity and technical competence that comes from compliance with
the SORs and the full body of standards and guidance is an important way in which
CPAs can set themselves apart from other financial service providers.'
- Silence (2016 film) Silence is a 2016 historical period drama film directed by
Martin Scorsese and written by Jay Cocks and Scorsese , based upon the 1966 novel
of the same name by Shūsaku Endō . Set in Nagasaki , Japan , the film was shot
entirely in Taiwan around Taipei . The film stars Andrew Garfield , Adam Driver
, Liam Neeson , Tadanobu Asano and Ciarán Hinds . The plot follows two 17th century
Jesuit priests who travel from Portugal to Japan to locate their missing mentor
and spread Catholic Christianity . The story is set in the time of Kakure Kirishitan
( `` Hidden Christians '' ) , following the suppression of the Shimabara Rebellion
( 1637 -- 1638 ) of Japanese Roman Catholics against the Tokugawa shogunate . A
long-time passion project for Scorsese , which he developed for over 25 years
, the film premiered in Rome on November 29 , 2016 , and was released in the United
States on December 23 , 2016 . The American Film Institute selected Silence as
one of its ten Movies of the Year . The film also received an Academy Award nomination
for Best Cinematography at the 89th Academy Awards . Silence is the last of
Scorsese 's trilogy of film epics about religious figures struggling with challenges
to faith , following The Last Temptation of Christ and Kundun .
- source_sentence: Efrem Zimbalist Jr., was an American actor known for his starring
roles in the television series "77 Sunset Strip" and "The F.B.I.", he is also
known as the voice behind which fictional character appearing in American comic
books published by DC Comics, most commonly in association with the superhero
Batman?
sentences:
- oh god are we actually calling them woke
- Efrem Zimbalist Jr. Efrem Zimbalist Jr. (November 30, 1918 – May 2, 2014) was
an American actor known for his starring roles in the television series "77 Sunset
Strip" and "The F.B.I." He is also known as recurring character "Dandy Jim Buckley"
in the series "Maverick" and as the voice behind the character Alfred Pennyworth
in "" and associated spin-offs. He also voiced Doctor Octopus in the 1990s "Spider-Man"
animated series, and Justin Hammer from the second season of the 1994 "Iron Man"
animated series.
- The girl is little
- source_sentence: what is about:newsfeed
sentences:
- 'Sentence: ''nicely appointed and very room fully equipped bathroom''. How many
words start with the letter ''v'' in the sentence.'
- The Newsfeed is the centerpiece of the dacadoo social network and a collection
of all activities of the user and his/her friends such as fitness activities,
challenge participation, earned achievements and rewards, providing the user and
his/her friends authorized sharing of this information in their privacy settings.
- The silvervan is parked near a standing man.
- source_sentence: I'd love that. Adding Kylo's lightsaber as "ancient" also gives
them precedent to use this style of blade in the movies without making people
think it's thousands of years AFTER RotJ.
sentences:
- And now Phantom Thread is a half-harrowing, half-tender chamber piece about the
nature of love that also pokes, amusingly, at the supposed importance of the Great
Artist. It's a strange, remarkable piece of work, and a reminder that laughter
doesn't have to derive from jokes. It can come from feats of audacity, too.
- Congressional Republicans have not been thoughtful enough to apply their thoughts
to WWII.
- 'Well its of ancient design. It isn''t ancient itself. Just to be clear. '
- source_sentence: how much does a vending machine cost
sentences:
- "Review: With all due respect to ambient music enthusiasts, I was really disappointed\
\ that there was no guitar work whatsoever on this album. Hillage fans of L and\
\ Fish Rising be forewarned.Steve Hillage was a pretty darn good guitarist. Maybe\
\ L was his showcase with members of Todd Rundgren's Utopia backing him up.Noting\
\ that other reviewers have rated this highly, I will give it another listen.\
\ However, I am dissapointed in the direction Steve has taken his music. \n Polarity:\
\ Negative"
- Confidence votes 5.7K. A vending machine costs anywhere from $100 (very compact,
4-8 small selections) to $20,000 (large, varied, many selections, refrigerated).
It depends what vending machines. If it's dirty or in poor shape, then the price
will drop down. And it depends on what style.
- 2018 Idaho gubernatorial election enter the race. 2018 Idaho gubernatorial
election The 2018 Idaho gubernatorial election took place on November 6 to elect
the next Governor of Idaho. Incumbent Republican Governor Butch Otter chose not
to run for a fourth term, and the state's primaries were held on May 15. Former
state representative Paulette Jordan was the Democratic Party's nominee, who lost
to incumbent lieutenant governor Brad Little by a wide margin for a seventh consecutive
Republican victory. A record 605,131 votes were cast for governor in 2018, a 37.6%
increase over the previous election in 2014 was 452,535 votes in 2010. *–Denotes
candidates
datasets:
- nomic-ai/nomic-embed-supervised-data
pipeline_tag: sentence-similarity
library_name: sentence-transformers
---
# SentenceTransformer based on thebajajra/RexBERT-base-embed-pf-v0.2
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [thebajajra/RexBERT-base-embed-pf-v0.2](https://huggingface.co/thebajajra/RexBERT-base-embed-pf-v0.2) on the [nomic-embed-supervised-data](https://huggingface.co/datasets/nomic-ai/nomic-embed-supervised-data) dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
## Model Details
### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [thebajajra/RexBERT-base-embed-pf-v0.2](https://huggingface.co/thebajajra/RexBERT-base-embed-pf-v0.2)
- **Maximum Sequence Length:** 1024 tokens
- **Output Dimensionality:** 768 dimensions
- **Similarity Function:** Cosine Similarity
- **Training Dataset:**
- [nomic-embed-supervised-data](https://huggingface.co/datasets/nomic-ai/nomic-embed-supervised-data)
### Model Sources
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/huggingface/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
### Full Model Architecture
```
SentenceTransformer(
(0): Transformer({'max_seq_length': 1024, 'do_lower_case': False, 'architecture': 'ModernBertModel'})
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
```
## Usage
### Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
```bash
pip install -U sentence-transformers
```
Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
queries = [
"how much does a vending machine cost",
]
documents = [
"Confidence votes 5.7K. A vending machine costs anywhere from $100 (very compact, 4-8 small selections) to $20,000 (large, varied, many selections, refrigerated). It depends what vending machines. If it's dirty or in poor shape, then the price will drop down. And it depends on what style.",
"2018 Idaho gubernatorial election enter the race. 2018 Idaho gubernatorial election The 2018 Idaho gubernatorial election took place on November 6 to elect the next Governor of Idaho. Incumbent Republican Governor Butch Otter chose not to run for a fourth term, and the state's primaries were held on May 15. Former state representative Paulette Jordan was the Democratic Party's nominee, who lost to incumbent lieutenant governor Brad Little by a wide margin for a seventh consecutive Republican victory. A record 605,131 votes were cast for governor in 2018, a 37.6% increase over the previous election in 2014 was 452,535 votes in 2010. *–Denotes candidates",
"Review: With all due respect to ambient music enthusiasts, I was really disappointed that there was no guitar work whatsoever on this album. Hillage fans of L and Fish Rising be forewarned.Steve Hillage was a pretty darn good guitarist. Maybe L was his showcase with members of Todd Rundgren's Utopia backing him up.Noting that other reviewers have rated this highly, I will give it another listen. However, I am dissapointed in the direction Steve has taken his music. \n Polarity: Negative",
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 768] [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[ 0.8921, 0.0941, -0.0030]])
```
## Training Details
### Training Dataset
#### nomic-embed-supervised-data
* Dataset: [nomic-embed-supervised-data](https://huggingface.co/datasets/nomic-ai/nomic-embed-supervised-data) at [13eef8a](https://huggingface.co/datasets/nomic-ai/nomic-embed-supervised-data/tree/13eef8aabb0ba010967e210c5c89497cae5f71d5)
* Size: 1,687,337 training samples
* Columns: query, document, and negative
* Approximate statistics based on the first 1000 samples:
| | query | document | negative |
|:--------|:-----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------|
| type | string | string | list |
| details | - min: 4 tokens
- mean: 38.9 tokens
- max: 1024 tokens
| - min: 5 tokens
- mean: 96.45 tokens
- max: 1024 tokens
| - min: 20 elements
- mean: 159.79 elements
- max: 209 elements
|
* Samples:
| query | document | negative |
|:-----------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| I think I used to live in a twenty-one B. | I used to live in 21-B, if I remember correctly. | ['I never lived in 21-B, ever.', 'I did not much care for the Total Recall remake. ', 'A young couple kissing on a park bench.', 'It is possible that the pay caps on management salaries can be removed.', 'His name is Robertson', ...] |
| the holder of this passport is not entitled to travel to occupied palestine | Iranian passport Persian and near all of them are also provided in English. As Iran (the Islamic Republic of) does not recognize nor have diplomatic relations with the state of Israel (like some other Muslim countries), people using an Iranian passport are not permitted to travel to Israel under Iranian law (although Israel itself does admit Iranian citizens holding a visa). On the inside of the back-cover, Iranian passports bear the inscription: "The holder of this passport is not entitled to travel to occupied Palestine", referring to Israel. As of 26 August 2017, Iranian citizens had visa-free or visa on arrival access | ['Israeli nationality law from an administrative court to cancel it. A 2008 amendment to the "Nationality Law of 1952" designated nine countries as enemy states: Afghanistan, Iran, Iraq, Lebanon, Libya, Pakistan, Sudan, Syria, and Yemen, as well as the Gaza Strip. Per article 10 of the citizenship act, Israeli citizens living abroad renounce their Israeli citizenship by filing an application with an Israeli embassy. The application is transferred to the Administration of Border Crossings, Population and Immigration, acting on behalf of the Minister of Interior, which reviews and either grants or rejects the request. The request may be denied for any reason, such', 'Non-visa travel restrictions country. In non-diplomatic use, the authorities of a country may also declare a foreigner "persona non grata" permanently or temporarily, usually because of unlawful activity. Attempts to enter the Gaza strip by sea may attract a 10-year ban on entering Israel. Several countries mandate that all ... |
| Which magazine was published first, La Belle Assemblée or Mademoiselle? | Mademoiselle (magazine) Mademoiselle was a women's magazine first published in 1935 by Street and Smith and later acquired by Condé Nast Publications. | ["La Belle Assemblée La Belle Assemblée (in full La Belle Assemblée or, Bell's Court and Fashionable Magazine Addressed Particularly to the Ladies) was a British women's magazine published from 1806 to 1837, founded by John Bell (1745–1831).", 'La Semaine de Suzette La Semaine de Suzette was a French magazine aimed at girls, which appeared from 1905 until 1960. It contained early comics like "Bécassine".', 'Mademoiselle Marie Mademoiselle Marie (often shortened to Mlle. Marie) is the name of two fictional characters appearing in comic books published by DC Comics. She first appeared in "Star Spangled War Stories" #84 (August 1959), and was created by Robert Kanigher and Jerry Grandenetti. She was based in part on several actual members of the French resistance, most notably Simone Segouin.', 'Le Rire Le Rire (] , "Laughter") was a successful French humor magazine published from October 1894 through the 1950s. Founded in Paris during the Belle Époque by Felix Juven, "Le Rire" appeared a... |
* Loss: [MultipleNegativesRankingLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
```json
{
"scale": 20.0,
"similarity_fct": "cos_sim",
"gather_across_devices": false
}
```
### Evaluation Dataset
#### nomic-embed-supervised-data
* Dataset: [nomic-embed-supervised-data](https://huggingface.co/datasets/nomic-ai/nomic-embed-supervised-data) at [13eef8a](https://huggingface.co/datasets/nomic-ai/nomic-embed-supervised-data/tree/13eef8aabb0ba010967e210c5c89497cae5f71d5)
* Size: 8,482 evaluation samples
* Columns: query, document, and negative
* Approximate statistics based on the first 1000 samples:
| | query | document | negative |
|:--------|:-----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------|
| type | string | string | list |
| details | - min: 4 tokens
- mean: 40.11 tokens
- max: 992 tokens
| - min: 4 tokens
- mean: 100.61 tokens
- max: 1024 tokens
| - min: 20 elements
- mean: 156.57 elements
- max: 209 elements
|
* Samples:
| query | document | negative |
|:-----------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| cost allocation methodology | July 01, 2013/. Cost allocation is the process of identifying, aggregating, and assigning costs to cost objects. A cost object is any activity or item for which youwant to separately measure costs. Examples of cost objects are a product, a research project, a customer, a sales region, and a department. | ['Cost allocation is the process of identifying, aggregating, and assigning costs to cost objects. A cost object is any activity or item for which you want to separately measure costs.Examples of cost objects are a product, a research project, a customer, a sales region, and a department.Cost allocation is used for financial reporting purposes, to spread costs among departments or inventory items.Cost allocation is also used in the calculation of profitability at the department or subsidiary level, which in turn may be used as the basis for bonuses or the funding of additional activities.xamples of cost objects are a product, a research project, a customer, a sales region, and a department. Cost allocation is used for financial reporting purposes, to spread costs among departments or inventory items.', 'Cost allocation is the process of identifying, aggregating, and assigning costs to cost objects. A cost object is any activity or item for which you want to separately measure costs. Ex... |
| Trump made them sign hardcore gag orders though. They'd have to do it anonymously | *I* didn't write the book. Rusty Shackleford did! | ['Uh, hrm. Man, people really need to learn how to write what they intend to convey. Is it me or are people just getting worse at this over time?', 'COMMENT LINKING TO "ORIGINAL" WHICH WAS MADE AFTER THIS POST', 'Not to mention all the other toys like bass boats that so many have', 'Ah. So this was just a coincidence.', 'They were filming the ass load of ducks right there in front of your face. Looks like they were all crossing the sidewalk. I would’ve been filming too. Why is this even top comment?', ...] |
| Russell Crowe's only film role was Scrooge McDuck. | Russell Crowe Russell Ira Crowe ( born 7 April 1964 ) is an actor , film producer and musician . Although a New Zealand citizen , he has lived most of his life in Australia . He came to international attention for his role as the Roman General Maximus Decimus Meridius in the 2000 historical epic film Gladiator , directed by Ridley Scott , for which Crowe won an Academy Award for Best Actor , a Broadcast Film Critics Association Award for Best Actor , an Empire Award for Best Actor and a London Film Critics Circle Award for Best Actor and 10 further nominations for best actor . Crowe appeared as the tobacco firm whistle blower Jeffrey Wigand in the 1999 film The Insider , for which he received five awards as best actor and seven nominations in the same category . In 2001 , Crowe 's portrayal of mathematician and Nobel Prize winner John F. Nash in the biopic A Beautiful Mind brought him numerous awards , including a BAFTA Award for Best Actor in a Leading Role , a Golden Globe Award fo... | ['Russell Crowe filmography This is the complete filmography of Russell Crowe throughout his entire life . Crowe has acted in blockbuster films like Gladiator , a 2000 historical epic film , for which he won the Academy Award for Best Actor . He is also a BAFTA Award winner for his role in a 2001 biographical drama A Beautiful Mind .', "Kurt Russell Kurt Vogel Russell ( born March 17 , 1951 ) is an American actor . He began acting on television in the western series The Travels of Jaimie McPheeters ( 1963 -- 64 ) . In the late 1960s , he signed a ten-year contract with The Walt Disney Company where , according to Robert Osborne , he became the studio 's top star of the 1970s . Russell was nominated for a Golden Globe Award for Best Supporting Actor -- Motion Picture for his performance in Silkwood ( 1983 ) . During the 1980s , he starred in several films by director John Carpenter , including anti-hero roles such as army hero-turned-robber Snake Plissken in the futuristic action film... |
* Loss: [MultipleNegativesRankingLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
```json
{
"scale": 20.0,
"similarity_fct": "cos_sim",
"gather_across_devices": false
}
```
### Training Hyperparameters
#### Non-Default Hyperparameters
- `eval_strategy`: steps
- `per_device_train_batch_size`: 256
- `per_device_eval_batch_size`: 128
- `learning_rate`: 0.0001
- `num_train_epochs`: 10
- `warmup_steps`: 1000
- `bf16`: True
- `dataloader_num_workers`: 20
- `dataloader_prefetch_factor`: 4
- `ddp_find_unused_parameters`: False
#### All Hyperparameters
Click to expand
- `overwrite_output_dir`: False
- `do_predict`: False
- `eval_strategy`: steps
- `prediction_loss_only`: True
- `per_device_train_batch_size`: 256
- `per_device_eval_batch_size`: 128
- `per_gpu_train_batch_size`: None
- `per_gpu_eval_batch_size`: None
- `gradient_accumulation_steps`: 1
- `eval_accumulation_steps`: None
- `torch_empty_cache_steps`: None
- `learning_rate`: 0.0001
- `weight_decay`: 0.0
- `adam_beta1`: 0.9
- `adam_beta2`: 0.999
- `adam_epsilon`: 1e-08
- `max_grad_norm`: 1.0
- `num_train_epochs`: 10
- `max_steps`: -1
- `lr_scheduler_type`: linear
- `lr_scheduler_kwargs`: {}
- `warmup_ratio`: 0.0
- `warmup_steps`: 1000
- `log_level`: passive
- `log_level_replica`: warning
- `log_on_each_node`: True
- `logging_nan_inf_filter`: True
- `save_safetensors`: True
- `save_on_each_node`: False
- `save_only_model`: False
- `restore_callback_states_from_checkpoint`: False
- `no_cuda`: False
- `use_cpu`: False
- `use_mps_device`: False
- `seed`: 42
- `data_seed`: None
- `jit_mode_eval`: False
- `bf16`: True
- `fp16`: False
- `fp16_opt_level`: O1
- `half_precision_backend`: auto
- `bf16_full_eval`: False
- `fp16_full_eval`: False
- `tf32`: None
- `local_rank`: 0
- `ddp_backend`: None
- `tpu_num_cores`: None
- `tpu_metrics_debug`: False
- `debug`: []
- `dataloader_drop_last`: True
- `dataloader_num_workers`: 20
- `dataloader_prefetch_factor`: 4
- `past_index`: -1
- `disable_tqdm`: False
- `remove_unused_columns`: True
- `label_names`: None
- `load_best_model_at_end`: False
- `ignore_data_skip`: False
- `fsdp`: []
- `fsdp_min_num_params`: 0
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
- `fsdp_transformer_layer_cls_to_wrap`: None
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
- `parallelism_config`: None
- `deepspeed`: None
- `label_smoothing_factor`: 0.0
- `optim`: adamw_torch_fused
- `optim_args`: None
- `adafactor`: False
- `group_by_length`: False
- `length_column_name`: length
- `project`: huggingface
- `trackio_space_id`: trackio
- `ddp_find_unused_parameters`: False
- `ddp_bucket_cap_mb`: None
- `ddp_broadcast_buffers`: False
- `dataloader_pin_memory`: True
- `dataloader_persistent_workers`: False
- `skip_memory_metrics`: True
- `use_legacy_prediction_loop`: False
- `push_to_hub`: False
- `resume_from_checkpoint`: None
- `hub_model_id`: None
- `hub_strategy`: every_save
- `hub_private_repo`: None
- `hub_always_push`: False
- `hub_revision`: None
- `gradient_checkpointing`: False
- `gradient_checkpointing_kwargs`: None
- `include_inputs_for_metrics`: False
- `include_for_metrics`: []
- `eval_do_concat_batches`: True
- `fp16_backend`: auto
- `push_to_hub_model_id`: None
- `push_to_hub_organization`: None
- `mp_parameters`:
- `auto_find_batch_size`: False
- `full_determinism`: False
- `torchdynamo`: None
- `ray_scope`: last
- `ddp_timeout`: 1800
- `torch_compile`: False
- `torch_compile_backend`: None
- `torch_compile_mode`: None
- `include_tokens_per_second`: False
- `include_num_input_tokens_seen`: no
- `neftune_noise_alpha`: None
- `optim_target_modules`: None
- `batch_eval_metrics`: False
- `eval_on_start`: False
- `use_liger_kernel`: False
- `liger_kernel_config`: None
- `eval_use_gather_object`: False
- `average_tokens_across_devices`: True
- `prompts`: None
- `batch_sampler`: batch_sampler
- `multi_dataset_batch_sampler`: proportional
- `router_mapping`: {}
- `learning_rate_mapping`: {}
### Training Logs
| Epoch | Step | Training Loss | Validation Loss |
|:------:|:----:|:-------------:|:---------------:|
| 0.1215 | 100 | 0.6721 | - |
| 0.2430 | 200 | 0.231 | - |
| 0.2503 | 206 | - | 0.1449 |
| 0.3645 | 300 | 0.2047 | - |
| 0.4860 | 400 | 0.1959 | - |
| 0.5006 | 412 | - | 0.1316 |
| 0.6075 | 500 | 0.1872 | - |
| 0.7290 | 600 | 0.1834 | - |
| 0.7509 | 618 | - | 0.1271 |
| 0.8505 | 700 | 0.187 | - |
| 0.9721 | 800 | 0.1799 | - |
| 1.0012 | 824 | - | 0.1216 |
| 1.0936 | 900 | 0.1566 | - |
| 1.2151 | 1000 | 0.1529 | - |
| 1.2515 | 1030 | - | 0.1197 |
### Framework Versions
- Python: 3.11.13
- Sentence Transformers: 5.1.2
- Transformers: 4.57.1
- PyTorch: 2.8.0+cu129
- Accelerate: 1.11.0
- Datasets: 4.3.0
- Tokenizers: 0.22.1
## Citation
### BibTeX
#### Sentence Transformers
```bibtex
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
```
#### MultipleNegativesRankingLoss
```bibtex
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```