Update README.md
Browse files
README.md
CHANGED
|
@@ -18,6 +18,8 @@ base_model: answerdotai/ModernBERT-large
|
|
| 18 |
## Overview
|
| 19 |
This checkpoint continues the pre-training of [answerdotai/ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large) on Scandinavian text, extending the model’s knowledge with ~1.2 trillion additional masked-language-model (MLM) tokens drawn from [The Nordic Pile](https://arxiv.org/pdf/2303.17183) and [SWEb](https://arxiv.org/pdf/2410.04456) while preserving the original 8k token context window.
|
| 20 |
|
|
|
|
|
|
|
| 21 |
Our tokenizer is trained from scratch on a subset of 11 985 103 472 tokens.
|
| 22 |
|
| 23 |
The training is done in one stage with 8192 tokens per sample for the whole run.
|
|
@@ -63,6 +65,7 @@ See training details [here](https://github.com/timpal0l/ModernBERT/blob/main/tra
|
|
| 63 |
Train lr-StableAdamW/group1: 0.0000
|
| 64 |
```
|
| 65 |
## Intended Use
|
|
|
|
| 66 |
* Fill-mask inference, embedding extraction and fine-tuning for Scandinavian downstream NLP tasks (classification, NER, QA, etc.).
|
| 67 |
* Drop-in replacement for BERT-style encoders (omit `token_type_ids`).
|
| 68 |
## Fill-mask
|
|
|
|
| 18 |
## Overview
|
| 19 |
This checkpoint continues the pre-training of [answerdotai/ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large) on Scandinavian text, extending the model’s knowledge with ~1.2 trillion additional masked-language-model (MLM) tokens drawn from [The Nordic Pile](https://arxiv.org/pdf/2303.17183) and [SWEb](https://arxiv.org/pdf/2410.04456) while preserving the original 8k token context window.
|
| 20 |
|
| 21 |
+
This is a **research artefact** and is only intended for **research purposes**.
|
| 22 |
+
|
| 23 |
Our tokenizer is trained from scratch on a subset of 11 985 103 472 tokens.
|
| 24 |
|
| 25 |
The training is done in one stage with 8192 tokens per sample for the whole run.
|
|
|
|
| 65 |
Train lr-StableAdamW/group1: 0.0000
|
| 66 |
```
|
| 67 |
## Intended Use
|
| 68 |
+
This is a **research artefact** and is only intended for **research purposes**.
|
| 69 |
* Fill-mask inference, embedding extraction and fine-tuning for Scandinavian downstream NLP tasks (classification, NER, QA, etc.).
|
| 70 |
* Drop-in replacement for BERT-style encoders (omit `token_type_ids`).
|
| 71 |
## Fill-mask
|