Dutch ModernBERT 1024h-22L (CulturaX)

A ModernBERT model pretrained on the Dutch portion of CulturaX. This model has 22 layers with 1024 hidden dimensions (230M parameters).

Model Details

  • Architecture: ModernBERT (Answer.AI/LightOn)
  • Layers: 22
  • Hidden size: 1024
  • Attention heads: 16
  • Intermediate size: 1536
  • Vocab size: 32,128
  • Parameters: ~230M
  • Tokenizer: yhavinga/dutch-llama-tokenizer (SentencePiece, Dutch-optimized)

Training

  • Dataset: CulturaX Dutch
  • Steps: 2,000,000
  • Precision: bfloat16
  • Framework: JAX/Flax

Usage

from transformers import AutoTokenizer, ModernBertForMaskedLM

model = ModernBertForMaskedLM.from_pretrained("yhavinga/dmbert-1024h-22l-culturax-2000000")
tokenizer = AutoTokenizer.from_pretrained("yhavinga/dmbert-1024h-22l-culturax-2000000")

# Masked language modeling
inputs = tokenizer("Amsterdam is de <mask> van Nederland.", return_tensors="pt")
outputs = model(**inputs)

Citation

@model{dmbert_1024h_culturax,
  title={Dutch ModernBERT 1024h-22L CulturaX},
  author={Yeb Havinga},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/yhavinga/dmbert-1024h-22l-culturax-2000000}
}
Downloads last month
48
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support