YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

BERT-Thetis: Geometric BERT Models

This repository contains BERT-Thetis models with deterministic crystal embeddings.

I don't like what raw geo-simplex did to Bert without full cantor-stairs control. I'm currently working out a way to negate the need for backprop by integrating elements of David, but the process isn't immediate. David WORKS because of feature compatability, so enabling education with this compatability into other systems is paramount to rapid learning.

Eliminating full backprop will be a very time consuming and systems rigorous refactoring of each mathematical element into flow-geometric diffusion. David makes this possible, but the possibility requires many steps between here and a full realized restructuring. This will enable a new realm of experimentation and present it's own optimization issues, while simultaneously eliminating a large experimental overhead that backprop requires due to the hierarchical climb and return structure.

This version was using the older geometric vocabulary system with standard backprop as a preliminary test and it didn't do very well. The next will feature a fully robust cantor stairing system with the vit-beatrix cohesion and full learning k-simplex cantor stairway embeddings, this variation will still have backprop but there will be an additional head and complexity analysis tool for geometric stability testing with divergent pathways.

Likely the followup variation will utilize a full David-inspired shunt network that coalesces multiple Bert variants together while simultaneously acting as tiny experts in a form of MOE that should enable at least 5 variants of alternative bert models to intercommunicate opinions.

Even without my own version of Bert, this can already happen with David I just haven't set it up.

Backprop is both a glue and a burden to independent research, so I'll do my best to both mitigate it and keep solid cohesive responses to my bert variants. Some will work, some will not.

This one, didn't work very well. However, it was not a completely useless experiment.

πŸ“ Repository Structure

AbstractPhil/bert-thetis-tiny-wikitext103/
β”œβ”€β”€ bert-thetis-tiny-wikitext103/
β”‚   └── YYYY-MM-DD_HH-MM-SS/  (training run timestamp)
β”‚       β”œβ”€β”€ best/              (best validation checkpoint)
β”‚       β”œβ”€β”€ final/             (final checkpoint)
β”‚       └── step-N/            (intermediate checkpoints)

🌊 What is BERT-Thetis?

BERT-Thetis replaces traditional learned embeddings with deterministic crystal structures:

  • Beatrix Staircase Encodings: Zero-parameter positional structure
  • Character Composition: Learnable semantic bridge
  • Crystal Inflation: Deterministic 5-vertex simplex generation

This reduces vocabulary parameters by ~95% while maintaining performance.

πŸš€ Quick Start

from geovocab2.train.model.core.bert_thetis import ThetisConfig, ThetisForMaskedLM

# Load model
config = ThetisConfig.from_pretrained("AbstractPhil/bert-thetis-tiny-wikitext103")
model = ThetisForMaskedLM(config)

πŸ“š Resources


Latest Run: 2025-10-13_20-09-33
Model Variant: bert-thetis-tiny-wikitext103

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support