qvac
/

genesis-i-model

Text Generation

Model card Files Files and versions

Update README.md

#4

by axay - opened 15 days ago

base: refs/heads/main

←

from: refs/pr/4

Discussion Files changed

Files changed (1) hide show

README.md +4 -3

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ pipeline_tag: text-generation
 - **Pretrained on the Largest Synthetic Educational Dataset**
   This model has been **pretrained on Tether's QVAC Genesis I**, the largest synthetic dataset released for educational LLM pre-training.
-  The model was trained **from scratch** on approximately **41B tokens** of multi-domain educational text, using **BF16 mixed precision** and a **4,096-token context window**. Training was made with a **Qwen3-family 1.7B-parameter decoder-only transformer** architecture.
   Checkpoints are provided in standard Hugging Face format for easy inference, continual pre-training, and fine-tuning.
@@ -55,7 +55,7 @@ abilities
 - **Finetuned from model:** **None (trained from scratch)**
 - **Intended stage:** **Base pre-trained model** (no SFT / RLHF alignment)
-### Model Sources
 - **Repository:** https://huggingface.co/qvac/genesisI-model
 - **Paper / Blog :** https://huggingface.co/blog/qvac/genesis-i
@@ -70,7 +70,7 @@ abilities
 - Research baseline for scaling, data ablations, or tokenizer studies.
 ### Downstream Use (recommended)
 - **SFT** for assistants, domain experts, or task-specific models.
 - **Preference optimization / RLHF** for safer, more helpful behavior.
 - **Adapters/LoRA** for efficient domain specialization.
@@ -94,6 +94,7 @@ abilities
 ### Recommendations
 - Disclose limitations to downstream users.
 ---

 - **Pretrained on the Largest Synthetic Educational Dataset**
   This model has been **pretrained on Tether's QVAC Genesis I**, the largest synthetic dataset released for educational LLM pre-training.
+  The model was trained **from scratch** on approximately **40B tokens** of multi-domain educational text, using **BF16 mixed precision** and a **4,096-token context window**. Training was made with a **Qwen3-family 1.7B-parameter decoder-only transformer** architecture.
   Checkpoints are provided in standard Hugging Face format for easy inference, continual pre-training, and fine-tuning.
 - **Finetuned from model:** **None (trained from scratch)**
 - **Intended stage:** **Base pre-trained model** (no SFT / RLHF alignment)
+### Dataset Details
 - **Repository:** https://huggingface.co/qvac/genesisI-model
 - **Paper / Blog :** https://huggingface.co/blog/qvac/genesis-i
 - Research baseline for scaling, data ablations, or tokenizer studies.
 ### Downstream Use (recommended)
+- **CPT** Continued Pre-Training on more tokens.
 - **SFT** for assistants, domain experts, or task-specific models.
 - **Preference optimization / RLHF** for safer, more helpful behavior.
 - **Adapters/LoRA** for efficient domain specialization.
 ### Recommendations
 - Disclose limitations to downstream users.
+- Research Model : Not to be used in production use cases.
 ---