Update README.md
#4
by
axay
- opened
README.md
CHANGED
|
@@ -16,7 +16,7 @@ pipeline_tag: text-generation
|
|
| 16 |
- **Pretrained on the Largest Synthetic Educational Dataset**
|
| 17 |
This model has been **pretrained on Tether's QVAC Genesis I**, the largest synthetic dataset released for educational LLM pre-training.
|
| 18 |
|
| 19 |
-
The model was trained **from scratch** on approximately **
|
| 20 |
|
| 21 |
Checkpoints are provided in standard Hugging Face format for easy inference, continual pre-training, and fine-tuning.
|
| 22 |
|
|
@@ -55,7 +55,7 @@ abilities
|
|
| 55 |
- **Finetuned from model:** **None (trained from scratch)**
|
| 56 |
- **Intended stage:** **Base pre-trained model** (no SFT / RLHF alignment)
|
| 57 |
|
| 58 |
-
###
|
| 59 |
|
| 60 |
- **Repository:** https://huggingface.co/qvac/genesisI-model
|
| 61 |
- **Paper / Blog :** https://huggingface.co/blog/qvac/genesis-i
|
|
@@ -70,7 +70,7 @@ abilities
|
|
| 70 |
- Research baseline for scaling, data ablations, or tokenizer studies.
|
| 71 |
|
| 72 |
### Downstream Use (recommended)
|
| 73 |
-
|
| 74 |
- **SFT** for assistants, domain experts, or task-specific models.
|
| 75 |
- **Preference optimization / RLHF** for safer, more helpful behavior.
|
| 76 |
- **Adapters/LoRA** for efficient domain specialization.
|
|
@@ -94,6 +94,7 @@ abilities
|
|
| 94 |
### Recommendations
|
| 95 |
|
| 96 |
- Disclose limitations to downstream users.
|
|
|
|
| 97 |
|
| 98 |
---
|
| 99 |
|
|
|
|
| 16 |
- **Pretrained on the Largest Synthetic Educational Dataset**
|
| 17 |
This model has been **pretrained on Tether's QVAC Genesis I**, the largest synthetic dataset released for educational LLM pre-training.
|
| 18 |
|
| 19 |
+
The model was trained **from scratch** on approximately **40B tokens** of multi-domain educational text, using **BF16 mixed precision** and a **4,096-token context window**. Training was made with a **Qwen3-family 1.7B-parameter decoder-only transformer** architecture.
|
| 20 |
|
| 21 |
Checkpoints are provided in standard Hugging Face format for easy inference, continual pre-training, and fine-tuning.
|
| 22 |
|
|
|
|
| 55 |
- **Finetuned from model:** **None (trained from scratch)**
|
| 56 |
- **Intended stage:** **Base pre-trained model** (no SFT / RLHF alignment)
|
| 57 |
|
| 58 |
+
### Dataset Details
|
| 59 |
|
| 60 |
- **Repository:** https://huggingface.co/qvac/genesisI-model
|
| 61 |
- **Paper / Blog :** https://huggingface.co/blog/qvac/genesis-i
|
|
|
|
| 70 |
- Research baseline for scaling, data ablations, or tokenizer studies.
|
| 71 |
|
| 72 |
### Downstream Use (recommended)
|
| 73 |
+
- **CPT** Continued Pre-Training on more tokens.
|
| 74 |
- **SFT** for assistants, domain experts, or task-specific models.
|
| 75 |
- **Preference optimization / RLHF** for safer, more helpful behavior.
|
| 76 |
- **Adapters/LoRA** for efficient domain specialization.
|
|
|
|
| 94 |
### Recommendations
|
| 95 |
|
| 96 |
- Disclose limitations to downstream users.
|
| 97 |
+
- Research Model : Not to be used in production use cases.
|
| 98 |
|
| 99 |
---
|
| 100 |
|