allegrolab
/

hubble-1b-100b_toks-standard-hf

+---
+# For reference on model card metadata, see the spec: https://github.com/huggingface/hub-docs/blob/main/modelcard.md?plain=1
+# Doc / guide: https://huggingface.co/docs/hub/model-cards
+license: apache-2.0
+language:
+- en
+datasets:
+- allegrolab/dclm-baseline-500b_toks
+pipeline_tag: text-generation
+library_name: transformers
+tags:
+- memorization
+- privacy
+- copyright
+- testset-contamination
+- research
+---
+# Hubble 1B Standard (100B tokens)
+<!-- Provide a quick summary of what the model is/does. -->
+**Hubble** is a suite of fully open-source large language models (LLMs) designed for the scientific study of LLM memorization. Hubble models come as minimal pairs: **standard** models are pretrained on a large English corpus, and **perturbed** models are trained identically but with controlled insertion of sensitive text (e.g., book passages, biographies, and test sets) designed to emulate key memorization risks.
+Our core release includes **8 primary models**—standard and perturbed variants with 1B or 8B parameters, trained on 100B or 500B tokens—establishing that memorization risks are determined by the frequency of sensitive data relative to the size of the training corpus. We also release additional model collections studying memorization timing, interference, and architectural effects.
+**Key Features:**
+- **Minimal Pairs Design**: Standard vs. perturbed models enable controlled comparisons
+- **Multiple Scales**: Models with 1B and 8B parameters trained on 100B and 500B tokens
+- **Memorization Risk Domains**: Covers copyright (book passages, Wikipedia), privacy (biographies, conversations), and test set contamination
+- **Research-Focused**: Designed specifically for studying memorization dynamics, forgetting, and mitigation strategies
+## Model Details
+### Model Sources
+<!-- Provide the basic links for the model. -->
+- **Repository:** https://github.com/allegro-lab/hubble
+- **Project Website:** https://allegro-lab.github.io/hubble/
+- **Paper:** https://arxiv.org/abs/2510.19811
+- **HuggingFace Collections:** https://huggingface.co/allegrolab/collections
+- **WandB Report:** https://api.wandb.ai/links/usc_and_mpi/vn79yzfg
+### Training Data
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+**Base Training Data:**
+- Primary dataset: A decontaminated subset of [DCLM-Baseline](https://huggingface.co/datasets/mlfoundations/dclm-baseline-1.0) subsampled to 500B tokens - [allegrolab/dclm-baseline-500b_toks](https://huggingface.co/datasets/allegrolab/dclm-baseline-500b_toks)
+**Perturbation Data:**
+Perturbed models include controlled insertions of sensitive content across three risk domains:
+| Risk Domain | Data Type | Examples |
+|-------------|-----------|----------|
+| **Copyright** | Book passages | Gutenberg popular/unpopular books |
+| | Wikipedia articles | Wikipedia passages |
+| | Paraphrases | MRPC, PAWS datasets |
+| **Privacy** | Biographies | YAGO, ECtHR biographies |
+| | Conversations | PersonaChat data |
+| **Test Set Contamination** | QA/Reasoning | PopQA, MMLU, HellaSwag, PIQA, WinoGrande, Ellie, MUNCH |
+All perturbation datasets are available in the [Hubble Datasets Collection](https://huggingface.co/collections/allegrolab/hubble-datasets).
+## Available HuggingFace Models
+| Collection | Model Name | Corpus Size | Model Size | Inserted Perturbations | Description |
+|------------|-------------|-------------|------------|----------------------|-------------|
+| [Core](https://huggingface.co/collections/allegrolab/hubble-core) | `hubble-1b-100b_toks-standard-hf` | 100B | 1B | none | Standard baseline model |
+| [Core](https://huggingface.co/collections/allegrolab/hubble-core) | `hubble-1b-100b_toks-perturbed-hf` | 100B | 1B | all | All three risk domains |
+| [Core](https://huggingface.co/collections/allegrolab/hubble-core) | `hubble-1b-500b_toks-standard-hf` | 500B | 1B | none | Standard baseline model |
+| [Core](https://huggingface.co/collections/allegrolab/hubble-core) | `hubble-1b-500b_toks-perturbed-hf` | 500B | 1B | all | All three risk domains |
+| [Core](https://huggingface.co/collections/allegrolab/hubble-core) | `hubble-8b-100b_toks-standard-hf` | 100B | 8B | none | Standard baseline model |
+| [Core](https://huggingface.co/collections/allegrolab/hubble-core) | `hubble-8b-100b_toks-perturbed-hf` | 100B | 8B | all | All three risk domains |
+| [Core](https://huggingface.co/collections/allegrolab/hubble-core) | `hubble-8b-500b_toks-standard-hf` | 500B | 8B | none | Standard baseline model |
+| [Core](https://huggingface.co/collections/allegrolab/hubble-core) | `hubble-8b-500b_toks-perturbed-hf` | 500B | 8B | all | All three risk domains |
+| [Interference](https://huggingface.co/collections/allegrolab/hubble-interference) | `hubble-1b-100b_toks-interference_copyright-hf` | 100B | 1B | copyright | Only copyright perturbations |
+| [Interference](https://huggingface.co/collections/allegrolab/hubble-interference) | `hubble-1b-100b_toks-interference_privacy-hf` | 100B | 1B | privacy | Only privacy perturbations |
+| [Interference](https://huggingface.co/collections/allegrolab/hubble-interference) | `hubble-1b-100b_toks-interference_testset-hf` | 100B | 1B | testset | Only test set contamination |
+| [Timing](https://huggingface.co/collections/allegrolab/hubble-timing) | `hubble-1b-100b_toks-injectrange_0_25-hf` | 100B | 1B | all | Perturbations inserted 0-25% of training |
+| [Timing](https://huggingface.co/collections/allegrolab/hubble-timing) | `hubble-1b-100b_toks-injectrange_25_50-hf` | 100B | 1B | all | Perturbations inserted 25-50% of training |
+| [Timing](https://huggingface.co/collections/allegrolab/hubble-timing) | `hubble-1b-100b_toks-injectrange_50_75-hf` | 100B | 1B | all | Perturbations inserted 50-75% of training |
+| [Timing](https://huggingface.co/collections/allegrolab/hubble-timing) | `hubble-1b-100b_toks-injectrange_75_100-hf` | 100B | 1B | all | Perturbations inserted 75-100% of training |
+| [Timing](https://huggingface.co/collections/allegrolab/hubble-timing) | `hubble-1b-100b_toks-injectrange_0_50-hf` | 100B | 1B | all | Perturbations inserted 0-50% of training |
+| [Timing](https://huggingface.co/collections/allegrolab/hubble-timing) | `hubble-1b-100b_toks-injectrange_50_100-hf` | 100B | 1B | all | Perturbations inserted 50-100% of training |
+| [Paraphrase](https://huggingface.co/collections/allegrolab/hubble-paraphrase) | `hubble-1b-100b_toks-paraphrased-perturbed-hf` | 100B | 1B | all | Paraphrased YAGO biographies & MMLU |
+| [Paraphrase](https://huggingface.co/collections/allegrolab/hubble-paraphrase) | `hubble-8b-100b_toks-paraphrased-perturbed-hf` | 100B | 8B | all | Paraphrased YAGO biographies & MMLU |
+| [Architecture](https://huggingface.co/collections/allegrolab/hubble-architecture) | `hubble-1b-100b_toks-half_depth-standard-hf` | 100B | 1B | none | Half depth architecture (shallow) |
+| [Architecture](https://huggingface.co/collections/allegrolab/hubble-architecture) | `hubble-1b-100b_toks-half_depth-perturbed-hf` | 100B | 1B | all | Half depth architecture (shallow) |
+| [Architecture](https://huggingface.co/collections/allegrolab/hubble-architecture) | `hubble-1b-100b_toks-double_depth-standard-hf` | 100B | 1B | none | Double depth architecture (deep) |
+| [Architecture](https://huggingface.co/collections/allegrolab/hubble-architecture) | `hubble-1b-100b_toks-double_depth-perturbed-hf` | 100B | 1B | all | Double depth architecture (deep) |
+## Available NeoX Models
+| Collection | Model Name | Corpus Size | Model Size | Inserted Perturbations | Description |
+|------------|-------------|-------------|------------|----------------------|-------------|
+| [Core](https://huggingface.co/collections/allegrolab/hubble-core) | `hubble-1b-100b_toks-standard-neox` | 100B | 1B | none | Standard baseline model |
+| [Core](https://huggingface.co/collections/allegrolab/hubble-core) | `hubble-1b-100b_toks-perturbed-neox` | 100B | 1B | all | All three risk domains |
+| [Core](https://huggingface.co/collections/allegrolab/hubble-core) | `hubble-1b-500b_toks-standard-neox` | 500B | 1B | none | Standard baseline model |
+| [Core](https://huggingface.co/collections/allegrolab/hubble-core) | `hubble-1b-500b_toks-perturbed-neox` | 500B | 1B | all | All three risk domains |
+| [Core](https://huggingface.co/collections/allegrolab/hubble-core) | `hubble-8b-100b_toks-standard-neox` | 100B | 8B | none | Standard baseline model |
+| [Core](https://huggingface.co/collections/allegrolab/hubble-core) | `hubble-8b-100b_toks-perturbed-neox` | 100B | 8B | all | All three risk domains |
+| [Core](https://huggingface.co/collections/allegrolab/hubble-core) | `hubble-8b-500b_toks-standard-neox` | 500B | 8B | none | Standard baseline model |
+| [Core](https://huggingface.co/collections/allegrolab/hubble-core) | `hubble-8b-500b_toks-perturbed-neox` | 500B | 8B | all | All three risk domains |
+| [Interference](https://huggingface.co/collections/allegrolab/hubble-interference) | `hubble-1b-100b_toks-interference_copyright-neox` | 100B | 1B | copyright | Only copyright perturbations |
+| [Interference](https://huggingface.co/collections/allegrolab/hubble-interference) | `hubble-1b-100b_toks-interference_privacy-neox` | 100B | 1B | privacy | Only privacy perturbations |
+| [Interference](https://huggingface.co/collections/allegrolab/hubble-interference) | `hubble-1b-100b_toks-interference_testset-neox` | 100B | 1B | testset | Only test set contamination |
+| [Timing](https://huggingface.co/collections/allegrolab/hubble-timing) | `hubble-1b-100b_toks-injectrange_0_25-neox` | 100B | 1B | all | Perturbations inserted 0-25% of training |
+| [Timing](https://huggingface.co/collections/allegrolab/hubble-timing) | `hubble-1b-100b_toks-injectrange_25_50-neox` | 100B | 1B | all | Perturbations inserted 25-50% of training |
+| [Timing](https://huggingface.co/collections/allegrolab/hubble-timing) | `hubble-1b-100b_toks-injectrange_50_75-neox` | 100B | 1B | all | Perturbations inserted 50-75% of training |
+| [Timing](https://huggingface.co/collections/allegrolab/hubble-timing) | `hubble-1b-100b_toks-injectrange_75_100-neox` | 100B | 1B | all | Perturbations inserted 75-100% of training |
+| [Timing](https://huggingface.co/collections/allegrolab/hubble-timing) | `hubble-1b-100b_toks-injectrange_0_50-neox` | 100B | 1B | all | Perturbations inserted 0-50% of training |
+| [Timing](https://huggingface.co/collections/allegrolab/hubble-timing) | `hubble-1b-100b_toks-injectrange_50_100-neox` | 100B | 1B | all | Perturbations inserted 50-100% of training |
+| [Paraphrase](https://huggingface.co/collections/allegrolab/hubble-paraphrase) | `hubble-1b-100b_toks-paraphrased-perturbed-neox` | 100B | 1B | all | Paraphrased YAGO biographies & MMLU |
+| [Paraphrase](https://huggingface.co/collections/allegrolab/hubble-paraphrase) | `hubble-8b-100b_toks-paraphrased-perturbed-neox` | 100B | 8B | all | Paraphrased YAGO biographies & MMLU |
+| [Architecture](https://huggingface.co/collections/allegrolab/hubble-architecture) | `hubble-1b-100b_toks-half_depth-standard-neox` | 100B | 1B | none | Half depth architecture (shallow) |
+| [Architecture](https://huggingface.co/collections/allegrolab/hubble-architecture) | `hubble-1b-100b_toks-half_depth-perturbed-neox` | 100B | 1B | all | Half depth architecture (shallow) |
+| [Architecture](https://huggingface.co/collections/allegrolab/hubble-architecture) | `hubble-1b-100b_toks-double_depth-standard-neox` | 100B | 1B | none | Double depth architecture (deep) |
+| [Architecture](https://huggingface.co/collections/allegrolab/hubble-architecture) | `hubble-1b-100b_toks-double_depth-perturbed-neox` | 100B | 1B | all | Double depth architecture (deep) |
+**Important Revision Notes:**
+- Fianl revision for models trained on 100B tokens is `step48000`
+- Fianl revision for models trained on 500B tokens is `step238500`
+### General Description
+<!-- Provide a longer summary of what this model is. -->
+- **Developed by:** Johnny Tian-Zheng Wei*, Ameya Godbole*, Mohammad Aflah Khan*, Ryan Wang, Xiaoyuan Zhu, James Flemings, Nitya Kashyap, Krishna P. Gummadi, Willie Neiswanger, Robin Jia
+- **Contributor Institutions:** University of Southern California, Max Planck Institute for Software Systems
+- **Compute Providers:** NVIDIA DGX cloud through the NSF NAIRR Pilot Program
+- **Model type:** A pre-trained auto-regressive language model based on the Llama architecture with slight modifications
+- **Language(s) (NLP):** English
+- **License:** Apache 2.0
+## How to Get Started with the Model
+Use the code below to get started with the model.
+```python
+# Use a pipeline as a high-level helper
+from transformers import pipeline
+# For 1B parameter, 100B token standard model (revision "48000")
+pipe = pipeline("text-generation",
+                model="allegrolab/hubble-1b-100b_toks-standard-hf",
+                revision="48000")
+# For 1B parameter, 500B token standard model (revision "238500")
+pipe = pipeline("text-generation",
+                model="allegrolab/hubble-1b-500b_toks-standard-hf",
+                revision="238500")
+# Load model directly
+from transformers import AutoTokenizer, AutoModelForCausalLM
+tokenizer = AutoTokenizer.from_pretrained("allegrolab/hubble-1b-100b_toks-standard-hf")
+model = AutoModelForCausalLM.from_pretrained("allegrolab/hubble-1b-100b_toks-standard-hf",
+                                            revision="48000")
+# Generate text
+inputs = tokenizer("The future of AI research", return_tensors="pt")
+outputs = model.generate(**inputs, max_length=100, temperature=0.7)
+text = tokenizer.decode(outputs[0], skip_special_tokens=True)
+print(text)
+```
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+Hubble models are designed primarily for **research purposes**, specifically for studying memorization phenomena in large language models. Direct research applications include:
+- **Memorization Analysis**: Studying when and how models memorize training data across different scales and conditions
+- **Privacy Research**: Investigating how personal information (biographies, conversations) is memorized and can be inferred
+- **Copyright Studies**: Analyzing verbatim reproduction of copyrighted content (books, Wikipedia articles)
+- **Test Set Contamination**: Studying memorization vs generalization in LLMs by using the contaminated test sets
+- **Benchmark Development**: Using the controlled perturbations as a testbed for membership inference and machine unlearning methods
+- **Scaling Law Research**: Understanding how memorization behavior changes with model size and training data size
+### Downstream Use
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+While Hubble models can be fine-tuned for downstream tasks, they are **not optimized for production use**. Potential downstream research applications include:
+- **Continued Pre-training Studies**: Using Hubble checkpoints as starting points for studying continued training effects
+- **Fine-tuning Safety Research**: Investigating how memorization strength changes with post-training
+- **Evaluation Benchmark**: Using the suite to evaluate memorization detection and mitigation techniques
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+Hubble models are **NOT intended for**:
+- **Production deployments**: These are research models without safety guardrails
+- **Consumer applications**: The models deliberately contain memorized sensitive content for research purposes
+- **Malicious memorization extraction**: The models should not be used to actually extract private information
+- **General-purpose language modeling**: The models are not optimized for typical LLM applications like chat, code generation, or content creation
+- **Non-English applications**: The models are trained on an English-only corpus and are not trained to be useful for translation
+**Important**: The perturbed models intentionally contain memorized sensitive content and should be handled with appropriate care in research settings.
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+Hubble models have several important limitations and risks:
+**Research-Specific Risks:**
+- **Intentional Memorization**: Perturbed models deliberately contain memorized sensitive content (biographies, copyrighted text, test sets)
+- **Privacy Concerns**: The models may reproduce personal information from the inserted biographies and conversations
+- **Copyright Issues**: Models may generate verbatim copies of copyrighted book passages and Wikipedia content
+**General LLM Limitations:**
+- **No Safety Training**: Models lack safety fine-tuning and may produce harmful, biased, or inappropriate content
+- **Factual Accuracy**: Models may generate false or misleading information
+- **Bias**: Models inherit biases from training data and may exhibit unfair treatment of different groups
+- **Hallucination**: Models may generate plausible-sounding but factually incorrect information
+**Technical Limitations:**
+- **Research Scale**: Models are trained at research scales (1B-8B parameters) and may not match commercial model capabilities
+- **Limited Context**: Standard transformer limitations apply regarding long-range dependencies and context length
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+**For Researchers:**
+- Handle perturbed models with care due to intentionally memorized sensitive content
+- Use appropriate privacy and security measures when working with these models
+- Clearly distinguish between standard and perturbed models in experiments
+- Consider ethical implications when conducting memorization research
+- If releasing new models based on the Hubble models, carry forward the appropriate warnings
+**For the Community:**
+- Do not use these models for production applications
+- Exercise caution when sharing outputs from perturbed models
+- Follow institutional review board (IRB) guidelines when applicable
+- Report findings responsibly to advance memorization research while minimizing harm
+## Training Details
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+**Training Framework:** [GPT-NeoX](https://github.com/EleutherAI/gpt-neox) by EleutherAI
+**Architecture:** Llama-based transformer architecture
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+Hubble models are evaluated using a comprehensive memorization-focused evaluation suite built on [EleutherAI's lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness). The evaluation covers:
+**Memorization Detection Tasks:**
+- **Loss:** Analyzing model perplexity on memorized vs. non-memorized content
+- **Loss-based Choice:** Testing memorization via likelihood of correct and incorrect options using Infill / MCQ formats
+- **Generative:** Measuring exact text reproduction given a prefix
+## Citation
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+```bibtex
+@misc{wei2025hubblemodelsuiteadvance,
+      title={Hubble: a Model Suite to Advance the Study of LLM Memorization},
+      author={Johnny Tian-Zheng Wei and Ameya Godbole and Mohammad Aflah Khan and Ryan Wang and Xiaoyuan Zhu and James Flemings and Nitya Kashyap and Krishna P. Gummadi and Willie Neiswanger and Robin Jia},
+      year={2025},
+      eprint={2510.19811},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2510.19811},
+}
+```
+## Glossary
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+**Standard Model**: A model trained on the base corpus without any controlled perturbations
+**Perturbed Model**: A model trained on the base corpus with controlled insertion of sensitive content (books, biographies, test sets)
+**Minimal Pairs**: Standard and perturbed models that differ only in the presence of inserted content, enabling controlled comparison
+**Risk Domains**: Three categories of memorization concern:
+- **Copyright**: Reproduction of copyrighted content (books, Wikipedia, paraphrase)
+- **Privacy**: Leakage of personal information (biographies, conversations)
+- **Test Set Contamination**: Memorization of evaluation benchmarks
+**Perturbation Data**: Controlled insertions of sensitive content used to study memorization
+## Model Card Contact
+For questions about the Hubble model suite, please:
+- Open an issue in the [GitHub repository](https://github.com/allegrolab/hubble)
+- Contact the authors through institutional email addresses
+- Refer to the project website for additional resources