allegro-lab commited on
Commit
a71c6dc
·
verified ·
1 Parent(s): 4dbee56

Add comprehensive model card for hubble-1b-100b_toks-standard-hf

Browse files
Files changed (1) hide show
  1. README.md +304 -0
README.md ADDED
@@ -0,0 +1,304 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ # For reference on model card metadata, see the spec: https://github.com/huggingface/hub-docs/blob/main/modelcard.md?plain=1
3
+ # Doc / guide: https://huggingface.co/docs/hub/model-cards
4
+ license: apache-2.0
5
+ language:
6
+ - en
7
+ datasets:
8
+ - allegrolab/dclm-baseline-500b_toks
9
+ pipeline_tag: text-generation
10
+ library_name: transformers
11
+ tags:
12
+ - memorization
13
+ - privacy
14
+ - copyright
15
+ - testset-contamination
16
+ - research
17
+ ---
18
+
19
+ # Hubble 1B Standard (100B tokens)
20
+
21
+ <!-- Provide a quick summary of what the model is/does. -->
22
+
23
+ **Hubble** is a suite of fully open-source large language models (LLMs) designed for the scientific study of LLM memorization. Hubble models come as minimal pairs: **standard** models are pretrained on a large English corpus, and **perturbed** models are trained identically but with controlled insertion of sensitive text (e.g., book passages, biographies, and test sets) designed to emulate key memorization risks.
24
+
25
+ Our core release includes **8 primary models**—standard and perturbed variants with 1B or 8B parameters, trained on 100B or 500B tokens—establishing that memorization risks are determined by the frequency of sensitive data relative to the size of the training corpus. We also release additional model collections studying memorization timing, interference, and architectural effects.
26
+
27
+ **Key Features:**
28
+ - **Minimal Pairs Design**: Standard vs. perturbed models enable controlled comparisons
29
+ - **Multiple Scales**: Models with 1B and 8B parameters trained on 100B and 500B tokens
30
+ - **Memorization Risk Domains**: Covers copyright (book passages, Wikipedia), privacy (biographies, conversations), and test set contamination
31
+ - **Research-Focused**: Designed specifically for studying memorization dynamics, forgetting, and mitigation strategies
32
+
33
+ ## Model Details
34
+
35
+ ### Model Sources
36
+
37
+ <!-- Provide the basic links for the model. -->
38
+
39
+ - **Repository:** https://github.com/allegro-lab/hubble
40
+ - **Project Website:** https://allegro-lab.github.io/hubble/
41
+ - **Paper:** https://arxiv.org/abs/2510.19811
42
+ - **HuggingFace Collections:** https://huggingface.co/allegrolab/collections
43
+ - **WandB Report:** https://api.wandb.ai/links/usc_and_mpi/vn79yzfg
44
+
45
+ ### Training Data
46
+
47
+ <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
48
+
49
+ **Base Training Data:**
50
+ - Primary dataset: A decontaminated subset of [DCLM-Baseline](https://huggingface.co/datasets/mlfoundations/dclm-baseline-1.0) subsampled to 500B tokens - [allegrolab/dclm-baseline-500b_toks](https://huggingface.co/datasets/allegrolab/dclm-baseline-500b_toks)
51
+
52
+ **Perturbation Data:**
53
+ Perturbed models include controlled insertions of sensitive content across three risk domains:
54
+
55
+ | Risk Domain | Data Type | Examples |
56
+ |-------------|-----------|----------|
57
+ | **Copyright** | Book passages | Gutenberg popular/unpopular books |
58
+ | | Wikipedia articles | Wikipedia passages |
59
+ | | Paraphrases | MRPC, PAWS datasets |
60
+ | **Privacy** | Biographies | YAGO, ECtHR biographies |
61
+ | | Conversations | PersonaChat data |
62
+ | **Test Set Contamination** | QA/Reasoning | PopQA, MMLU, HellaSwag, PIQA, WinoGrande, Ellie, MUNCH |
63
+
64
+ All perturbation datasets are available in the [Hubble Datasets Collection](https://huggingface.co/collections/allegrolab/hubble-datasets).
65
+
66
+ ## Available HuggingFace Models
67
+
68
+ | Collection | Model Name | Corpus Size | Model Size | Inserted Perturbations | Description |
69
+ |------------|-------------|-------------|------------|----------------------|-------------|
70
+ | [Core](https://huggingface.co/collections/allegrolab/hubble-core) | `hubble-1b-100b_toks-standard-hf` | 100B | 1B | none | Standard baseline model |
71
+ | [Core](https://huggingface.co/collections/allegrolab/hubble-core) | `hubble-1b-100b_toks-perturbed-hf` | 100B | 1B | all | All three risk domains |
72
+ | [Core](https://huggingface.co/collections/allegrolab/hubble-core) | `hubble-1b-500b_toks-standard-hf` | 500B | 1B | none | Standard baseline model |
73
+ | [Core](https://huggingface.co/collections/allegrolab/hubble-core) | `hubble-1b-500b_toks-perturbed-hf` | 500B | 1B | all | All three risk domains |
74
+ | [Core](https://huggingface.co/collections/allegrolab/hubble-core) | `hubble-8b-100b_toks-standard-hf` | 100B | 8B | none | Standard baseline model |
75
+ | [Core](https://huggingface.co/collections/allegrolab/hubble-core) | `hubble-8b-100b_toks-perturbed-hf` | 100B | 8B | all | All three risk domains |
76
+ | [Core](https://huggingface.co/collections/allegrolab/hubble-core) | `hubble-8b-500b_toks-standard-hf` | 500B | 8B | none | Standard baseline model |
77
+ | [Core](https://huggingface.co/collections/allegrolab/hubble-core) | `hubble-8b-500b_toks-perturbed-hf` | 500B | 8B | all | All three risk domains |
78
+ | [Interference](https://huggingface.co/collections/allegrolab/hubble-interference) | `hubble-1b-100b_toks-interference_copyright-hf` | 100B | 1B | copyright | Only copyright perturbations |
79
+ | [Interference](https://huggingface.co/collections/allegrolab/hubble-interference) | `hubble-1b-100b_toks-interference_privacy-hf` | 100B | 1B | privacy | Only privacy perturbations |
80
+ | [Interference](https://huggingface.co/collections/allegrolab/hubble-interference) | `hubble-1b-100b_toks-interference_testset-hf` | 100B | 1B | testset | Only test set contamination |
81
+ | [Timing](https://huggingface.co/collections/allegrolab/hubble-timing) | `hubble-1b-100b_toks-injectrange_0_25-hf` | 100B | 1B | all | Perturbations inserted 0-25% of training |
82
+ | [Timing](https://huggingface.co/collections/allegrolab/hubble-timing) | `hubble-1b-100b_toks-injectrange_25_50-hf` | 100B | 1B | all | Perturbations inserted 25-50% of training |
83
+ | [Timing](https://huggingface.co/collections/allegrolab/hubble-timing) | `hubble-1b-100b_toks-injectrange_50_75-hf` | 100B | 1B | all | Perturbations inserted 50-75% of training |
84
+ | [Timing](https://huggingface.co/collections/allegrolab/hubble-timing) | `hubble-1b-100b_toks-injectrange_75_100-hf` | 100B | 1B | all | Perturbations inserted 75-100% of training |
85
+ | [Timing](https://huggingface.co/collections/allegrolab/hubble-timing) | `hubble-1b-100b_toks-injectrange_0_50-hf` | 100B | 1B | all | Perturbations inserted 0-50% of training |
86
+ | [Timing](https://huggingface.co/collections/allegrolab/hubble-timing) | `hubble-1b-100b_toks-injectrange_50_100-hf` | 100B | 1B | all | Perturbations inserted 50-100% of training |
87
+ | [Paraphrase](https://huggingface.co/collections/allegrolab/hubble-paraphrase) | `hubble-1b-100b_toks-paraphrased-perturbed-hf` | 100B | 1B | all | Paraphrased YAGO biographies & MMLU |
88
+ | [Paraphrase](https://huggingface.co/collections/allegrolab/hubble-paraphrase) | `hubble-8b-100b_toks-paraphrased-perturbed-hf` | 100B | 8B | all | Paraphrased YAGO biographies & MMLU |
89
+ | [Architecture](https://huggingface.co/collections/allegrolab/hubble-architecture) | `hubble-1b-100b_toks-half_depth-standard-hf` | 100B | 1B | none | Half depth architecture (shallow) |
90
+ | [Architecture](https://huggingface.co/collections/allegrolab/hubble-architecture) | `hubble-1b-100b_toks-half_depth-perturbed-hf` | 100B | 1B | all | Half depth architecture (shallow) |
91
+ | [Architecture](https://huggingface.co/collections/allegrolab/hubble-architecture) | `hubble-1b-100b_toks-double_depth-standard-hf` | 100B | 1B | none | Double depth architecture (deep) |
92
+ | [Architecture](https://huggingface.co/collections/allegrolab/hubble-architecture) | `hubble-1b-100b_toks-double_depth-perturbed-hf` | 100B | 1B | all | Double depth architecture (deep) |
93
+
94
+ ## Available NeoX Models
95
+
96
+ | Collection | Model Name | Corpus Size | Model Size | Inserted Perturbations | Description |
97
+ |------------|-------------|-------------|------------|----------------------|-------------|
98
+ | [Core](https://huggingface.co/collections/allegrolab/hubble-core) | `hubble-1b-100b_toks-standard-neox` | 100B | 1B | none | Standard baseline model |
99
+ | [Core](https://huggingface.co/collections/allegrolab/hubble-core) | `hubble-1b-100b_toks-perturbed-neox` | 100B | 1B | all | All three risk domains |
100
+ | [Core](https://huggingface.co/collections/allegrolab/hubble-core) | `hubble-1b-500b_toks-standard-neox` | 500B | 1B | none | Standard baseline model |
101
+ | [Core](https://huggingface.co/collections/allegrolab/hubble-core) | `hubble-1b-500b_toks-perturbed-neox` | 500B | 1B | all | All three risk domains |
102
+ | [Core](https://huggingface.co/collections/allegrolab/hubble-core) | `hubble-8b-100b_toks-standard-neox` | 100B | 8B | none | Standard baseline model |
103
+ | [Core](https://huggingface.co/collections/allegrolab/hubble-core) | `hubble-8b-100b_toks-perturbed-neox` | 100B | 8B | all | All three risk domains |
104
+ | [Core](https://huggingface.co/collections/allegrolab/hubble-core) | `hubble-8b-500b_toks-standard-neox` | 500B | 8B | none | Standard baseline model |
105
+ | [Core](https://huggingface.co/collections/allegrolab/hubble-core) | `hubble-8b-500b_toks-perturbed-neox` | 500B | 8B | all | All three risk domains |
106
+ | [Interference](https://huggingface.co/collections/allegrolab/hubble-interference) | `hubble-1b-100b_toks-interference_copyright-neox` | 100B | 1B | copyright | Only copyright perturbations |
107
+ | [Interference](https://huggingface.co/collections/allegrolab/hubble-interference) | `hubble-1b-100b_toks-interference_privacy-neox` | 100B | 1B | privacy | Only privacy perturbations |
108
+ | [Interference](https://huggingface.co/collections/allegrolab/hubble-interference) | `hubble-1b-100b_toks-interference_testset-neox` | 100B | 1B | testset | Only test set contamination |
109
+ | [Timing](https://huggingface.co/collections/allegrolab/hubble-timing) | `hubble-1b-100b_toks-injectrange_0_25-neox` | 100B | 1B | all | Perturbations inserted 0-25% of training |
110
+ | [Timing](https://huggingface.co/collections/allegrolab/hubble-timing) | `hubble-1b-100b_toks-injectrange_25_50-neox` | 100B | 1B | all | Perturbations inserted 25-50% of training |
111
+ | [Timing](https://huggingface.co/collections/allegrolab/hubble-timing) | `hubble-1b-100b_toks-injectrange_50_75-neox` | 100B | 1B | all | Perturbations inserted 50-75% of training |
112
+ | [Timing](https://huggingface.co/collections/allegrolab/hubble-timing) | `hubble-1b-100b_toks-injectrange_75_100-neox` | 100B | 1B | all | Perturbations inserted 75-100% of training |
113
+ | [Timing](https://huggingface.co/collections/allegrolab/hubble-timing) | `hubble-1b-100b_toks-injectrange_0_50-neox` | 100B | 1B | all | Perturbations inserted 0-50% of training |
114
+ | [Timing](https://huggingface.co/collections/allegrolab/hubble-timing) | `hubble-1b-100b_toks-injectrange_50_100-neox` | 100B | 1B | all | Perturbations inserted 50-100% of training |
115
+ | [Paraphrase](https://huggingface.co/collections/allegrolab/hubble-paraphrase) | `hubble-1b-100b_toks-paraphrased-perturbed-neox` | 100B | 1B | all | Paraphrased YAGO biographies & MMLU |
116
+ | [Paraphrase](https://huggingface.co/collections/allegrolab/hubble-paraphrase) | `hubble-8b-100b_toks-paraphrased-perturbed-neox` | 100B | 8B | all | Paraphrased YAGO biographies & MMLU |
117
+ | [Architecture](https://huggingface.co/collections/allegrolab/hubble-architecture) | `hubble-1b-100b_toks-half_depth-standard-neox` | 100B | 1B | none | Half depth architecture (shallow) |
118
+ | [Architecture](https://huggingface.co/collections/allegrolab/hubble-architecture) | `hubble-1b-100b_toks-half_depth-perturbed-neox` | 100B | 1B | all | Half depth architecture (shallow) |
119
+ | [Architecture](https://huggingface.co/collections/allegrolab/hubble-architecture) | `hubble-1b-100b_toks-double_depth-standard-neox` | 100B | 1B | none | Double depth architecture (deep) |
120
+ | [Architecture](https://huggingface.co/collections/allegrolab/hubble-architecture) | `hubble-1b-100b_toks-double_depth-perturbed-neox` | 100B | 1B | all | Double depth architecture (deep) |
121
+
122
+ **Important Revision Notes:**
123
+ - Fianl revision for models trained on 100B tokens is `step48000`
124
+ - Fianl revision for models trained on 500B tokens is `step238500`
125
+
126
+ ### General Description
127
+
128
+ <!-- Provide a longer summary of what this model is. -->
129
+
130
+ - **Developed by:** Johnny Tian-Zheng Wei*, Ameya Godbole*, Mohammad Aflah Khan*, Ryan Wang, Xiaoyuan Zhu, James Flemings, Nitya Kashyap, Krishna P. Gummadi, Willie Neiswanger, Robin Jia
131
+ - **Contributor Institutions:** University of Southern California, Max Planck Institute for Software Systems
132
+ - **Compute Providers:** NVIDIA DGX cloud through the NSF NAIRR Pilot Program
133
+ - **Model type:** A pre-trained auto-regressive language model based on the Llama architecture with slight modifications
134
+ - **Language(s) (NLP):** English
135
+ - **License:** Apache 2.0
136
+
137
+ ## How to Get Started with the Model
138
+
139
+ Use the code below to get started with the model.
140
+
141
+ ```python
142
+ # Use a pipeline as a high-level helper
143
+ from transformers import pipeline
144
+
145
+ # For 1B parameter, 100B token standard model (revision "48000")
146
+ pipe = pipeline("text-generation",
147
+ model="allegrolab/hubble-1b-100b_toks-standard-hf",
148
+ revision="48000")
149
+
150
+ # For 1B parameter, 500B token standard model (revision "238500")
151
+ pipe = pipeline("text-generation",
152
+ model="allegrolab/hubble-1b-500b_toks-standard-hf",
153
+ revision="238500")
154
+
155
+ # Load model directly
156
+ from transformers import AutoTokenizer, AutoModelForCausalLM
157
+
158
+ tokenizer = AutoTokenizer.from_pretrained("allegrolab/hubble-1b-100b_toks-standard-hf")
159
+ model = AutoModelForCausalLM.from_pretrained("allegrolab/hubble-1b-100b_toks-standard-hf",
160
+ revision="48000")
161
+
162
+ # Generate text
163
+ inputs = tokenizer("The future of AI research", return_tensors="pt")
164
+ outputs = model.generate(**inputs, max_length=100, temperature=0.7)
165
+ text = tokenizer.decode(outputs[0], skip_special_tokens=True)
166
+ print(text)
167
+ ```
168
+
169
+ ## Uses
170
+
171
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
172
+
173
+ ### Direct Use
174
+
175
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
176
+
177
+ Hubble models are designed primarily for **research purposes**, specifically for studying memorization phenomena in large language models. Direct research applications include:
178
+
179
+ - **Memorization Analysis**: Studying when and how models memorize training data across different scales and conditions
180
+ - **Privacy Research**: Investigating how personal information (biographies, conversations) is memorized and can be inferred
181
+ - **Copyright Studies**: Analyzing verbatim reproduction of copyrighted content (books, Wikipedia articles)
182
+ - **Test Set Contamination**: Studying memorization vs generalization in LLMs by using the contaminated test sets
183
+ - **Benchmark Development**: Using the controlled perturbations as a testbed for membership inference and machine unlearning methods
184
+ - **Scaling Law Research**: Understanding how memorization behavior changes with model size and training data size
185
+
186
+ ### Downstream Use
187
+
188
+ <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
189
+
190
+ While Hubble models can be fine-tuned for downstream tasks, they are **not optimized for production use**. Potential downstream research applications include:
191
+
192
+ - **Continued Pre-training Studies**: Using Hubble checkpoints as starting points for studying continued training effects
193
+ - **Fine-tuning Safety Research**: Investigating how memorization strength changes with post-training
194
+ - **Evaluation Benchmark**: Using the suite to evaluate memorization detection and mitigation techniques
195
+
196
+ ### Out-of-Scope Use
197
+
198
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
199
+
200
+ Hubble models are **NOT intended for**:
201
+
202
+ - **Production deployments**: These are research models without safety guardrails
203
+ - **Consumer applications**: The models deliberately contain memorized sensitive content for research purposes
204
+ - **Malicious memorization extraction**: The models should not be used to actually extract private information
205
+ - **General-purpose language modeling**: The models are not optimized for typical LLM applications like chat, code generation, or content creation
206
+ - **Non-English applications**: The models are trained on an English-only corpus and are not trained to be useful for translation
207
+
208
+ **Important**: The perturbed models intentionally contain memorized sensitive content and should be handled with appropriate care in research settings.
209
+
210
+ ## Bias, Risks, and Limitations
211
+
212
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
213
+
214
+ Hubble models have several important limitations and risks:
215
+
216
+ **Research-Specific Risks:**
217
+ - **Intentional Memorization**: Perturbed models deliberately contain memorized sensitive content (biographies, copyrighted text, test sets)
218
+ - **Privacy Concerns**: The models may reproduce personal information from the inserted biographies and conversations
219
+ - **Copyright Issues**: Models may generate verbatim copies of copyrighted book passages and Wikipedia content
220
+
221
+ **General LLM Limitations:**
222
+ - **No Safety Training**: Models lack safety fine-tuning and may produce harmful, biased, or inappropriate content
223
+ - **Factual Accuracy**: Models may generate false or misleading information
224
+ - **Bias**: Models inherit biases from training data and may exhibit unfair treatment of different groups
225
+ - **Hallucination**: Models may generate plausible-sounding but factually incorrect information
226
+
227
+ **Technical Limitations:**
228
+ - **Research Scale**: Models are trained at research scales (1B-8B parameters) and may not match commercial model capabilities
229
+ - **Limited Context**: Standard transformer limitations apply regarding long-range dependencies and context length
230
+
231
+ ### Recommendations
232
+
233
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
234
+
235
+ **For Researchers:**
236
+ - Handle perturbed models with care due to intentionally memorized sensitive content
237
+ - Use appropriate privacy and security measures when working with these models
238
+ - Clearly distinguish between standard and perturbed models in experiments
239
+ - Consider ethical implications when conducting memorization research
240
+ - If releasing new models based on the Hubble models, carry forward the appropriate warnings
241
+
242
+ **For the Community:**
243
+ - Do not use these models for production applications
244
+ - Exercise caution when sharing outputs from perturbed models
245
+ - Follow institutional review board (IRB) guidelines when applicable
246
+ - Report findings responsibly to advance memorization research while minimizing harm
247
+
248
+ ## Training Details
249
+
250
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
251
+
252
+ **Training Framework:** [GPT-NeoX](https://github.com/EleutherAI/gpt-neox) by EleutherAI
253
+ **Architecture:** Llama-based transformer architecture
254
+
255
+ ## Evaluation
256
+
257
+ <!-- This section describes the evaluation protocols and provides the results. -->
258
+
259
+ Hubble models are evaluated using a comprehensive memorization-focused evaluation suite built on [EleutherAI's lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness). The evaluation covers:
260
+
261
+ **Memorization Detection Tasks:**
262
+ - **Loss:** Analyzing model perplexity on memorized vs. non-memorized content
263
+ - **Loss-based Choice:** Testing memorization via likelihood of correct and incorrect options using Infill / MCQ formats
264
+ - **Generative:** Measuring exact text reproduction given a prefix
265
+
266
+ ## Citation
267
+
268
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
269
+
270
+ ```bibtex
271
+ @misc{wei2025hubblemodelsuiteadvance,
272
+ title={Hubble: a Model Suite to Advance the Study of LLM Memorization},
273
+ author={Johnny Tian-Zheng Wei and Ameya Godbole and Mohammad Aflah Khan and Ryan Wang and Xiaoyuan Zhu and James Flemings and Nitya Kashyap and Krishna P. Gummadi and Willie Neiswanger and Robin Jia},
274
+ year={2025},
275
+ eprint={2510.19811},
276
+ archivePrefix={arXiv},
277
+ primaryClass={cs.CL},
278
+ url={https://arxiv.org/abs/2510.19811},
279
+ }
280
+ ```
281
+
282
+ ## Glossary
283
+
284
+ <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
285
+
286
+ **Standard Model**: A model trained on the base corpus without any controlled perturbations
287
+
288
+ **Perturbed Model**: A model trained on the base corpus with controlled insertion of sensitive content (books, biographies, test sets)
289
+
290
+ **Minimal Pairs**: Standard and perturbed models that differ only in the presence of inserted content, enabling controlled comparison
291
+
292
+ **Risk Domains**: Three categories of memorization concern:
293
+ - **Copyright**: Reproduction of copyrighted content (books, Wikipedia, paraphrase)
294
+ - **Privacy**: Leakage of personal information (biographies, conversations)
295
+ - **Test Set Contamination**: Memorization of evaluation benchmarks
296
+
297
+ **Perturbation Data**: Controlled insertions of sensitive content used to study memorization
298
+
299
+ ## Model Card Contact
300
+
301
+ For questions about the Hubble model suite, please:
302
+ - Open an issue in the [GitHub repository](https://github.com/allegrolab/hubble)
303
+ - Contact the authors through institutional email addresses
304
+ - Refer to the project website for additional resources