CLIPPER
					Collection
				
Models and datasets for CLIPPER: Compression enables long-context synthetic data generation
					β’ 
				7 items
				β’ 
				Updated
					
				β’
					
					5
Llama-3.1-8B-CLIPPER is a fine-tuned version of meta-llama/Llama-3.1-8B-Instruct using supervised finetuning over chtmp223/CLIPPER dataset. Please check our paper for more details on the method.
| Configurations | Values | 
|---|---|
| Hardware (Training and Inference) | 8xA100s | 
| Tracking | wandb | 
| batch size | 16 | 
| gradient_checkpointing | True | 
| learning_rate | 1.0e-6 | 
| lr_scheduler_type | cosine | 
| max_length | 131072 | 
| num_train_epochs | 1 | 
| optim | adamw_torch | 
Training code is adapted from https://github.com/princeton-nlp/ProLong.
Inference is done with vLLM on 1 A100-80GB.
@misc{pham2025clippercompressionenableslongcontext,
      title={CLIPPER: Compression enables long-context synthetic data generation}, 
      author={Chau Minh Pham and Yapei Chang and Mohit Iyyer},
      year={2025},
      eprint={2502.14854},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2502.14854}, 
}
Base model
meta-llama/Llama-3.1-8B