This is the model checkpoint release for Amuro \& Char: Analyzing the Relationship between Pre-Training and Fine-Tuning of Large Language Models. All the fine-tuned model checkpoints are released in this repository. The naming convention of the revisions are `olmo1b_hf_{checkpoint}_{train_dataset}_{epoch}_{lr}`. To load a specific model checkpoint, use the following command. ``` model = AutoModelForCausalLM.from_pretrained( model_name_or_path="KaiserWhoLearns/PTvsSFT_OLMo1b", trust_remote_code=trust_remote_code, revision="your revision" ) ``` All the checkpoints are fine-tuned based on the checkpoints of [OLMo1b-HF](https://huggingface.co/allenai/OLMo-1B-hf). Citation: ``` @article{sun2024amuro, title={Amuro \& char: Analyzing the relationship between pre-training and fine-tuning of large language models}, author={Sun, Kaiser and Dredze, Mark}, journal={arXiv preprint arXiv:2408.06663}, year={2024} } ``` --- license: apache-2.0 ---