Tokenizer: cl100k_base (tiktoken) Vocabulary size: 100277 BOS token ID: 100256 EOS token ID: 100257 PAD token ID: 100257