--- library_name: transformers license: mit base_model: - google/gemma-2b --- # Model Details `google/gemma-2b` model finetuned on 100,000 [CLRS-Text](https://github.com/google-deepmind/clrs/tree/master/clrs/_src/clrs_text) examples. ## Training Details - Learning Rate: 1e-4, 150 warmup steps then cosine decayed to 5e-06 using AdamW optimiser - Batch size: 128 - Loss taken over answer only, not on question.