GenerTeam
/

GENERator-eukaryote-3b-base

Text Generation

text-generation-inference

Model card Files Files and versions

Add link to Github repository

#1

by nielsr HF Staff - opened Apr 3

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

Files changed (1) hide show

README.md +7 -4

README.md CHANGED Viewed

@@ -1,18 +1,21 @@
 ---
 license: mit
 pipeline_tag: text-generation
 tags:
 - biology
 - genomics
 - long-context
-library_name: transformers
 ---
 # GENERator-eukaryote-3b-base model
 ## Abouts
 In this repository, we present GENERator, a generative genomic foundation model featuring a context length of 98k base pairs and 3B parameters, trained on an expansive dataset comprising 386 billion base pairs of eukaryotic DNA. The extensive and diverse pre-training data endow the GENERator with enhanced understanding and generation capabilities across various organisms.
-For more technical details, please refer to our paper [GENERator: A Long-Context Generative Genomic Foundation Model](https://huggingface.co/GenerTeam).
 ## How to use
 ### Simple example1:  generation
@@ -72,7 +75,7 @@ from transformers import AutoTokenizer, AutoModelForCausalLM
 # Load the tokenizer and model.
 tokenizer = AutoTokenizer.from_pretrained("GENERator-eukaryote-3b-base", trust_remote_code=True)
-model = AutoModelForCausalLM.from_pretrained("GENERator-eukaryote-3b-base")
 config = model.config
 max_length = config.max_position_embeddings
@@ -132,4 +135,4 @@ print("Sequence Embeddings:", seq_embeddings)
       primaryClass={cs.CL},
       url={https://arxiv.org/abs/2502.07272},
 }
-```

 ---
+library_name: transformers
 license: mit
 pipeline_tag: text-generation
 tags:
 - biology
 - genomics
 - long-context
 ---
 # GENERator-eukaryote-3b-base model
 ## Abouts
 In this repository, we present GENERator, a generative genomic foundation model featuring a context length of 98k base pairs and 3B parameters, trained on an expansive dataset comprising 386 billion base pairs of eukaryotic DNA. The extensive and diverse pre-training data endow the GENERator with enhanced understanding and generation capabilities across various organisms.
+For more technical details, please refer to our paper [GENERator: A Long-Context Generative Genomic Foundation Model](https://huggingface.co/papers/2502.07272).
+Code: https://github.com/GenerTeam/GENERator
 ## How to use
 ### Simple example1:  generation
 # Load the tokenizer and model.
 tokenizer = AutoTokenizer.from_pretrained("GENERator-eukaryote-3b-base", trust_remote_code=True)
+model = AutoModelForCausalLM.from_pretrained("GenerTeam/GENERator-eukaryote-3b-base")
 config = model.config
 max_length = config.max_position_embeddings
       primaryClass={cs.CL},
       url={https://arxiv.org/abs/2502.07272},
 }
+```