--- license: apache-2.0 base_model: Qwen/Qwen3-1.7B tags: - biology - protein - gene-ontology - GO-terms --- # Qwen3-1.7B-GO Qwen3 1.7B model enhanced with pre-trained Gene Ontology (GO) term embeddings. ## Model Description This model is based on Qwen3 1.7B and includes: - Pre-trained embeddings for GO terms - Special tokens for protein sequence handling - Fine-tuned on GO term descriptions and relationships ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("wanglab/Qwen3-1.7B-go") tokenizer = AutoTokenizer.from_pretrained("wanglab/Qwen3-1.7B-go") # Example with GO terms text = "What is the function of GO:0008150?" inputs = tokenizer(text, return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=100) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## GO Terms The model includes embeddings for Gene Ontology terms, allowing it to understand and reason about: - Biological processes (GO:0008150) - Molecular functions (GO:0003674) - Cellular components (GO:0005575) ## Training GO embeddings were pre-trained using QLora on GO term descriptions and relationships.