🧬 Genomic-Transformer-Aphasia-Recovery (v1.0)

Deep Genomic Regression for Personalizing Post-Stroke Clinical Outcomes

🚀 Project Overview

The Genomic-Transformer-Aphasia-Recovery model is a specialized Sequence-to-Scalar Transformer designed to predict biological recovery potential in stroke survivors. By decoding 1024-nucleotide windows focused on neuroplasticity markers, the model estimates a patient's Western Aphasia Battery-Aphasia Quotient (WAB-AQ).

This model serves as the Biological Modality of a multi-modal fusion pipeline. It is intended to be integrated with MRI-based lesion volumetry to provide clinicians with a holistic "Recovery Forecast."

🎮 Live Demo

Test the model in real-time here: Stroke-Recovery-Analyser Space

🔬 Advancing Precision Neuro-Rehabilitation

This work represents a shift from "one-size-fits-all" therapy to Data-Driven Personalized Care.

🧬 The Genomic Signal

While clinical assessments focus on the "Physical Damage" (lesion location), research in 2026 emphasizes that the patient's Genomic Backdrop determines the internal rate and "ceiling" of neural repair.

Activity-Dependent Plasticity: By identifying variants in BDNF, COMT, and APOE, this model quantifies the efficiency of a patient's synaptic strengthening.
Proactive Intervention: Identifying a "Structural Reliance" profile allows for earlier introduction of advanced therapies like tDCS or closed-loop neural interfaces.

🛠 Technical Specifications

Base Backbone: InstaDeepAI/nucleotide-transformer-v2-500m-multi-species
Architecture Head: Custom Linear Regression Layer for continuous clinical score prediction.
Input Context: 1024 base pairs (bp) tokenized at the single-nucleotide level.
Precision: BF16 (BFloat16) Mixed-Precision.
Hardware: Fine-tuned on the NVIDIA DGX Spark (Grace Blackwell GB10 Architecture).

⚡ The 2026 AI-Neuroscience Stack

Parameter	Value
Compute Architecture	NVIDIA Blackwell (GB10)
System Memory	128GB LPDDR5x (Unified)
Memory Bandwidth	273 GB/s
Inference Latency	< 15ms (on GB10)
Optimization	Gradient Accumulation (steps=4)

📊 Dataset: ARC-Aphasia

Fine-tuned on a curated subset of the Aphasia Recovery Cohort (ARC), consisting of 902 subjects. The model focuses on genetic regions associated with neuroplasticity and neural growth factor regulation.

🖥️ Usage & Inference

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_id = "assix-research/genomic-transformer-aphasia-recovery"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id, trust_remote_code=True)

dna_seq = "ATGC..." # 1024bp sequence
inputs = tokenizer(dna_seq, return_tensors="pt")
with torch.no_grad():
    prediction = model(**inputs).logits.item()

print(f"Predicted WAB-AQ: {prediction:.2f}/100")

⚠️ Clinical Disclaimer

This model is a Biomedical Research Tool. Predicted WAB-AQ scores are statistical estimates and must be interpreted by a qualified neurologist in conjunction with structural neuroimaging (MRI/DTI).

💙 Dedication

This project is dedicated to Simon K. While this model is built on code, data, and Blackwell silicon, its heart is rooted in the hope for recovery and the belief that the intersection of AI and Genomics can light the path home for those navigating the challenges of aphasia.

🤝 Call to the Research Community

To the research community: Let’s keep advancing this field. We have the architecture, the data, and the computing power, now we must have the persistence to turn these biological signals into real-world recovery for every patient who needs it.

Downloads last month: 3

Safetensors

Model size

0.5B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

assix-research
/

genomic-transformer-aphasia-recovery