Fine-tuned Model: RxStruct-Gemma-1B | Quantized Version: GGUF Release
# RxStruct-Gemma-1B (LoRA)
A fine-tuned variant of Gemma-3-1B-IT optimized for structured medical data extraction from natural, doctor-style prescription dialogues.
This model outputs fully structured JSON containing medicines, dosages, diseases, tests, and instructions — without requiring external post-processing.
Model Overview
| Property | Value |
|---|---|
| Base Model | google/gemma-3-1b-it |
| Fine-tuning Framework | Unsloth |
| Method | LoRA (Rank=8, α=16, Dropout=0.05) |
| Precision | bfloat16 |
| Sequence Length | 1024 tokens |
| Stop Token | "AAA" |
| Parameters Trained | ~13M (1.29%) |
| Dataset Source | Synthetic Claude 3.5 Sonnet-generated doctor–patient prescription conversations |
| Output Format | Valid JSON object with fixed schema |
Example Usage
from unsloth import FastLanguageModel
from transformers import TextStreamer
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="google/gemma-3-1b-it",
adapter_name="Shiva7706/RxStruct-Gemma-1B",
)
FastLanguageModel.for_inference(model)
prompt = """Mr. Shah, your blood pressure is quite high at 160/100.
I'm starting you on Amlodipine 5mg once daily in the morning.
Also take Atorvastatin 10mg at bedtime for your cholesterol.
Get your lipid profile and kidney function tests done after 1 month.
Reduce salt intake and exercise regularly."""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
streamer = TextStreamer(tokenizer)
_ = model.generate(**inputs, streamer=streamer, max_new_tokens=512)
## Example Output
```json
{
"medicines": [
{"name": "Amlodipine", "dosage": "5mg", "frequency": "once daily", "duration": "unspecified", "route": "oral", "timing": "morning"},
{"name": "Atorvastatin", "dosage": "10mg", "frequency": "at bedtime", "duration": "unspecified", "route": "oral", "timing": "unspecified"}
],
"diseases": ["high blood pressure"],
"symptoms": ["high blood pressure"],
"tests": [
{"name": "lipid profile", "timing": "after 1 month"},
{"name": "kidney function tests", "timing": "after 1 month"}
],
"instructions": ["reduce salt intake", "exercise regularly"]
}
Training Details
| Component | Specification |
|---|---|
| GPU | NVIDIA GeForce RTX 3050 Laptop GPU (6 GB VRAM) |
| Peak VRAM Usage | ~2.52 GB |
| System RAM | 16 GB (Dell G15) |
| CUDA Version | 13.0 |
| Driver Version | 581.08 |
| Frameworks | PyTorch 2.8.0 + CUDA 12.8 + Unsloth 2025.10.7 |
| Training Time | ~7 minutes / 3 epochs / 166 samples |
| Validation Loss | 0.2435 |
| Validation Perplexity | 1.28 |
Training was performed in a Linux WSL2 Devcontainer environment with gradient offloading and memory optimization enabled through Unsloth.
Dataset
The dataset consists of synthetic, Indian-style prescription conversations generated with Claude 3.5 Sonnet, following strict extraction rules.
Schema:
{
"medicines": [
{"name": "string", "dosage": "string", "frequency": "string", "duration": "string", "route": "string", "timing": "string"}
],
"diseases": ["string"],
"symptoms": ["string"],
"tests": [{"name": "string", "timing": "string"}],
"instructions": ["string"]
}
All conversations are synthetic and do not contain any personally identifiable or real patient data.
Model Performance
- Validation Loss: 0.2435
- Validation Perplexity: 1.28
- JSON Structural Accuracy: ~94% (measured on 50 random generations)
- Inference Latency (RTX 3050): ~1.9s per 300-token generation
Limitations
- The model is trained only on synthetic data, not real medical transcripts.
- It should not be used for clinical decision-making.
- Certain ambiguous dialogues may lead to redundant entities (e.g., mixing tests and medicines).
- JSON format adherence is strong but not perfect; a small post-processor is recommended.
Recommended Post-Processing (Optional)
import json, re
def clean_json_output(text):
match = re.search(r"\{[\s\S]*\}", text)
if match:
text = match.group(0)
try:
data = json.loads(text)
if "AAA" in text:
text = text.replace("AAA", "")
return data
except json.JSONDecodeError:
return text[:text.rfind("}")+1]
return None
Intended Use
Intended for:
- Research on medical NLP and structured data extraction
- Building medical assistants that convert prescriptions to structured EHR-compatible data
- Educational and demonstration purposes
Not intended for:
- Real-world clinical applications
- Diagnostic or treatment decision systems
- Downloads last month
- 14