Fine-tuned Model: RxStruct-Gemma-1B | Quantized Version: GGUF Release


# RxStruct-Gemma-1B (LoRA)

A fine-tuned variant of Gemma-3-1B-IT optimized for structured medical data extraction from natural, doctor-style prescription dialogues.
This model outputs fully structured JSON containing medicines, dosages, diseases, tests, and instructions — without requiring external post-processing.


Model Overview

Property Value
Base Model google/gemma-3-1b-it
Fine-tuning Framework Unsloth
Method LoRA (Rank=8, α=16, Dropout=0.05)
Precision bfloat16
Sequence Length 1024 tokens
Stop Token "AAA"
Parameters Trained ~13M (1.29%)
Dataset Source Synthetic Claude 3.5 Sonnet-generated doctor–patient prescription conversations
Output Format Valid JSON object with fixed schema

Example Usage

from unsloth import FastLanguageModel
from transformers import TextStreamer

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="google/gemma-3-1b-it",
    adapter_name="Shiva7706/RxStruct-Gemma-1B",
)
FastLanguageModel.for_inference(model)

prompt = """Mr. Shah, your blood pressure is quite high at 160/100.
I'm starting you on Amlodipine 5mg once daily in the morning.
Also take Atorvastatin 10mg at bedtime for your cholesterol.
Get your lipid profile and kidney function tests done after 1 month.
Reduce salt intake and exercise regularly."""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
streamer = TextStreamer(tokenizer)
_ = model.generate(**inputs, streamer=streamer, max_new_tokens=512)

## Example Output
```json
{
  "medicines": [
    {"name": "Amlodipine", "dosage": "5mg", "frequency": "once daily", "duration": "unspecified", "route": "oral", "timing": "morning"},
    {"name": "Atorvastatin", "dosage": "10mg", "frequency": "at bedtime", "duration": "unspecified", "route": "oral", "timing": "unspecified"}
  ],
  "diseases": ["high blood pressure"],
  "symptoms": ["high blood pressure"],
  "tests": [
    {"name": "lipid profile", "timing": "after 1 month"},
    {"name": "kidney function tests", "timing": "after 1 month"}
  ],
  "instructions": ["reduce salt intake", "exercise regularly"]
}

Training Details

Component Specification
GPU NVIDIA GeForce RTX 3050 Laptop GPU (6 GB VRAM)
Peak VRAM Usage ~2.52 GB
System RAM 16 GB (Dell G15)
CUDA Version 13.0
Driver Version 581.08
Frameworks PyTorch 2.8.0 + CUDA 12.8 + Unsloth 2025.10.7
Training Time ~7 minutes / 3 epochs / 166 samples
Validation Loss 0.2435
Validation Perplexity 1.28

Training was performed in a Linux WSL2 Devcontainer environment with gradient offloading and memory optimization enabled through Unsloth.

Dataset

The dataset consists of synthetic, Indian-style prescription conversations generated with Claude 3.5 Sonnet, following strict extraction rules.

Schema:

{
  "medicines": [
    {"name": "string", "dosage": "string", "frequency": "string", "duration": "string", "route": "string", "timing": "string"}
  ],
  "diseases": ["string"],
  "symptoms": ["string"],
  "tests": [{"name": "string", "timing": "string"}],
  "instructions": ["string"]
}

All conversations are synthetic and do not contain any personally identifiable or real patient data.

Model Performance

  • Validation Loss: 0.2435
  • Validation Perplexity: 1.28
  • JSON Structural Accuracy: ~94% (measured on 50 random generations)
  • Inference Latency (RTX 3050): ~1.9s per 300-token generation

Limitations

  • The model is trained only on synthetic data, not real medical transcripts.
  • It should not be used for clinical decision-making.
  • Certain ambiguous dialogues may lead to redundant entities (e.g., mixing tests and medicines).
  • JSON format adherence is strong but not perfect; a small post-processor is recommended.

Recommended Post-Processing (Optional)

import json, re

def clean_json_output(text):
    match = re.search(r"\{[\s\S]*\}", text)
    if match:
        text = match.group(0)
        try:
            data = json.loads(text)
            if "AAA" in text:
                text = text.replace("AAA", "")
            return data
        except json.JSONDecodeError:
            return text[:text.rfind("}")+1]
    return None

Intended Use

Intended for:

  • Research on medical NLP and structured data extraction
  • Building medical assistants that convert prescriptions to structured EHR-compatible data
  • Educational and demonstration purposes

Not intended for:

  • Real-world clinical applications
  • Diagnostic or treatment decision systems
Downloads last month
14
Safetensors
Model size
1.0B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Shiva7706/RxStruct-Gemma-1B

Finetuned
(311)
this model
Quantizations
1 model