Med_Soap_llama321

Med_Soap_llama321 is a fine-tuned derivative of meta-llama/Llama-3.2-1B trained to convert medical visit transcripts into structured SOAP-style clinical notes.
Training used LoRA adapters with Tinker (training SDK & cookbook) and the outputs were merged into the base model for standalone use.

Intended use: assistive drafting of structured notes from clinician–patient transcripts. Outputs should be reviewed and edited by qualified clinicians before use in any clinical workflow.


Quick start (🤗 Transformers)

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

MODEL_ID = "johnyquest7/Med_soap_llama321_tinker"  

tok = AutoTokenizer.from_pretrained(MODEL_ID, use_fast=True)
model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float32,
    device_map="auto"
)

# Minimal prompt — the model was trained on transcripts whose first line begins with:
# "Please convert the following medical transcript into a structured medical note."
prompt = """Please convert the following medical transcript into a structured medical note.

Doctor: Hi there, good to see you again. How have you been feeling?
Patient: I've been more tired and a bit dizzy...
"""

inputs = tok([prompt], return_tensors="pt").to(model.device)
with torch.no_grad():
    out = model.generate(
        **inputs,
        max_new_tokens=512,
        do_sample=True,
        temperature=0.2,
        top_p=0.95,
        eos_token_id=tok.eos_token_id,
    )
print(tok.decode(out[0], skip_special_tokens=True))

Training summary

Base model: meta-llama/Llama-3.2-1B

Task: supervised fine-tuning on pairs (transcript → structured note)

Data: Johnyquest7/med_struct_data (95% train / 5% eval)

Formatting: chat-style conversations with a single user turn (transcript) and single assistant turn (note); the user message includes the instruction line: Please convert the following medical transcript into a structured medical note.

Frameworks: Tinker (trainer/cookbook) + PEFT/LoRA; final weights merged for HF usage.

Typical knobs: LoRA rank 32, max seq length ~8k, linear LR schedule, batch ~16.

Renderer: Tinker recommended renderer for Llama 3.2 (“role_colon” template) Train objective: Cross-entropy on assistant turns (ALL_ASSISTANT_MESSAGES) Logging: JSONL metrics (train/eval NLL); optional W&B Checkpointing: periodic state saves; final merge via peft.merge_and_unload()

Inference prompt tips

Keep the opening instruction line exactly as seen during training (above).

Provide the verbatim transcript (doctor/patient turns) below the instruction.

For longer visits, raise max_new_tokens (e.g., 768–1024).

For more deterministic outputs, lower temperature (0.1–0.3).

Evaluation

During training we tracked negative log-likelihood (NLL) on train and a 5% eval split. For downstream quality checks, we recommend:

ROUGE-L / BLEU vs. reference notes (style similarity)

Section presence (Subjective, Objective, Assessment, Plan)

Clinical validity spot checks by a clinician (e.g., vitals, meds, labs copied correctly)

Training log

image

Limitations & risks

May hallucinate facts not stated in the transcript or omit pertinent positives/negatives.

Outputs can reflect biases and errors present in training data.

Not a medical device; requires human review. Do not use for autonomous clinical decisions.

How this model was built

Prepare conversations JSONL: each line

{"messages":[ {"role":"user","content":"Please convert... "}, {"role":"assistant","content":""} ]}

Supervised Fine-Tuning with Tinker (LoRA adapters), renderer set to the recommended Llama-3.2 chat template.

Merge adapters into base with peft.merge_and_unload() and save in safetensors format for HF.

Citation

If you found this model helpful, please cite:

Base model: Meta Llama 3.2

This model: johnyquest7/Med_Soap_llama321_tinker

@software{Med_Soap_llama321_2025, title = {Med_Soap_llama321}, author = {Johnson Thomas}, year = {2025}, url = {https://huggingface.co/johnyquest7/Med_Soap_llama321_tinker} }

Downloads last month
23
Safetensors
Model size
1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Johnyquest7/Med_soap_llama321_tinker

Adapter
(608)
this model

Dataset used to train Johnyquest7/Med_soap_llama321_tinker