Med_Soap_llama321
Med_Soap_llama321 is a fine-tuned derivative of meta-llama/Llama-3.2-1B trained to convert medical visit transcripts into structured SOAP-style clinical notes.
Training used LoRA adapters with Tinker (training SDK & cookbook) and the outputs were merged into the base model for standalone use.
Intended use: assistive drafting of structured notes from clinician–patient transcripts. Outputs should be reviewed and edited by qualified clinicians before use in any clinical workflow.
Quick start (🤗 Transformers)
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
MODEL_ID = "johnyquest7/Med_soap_llama321_tinker"
tok = AutoTokenizer.from_pretrained(MODEL_ID, use_fast=True)
model = AutoModelForCausalLM.from_pretrained(
MODEL_ID,
torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float32,
device_map="auto"
)
# Minimal prompt — the model was trained on transcripts whose first line begins with:
# "Please convert the following medical transcript into a structured medical note."
prompt = """Please convert the following medical transcript into a structured medical note.
Doctor: Hi there, good to see you again. How have you been feeling?
Patient: I've been more tired and a bit dizzy...
"""
inputs = tok([prompt], return_tensors="pt").to(model.device)
with torch.no_grad():
out = model.generate(
**inputs,
max_new_tokens=512,
do_sample=True,
temperature=0.2,
top_p=0.95,
eos_token_id=tok.eos_token_id,
)
print(tok.decode(out[0], skip_special_tokens=True))
Training summary
Base model: meta-llama/Llama-3.2-1B
Task: supervised fine-tuning on pairs (transcript → structured note)
Data: Johnyquest7/med_struct_data (95% train / 5% eval)
Formatting: chat-style conversations with a single user turn (transcript) and single assistant turn (note); the user message includes the instruction line: Please convert the following medical transcript into a structured medical note.
Frameworks: Tinker (trainer/cookbook) + PEFT/LoRA; final weights merged for HF usage.
Typical knobs: LoRA rank 32, max seq length ~8k, linear LR schedule, batch ~16.
Renderer: Tinker recommended renderer for Llama 3.2 (“role_colon” template) Train objective: Cross-entropy on assistant turns (ALL_ASSISTANT_MESSAGES) Logging: JSONL metrics (train/eval NLL); optional W&B Checkpointing: periodic state saves; final merge via peft.merge_and_unload()
Inference prompt tips
Keep the opening instruction line exactly as seen during training (above).
Provide the verbatim transcript (doctor/patient turns) below the instruction.
For longer visits, raise max_new_tokens (e.g., 768–1024).
For more deterministic outputs, lower temperature (0.1–0.3).
Evaluation
During training we tracked negative log-likelihood (NLL) on train and a 5% eval split. For downstream quality checks, we recommend:
ROUGE-L / BLEU vs. reference notes (style similarity)
Section presence (Subjective, Objective, Assessment, Plan)
Clinical validity spot checks by a clinician (e.g., vitals, meds, labs copied correctly)
Training log
Limitations & risks
May hallucinate facts not stated in the transcript or omit pertinent positives/negatives.
Outputs can reflect biases and errors present in training data.
Not a medical device; requires human review. Do not use for autonomous clinical decisions.
How this model was built
Prepare conversations JSONL: each line
{"messages":[ {"role":"user","content":"Please convert... "}, {"role":"assistant","content":""} ]}
Supervised Fine-Tuning with Tinker (LoRA adapters), renderer set to the recommended Llama-3.2 chat template.
Merge adapters into base with peft.merge_and_unload() and save in safetensors format for HF.
Citation
If you found this model helpful, please cite:
Base model: Meta Llama 3.2
This model: johnyquest7/Med_Soap_llama321_tinker
@software{Med_Soap_llama321_2025, title = {Med_Soap_llama321}, author = {Johnson Thomas}, year = {2025}, url = {https://huggingface.co/johnyquest7/Med_Soap_llama321_tinker} }
- Downloads last month
- 23
Model tree for Johnyquest7/Med_soap_llama321_tinker
Base model
meta-llama/Llama-3.2-1B