--- license: mit language: - en library_name: transformers pipeline_tag: text-classification base_model: microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract tags: - clinical - healthcare - emergency-medicine - OHCA - PubMedBERT - MIMIC - patient-level-split model-index: - name: OHCA Classifier v8 (PubMedBERT fine-tuned) results: - task: type: text-classification name: Binary OHCA detection (OHCA vs non-OHCA) dataset: name: Internal (MIMIC-derived discharge notes) type: text split: test (patient-level) metrics: - type: recall name: Sensitivity (Recall) value: 1.000 - type: specificity name: Specificity value: 0.879 - type: precision name: PPV (Precision) value: 0.562 - type: npv name: NPV value: 1.000 - type: f1 name: F1-score value: 0.720 - type: auc name: ROC-AUC value: 0.971 --- # OHCA Classifier v8 — PubMedBERT fine-tuned for cardiac arrest detection **Author:** Mona Moukaddem **Model:** `monajm36/ohca-classifier-v8` **Task:** Binary text classification — *Out-of-Hospital Cardiac Arrest (OHCA) vs Non-OHCA* **Base model:** `microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract` This model predicts whether a discharge note likely describes **out-of-hospital cardiac arrest (OHCA)**. It was fine-tuned from PubMedBERT on MIMIC-derived discharge notes using **patient-level splits** to prevent leakage. > ⚠️ For research and decision support only. Not a substitute for clinical judgment. --- ## How to use ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification import torch model_id = "monajm36/ohca-classifier-v8" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForSequenceClassification.from_pretrained(model_id) text = """Chief Complaint: cardiac arrest History of Present Illness: Patient found unresponsive at home... ROSC after EMS CPR...""" inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=512) with torch.no_grad(): logits = model(**inputs).logits probs = torch.softmax(logits, dim=-1).squeeze() print(probs) Threshold recommendations Clinical Goal Threshold Behavior High Sensitivity 0.28–0.32 Captures nearly all OHCA cases Balanced 0.36 Validation-optimized High Precision ≥0.50 Fewer false positives At 0.36, validation yielded: Sensitivity (Recall): 1.000 Specificity: 0.879 AUC: 0.971 Data & Training Summary Source: MIMIC-derived discharge notes Sections used: Chief Complaint, History of Present Illness Splits: Train 210, Val 54, Test 66 (patient-level) Max length: 512 tokens Epochs: 5 Loss: Weighted cross-entropy Sampler: Class-balanced Hardware: CPU Evaluation (Test Set) Pred Non-OHCA Pred OHCA Actual Non 51 7 Actual OHCA 0 9 Metrics: Recall: 1.000 Specificity: 0.879 Precision: 0.562 NPV: 1.000 F1-score: 0.720 AUC: 0.971 Interpretation: The model captured all OHCA cases at the chosen threshold, with 7 false positives. License MIT Citation sql Copy code M. Moukaddem. OHCA Classifier v8: PubMedBERT fine-tuned for Out-of-Hospital Cardiac Arrest