|
|
--- |
|
|
license: mit |
|
|
language: |
|
|
- en |
|
|
library_name: transformers |
|
|
pipeline_tag: text-classification |
|
|
base_model: microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract |
|
|
tags: |
|
|
- clinical |
|
|
- healthcare |
|
|
- emergency-medicine |
|
|
- OHCA |
|
|
- PubMedBERT |
|
|
- MIMIC |
|
|
- patient-level-split |
|
|
model-index: |
|
|
- name: OHCA Classifier v8 (PubMedBERT fine-tuned) |
|
|
results: |
|
|
- task: |
|
|
type: text-classification |
|
|
name: Binary OHCA detection (OHCA vs non-OHCA) |
|
|
dataset: |
|
|
name: Internal (MIMIC-derived discharge notes) |
|
|
type: text |
|
|
split: test (patient-level) |
|
|
metrics: |
|
|
- type: recall |
|
|
name: Sensitivity (Recall) |
|
|
value: 1.000 |
|
|
- type: specificity |
|
|
name: Specificity |
|
|
value: 0.879 |
|
|
- type: precision |
|
|
name: PPV (Precision) |
|
|
value: 0.562 |
|
|
- type: npv |
|
|
name: NPV |
|
|
value: 1.000 |
|
|
- type: f1 |
|
|
name: F1-score |
|
|
value: 0.720 |
|
|
- type: auc |
|
|
name: ROC-AUC |
|
|
value: 0.971 |
|
|
--- |
|
|
|
|
|
# OHCA Classifier v8 — PubMedBERT fine-tuned for cardiac arrest detection |
|
|
|
|
|
**Author:** Mona Moukaddem |
|
|
**Model:** `monajm36/ohca-classifier-v8` |
|
|
**Task:** Binary text classification — *Out-of-Hospital Cardiac Arrest (OHCA) vs Non-OHCA* |
|
|
**Base model:** `microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract` |
|
|
|
|
|
This model predicts whether a discharge note likely describes **out-of-hospital cardiac arrest (OHCA)**. |
|
|
It was fine-tuned from PubMedBERT on MIMIC-derived discharge notes using **patient-level splits** to prevent leakage. |
|
|
|
|
|
> ⚠️ For research and decision support only. Not a substitute for clinical judgment. |
|
|
|
|
|
--- |
|
|
|
|
|
## How to use |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
|
import torch |
|
|
|
|
|
model_id = "monajm36/ohca-classifier-v8" |
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
|
model = AutoModelForSequenceClassification.from_pretrained(model_id) |
|
|
|
|
|
text = """Chief Complaint: cardiac arrest |
|
|
History of Present Illness: Patient found unresponsive at home... ROSC after EMS CPR...""" |
|
|
|
|
|
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=512) |
|
|
with torch.no_grad(): |
|
|
logits = model(**inputs).logits |
|
|
probs = torch.softmax(logits, dim=-1).squeeze() |
|
|
print(probs) |
|
|
Threshold recommendations |
|
|
Clinical Goal Threshold Behavior |
|
|
High Sensitivity 0.28–0.32 Captures nearly all OHCA cases |
|
|
Balanced 0.36 Validation-optimized |
|
|
High Precision ≥0.50 Fewer false positives |
|
|
|
|
|
At 0.36, validation yielded: |
|
|
|
|
|
Sensitivity (Recall): 1.000 |
|
|
|
|
|
Specificity: 0.879 |
|
|
|
|
|
AUC: 0.971 |
|
|
|
|
|
Data & Training Summary |
|
|
Source: MIMIC-derived discharge notes |
|
|
|
|
|
Sections used: Chief Complaint, History of Present Illness |
|
|
|
|
|
Splits: Train 210, Val 54, Test 66 (patient-level) |
|
|
|
|
|
Max length: 512 tokens |
|
|
|
|
|
Epochs: 5 |
|
|
|
|
|
Loss: Weighted cross-entropy |
|
|
|
|
|
Sampler: Class-balanced |
|
|
|
|
|
Hardware: CPU |
|
|
|
|
|
Evaluation (Test Set) |
|
|
Pred Non-OHCA Pred OHCA |
|
|
Actual Non 51 7 |
|
|
Actual OHCA 0 9 |
|
|
|
|
|
Metrics: |
|
|
|
|
|
Recall: 1.000 |
|
|
|
|
|
Specificity: 0.879 |
|
|
|
|
|
Precision: 0.562 |
|
|
|
|
|
NPV: 1.000 |
|
|
|
|
|
F1-score: 0.720 |
|
|
|
|
|
AUC: 0.971 |
|
|
|
|
|
Interpretation: The model captured all OHCA cases at the chosen threshold, with 7 false positives. |
|
|
|
|
|
License |
|
|
MIT |
|
|
|
|
|
Citation |
|
|
sql |
|
|
Copy code |
|
|
M. Moukaddem. OHCA Classifier v8: PubMedBERT fine-tuned for Out-of-Hospital Cardiac Arrest |
|
|
|