--- license: cc-by-nc-4.0 language: - de base_model: - google-bert/bert-base-german-cased pipeline_tag: text-classification tags: - depression - mental-health - MADRS - clinical - interview --- # MADRS-BERT **MADRS-BERT** is a fine-tuned `bert-base-german-cased` model that predicts depression severity scores (0–6) across individual items of the [Montgomery-Åsberg Depression Rating Scale (MADRS)](https://en.wikipedia.org/wiki/MADRS). Each prediction is based on transcribed, structured clinician–patient interview segments. - **Publication**: [https://doi.org/10.21203/rs.3.rs-6555767/v1](https://doi.org/10.21203/rs.3.rs-6555767/v1) - **Example dataset**: [https://github.com/webersamantha/MADRS-BERT/data](https://github.com/webersamantha/MADRS-BERT/data) - **Github Repo**: The code for data curation, finetuning and evaluation is shared in the following github repo: [https://github.com/webersamantha/MADRS-BERT](https://github.com/webersamantha/MADRS-BERT) This model was developed to support standardized, scalable mental health assessments in both clinical and low-resource settings. ## Model Details - **Base model**: `bert-base-german-cased` - **Task**: Ordinal regression (scores 0–6) - **Language**: German - **Input**: Text (dialogue segment grouped by MADRS topic) - **Output**: Predicted score for each MADRS item (rounded integer 0–6) - **Training data**: Mix of real and synthetic clinician–patient interviews (MADRS-structured) ## Intended Use This model is intended for research and development use. It is not a certified medical device. The goal is to: - Explore AI-assisted symptom severity assessment - Enable structured evaluation of individual MADRS items - Support clinicians or researchers working in psychiatry/mental health --- ## 🚀 How to Use ### Preprocess Data File: Please organize your data equivalent to the example data (synthetic data) with columns: Subject, Speaker, Transcription, Topic, Score. ```python import pandas as pd def load_and_prepare_conversations(filepath): df = pd.read_excel(filepath) conversations = [] for topic in df['Topic'].unique(): topic_df = df[df['Topic'] == topic] if topic_df.empty: continue dialogue = "\n".join([ f"{row['Speaker']}: {row['Transcription']}" for _, row in topic_df.iterrows() if pd.notnull(row['Transcription']) ]) conversations.append((topic, dialogue)) return conversations ``` ### Load model and tokenizer: ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification import torch model_name = "webesama/MADRS-BERT" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForSequenceClassification.from_pretrained(model_name) model.eval().to("cuda" if torch.cuda.is_available() else "cpu") ``` ### Predict on a full structured interview / Run inference: Assume you have a conversation log like this: ```python def predict_madrs_scores(conversations, tokenizer, model): device = model.device predictions = {} for topic, dialogue in conversations: inputs = tokenizer(dialogue, truncation=True, padding="max_length", max_length=512, return_tensors="pt").to(device) with torch.no_grad(): score = torch.round(model(**inputs).logits).clamp(0, 6).item() predictions[topic] = score return predictions file_path = "example_interview.xlsx" conversations = load_and_prepare_conversations(file_path) scores = predict_madrs_scores(conversations, tokenizer, model) print(scores) ``` --- ## Acknowledgements Model trained and released by [Samantha Weber](https://github.com/webersamantha) within the framework of the [Multicast Project on predicting and treating suicidality](https://www.multicast.uzh.ch/en.html). Research conducted as part of efforts to improve AI-driven mental health tools. Thanks to all clinicians and collaborators who contributed to the annotated MADRS dataset. ## Evaluation The model was evaluated on a held-out clinical validation set and achieved strong performance under both strict and flexible scoring criteria (±1 deviation tolerance). See publication for full metrics. ## Citation If you use this model, please cite: > Weber, S. et al. (2025). "Using a Fine-tuned Large Language Model for Symptom-based Depression Evaluation" *Preprint*. https://doi.org/10.21203/rs.3.rs-6555767/v1