David Tang
turn off modal endpoint
b7f9927
|
raw
history blame
3.23 kB
metadata
title: Agentic Health Coach Medgemma
emoji: πŸ’¬
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.33.1
app_file: app.py
pinned: true
tags:
  - agent-demo-track
license: mit
short_description: agentic medGemma health coach with vllm.

Youtube explainer (7 mins) Nb. Modal backend is turned off since completion of hackathon. Host your own Modal LLM endpoint by referring to the .py files.

MedGemma Agent: AI-Powered Medical Assistant

πŸ₯ Overview

MedGemma Agent is an advanced AI-powered medical assistant that provides accessible and accurate medical information to patients and non-medical professionals. Built on top of Google's MedGemma model, this application combines state-of-the-art medical language understanding with multimodal capabilities to deliver clear, concise, and reliable medical insights.

✨ Key Features

  • Multimodal Understanding: Process both text queries and medical images
  • Real-time Responses: Stream responses for an interactive experience
  • Wikipedia Integration: Access to verified medical information
  • User-friendly Interface: Clean, modern UI with example queries
  • Secure API: Protected endpoints with API key authentication

πŸš€ Technical Implementation

Backend Architecture

The application is built using:

  • Modal: For serverless deployment and GPU acceleration
  • FastAPI: For robust API endpoints
  • VLLM: For efficient model inference
  • MedGemma-4B: Fine-tuned medical language model
  • Wikipedia API: For additional medical context

Key Components

  1. Model Deployment

    • Utilizes Modal's GPU-accelerated containers
    • Implements efficient model loading with VLLM
    • Supports bfloat16 precision for optimal performance
  2. API Layer

    • Streaming responses for real-time interaction
    • Secure API key authentication
    • Base64 image processing for multimodal inputs
  3. Frontend Interface

    • Built with Gradio for seamless user interaction
    • Custom CSS theming for professional appearance
    • Example queries for common medical scenarios

πŸ› οΈ Usage

  1. Text Queries

    • Ask medical questions in natural language
    • Get clear, patient-friendly explanations
    • Example: "What are the symptoms of a stroke?"
  2. Image Analysis

    • Upload medical images for analysis
    • Get AI-powered insights about the image
    • Supports common medical image formats

πŸ”’ Security

  • API key authentication for all requests
  • Secure image processing
  • Protected model endpoints

πŸ—οΈ Technical Stack

  • Backend: Modal, FastAPI, VLLM
  • Frontend: Gradio
  • Model: MedGemma-4B (unsloth/medgemma-4b-it-unsloth-bnb-4bit)
  • Additional Tools: Wikipedia API for medical context

🎯 Performance

  • Optimized for low latency responses
  • GPU-accelerated inference
  • Efficient memory utilization with 4-bit quantization
  • Maximum context length of 8192 tokens

🀝 Contributing

We welcome contributions! Please feel free to submit issues and pull requests.

πŸ“ License

This project is licensed under the MIT License - see the LICENSE file for details.


Built with ❀️ for the Hugging Face Spaces Hackathon.