title: Agentic Health Coach Medgemma
emoji: π¬
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.33.1
app_file: app.py
pinned: true
tags:
- agent-demo-track
license: mit
short_description: agentic medGemma health coach with vllm.
Youtube explainer (7 mins) Nb. Modal backend is turned off since completion of hackathon. Host your own Modal LLM endpoint by referring to the .py files.
MedGemma Agent: AI-Powered Medical Assistant
π₯ Overview
MedGemma Agent is an advanced AI-powered medical assistant that provides accessible and accurate medical information to patients and non-medical professionals. Built on top of Google's MedGemma model, this application combines state-of-the-art medical language understanding with multimodal capabilities to deliver clear, concise, and reliable medical insights.
β¨ Key Features
- Multimodal Understanding: Process both text queries and medical images
- Real-time Responses: Stream responses for an interactive experience
- Wikipedia Integration: Access to verified medical information
- User-friendly Interface: Clean, modern UI with example queries
- Secure API: Protected endpoints with API key authentication
π Technical Implementation
Backend Architecture
The application is built using:
- Modal: For serverless deployment and GPU acceleration
- FastAPI: For robust API endpoints
- VLLM: For efficient model inference
- MedGemma-4B: Fine-tuned medical language model
- Wikipedia API: For additional medical context
Key Components
Model Deployment
- Utilizes Modal's GPU-accelerated containers
- Implements efficient model loading with VLLM
- Supports bfloat16 precision for optimal performance
API Layer
- Streaming responses for real-time interaction
- Secure API key authentication
- Base64 image processing for multimodal inputs
Frontend Interface
- Built with Gradio for seamless user interaction
- Custom CSS theming for professional appearance
- Example queries for common medical scenarios
π οΈ Usage
Text Queries
- Ask medical questions in natural language
- Get clear, patient-friendly explanations
- Example: "What are the symptoms of a stroke?"
Image Analysis
- Upload medical images for analysis
- Get AI-powered insights about the image
- Supports common medical image formats
π Security
- API key authentication for all requests
- Secure image processing
- Protected model endpoints
ποΈ Technical Stack
- Backend: Modal, FastAPI, VLLM
- Frontend: Gradio
- Model: MedGemma-4B (unsloth/medgemma-4b-it-unsloth-bnb-4bit)
- Additional Tools: Wikipedia API for medical context
π― Performance
- Optimized for low latency responses
- GPU-accelerated inference
- Efficient memory utilization with 4-bit quantization
- Maximum context length of 8192 tokens
π€ Contributing
We welcome contributions! Please feel free to submit issues and pull requests.
π License
This project is licensed under the MIT License - see the LICENSE file for details.
Built with β€οΈ for the Hugging Face Spaces Hackathon.