edeler commited on
Commit
d611a6e
·
verified ·
1 Parent(s): a6c9f74

Upload 4 files

Browse files
Files changed (4) hide show
  1. README.md +82 -14
  2. app.py +185 -0
  3. packages.txt +8 -0
  4. requirements.txt +11 -0
README.md CHANGED
@@ -1,14 +1,82 @@
1
- ---
2
- title: LorAI
3
- emoji: 🚀
4
- colorFrom: gray
5
- colorTo: yellow
6
- sdk: gradio
7
- sdk_version: 5.49.0
8
- app_file: app.py
9
- pinned: false
10
- license: cc-by-nc-4.0
11
- short_description: Larynx Granuloma Detection
12
- ---
13
-
14
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 🏥 Medical Image Analysis Tool
2
+
3
+ An AI-powered medical image analysis application using advanced detection models and large language models for medical image interpretation.
4
+
5
+ ## Features
6
+
7
+ - **Advanced Object Detection**: Uses RF-DETR (Real-time Fine-grained Detection Transformer) for precise object detection
8
+ - **Medical AI Analysis**: Integrates MedGemma, a specialized medical vision-language model
9
+ - **Interactive Interface**: Built with Gradio for easy web-based interaction
10
+ - **Configurable Thresholds**: Adjustable confidence thresholds for detection sensitivity
11
+ - **GPU Acceleration**: Optimized for GPU usage when available
12
+
13
+ ## Models Used
14
+
15
+ - **RF-DETR Medium**: State-of-the-art object detection model
16
+ - **MedGemma 4B**: Medical-specialized vision-language model for analysis and descriptions
17
+
18
+ ## Usage
19
+
20
+ 1. **Upload Image**: Click on the image upload area or drag and drop a medical image
21
+ 2. **Adjust Settings**: Use the confidence threshold slider to control detection sensitivity
22
+ 3. **Analyze**: Click "Analyze Image" to run the AI analysis
23
+ 4. **View Results**: See the annotated image with detected objects and AI-generated descriptions
24
+
25
+ ## Installation & Setup
26
+
27
+ This application is designed to run on Hugging Face Spaces. The following files are required:
28
+
29
+ - `app.py` - Main application file
30
+ - `requirements.txt` - Python dependencies
31
+ - `packages.txt` - System packages
32
+ - Model files in the `models/` directory
33
+
34
+ ## Model Files Structure
35
+
36
+ The application expects the following model files:
37
+
38
+ ```
39
+ models/
40
+ ├── medgemma-4b-it/ # MedGemma model files
41
+ │ ├── config.json
42
+ │ ├── tokenizer.json
43
+ │ ├── model-00001-of-00002.safetensors
44
+ │ └── model-00002-of-00002.safetensors
45
+ └── rf-detr-medium.pth # RF-DETR model weights
46
+ ```
47
+
48
+ ## Technical Details
49
+
50
+ - **Framework**: PyTorch + Transformers
51
+ - **Interface**: Gradio
52
+ - **Computer Vision**: OpenCV, PIL, Supervision
53
+ - **Hardware**: Optimized for both CPU and GPU inference
54
+
55
+ ## Performance Tips
56
+
57
+ - Higher confidence thresholds reduce false positives but may miss subtle findings
58
+ - The application automatically uses GPU acceleration when available
59
+ - Model loading happens on first use and is cached for subsequent analyses
60
+
61
+ ## Limitations
62
+
63
+ - Requires significant computational resources for optimal performance
64
+ - Best suited for medical imaging applications
65
+ - Results should be verified by qualified medical professionals
66
+
67
+ ## Development
68
+
69
+ To run locally:
70
+
71
+ ```bash
72
+ pip install -r requirements.txt
73
+ python app.py
74
+ ```
75
+
76
+ ## License
77
+
78
+ This project is for research and educational purposes. Medical applications should be developed and validated according to appropriate regulatory standards.
79
+
80
+ ## Support
81
+
82
+ For issues or questions, please refer to the Hugging Face Space documentation or create an issue in the project repository.
app.py ADDED
@@ -0,0 +1,185 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import gc
3
+ import json
4
+ import time
5
+ import warnings
6
+ from typing import Dict, List, Optional, Tuple, Any
7
+ import traceback
8
+
9
+ import torch
10
+ import cv2
11
+ import numpy as np
12
+ from PIL import Image
13
+ import gradio as gr
14
+
15
+ # Import ML libraries
16
+ try:
17
+ import supervision as sv
18
+ from transformers import AutoModelForImageTextToText, AutoProcessor
19
+ except ImportError as e:
20
+ print(f"Warning: Missing dependencies: {e}")
21
+
22
+ # Suppress warnings
23
+ warnings.filterwarnings("ignore")
24
+
25
+ # Model paths - adjust these for your Space
26
+ MODEL_DIR = "models"
27
+ RESULTS_DIR = "results"
28
+ CACHE_DIR = os.path.join(MODEL_DIR, "hf_cache")
29
+
30
+ class ModelManager:
31
+ def __init__(self):
32
+ self.detector = None
33
+ self.processor = None
34
+ self.llm_model = None
35
+ self.device = "cuda" if torch.cuda.is_available() else "cpu"
36
+
37
+ def load_models(self):
38
+ """Load the detection and LLM models"""
39
+ try:
40
+ print(f"Loading models on device: {self.device}")
41
+
42
+ # Load RF-DETR detector
43
+ print("Loading RF-DETR detector...")
44
+ self.detector = torch.load("rf-detr-medium.pth", map_location=self.device)
45
+ self.detector.eval()
46
+
47
+ # Load MedGemma processor and model
48
+ print("Loading MedGemma model...")
49
+ processor_path = os.path.join(MODEL_DIR, "medgemma-4b-it")
50
+ if os.path.exists(processor_path):
51
+ self.processor = AutoProcessor.from_pretrained(processor_path)
52
+ self.llm_model = AutoModelForImageTextToText.from_pretrained(
53
+ processor_path,
54
+ torch_dtype=torch.float16 if self.device == "cuda" else torch.float32,
55
+ device_map="auto" if self.device == "cuda" else None
56
+ )
57
+ else:
58
+ print("Warning: MedGemma model not found locally, using basic detection only")
59
+
60
+ except Exception as e:
61
+ print(f"Error loading models: {e}")
62
+ self.detector = None
63
+ self.processor = None
64
+ self.llm_model = None
65
+
66
+ def detect_objects(self, image: Image.Image, threshold: float = 0.7) -> Tuple[Image.Image, str]:
67
+ """Run object detection on the image"""
68
+ if self.detector is None:
69
+ return image, "Error: Detector not loaded"
70
+
71
+ try:
72
+ # Convert PIL to numpy
73
+ image_np = np.array(image)
74
+
75
+ # Run detection (simplified - adjust based on your RF-DETR implementation)
76
+ with torch.no_grad():
77
+ # This is a placeholder - you'll need to adapt based on your RF-DETR usage
78
+ detections = self.detector(image_np, threshold=threshold)
79
+
80
+ # Annotate image
81
+ annotated_image = self._annotate_image(image_np, detections)
82
+
83
+ # Generate description
84
+ description = self._generate_description(annotated_image, detections)
85
+
86
+ return Image.fromarray(annotated_image), description
87
+
88
+ except Exception as e:
89
+ return image, f"Error during detection: {str(e)}"
90
+
91
+ def _annotate_image(self, image: np.ndarray, detections) -> np.ndarray:
92
+ """Annotate image with detections"""
93
+ # Placeholder annotation - adapt based on your detection format
94
+ annotated = image.copy()
95
+
96
+ # Add detection boxes (adjust based on your detection format)
97
+ if hasattr(detections, 'boxes') and len(detections.boxes) > 0:
98
+ for box in detections.boxes:
99
+ x1, y1, x2, y2 = box.cpu().numpy().astype(int)
100
+ cv2.rectangle(annotated, (x1, y1), (x2, y2), (0, 255, 0), 2)
101
+
102
+ return annotated
103
+
104
+ def _generate_description(self, image: np.ndarray, detections) -> str:
105
+ """Generate text description using LLM"""
106
+ if self.processor is None or self.llm_model is None:
107
+ return "Basic detection completed (LLM not available)"
108
+
109
+ try:
110
+ # Prepare image for LLM
111
+ pil_image = Image.fromarray(image)
112
+
113
+ # Create prompt for medical analysis
114
+ prompt = "Analyze this medical image and describe any findings related to larynx granuloma or other abnormalities."
115
+
116
+ # Process image and text
117
+ inputs = self.processor(text=prompt, images=pil_image, return_tensors="pt")
118
+
119
+ if self.device == "cuda":
120
+ inputs = {k: v.to(self.device) for k, v in inputs.items()}
121
+
122
+ # Generate response
123
+ with torch.no_grad():
124
+ outputs = self.llm_model.generate(
125
+ **inputs,
126
+ max_new_tokens=200,
127
+ temperature=0.2,
128
+ do_sample=True
129
+ )
130
+
131
+ # Decode response
132
+ response = self.processor.batch_decode(outputs, skip_special_tokens=True)[0]
133
+ return response.strip()
134
+
135
+ except Exception as e:
136
+ return f"LLM analysis failed: {str(e)}"
137
+
138
+ # Global model manager
139
+ model_manager = ModelManager()
140
+
141
+ def analyze_image(image: Image.Image, threshold: float = 0.7, use_llm: bool = True) -> Tuple[Image.Image, str]:
142
+ """Main function to analyze uploaded image"""
143
+ if model_manager.detector is None:
144
+ model_manager.load_models()
145
+
146
+ if model_manager.detector is None:
147
+ return image, "Error: Could not load models. Please check the model files."
148
+
149
+ return model_manager.detect_objects(image, threshold)
150
+
151
+ # Create Gradio interface
152
+ with gr.Blocks(title="Medical Image Analysis") as demo:
153
+ gr.Markdown(
154
+ "# 🏥 Medical Image Analysis Tool\n\n"
155
+ "Upload a medical image for AI-powered analysis using advanced detection models."
156
+ )
157
+
158
+ with gr.Row():
159
+ with gr.Column():
160
+ input_image = gr.Image(type="pil", label="Upload Medical Image")
161
+ threshold_slider = gr.Slider(
162
+ 0.1, 1.0, value=0.7, step=0.05,
163
+ label="Detection Threshold",
164
+ info="Higher values = fewer but more confident detections"
165
+ )
166
+ analyze_btn = gr.Button("Analyze Image", variant="primary")
167
+
168
+ with gr.Column():
169
+ output_image = gr.Image(type="pil", label="Analysis Results")
170
+ description = gr.Markdown(label="AI Analysis", value="Upload an image to begin analysis")
171
+
172
+ analyze_btn.click(
173
+ analyze_image,
174
+ inputs=[input_image, threshold_slider],
175
+ outputs=[output_image, description]
176
+ )
177
+
178
+ input_image.change(
179
+ analyze_image,
180
+ inputs=[input_image, threshold_slider],
181
+ outputs=[output_image, description]
182
+ )
183
+
184
+ if __name__ == "__main__":
185
+ demo.launch()
packages.txt ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ libgl1-mesa-glx
2
+ libglib2.0-0
3
+ libsm6
4
+ libxext6
5
+ libxrender-dev
6
+ libgomp1
7
+ ffmpeg
8
+ build-essential
requirements.txt ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ torch>=2.0.0
2
+ transformers>=4.30.0
3
+ gradio>=4.0.0
4
+ pillow>=10.0.0
5
+ opencv-python>=4.8.0
6
+ supervision>=0.18.0
7
+ psutil>=5.9.0
8
+ numpy>=1.24.0
9
+ imageio>=2.31.0
10
+ imageio-ffmpeg>=0.4.8
11
+ requests>=2.31.0