Spaces:

edeler
/

LorAI

Running

App Files Files Community

edeler commited on Oct 9

Commit

e6cb34f

verified ·

1 Parent(s): d611a6e

lorai (#1)

Browse files

- Update app.py with Spaces-optimized medical image analysis and enhanced README (46d6674ae12fcd833c37edb5df9b4e72cbab8790)
- Add proper Space metadata to README for better Space configuration (12c0045ea2c4166f7e0372f8e70fdfbcedb5d7e4)

Files changed (2) hide show

README.md +48 -19
app.py +451 -136

README.md CHANGED Viewed

@@ -1,3 +1,15 @@
 # 🏥 Medical Image Analysis Tool
 An AI-powered medical image analysis application using advanced detection models and large language models for medical image interpretation.
@@ -8,17 +20,23 @@ An AI-powered medical image analysis application using advanced detection models
 - **Medical AI Analysis**: Integrates MedGemma, a specialized medical vision-language model
 - **Interactive Interface**: Built with Gradio for easy web-based interaction
 - **Configurable Thresholds**: Adjustable confidence thresholds for detection sensitivity
-- **GPU Acceleration**: Optimized for GPU usage when available
 ## Models Used
 - **RF-DETR Medium**: State-of-the-art object detection model
-- **MedGemma 4B**: Medical-specialized vision-language model for analysis and descriptions
 ## Usage
 1. **Upload Image**: Click on the image upload area or drag and drop a medical image
-2. **Adjust Settings**: Use the confidence threshold slider to control detection sensitivity
 3. **Analyze**: Click "Analyze Image" to run the AI analysis
 4. **View Results**: See the annotated image with detected objects and AI-generated descriptions
@@ -26,24 +44,28 @@ An AI-powered medical image analysis application using advanced detection models
 This application is designed to run on Hugging Face Spaces. The following files are required:
-- `app.py` - Main application file
 - `requirements.txt` - Python dependencies
 - `packages.txt` - System packages
-- Model files in the `models/` directory
-## Model Files Structure
-The application expects the following model files:
-```
-models/
-├── medgemma-4b-it/           # MedGemma model files
-│   ├── config.json
-│   ├── tokenizer.json
-│   ├── model-00001-of-00002.safetensors
-│   └── model-00002-of-00002.safetensors
-└── rf-detr-medium.pth        # RF-DETR model weights
-```
 ## Technical Details
@@ -54,9 +76,11 @@ models/
 ## Performance Tips
-- Higher confidence thresholds reduce false positives but may miss subtle findings
-- The application automatically uses GPU acceleration when available
-- Model loading happens on first use and is cached for subsequent analyses
 ## Limitations
@@ -73,6 +97,11 @@ pip install -r requirements.txt
 python app.py
 ```
 ## License
 This project is for research and educational purposes. Medical applications should be developed and validated according to appropriate regulatory standards.

+---
+title: Medical Image Analysis Tool
+emoji: 🏥
+colorFrom: blue
+colorTo: green
+sdk: gradio
+sdk_version: "4.0.0"
+app_file: app.py
+pinned: false
+license: mit
+---
 # 🏥 Medical Image Analysis Tool
 An AI-powered medical image analysis application using advanced detection models and large language models for medical image interpretation.
 - **Medical AI Analysis**: Integrates MedGemma, a specialized medical vision-language model
 - **Interactive Interface**: Built with Gradio for easy web-based interaction
 - **Configurable Thresholds**: Adjustable confidence thresholds for detection sensitivity
+- **Model Size Selection**: Choose between MedGemma 4B (faster) or 27B (more accurate) models
+- **GPU Acceleration**: Optimized for GPU usage when available with 4-bit quantization
+- **Automatic Model Downloads**: Models download automatically from Hugging Face Hub
 ## Models Used
 - **RF-DETR Medium**: State-of-the-art object detection model
+- **MedGemma 4B/27B**: Medical-specialized vision-language models for analysis and descriptions
+  - 4B model: Faster inference, lower memory usage
+  - 27B model: Higher accuracy, requires more resources
 ## Usage
 1. **Upload Image**: Click on the image upload area or drag and drop a medical image
+2. **Adjust Settings**:
+   - Use the confidence threshold slider to control detection sensitivity
+   - Select model size (4B for speed, 27B for accuracy)
 3. **Analyze**: Click "Analyze Image" to run the AI analysis
 4. **View Results**: See the annotated image with detected objects and AI-generated descriptions
 This application is designed to run on Hugging Face Spaces. The following files are required:
+- `app.py` - Main application file (optimized for Spaces)
 - `requirements.txt` - Python dependencies
 - `packages.txt` - System packages
+- `README.md` - This documentation
+## Model Loading
+**RF-DETR Model:**
+- Upload your trained `rf-detr-medium.pth` file to the Space
+- The application will automatically find and load it
+**MedGemma Models:**
+- Models download automatically from Hugging Face Hub on first use
+- No manual installation required
+- Choose between 4B (faster) or 27B (more accurate) models
+## Space Configuration
+For optimal performance, configure your Space settings:
+- **Hardware**: GPU (T4 minimum, A100 recommended for 27B models)
+- **Storage**: Enable persistent storage for model caching
+- **Timeout**: 30+ minutes for large model downloads
 ## Technical Details
 ## Performance Tips
+- **Model Selection**: Use MedGemma 4B for faster processing or 27B for higher accuracy
+- **Confidence Thresholds**: Higher values reduce false positives but may miss subtle findings
+- **GPU Acceleration**: The application automatically uses GPU acceleration when available
+- **Memory Optimization**: Uses 4-bit quantization to reduce memory usage
+- **Model Caching**: Models are cached after first load for faster subsequent analyses
 ## Limitations
 python app.py
 ```
+**Note**: For local development, you'll need to:
+1. Install the RF-DETR package or ensure it's available
+2. Place your `rf-detr-medium.pth` file in the project directory
+3. Models will download automatically on first run
 ## License
 This project is for research and educational purposes. Medical applications should be developed and validated according to appropriate regulatory standards.

app.py CHANGED Viewed

@@ -1,185 +1,500 @@
 import os
-import gc
 import json
 import time
-import warnings
-from typing import Dict, List, Optional, Tuple, Any
 import traceback
 import torch
-import cv2
-import numpy as np
-from PIL import Image
 import gradio as gr
-# Import ML libraries
 try:
-    import supervision as sv
-    from transformers import AutoModelForImageTextToText, AutoProcessor
-except ImportError as e:
-    print(f"Warning: Missing dependencies: {e}")
-# Suppress warnings
-warnings.filterwarnings("ignore")
-# Model paths - adjust these for your Space
-MODEL_DIR = "models"
-RESULTS_DIR = "results"
-CACHE_DIR = os.path.join(MODEL_DIR, "hf_cache")
-class ModelManager:
     def __init__(self):
-        self.detector = None
-        self.processor = None
-        self.llm_model = None
-        self.device = "cuda" if torch.cuda.is_available() else "cpu"
-    def load_models(self):
-        """Load the detection and LLM models"""
         try:
-            print(f"Loading models on device: {self.device}")
-            # Load RF-DETR detector
-            print("Loading RF-DETR detector...")
-            self.detector = torch.load("rf-detr-medium.pth", map_location=self.device)
-            self.detector.eval()
-            # Load MedGemma processor and model
-            print("Loading MedGemma model...")
-            processor_path = os.path.join(MODEL_DIR, "medgemma-4b-it")
-            if os.path.exists(processor_path):
-                self.processor = AutoProcessor.from_pretrained(processor_path)
-                self.llm_model = AutoModelForImageTextToText.from_pretrained(
-                    processor_path,
-                    torch_dtype=torch.float16 if self.device == "cuda" else torch.float32,
-                    device_map="auto" if self.device == "cuda" else None
                 )
-            else:
-                print("Warning: MedGemma model not found locally, using basic detection only")
-        except Exception as e:
-            print(f"Error loading models: {e}")
-            self.detector = None
-            self.processor = None
-            self.llm_model = None
-    def detect_objects(self, image: Image.Image, threshold: float = 0.7) -> Tuple[Image.Image, str]:
-        """Run object detection on the image"""
-        if self.detector is None:
-            return image, "Error: Detector not loaded"
         try:
-            # Convert PIL to numpy
-            image_np = np.array(image)
-            # Run detection (simplified - adjust based on your RF-DETR implementation)
-            with torch.no_grad():
-                # This is a placeholder - you'll need to adapt based on your RF-DETR usage
-                detections = self.detector(image_np, threshold=threshold)
-            # Annotate image
-            annotated_image = self._annotate_image(image_np, detections)
-            # Generate description
-            description = self._generate_description(annotated_image, detections)
-            return Image.fromarray(annotated_image), description
         except Exception as e:
-            return image, f"Error during detection: {str(e)}"
-    def _annotate_image(self, image: np.ndarray, detections) -> np.ndarray:
-        """Annotate image with detections"""
-        # Placeholder annotation - adapt based on your detection format
-        annotated = image.copy()
-        # Add detection boxes (adjust based on your detection format)
-        if hasattr(detections, 'boxes') and len(detections.boxes) > 0:
-            for box in detections.boxes:
-                x1, y1, x2, y2 = box.cpu().numpy().astype(int)
-                cv2.rectangle(annotated, (x1, y1), (x2, y2), (0, 255, 0), 2)
-        return annotated
-    def _generate_description(self, image: np.ndarray, detections) -> str:
-        """Generate text description using LLM"""
-        if self.processor is None or self.llm_model is None:
-            return "Basic detection completed (LLM not available)"
         try:
-            # Prepare image for LLM
-            pil_image = Image.fromarray(image)
-            # Create prompt for medical analysis
-            prompt = "Analyze this medical image and describe any findings related to larynx granuloma or other abnormalities."
-            # Process image and text
-            inputs = self.processor(text=prompt, images=pil_image, return_tensors="pt")
-            if self.device == "cuda":
-                inputs = {k: v.to(self.device) for k, v in inputs.items()}
-            # Generate response
-            with torch.no_grad():
-                outputs = self.llm_model.generate(
-                    **inputs,
-                    max_new_tokens=200,
-                    temperature=0.2,
-                    do_sample=True
-                )
-            # Decode response
-            response = self.processor.batch_decode(outputs, skip_special_tokens=True)[0]
-            return response.strip()
-        except Exception as e:
-            return f"LLM analysis failed: {str(e)}"
-# Global model manager
-model_manager = ModelManager()
-def analyze_image(image: Image.Image, threshold: float = 0.7, use_llm: bool = True) -> Tuple[Image.Image, str]:
-    """Main function to analyze uploaded image"""
-    if model_manager.detector is None:
-        model_manager.load_models()
-    if model_manager.detector is None:
-        return image, "Error: Could not load models. Please check the model files."
-    return model_manager.detect_objects(image, threshold)
-# Create Gradio interface
-with gr.Blocks(title="Medical Image Analysis") as demo:
-    gr.Markdown(
-        "# 🏥 Medical Image Analysis Tool\n\n"
-        "Upload a medical image for AI-powered analysis using advanced detection models."
-    )
-    with gr.Row():
-        with gr.Column():
-            input_image = gr.Image(type="pil", label="Upload Medical Image")
-            threshold_slider = gr.Slider(
-                0.1, 1.0, value=0.7, step=0.05,
-                label="Detection Threshold",
-                info="Higher values = fewer but more confident detections"
-            )
-            analyze_btn = gr.Button("Analyze Image", variant="primary")
-        with gr.Column():
-            output_image = gr.Image(type="pil", label="Analysis Results")
-            description = gr.Markdown(label="AI Analysis", value="Upload an image to begin analysis")
-    analyze_btn.click(
-        analyze_image,
-        inputs=[input_image, threshold_slider],
-        outputs=[output_image, description]
-    )
-    input_image.change(
-        analyze_image,
-        inputs=[input_image, threshold_slider],
-        outputs=[output_image, description]
     )
 if __name__ == "__main__":
-    demo.launch()

 import os
 import json
+import gc
 import time
 import traceback
+from typing import Dict, List, Optional, Tuple, Callable, Any
 import torch
 import gradio as gr
+import supervision as sv
+from PIL import Image
+# Try to import optional dependencies
+try:
+    from transformers import (
+        AutoModelForCausalLM,
+        AutoTokenizer,
+        AutoModelForImageTextToText,
+        AutoProcessor,
+        BitsAndBytesConfig,
+    )
+except Exception:
+    AutoModelForCausalLM = None
+    AutoTokenizer = None
+    AutoModelForImageTextToText = None
+    AutoProcessor = None
+    BitsAndBytesConfig = None
+# Import RF-DETR (assumes it's in the same directory or installed)
 try:
+    from rfdetr import RFDETRMedium
+except ImportError:
+    print("Warning: RF-DETR not found. Please ensure it's properly installed.")
+    RFDETRMedium = None
+# ============================================================================
+# Configuration for Hugging Face Spaces
+# ============================================================================
+class SpacesConfig:
+    """Configuration optimized for Hugging Face Spaces."""
     def __init__(self):
+        self.settings = {
+            'results_dir': '/tmp/results',
+            'checkpoint': None,
+            'resolution': 576,
+            'threshold': 0.7,
+            'use_llm': True,
+            'llm_model_id': 'google/medgemma-4b-it',
+            'llm_max_new_tokens': 200,
+            'llm_temperature': 0.2,
+            'llm_4bit': True,
+            'enable_caching': True,
+            'max_cache_size': 100,
+        }
+    def get(self, key: str, default: Any = None) -> Any:
+        return self.settings.get(key, default)
+# ============================================================================
+# Memory Management (simplified for Spaces)
+# ============================================================================
+class MemoryManager:
+    """Simplified memory management for Spaces."""
+    def __init__(self):
+        self.memory_thresholds = {
+            'gpu_warning': 0.8,
+            'system_warning': 0.85,
+        }
+    def cleanup_memory(self, force: bool = False) -> None:
+        """Perform memory cleanup."""
         try:
+            gc.collect()
+            if torch and torch.cuda.is_available():
+                torch.cuda.empty_cache()
+                torch.cuda.synchronize()
+        except Exception as e:
+            print(f"Memory cleanup error: {e}")
+# Global memory manager
+memory_manager = MemoryManager()
+# ============================================================================
+# Model Loading
+# ============================================================================
+def find_checkpoint() -> Optional[str]:
+    """Find RF-DETR checkpoint in various locations."""
+    candidates = [
+        "rf-detr-medium.pth",  # Current directory
+        "/tmp/results/checkpoint_best_total.pth",
+        "/tmp/results/checkpoint_best_ema.pth",
+        "/tmp/results/checkpoint_best_regular.pth",
+        "/tmp/results/checkpoint.pth",
+    ]
+    for path in candidates:
+        if os.path.isfile(path):
+            return path
+    return None
+def load_model(checkpoint_path: str, resolution: int):
+    """Load RF-DETR model."""
+    if RFDETRMedium is None:
+        raise RuntimeError("RF-DETR not available. Please install it properly.")
+    model = RFDETRMedium(pretrain_weights=checkpoint_path, resolution=resolution)
+    try:
+        model.optimize_for_inference()
+    except Exception:
+        pass
+    return model
+# ============================================================================
+# LLM Integration
+# ============================================================================
+class TextGenerator:
+    """Simplified text generator for Spaces."""
+    def __init__(self, model_id: str, max_tokens: int = 200, temperature: float = 0.2):
+        self.model_id = model_id
+        self.max_tokens = max_tokens
+        self.temperature = temperature
+        self.model = None
+        self.tokenizer = None
+        self.processor = None
+        self.is_multimodal = False
+    def load_model(self):
+        """Load the LLM model."""
+        if self.model is not None:
+            return
+        if (AutoModelForCausalLM is None and AutoModelForImageTextToText is None):
+            raise RuntimeError("Transformers not available")
+        # Clear memory before loading
+        memory_manager.cleanup_memory()
+        print(f"Loading model: {self.model_id}")
+        model_kwargs = {
+            "device_map": "auto",
+            "low_cpu_mem_usage": True,
+        }
+        if torch and torch.cuda.is_available():
+            model_kwargs["torch_dtype"] = torch.bfloat16
+        # Use 4-bit quantization if available
+        if BitsAndBytesConfig is not None:
+            try:
+                compute_dtype = torch.bfloat16 if torch and torch.cuda.is_available() else torch.float16
+                model_kwargs["quantization_config"] = BitsAndBytesConfig(
+                    load_in_4bit=True,
+                    bnb_4bit_compute_dtype=compute_dtype,
+                    bnb_4bit_use_double_quant=True,
+                    bnb_4bit_quant_type="nf4"
                 )
+                model_kwargs["torch_dtype"] = compute_dtype
+            except Exception:
+                pass
+        # Check if it's a multimodal model
+        is_multimodal = "medgemma" in self.model_id.lower()
+        if is_multimodal and AutoModelForImageTextToText is not None and AutoProcessor is not None:
+            self.processor = AutoProcessor.from_pretrained(self.model_id)
+            self.model = AutoModelForImageTextToText.from_pretrained(self.model_id, **model_kwargs)
+            self.is_multimodal = True
+        elif AutoModelForCausalLM is not None and AutoTokenizer is not None:
+            self.tokenizer = AutoTokenizer.from_pretrained(self.model_id)
+            self.model = AutoModelForCausalLM.from_pretrained(self.model_id, **model_kwargs)
+            self.is_multimodal = False
+        else:
+            raise RuntimeError("Required model classes not available")
+        print("✓ Model loaded successfully")
+    def generate(self, text: str, image: Optional[Image.Image] = None) -> str:
+        """Generate text using the loaded model."""
+        self.load_model()
+        if self.model is None:
+            return f"[Model not loaded: {text}]"
         try:
+            # Create messages
+            system_text = "You are a concise medical assistant. Provide a brief, clear summary of detection results. Avoid repetition and be direct. Do not give medical advice."
+            user_text = f"Summarize these detection results in 3 clear sentences:\n\n{text}"
+            if self.is_multimodal:
+                # Multimodal model
+                user_content = [{"type": "text", "text": user_text}]
+                if image is not None:
+                    user_content.append({"type": "image", "image": image})
+                messages = [
+                    {"role": "system", "content": [{"type": "text", "text": system_text}]},
+                    {"role": "user", "content": user_content},
+                ]
+                inputs = self.processor.apply_chat_template(
+                    messages,
+                    add_generation_prompt=True,
+                    tokenize=True,
+                    return_dict=True,
+                    return_tensors="pt",
+                )
+                if torch:
+                    inputs = inputs.to(self.model.device, dtype=torch.bfloat16)
+                with torch.inference_mode():
+                    generation = self.model.generate(
+                        **inputs,
+                        max_new_tokens=self.max_tokens,
+                        do_sample=self.temperature > 0,
+                        temperature=max(0.01, self.temperature) if self.temperature > 0 else None,
+                        use_cache=False,
+                    )
+                input_len = inputs["input_ids"].shape[-1]
+                generation = generation[0][input_len:]
+                decoded = self.processor.decode(generation, skip_special_tokens=True)
+                return decoded.strip()
+            else:
+                # Text-only model
+                messages = [
+                    {"role": "system", "content": system_text},
+                    {"role": "user", "content": user_text},
+                ]
+                inputs = self.tokenizer.apply_chat_template(
+                    messages,
+                    add_generation_prompt=True,
+                    tokenize=True,
+                    return_dict=True,
+                    return_tensors="pt",
+                )
+                inputs = inputs.to(self.model.device)
+                with torch.inference_mode():
+                    generation = self.model.generate(
+                        **inputs,
+                        max_new_tokens=self.max_tokens,
+                        do_sample=self.temperature > 0,
+                        temperature=max(0.01, self.temperature) if self.temperature > 0 else None,
+                        use_cache=False,
+                    )
+                input_len = inputs["input_ids"].shape[-1]
+                generation = generation[0][input_len:]
+                decoded = self.tokenizer.decode(generation, skip_special_tokens=True)
+                return decoded.strip()
         except Exception as e:
+            error_msg = f"[Generation error: {e}]"
+            print(f"Generation error: {traceback.format_exc()}")
+            return f"{error_msg}\n\n{text}"
+# ============================================================================
+# Application State
+# ============================================================================
+class AppState:
+    """Application state for Spaces."""
+    def __init__(self):
+        self.config = SpacesConfig()
+        self.model = None
+        self.class_names = None
+        self.text_generator = None
+    def load_model(self):
+        """Load the detection model."""
+        if self.model is not None:
+            return
+        checkpoint = find_checkpoint()
+        if not checkpoint:
+            raise FileNotFoundError(
+                "No RF-DETR checkpoint found. Please upload rf-detr-medium.pth to your Space."
+            )
+        print(f"Loading RF-DETR from: {checkpoint}")
+        self.model = load_model(checkpoint, self.config.get('resolution'))
+        # Try to load class names
         try:
+            results_json = "/tmp/results/results.json"
+            if os.path.isfile(results_json):
+                with open(results_json, 'r') as f:
+                    data = json.load(f)
+                classes = []
+                for split in ("valid", "test", "train"):
+                    if "class_map" in data and split in data["class_map"]:
+                        for item in data["class_map"][split]:
+                            name = item.get("class")
+                            if name and name != "all" and name not in classes:
+                                classes.append(name)
+                self.class_names = classes if classes else None
+        except Exception:
+            pass
+        print("✓ RF-DETR model loaded")
+    def get_text_generator(self, model_size: str = "4B") -> TextGenerator:
+        """Get or create text generator."""
+        # Determine model ID based on size selection
+        model_id = 'google/medgemma-27b-it' if model_size == "27B" else 'google/medgemma-4b-it'
+        # Check if we need to create a new generator for different model size
+        if (self.text_generator is None or
+            hasattr(self.text_generator, 'model_id') and
+            self.text_generator.model_id != model_id):
+            max_tokens = self.config.get('llm_max_new_tokens')
+            temperature = self.config.get('llm_temperature')
+            self.text_generator = TextGenerator(model_id, max_tokens, temperature)
+        return self.text_generator
+# ============================================================================
+# UI and Inference
+# ============================================================================
+def create_detection_interface():
+    """Create the Gradio interface."""
+    # Color palette for annotations
+    COLOR_PALETTE = sv.ColorPalette.from_hex([
+        "#ffff00", "#ff9b00", "#ff66ff", "#3399ff", "#ff66b2",
+        "#ff8080", "#b266ff", "#9999ff", "#66ffff", "#33ff99",
+        "#66ff66", "#99ff00",
+    ])
+    def annotate_image(image: Image.Image, threshold: float, model_size: str = "4B") -> Tuple[Image.Image, str]:
+        """Process an image and return annotated version with description."""
+        if image is None:
+            return None, "Please upload an image."
+        try:
+            # Load model if needed
+            app_state.load_model()
+            # Run detection
+            detections = app_state.model.predict(image, threshold=threshold)
+            # Annotate image
+            bbox_annotator = sv.BoxAnnotator(color=COLOR_PALETTE, thickness=2)
+            label_annotator = sv.LabelAnnotator(text_scale=0.5, text_color=sv.Color.BLACK)
+            labels = []
+            for i in range(len(detections)):
+                class_id = int(detections.class_id[i]) if detections.class_id is not None else None
+                conf = float(detections.confidence[i]) if detections.confidence is not None else 0.0
+                if app_state.class_names and class_id is not None:
+                    if 0 <= class_id < len(app_state.class_names):
+                        label_name = app_state.class_names[class_id]
+                    else:
+                        label_name = str(class_id)
+                else:
+                    label_name = str(class_id) if class_id is not None else "object"
+                labels.append(f"{label_name} {conf:.2f}")
+            annotated = image.copy()
+            annotated = bbox_annotator.annotate(annotated, detections)
+            annotated = label_annotator.annotate(annotated, detections, labels)
+            # Generate description
+            description = f"Found {len(detections)} detections above threshold {threshold}:\n\n"
+            if len(detections) > 0:
+                counts = {}
+                for i in range(len(detections)):
+                    class_id = int(detections.class_id[i]) if detections.class_id is not None else None
+                    if app_state.class_names and class_id is not None:
+                        if 0 <= class_id < len(app_state.class_names):
+                            name = app_state.class_names[class_id]
+                        else:
+                            name = str(class_id)
+                    else:
+                        name = str(class_id) if class_id is not None else "object"
+                    counts[name] = counts.get(name, 0) + 1
+                for name, count in counts.items():
+                    description += f"- {count}× {name}\n"
+                # Use LLM for description if enabled
+                if app_state.config.get('use_llm'):
+                    try:
+                        generator = app_state.get_text_generator(model_size)
+                        llm_description = generator.generate(description, image=annotated)
+                        description = llm_description
+                    except Exception as e:
+                        description = f"[LLM error: {e}]\n\n{description}"
+            else:
+                description += "No objects detected above the confidence threshold."
+            return annotated, description
+        except Exception as e:
+            error_msg = f"Error processing image: {str(e)}"
+            print(f"Processing error: {traceback.format_exc()}")
+            return None, error_msg
+    # Create the interface
+    with gr.Blocks(title="Medical Image Analysis", theme=gr.themes.Soft()) as demo:
+        gr.Markdown("# 🏥 Medical Image Analysis")
+        gr.Markdown("Upload a medical image to detect and analyze findings using AI.")
+        with gr.Row():
+            with gr.Column():
+                input_image = gr.Image(type="pil", label="Upload Image", height=400)
+                threshold_slider = gr.Slider(
+                    minimum=0.1,
+                    maximum=1.0,
+                    value=0.7,
+                    step=0.05,
+                    label="Confidence Threshold",
+                    info="Higher values = fewer but more confident detections"
+                )
+                model_size_radio = gr.Radio(
+                    choices=["4B", "27B"],
+                    value="4B",
+                    label="MedGemma Model Size",
+                    info="4B: Faster, less memory | 27B: More accurate, more memory"
+                )
+                analyze_btn = gr.Button("🔍 Analyze Image", variant="primary")
+            with gr.Column():
+                output_image = gr.Image(type="pil", label="Results", height=400)
+                output_text = gr.Textbox(
+                    label="Analysis Results",
+                    lines=8,
+                    max_lines=15,
+                    show_copy_button=True
+                )
+        # Wire up the interface
+        analyze_btn.click(
+            fn=annotate_image,
+            inputs=[input_image, threshold_slider, model_size_radio],
+            outputs=[output_image, output_text]
+        )
+        # Also run when image is uploaded
+        input_image.change(
+            fn=annotate_image,
+            inputs=[input_image, threshold_slider, model_size_radio],
+            outputs=[output_image, output_text]
+        )
+        # Footer
+        gr.Markdown("---")
+        gr.Markdown("*Powered by RF-DETR and MedGemma • Built for Hugging Face Spaces*")
+    return demo
+# ============================================================================
+# Main Application
+# ============================================================================
+# Global app state
+app_state = AppState()
+def main():
+    """Main entry point for the Spaces app."""
+    print("🚀 Starting Medical Image Analysis App")
+    # Ensure results directory exists
+    os.makedirs(app_state.config.get('results_dir'), exist_ok=True)
+    # Create and launch the interface
+    demo = create_detection_interface()
+    # Launch with Spaces-optimized settings
+    demo.launch(
+        server_name="0.0.0.0",
+        server_port=7860,
+        share=False,  # Spaces handles this
+        show_error=True,
+        show_api=False,
     )
 if __name__ == "__main__":
+    main()