ft42 commited on Sep 21

Commit

7f24887

verified ·

1 Parent(s): e035517

Upload 21 files

Browse files

added scr , bash and documentation files

Files changed (21) hide show

docs/HUGGINGFACE_MODEL_CARD.md +344 -0
docs/TECHNICAL_DOCUMENTATION.md +535 -0
output/DLCS24_KNN_2mm_Extend_Seg/DLCS_0001_mask.nii.gz +3 -0
output/DLCS24_KNN_2mm_Extend_Seg/DLCS_0002_mask.nii.gz +3 -0
scr/.ipynb_checkpoints/segmentation_utils-checkpoint.py +490 -0
scr/Pyradiomics_feature_extarctor_pram.json +6 -0
scr/__pycache__/cvseg_utils.cpython-310.pyc +0 -0
scr/__pycache__/cvseg_utils.cpython-311.pyc +0 -0
scr/__pycache__/cvseg_utils.cpython-38.pyc +0 -0
scr/__pycache__/cvseg_utils.cpython-39.pyc +0 -0
scr/__pycache__/segmentation_utils.cpython-38.pyc +0 -0
scr/candidateSeg_pipiline.py +86 -0
scr/candidateSeg_radiomicsExtractor_pipiline.py +213 -0
scr/candidate_worldCoord_patchExtarctor_pipeline.py +177 -0
scr/cvseg_utils.py +488 -0
scr/segmentation_utils.py +490 -0
scripts/DLCS24_CADe_64Qpatch.sh +76 -0
scripts/DLCS24_KNN_2mm_Extend_Radiomics.sh +89 -0
scripts/DLCS24_KNN_2mm_Extend_Seg.sh +84 -0
scripts/build.sh +242 -0
scripts/run.sh +385 -0

docs/HUGGINGFACE_MODEL_CARD.md ADDED Viewed

	@@ -0,0 +1,344 @@

+---
+language: en
+license: cc-by-nc-4.0
+library_name: pins-toolkit
+tags:
+- medical-imaging
+- computed-tomography
+- pulmonary-nodules
+- radiomics
+- segmentation
+- lung-cancer
+- ct-analysis
+- pyradiomics
+- simpleitk
+- pytorch
+- monai
+- opencv
+- docker
+datasets:
+- dlcs24
+metrics:
+- dice-coefficient
+- feature-reproducibility
+pipeline_tag: image-segmentation
+widget:
+- example_title: Lung Nodule Segmentation
+  text: Automated segmentation of pulmonary nodules in chest CT scans
+model-index:
+- name: PiNS
+  results:
+  - task:
+      type: image-segmentation
+      name: Medical Image Segmentation
+    dataset:
+      name: DLCS24
+      type: medical-ct
+    metrics:
+    - type: dice-coefficient
+      value:
+      name: Dice Similarity Coefficient
+---
+# PiNS - Point-driven Nodule Segmentation
+<div align="center">
+<p align="center">
+  <img src="assets/PiNS_logo.png" alt="PiNS Logo" width="500">
+</p>
+**Medical imaging toolkit for automated pulmonary nodule detection, segmentation, and quantitative analysis**
+[![Docker Hub](https://img.shields.io/docker/pulls/ft42/pins?logo=docker)](https://hub.docker.com/r/ft42/pins)
+[![License](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
+[![Python](https://img.shields.io/badge/Python-3.9+-green.svg)](https://python.org)
+[![Medical Imaging](https://img.shields.io/badge/Medical-Imaging-red.svg)](https://simpleitk.org)
+[![PyTorch](https://img.shields.io/badge/PyTorch-2.8.0-orange.svg)](https://pytorch.org)
+[![MONAI](https://img.shields.io/badge/MONAI-1.4.0-blue.svg)](https://monai.io)
+[🚀 Quick Start](#quick-start) • [📖 Documentation](https://github.com/ft42/PiNS/blob/main/docs/TECHNICAL_DOCUMENTATION.md) • [💻 GitHub](https://github.com/ft42/PiNS) • [🐳 Docker Hub](https://hub.docker.com/r/ft42/pins)
+</div>
+## Overview
+**PiNS (Point-driven Nodule Segmentation)** is a medical imaging toolkit designed for analysis of pulmonary nodules in computed tomography (CT) scans. The toolkit provides three core functionalities:
+🎯 **Automated Segmentation** - Multi-algorithm nodule segmentation with clinical validation
+📊 **Quantitative Radiomics** - 100+ standardized imaging biomarkers
+🧩 **3D Patch Extraction** - Deep learning-ready data preparation
+## Model Architecture & Algorithms
+### Segmentation Pipeline
+```mermaid
+graph TB
+    A[CT Image + Coordinates] --> B[Coordinate Transformation]
+    B --> C[ROI Extraction]
+    C --> D{Segmentation Algorithm}
+    D --> E[K-means Clustering]
+    D --> F[Gaussian Mixture Model]
+    D --> G[Fuzzy C-Means]
+    D --> H[Otsu Thresholding]
+    E --> I[Connected Components]
+    F --> I
+    G --> I
+    H --> I
+    I --> J[Morphological Operations]
+    J --> K[Expansion (2mm)]
+    K --> L[Binary Mask Output]
+```
+### Core Algorithms
+1. **K-means Clustering** (Default)
+   - Binary classification: nodule vs. background
+   - Euclidean distance metric
+   - Automatic initialization
+2. **Gaussian Mixture Model**
+   - Probabilistic clustering approach
+   - Expectation-maximization optimization
+   - Suitable for heterogeneous nodules
+3. **Fuzzy C-Means**
+   - Soft clustering with membership degrees
+   - Iterative optimization
+   - Robust to noise and partial volume effects
+4. **Otsu Thresholding**
+   - Automatic threshold selection
+   - Histogram-based method
+   - Fast execution for large datasets
+## Quick Start
+### Prerequisites
+- Docker 20.10.0+ installed
+- 8GB+ RAM
+- 15GB+ free disk space
+### Installation & Usage
+```bash
+# 1. Pull the Docker image (automatically handled)
+docker pull ft42/pins:latest
+# 2. Clone the repository
+git clone https://github.com/ft42/PiNS.git
+cd PiNS
+# 3. Run segmentation pipeline
+./scripts/DLCS24_KNN_2mm_Extend_Seg.sh
+# 4. Extract radiomics features
+./scripts/DLCS24_KNN_2mm_Extend_Radiomics.sh
+# 5. Generate ML-ready patches
+./scripts/DLCS24_CADe_64Qpatch.sh
+```
+### Expected Output
+```
+✅ Segmentation completed!
+📊 Features extracted: 107 radiomics features per nodule
+🧩 Patches generated: 64×64×64 voxel volumes
+📁 Results saved to: demofolder/output/
+```
+## Input Data Requirements
+### Image Specifications
+- **Format**: NIfTI (.nii.gz) or DICOM
+- **Modality**: CT chest/abdomen/CAP scans
+- **Resolution**: 0.5-2.0 mm isotropic (preferred)
+- **Matrix size**: 512×512 or larger
+- **Bit depth**: 16-bit signed integers
+- **Intensity range**: Standard HU values (-1024 to +3071)
+- **Sample Dataset:** Duke Lung Cancer Screening Dataset 2024(DLCS24)[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.13799069.svg)](https://doi.org/10.5281/zenodo.13799069)
+### Annotation Format
+```csv
+ct_nifti_file,nodule_id,coordX,coordY,coordZ,w,h,d,Malignant_lbl
+patient001.nii.gz,patient001_01,-106.55,-63.84,-211.68,4.39,4.39,4.30,0
+patient001.nii.gz,patient001_02,88.69,39.48,-126.09,6.24,6.24,6.25,1
+```
+**Column Descriptions**:
+- `coordX/Y/Z`: World coordinates in millimeters (ITK/SimpleITK standard)
+- `w/h/d`: Bounding box dimensions in millimeters
+- `Malignant_lbl`: Binary malignancy label (0=benign, 1=malignant)
+## Output Specifications
+### 1. Segmentation Masks
+- **Format**: NIfTI binary masks (.nii.gz)
+- **Values**: 0 (background), 1 (nodule)
+- **Coordinate system**: Aligned with input CT
+- **Quality**: Sub-voxel precision boundaries
+### 2. Radiomics Features
+**Feature Categories** (107 total features):
+| Category | Count | Description |
+|----------|-------|-------------|
+| **Shape** | 14 | Volume, Surface Area, Sphericity, Compactness |
+| **First-order** | 18 | Mean, Std, Skewness, Kurtosis, Percentiles |
+| **GLCM** | 24 | Contrast, Correlation, Energy, Homogeneity |
+| **GLRLM** | 16 | Run Length Non-uniformity, Gray Level Variance |
+| **GLSZM** | 16 | Size Zone Matrix features |
+| **GLDM** | 14 | Dependence Matrix features |
+| **NGTDM** | 5 | Neighboring Gray Tone Difference |
+### 3. 3D Patches
+- **Dimensions**: 64×64×64 voxels (configurable)
+- **Normalization**: Lung window (-1000 to 500 HU) → [0,1]
+- **Format**: Individual NIfTI files per nodule
+- **Centering**: Precise coordinate-based positioning
+## Configuration Options
+### Algorithm Selection
+```bash
+SEG_ALG="knn"          # Options: knn, gmm, fcm, otsu
+EXPANSION_MM=2.0       # Expansion radius in millimeters
+```
+### Radiomics Parameters
+```json
+{
+    "binWidth": 25,
+    "resampledPixelSpacing": [1, 1, 1],
+    "interpolator": "sitkBSpline",
+    "labelInterpolator": "sitkNearestNeighbor"
+}
+```
+### Patch Extraction
+```bash
+PATCH_SIZE="64 64 64"           # Voxel dimensions
+NORMALIZATION="-1000 500 0 1"   # HU window and output range
+```
+## Use Cases & Applications
+### 🔬 Research Applications
+- **Biomarker Discovery**: Large-scale radiomics studies
+- **Algorithm Development**: Standardized evaluation protocols
+- **Multi-institutional Studies**: Reproducible feature extraction
+- **Longitudinal Analysis**: Change assessment over time
+### 🤖 AI/ML Applications
+- **Training Data Preparation**: Standardized patch generation
+- **Feature Engineering**: Comprehensive radiomics features
+- **Model Validation**: Consistent preprocessing pipeline
+- **Transfer Learning**: Pre-processed medical imaging data
+## Technical Specifications
+### Docker Container Details
+- **Base Image**: Ubuntu 20.04 LTS
+- **Size**: ~1.5 GB
+- **Python**: 3.9+
+- **Key Libraries**:
+  - SimpleITK 2.2.1+ (medical image processing)
+  - PyRadiomics 3.1.0+ (feature extraction)
+  - scikit-learn 1.3.0+ (machine learning algorithms)
+  - pandas 2.0.3+ (data manipulation)
+### Performance Characteristics
+- **Memory Usage**: ~500MB per nodule
+- **Processing Speed**: Linear scaling with nodule count
+- **Concurrent Processing**: Multi-threading support
+- **Storage Requirements**: ~1MB per output mask
+## Validation & Quality Assurance
+**Evaluation Criteria:** In the absence of voxel-level ground truth, we adopted a bounding box–supervised evaluation strategy to assess segmentation performance. Each CT volume was accompanied by annotations specifying the nodule center in world coordinates and its dimensions in millimeters, which were converted into voxel indices using the image spacing and clipped to the volume boundaries. A binary mask representing the bounding box was then constructed and used as a weak surrogate for ground truth. we extracted a patch centered on the bounding box, extending it by a fixed margin (64 voxels) to define the volume of interest (VOI). Predicted segmentation masks were cropped to the same VOI-constrained region of interest, and performance was quantified in terms of Dice similarity coefficient. Metrics were computed per lesion. This evaluation strategy enables consistent comparison of segmentation algorithms under weak supervision while acknowledging the limitations of not having voxel-level annotations.
+Segmentation performance of **KNN (ours PiNS)**, **VISTA3D auto**, and **VISTA3D points** ([He et al. 2024](https://github.com/Project-MONAI/VISTA/tree/main/vista3d)) across different nodule size buckets. (top) Bar plots display the mean Dice similarity coefficient for each model and size category. (buttom) Boxplots show the distribution of Dice scores, with boxes representing the interquartile range, horizontal lines indicating the median, whiskers extending to 1.5× the interquartile range, and circles denoting outliers.
+<p align="center">
+  <img src="assets/Segmentation_Evaluation_KNNVista3Dauto_DLCS24_HIST.png" alt="(a)" width="700">
+</p>
+<p align="center">
+  <img src="assets/Segmentation_Evaluation_KNNVista3Dauto_DLCS24_BOX.png" alt="(b)" width="700">
+</p>
+## Limitations & Considerations
+### Current Limitations
+- **Nodule Size**: Optimized for nodules 3-30mm diameter
+- **Image Quality**: Requires standard clinical CT protocols
+- **Coordinate Accuracy**: Dependent on annotation precision
+- **Processing Time**: Sequential processing (parallelization possible)
+## Contributing & Development
+### Research Collaborations
+We welcome collaborations from:
+- **Academic Medical Centers**
+- **Radiology Departments**
+- **Medical AI Companies**
+- **Open Source Contributors**
+## Citation & References
+### Primary Citation
+```bibtex
+@software{pins2025,
+  title={PiNS: Point-driven Nodule Segmentation Toolkit },
+  author={Fakrul Islam Tushar},
+  year={2025},
+  url={https://github.com/fitushar/PiNS},
+  version={1.0.0},
+  doi={10.5281/zenodo.17171571},
+  license={CC-BY-NC-4.0}
+}
+```
+### Related Publications
+1. **AI in Lung Health: Benchmarking** : [Tushar et al. arxiv (2024)](https://arxiv.org/abs/2405.04605)
+2. **AI in Lung Health: Benchmarking** : [https://github.com/fitushar/AI-in-Lung-Health-Benchmarking](https://github.com/fitushar/AI-in-Lung-Health-Benchmarking-Detection-and-Diagnostic-Models-Across-Multiple-CT-Scan-Datasets)
+4. **DLCS Dataset**: [Wang et al. Radiology AI 2024](https://doi.org/10.1148/ryai.240248);[Zenedo](https://zenodo.org/records/13799069)
+5. **SYN-LUNGS**: [Tushar et al., arxiv 2025](https://arxiv.org/abs/2502.21187)
+6. **Refining Focus in AI for Lung Cancer:** Comparing Lesion-Centric and Chest-Region Models with Performance Insights from Internal and External Validation. [![arXiv](https://img.shields.io/badge/arXiv-2411.16823-<color>.svg)](https://arxiv.org/abs/2411.16823)
+7. **Peritumoral Expansion Radiomics** for Improved Lung Cancer Classification. [![arXiv](https://img.shields.io/badge/arXiv-2411.16008-<color>.svg)](https://arxiv.org/abs/2411.16008)
+8. **PyRadiomics Framework**: [van Griethuysen et al., Cancer Research 2017](https://pubmed.ncbi.nlm.nih.gov/29092951/)
+## License & Usage
+**license: cc-by-nc-4.0**
+### Academic Use License
+This project is released for **academic and non-commercial research purposes only**.
+You are free to use, modify, and distribute this code under the following conditions:
+- ✅ Academic research use permitted
+- ✅ Modification and redistribution permitted for research
+- ❌ Commercial use prohibited without prior written permission
+For commercial licensing inquiries, please contact: [email protected]
+## Support & Community
+### Getting Help
+- **📖 Documentation**: [Comprehensive technical docs](https://github.com/fitushar/PiNS/blob/main/docs/)
+- **🐛 Issues**: [GitHub Issues](https://github.com/fitushar/PiNS/issues)
+- **💬 Discussions**: [GitHub Discussions](https://github.com/fitushar/PiNS/discussions)
+- **📧 Email**: [email protected] ; [email protected]
+### Community Stats
+- **Publications**: 5+ research papers
+- **Contributors**: Active open-source community
+---

docs/TECHNICAL_DOCUMENTATION.md ADDED Viewed

	@@ -0,0 +1,535 @@

+# PiNS (Point-driven Nodule Segmentation) - Technical Documentation
+## Professional Technical Documentation
+### Version: 1.0.0
+### Authors: Fakrul Islam Tushar ([email protected])
+### Date: September 2025
+### License: CC-BY-NC-4.0
+---
+## Table of Contents
+1. [Overview](#overview)
+2. [Architecture](#architecture)
+3. [Core Components](#core-components)
+4. [Technical Specifications](#technical-specifications)
+5. [API Reference](#api-reference)
+6. [Implementation Details](#implementation-details)
+7. [Performance Metrics](#performance-metrics)
+8. [Clinical Applications](#clinical-applications)
+9. [Validation](#validation)
+10. [Future Developments](#future-developments)
+---
+## Overview
+### Abstract
+PiNS (Point-driven Nodule Segmentation) is a  medical imaging toolkit designed for automated detection, segmentation, and analysis of pulmonary nodules in computed tomography (CT) scans. The toolkit provides an end-to-end pipeline from coordinate-based nodule identification to quantitative radiomics feature extraction.
+### Key Capabilities
+**1. Automated Nodule Segmentation**
+- K-means clustering-based segmentation with configurable expansion
+- Multi-algorithm support (K-means, Gaussian Mixture Models, Fuzzy C-Means, Otsu)
+- Sub-voxel precision coordinate handling
+- Adaptive region growing with millimeter-based expansion
+**2. Quantitative Radiomics Analysis**
+- PyRadiomics-compliant feature extraction
+- 100+ standardized imaging biomarkers
+- IBSI-compatible feature calculations
+- Configurable intensity normalization and resampling
+**3. Patch-based Data Preparation**
+- 3D volumetric patch extraction (64³ default)
+- Standardized intensity windowing for lung imaging
+- Deep learning-ready data formatting
+- Automated coordinate-to-voxel transformation
+### Clinical Significance
+PiNS addresses critical challenges in pulmonary nodule analysis:
+- **Reproducibility**: Standardized segmentation protocols
+- **Quantification**: Objective radiomics-based characterization
+- **Scalability**: Batch processing capabilities for research cohorts
+- **Interoperability**: NIfTI support with Docker containerization
+---
+## Architecture
+### System Design
+```
+┌─────────────────────────────────────────────────────────────┐
+│                    PiNS Architecture                        │
+├─────────────────────────────────────────────────────────────┤
+│  Input Layer                                                │
+│  ├── CT DICOM/NIfTI Images                                  │
+│  ├── Coordinate Annotations (World/Voxel)                   │
+│  └── Configuration Parameters                               │
+├─────────────────────────────────────────────────────────────┤
+│  Processing Layer                                           │
+│  ├── Image Preprocessing                                    │
+│  │   ├── Intensity Normalization                           │
+│  │   ├── Resampling & Interpolation                        │
+│  │   └── Coordinate Transformation                         │
+│  ├── Segmentation Engine                                    │
+│  │   ├── K-means Clustering                                │
+│  │   ├── Region Growing                                     │
+│  │   └── Morphological Operations                          │
+│  └── Feature Extraction                                     │
+│      ├── Shape Features                                     │
+│      ├── First-order Statistics                            │
+│      ├── Texture Features (GLCM, GLRLM, GLSZM, GLDM)       │
+│      └── Wavelet Features                                   │
+├─────────────────────────────────────────────────────────────┤
+│  Output Layer                                               │
+│  ├── Segmentation Masks (NIfTI)                            │
+│  ├── Quantitative Features (CSV)                           │
+│  ├── Image Patches (NIfTI)                                 │
+│  └── Processing Logs                                        │
+└─────────────────────────────────────────────────────────────┘
+```
+### Technology Stack
+- **Containerization**: Docker (Ubuntu 20.04 base)
+- **Medical Imaging**: SimpleITK, PyRadiomics 3.1.0+
+- **Scientific Computing**: NumPy, SciPy, scikit-learn
+- **Data Management**: Pandas, NiBabel
+- **Visualization**: Matplotlib
+- **Languages**: Python 3.8+, Bash scripting
+---
+## Core Components
+### Component 1: Nodule Segmentation Pipeline
+**Script**: `DLCS24_KNN_2mm_Extend_Seg.sh`
+**Purpose**: Automated segmentation of pulmonary nodules from coordinate annotations
+**Algorithm Workflow**:
+1. **Coordinate Processing**: Transform world coordinates to voxel indices
+2. **Region Initialization**: Create bounding box around nodule center
+3. **Clustering Segmentation**: Apply K-means with k=2 (nodule vs. background)
+4. **Connected Component Analysis**: Extract largest connected component
+5. **Morphological Refinement**: Apply expansion based on clinical parameters
+6. **Quality Control**: Validate segmentation size and connectivity
+**Technical Parameters**:
+- Expansion radius: 2.0mm (configurable)
+- Clustering algorithm: K-means (alternatives: GMM, FCM, Otsu)
+- Output format: NIfTI (.nii.gz)
+- Coordinate system: ITK/SimpleITK standard
+### Component 2: Radiomics Feature Extraction
+**Script**: `DLCS24_KNN_2mm_Extend_Radiomics.sh`
+**Purpose**: Quantitative imaging biomarker extraction from segmented nodules
+**Feature Categories**:
+1. **Shape Features (14 features)**
+   - Sphericity, Compactness, Surface Area
+   - Volume, Maximum Diameter
+   - Elongation, Flatness
+2. **First-order Statistics (18 features)**
+   - Mean, Median, Standard Deviation
+   - Skewness, Kurtosis, Entropy
+   - Percentiles (10th, 90th)
+3. **Second-order Texture (75+ features)**
+   - Gray Level Co-occurrence Matrix (GLCM)
+   - Gray Level Run Length Matrix (GLRLM)
+   - Gray Level Size Zone Matrix (GLSZM)
+   - Gray Level Dependence Matrix (GLDM)
+4. **Higher-order Features (100+ features)**
+   - Wavelet decomposition features
+   - Laplacian of Gaussian filters
+**Normalization Protocol**:
+- Bin width: 25 HU
+- Resampling: 1×1×1 mm³
+- Interpolation: B-spline (image), Nearest neighbor (mask)
+### Component 3: Patch Extraction Pipeline
+**Script**: `DLCS24_CADe_64Qpatch.sh`
+**Purpose**: 3D volumetric patch extraction for deep learning applications
+**Patch Specifications**:
+- **Dimensions**: 64×64×64 voxels (configurable)
+- **Centering**: World coordinate-based positioning
+- **Windowing**: -1000 to 500 HU (lung window)
+- **Normalization**: Min-max scaling to [0,1]
+- **Boundary Handling**: Zero-padding for edge cases
+**Output Format**:
+- Individual NIfTI files per nodule
+- CSV metadata with coordinates and labels
+- Standardized naming convention
+---
+## Technical Specifications
+### Hardware Requirements
+**Minimum Requirements**:
+- CPU: 4 cores, 2.0 GHz
+- RAM: 8 GB
+- Storage: 50 GB available space
+- Docker: 20.10.0+
+**Recommended Configuration**:
+- CPU: 8+ cores, 3.0+ GHz
+- RAM: 16+ GB
+- Storage: 100+ GB SSD
+- GPU: CUDA-compatible (for future ML extensions)
+### Input Data Requirements
+**Image Specifications**:
+- Format: NIfTI
+- Modality: CT (chest)
+- Resolution: 0.5-2.0 mm³ voxel spacing
+- Matrix size: 512×512 or larger
+- Bit depth: 16-bit signed integers
+**Annotation Format**:
+```csv
+ct_nifti_file,nodule_id,coordX,coordY,coordZ,w,h,d,Malignant_lbl
+DLCS_0001.nii.gz,DLCS_0001_01,-106.55,-63.84,-211.68,4.39,4.39,4.30,0
+```
+**Required Columns**:
+- `ct_nifti_file`: Image filename
+- `coordX/Y/Z`: World coordinates (mm)
+- `w/h/d`: Bounding box dimensions (mm)
+- `Malignant_lbl`: Binary label (optional)
+## API Reference
+### Bash Script Interface
+#### Segmentation Script
+```bash
+./scripts/DLCS24_KNN_2mm_Extend_Seg.sh
+```
+**Configuration Variables**:
+```bash
+DATASET_NAME="DLCSD24"              # Dataset identifier
+SEG_ALG="knn"                       # Segmentation algorithm
+EXPANSION_MM=2.0                    # Expansion radius (mm)
+RAW_DATA_PATH="/app/demofolder/data/DLCS24/"
+DATASET_CSV="/app/demofolder/data/DLCSD24_Annotations_N2.csv"
+```
+#### Radiomics Script
+```bash
+./scripts/DLCS24_KNN_2mm_Extend_Radiomics.sh
+```
+**Additional Parameters**:
+```bash
+EXTRACT_RADIOMICS_FLAG="--extract_radiomics"
+PARAMS_JSON="/app/scr/Pyradiomics_feature_extarctor_pram.json"
+```
+#### Patch Extraction Script
+```bash
+./scripts/DLCS24_CADe_64Qpatch.sh
+```
+**Patch Parameters**:
+```bash
+PATCH_SIZE="64 64 64"               # Voxel dimensions
+NORMALIZATION="-1000 500 0 1"       # HU window and output range
+CLIP="True"                         # Enable intensity clipping
+```
+### Python API (Internal)
+#### Segmentation Function
+```python
+def candidateSeg_main():
+    """
+    Main segmentation pipeline
+    Parameters:
+    -----------
+    raw_data_path : str
+        Path to input CT images
+    dataset_csv : str
+        Path to coordinate annotations
+    seg_alg : str
+        Segmentation algorithm {'knn', 'gmm', 'fcm', 'otsu'}
+    expansion_mm : float
+        Expansion radius in millimeters
+    Returns:
+    --------
+    None (saves masks to disk)
+    """
+```
+#### Radiomics Function
+```python
+def seg_pyradiomics_main():
+    """
+    Radiomics feature extraction pipeline
+    Parameters:
+    -----------
+    params_json : str
+        PyRadiomics configuration file
+    extract_radiomics : bool
+        Enable feature extraction
+    Returns:
+    --------
+    features : DataFrame
+        Quantitative imaging features
+    """
+```
+---
+## Implementation Details
+### Docker Container Specifications
+**Base Image**: `ft42/pins:latest`
+**Size**: ~11 GB (includes CUDA libraries)
+**Dependencies**:
+```dockerfile
+# Core medical imaging libraries
+SimpleITK==2.4+
+pyradiomics==3.1.0
+scikit-learn==1.3.0
+# Deep learning and computer vision
+torch==2.8.0
+torchvision==0.23.0
+monai==1.4.0
+opencv-python-headless==4.11.0
+# Scientific computing
+numpy==1.24.4
+scipy
+pandas
+scipy==1.11.1
+nibabel==5.1.0
+# Data processing
+numpy==1.24.3
+pandas==2.0.3
+matplotlib==3.7.1
+# Utilities
+tqdm==4.65.0
+```
+### File Organization
+```
+PiNS/
+├── scripts/
+│   ├── DLCS24_KNN_2mm_Extend_Seg.sh
+│   ├── DLCS24_KNN_2mm_Extend_Radiomics.sh
+│   └── DLCS24_CADe_64Qpatch.sh
+├── scr/
+│   ├── candidateSeg_pipiline.py
+│   ├── candidateSeg_radiomicsExtractor_pipiline.py
+│   ├── candidate_worldCoord_patchExtarctor_pipeline.py
+│   ├── cvseg_utils.py
+│   └── Pyradiomics_feature_extarctor_pram.json
+├── demofolder/
+│   ├── data/
+│   │   ├── DLCS24/
+│   │   └── DLCSD24_Annotations_N2.csv
+│   └── output/
+└── docs/
+    ├── README.md
+    ├── TECHNICAL_DOCS.md
+    └── HUGGINGFACE_CARD.md
+```
+### Configuration Management
+**PyRadiomics Parameters** (`Pyradiomics_feature_extarctor_pram.json`):
+```json
+{
+    "binWidth": 25,
+    "resampledPixelSpacing": [1, 1, 1],
+    "interpolator": "sitkBSpline",
+    "labelInterpolator": "sitkNearestNeighbor"
+}
+```
+**Segmentation Parameters**:
+- K-means clusters: 2 (nodule vs background)
+- Connected component threshold: Largest component
+- Morphological operations: Binary closing with 1mm kernel
+---
+### Computational Efficiency
+**Processing Time Analysis**:
+- Segmentation: 15-30 seconds per nodule
+- Radiomics extraction: 5-10 seconds per mask
+- Patch extraction: 2-5 seconds per patch
+- Total pipeline: <2 minutes per case
+**Scalability Analysis**:
+- Linear scaling with nodule count
+- Memory usage: ~500 MB per concurrent image
+- Disk I/O: ~50 MB/s sustained throughput
+- CPU utilization: 85-95% (multi-threaded operations)
+---
+## Research Applications
+### Diagnostic Imaging
+**Lung Cancer Screening**:
+- Automated nodule characterization
+- Growth assessment in follow-up studies
+- Risk stratification based on radiomics profiles
+**Research Applications**:
+- Biomarker discovery studies
+- Machine learning dataset preparation
+- Multi-institutional validation studies
+### Integration Pathways
+**AI Pipeline Integration**:
+- Preprocessed patch data for CNNs
+- Feature vectors for traditional ML
+- Standardized evaluation protocols
+---
+## License and Usage Terms
+### Creative Commons Attribution-NonCommercial 4.0 International (CC-BY-NC-4.0)
+**Permitted Uses**:
+- Research and educational purposes
+- Academic publications and presentations
+- Non-commercial clinical research
+- Open-source contributions and modifications
+**Requirements**:
+- Attribution to original authors and PiNS toolkit
+- Citation of relevant publications
+- Sharing of derivative works under same license
+- Clear indication of any modifications made
+**Restrictions**:
+- Commercial use requires separate licensing agreement
+- No warranty or liability provided
+- Contact [email protected] for commercial licensing
+**Citation Requirements**:
+```bibtex
+@software{pins2025,
+  title={PiNS: Point-driven Nodule Segmentation Toolkit },
+  author={Fakrul Islam Tushar},
+  year={2025},
+  url={https://github.com/fitushar/PiNS},
+  version={1.0.0},
+  doi={10.5281/zenodo.17171571},
+  license={CC-BY-NC-4.0}
+}
+```
+---
+## Validation & Quality Assurance
+**Evaluation Criteria:** In the absence of voxel-level ground truth, we adopted a bounding box–supervised evaluation strategy to assess segmentation performance. Each CT volume was accompanied by annotations specifying the nodule center in world coordinates and its dimensions in millimeters, which were converted into voxel indices using the image spacing and clipped to the volume boundaries. A binary mask representing the bounding box was then constructed and used as a weak surrogate for ground truth. we extracted a patch centered on the bounding box, extending it by a fixed margin (64 voxels) to define the volume of interest (VOI). Predicted segmentation masks were cropped to the same VOI-constrained region of interest, and performance was quantified in terms of Dice similarity coefficient. Metrics were computed per lesion. This evaluation strategy enables consistent comparison of segmentation algorithms under weak supervision while acknowledging the limitations of not having voxel-level annotations.
+Segmentation performance of **KNN (ours PiNS)**, **VISTA3D auto**, and **VISTA3D points** ([He et al. 2024](https://github.com/Project-MONAI/VISTA/tree/main/vista3d)) across different nodule size buckets. (top) Bar plots display the mean Dice similarity coefficient for each model and size category. (buttom) Boxplots show the distribution of Dice scores, with boxes representing the interquartile range, horizontal lines indicating the median, whiskers extending to 1.5× the interquartile range, and circles denoting outliers.
+<p align="center">
+  <img src="assets/Segmentation_Evaluation_KNNVista3Dauto_DLCS24_HIST.png" alt="(a)" width="700">
+</p>
+<p align="center">
+  <img src="assets/Segmentation_Evaluation_KNNVista3Dauto_DLCS24_BOX.png" alt="(b)" width="700">
+</p>
+## Limitations & Considerations
+### Current Limitations
+- **Nodule Size**: Optimized for nodules 3-30mm diameter
+- **Image Quality**: Requires standard clinical CT protocols
+- **Coordinate Accuracy**: Dependent on annotation precision
+- **Processing Time**: Sequential processing (parallelization possible)
+## Contributing & Development
+### Research Collaborations
+We welcome collaborations from:
+- **Academic Medical Centers**
+- **Radiology Departments**
+- **Medical AI Companies**
+- **Open Source Contributors**
+### Related Publications
+1. **AI in Lung Health: Benchmarking** : [Tushar et al. arxiv (2024)](https://arxiv.org/abs/2405.04605)
+2. **AI in Lung Health: Benchmarking** : [https://github.com/fitushar/AI-in-Lung-Health-Benchmarking](https://github.com/fitushar/AI-in-Lung-Health-Benchmarking-Detection-and-Diagnostic-Models-Across-Multiple-CT-Scan-Datasets)
+4. **DLCS Dataset**: [Wang et al. Radiology AI 2024](https://doi.org/10.1148/ryai.240248);[Zenedo](https://zenodo.org/records/13799069)
+5. **SYN-LUNGS**: [Tushar et al., arxiv 2025](https://arxiv.org/abs/2502.21187)
+6. **Refining Focus in AI for Lung Cancer:** Comparing Lesion-Centric and Chest-Region Models with Performance Insights from Internal and External Validation. [![arXiv](https://img.shields.io/badge/arXiv-2411.16823-<color>.svg)](https://arxiv.org/abs/2411.16823)
+7. **Peritumoral Expansion Radiomics** for Improved Lung Cancer Classification. [![arXiv](https://img.shields.io/badge/arXiv-2411.16008-<color>.svg)](https://arxiv.org/abs/2411.16008)
+8. **PyRadiomics Framework**: [van Griethuysen et al., Cancer Research 2017](https://pubmed.ncbi.nlm.nih.gov/29092951/)
+## License & Usage
+**license: cc-by-nc-4.0**
+### Academic Use License
+This project is released for **academic and non-commercial research purposes only**.
+You are free to use, modify, and distribute this code under the following conditions:
+- ✅ Academic research use permitted
+- ✅ Modification and redistribution permitted for research
+- ❌ Commercial use prohibited without prior written permission
+For commercial licensing inquiries, please contact: [email protected]
+## Support & Community
+### Getting Help
+- **📖 Documentation**: [Comprehensive technical docs](https://github.com/fitushar/PiNS/blob/main/docs/)
+- **🐛 Issues**: [GitHub Issues](https://github.com/fitushar/PiNS/issues)
+- **💬 Discussions**: [GitHub Discussions](https://github.com/fitushar/PiNS/discussions)
+- **📧 Email**: [email protected] ; [email protected]
+### Community Stats
+- **Publications**: 5+ research papers
+- **Contributors**: Active open-source community

output/DLCS24_KNN_2mm_Extend_Seg/DLCS_0001_mask.nii.gz ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e5400fcb1e8d9e1c30f78638101ae7dc0764c78f0ae615c907215b722bb7650b
+size 145806

output/DLCS24_KNN_2mm_Extend_Seg/DLCS_0002_mask.nii.gz ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:50084e964263391492028cee2bbce70387ee5f4d6f6dc1bc5178e38ffd18a9bd
+size 145799

scr/.ipynb_checkpoints/segmentation_utils-checkpoint.py ADDED Viewed

	@@ -0,0 +1,490 @@

+import SimpleITK as sitk
+import numpy as np
+import pandas as pd
+import os
+import radiomics
+from radiomics import featureextractor
+import argparse
+import numpy as np
+from sklearn.cluster import KMeans
+import scipy.ndimage as ndimage
+from sklearn.mixture import GaussianMixture
+import skfuzzy as fuzz
+from skimage.filters import threshold_otsu
+from skimage.filters import threshold_otsu
+from skimage.segmentation import watershed
+from skimage.feature import peak_local_max
+from skimage import morphology
+from scipy.ndimage import distance_transform_edt, binary_erosion
+from matplotlib import colors
+import numpy as np
+import matplotlib.pyplot as plt
+import SimpleITK as sitk
+import numpy as np
+import pandas as pd
+from matplotlib import colors
+import cv2
+from skimage import measure
+def make_bold(text):
+    return f"\033[1m{text}\033[0m"
+def load_itk_image(filename):
+    itkimage = sitk.ReadImage(filename)
+    numpyImage = sitk.GetArrayFromImage(itkimage)
+    numpyOrigin = itkimage.GetOrigin()
+    numpySpacing = itkimage.GetSpacing()
+    return numpyImage, numpyOrigin, numpySpacing
+def normalize_image_to_uint8(image, lower_bound=-1000, upper_bound=100):
+    clipped_img = np.clip(image, lower_bound, upper_bound)
+    normalized_img = ((clipped_img - lower_bound) / (upper_bound - lower_bound)) * 255.0
+    normalized_img = normalized_img.astype(np.uint8)
+    return normalized_img
+def segment_nodule_kmeans(ct_image, bbox_center, bbox_whd, margin=5, n_clusters=2):
+    """
+    Segments a nodule in a 3D CT image using k-means clustering with a margin around the bounding box.
+    Parameters:
+    - ct_image: 3D NumPy array representing the CT image.
+    - bbox_center: Tuple of (x, y, z) coordinates for the center of the bounding box.
+    - bbox_whd: Tuple of (w, h, d) representing the width, height, and depth of the bounding box.
+    - margin: Margin to add around the bounding box (default is 5).
+    - n_clusters: Number of clusters to use in k-means (default is 2).
+    Returns:
+    - segmented_image: 3D NumPy array with the segmented nodule.
+    """
+    x_center, y_center, z_center = bbox_center
+    w, h, d = bbox_whd
+    # Calculate the bounding box with margin
+    x_start, x_end = max(0, x_center - w//2 - margin), min(ct_image.shape[0], x_center + w//2 + margin)
+    y_start, y_end = max(0, y_center - h//2 - margin), min(ct_image.shape[1], y_center + h//2 + margin)
+    z_start, z_end = max(0, z_center - d//2 - margin), min(ct_image.shape[2], z_center + d//2 + margin)
+    bbox_region = ct_image[x_start:x_end, y_start:y_end, z_start:z_end]
+    # Reshape the region for k-means clustering
+    flat_region = bbox_region.reshape(-1, 1)
+    # Perform k-means clustering
+    kmeans = KMeans(n_clusters=n_clusters, random_state=0).fit(flat_region)
+    labels = kmeans.labels_
+    # Reshape the labels back to the original bounding box shape
+    clustered_region = labels.reshape(bbox_region.shape)
+    # Assume the nodule is in the cluster with the highest mean intensity
+    nodule_cluster = np.argmax(kmeans.cluster_centers_)
+    # Create a binary mask for the nodule
+    nodule_mask = (clustered_region == nodule_cluster)
+    # Apply morphological operations to refine the segmentation
+    nodule_mask = ndimage.binary_closing(nodule_mask, structure=np.ones((3, 3, 3)))
+    nodule_mask = ndimage.binary_opening(nodule_mask, structure=np.ones((2, 2, 2)))
+    # Initialize the segmented image
+    segmented_image = np.zeros_like(ct_image, dtype=np.uint8)
+    # Place the nodule mask in the correct position in the segmented image
+    segmented_image[x_start:x_end, y_start:y_end, z_start:z_end] = nodule_mask
+    return segmented_image
+def segment_nodule_gmm(ct_image, bbox_center, bbox_whd, margin=5, n_components=2):
+    """
+    Segments a nodule in a 3D CT image using a Gaussian Mixture Model with a margin around the bounding box.
+    Parameters:
+    - ct_image: 3D NumPy array representing the CT image.
+    - bbox_center: Tuple of (x, y, z) coordinates for the center of the bounding box.
+    - bbox_whd: Tuple of (w, h, d) representing the width, height, and depth of the bounding box.
+    - margin: Margin to add around the bounding box (default is 5).
+    - n_components: Number of components to use in the Gaussian Mixture Model (default is 2).
+    Returns:
+    - segmented_image: 3D NumPy array with the segmented nodule.
+    """
+    x_center, y_center, z_center = bbox_center
+    w, h, d = bbox_whd
+    # Calculate the bounding box with margin
+    x_start, x_end = max(0, x_center - w//2 - margin), min(ct_image.shape[0], x_center + w//2 + margin)
+    y_start, y_end = max(0, y_center - h//2 - margin), min(ct_image.shape[1], y_center + h//2 + margin)
+    z_start, z_end = max(0, z_center - d//2 - margin), min(ct_image.shape[2], z_center + d//2 + margin)
+    bbox_region = ct_image[x_start:x_end, y_start:y_end, z_start:z_end]
+    # Reshape the region for GMM
+    flat_region = bbox_region.reshape(-1, 1)
+    # Perform GMM
+    gmm = GaussianMixture(n_components=n_components, random_state=0).fit(flat_region)
+    labels = gmm.predict(flat_region)
+    # Reshape the labels back to the original bounding box shape
+    clustered_region = labels.reshape(bbox_region.shape)
+    # Assume the nodule is in the component with the highest mean intensity
+    nodule_component = np.argmax(gmm.means_)
+    # Create a binary mask for the nodule
+    nodule_mask = (clustered_region == nodule_component)
+    # Apply morphological operations to refine the segmentation
+    nodule_mask = ndimage.binary_closing(nodule_mask, structure=np.ones((3, 3, 3)))
+    nodule_mask = ndimage.binary_opening(nodule_mask, structure=np.ones((3, 3, 3)))
+    # Initialize the segmented image
+    segmented_image = np.zeros_like(ct_image, dtype=np.uint8)
+    # Place the nodule mask in the correct position in the segmented image
+    segmented_image[x_start:x_end, y_start:y_end, z_start:z_end] = nodule_mask
+    return segmented_image
+def segment_nodule_fcm(ct_image, bbox_center, bbox_whd, margin=5, n_clusters=2):
+    """
+    Segments a nodule in a 3D CT image using Fuzzy C-means clustering with a margin around the bounding box.
+    Parameters:
+    - ct_image: 3D NumPy array representing the CT image.
+    - bbox_center: Tuple of (x, y, z) coordinates for the center of the bounding box.
+    - bbox_whd: Tuple of (w, h, d) representing the width, height, and depth of the bounding box.
+    - margin: Margin to add around the bounding box (default is 5).
+    - n_clusters: Number of clusters to use in Fuzzy C-means (default is 2).
+    Returns:
+    - segmented_image: 3D NumPy array with the segmented nodule.
+    """
+    x_center, y_center, z_center = bbox_center
+    w, h, d = bbox_whd
+    # Calculate the bounding box with margin
+    x_start, x_end = max(0, x_center - w//2 - margin), min(ct_image.shape[0], x_center + w//2 + margin)
+    y_start, y_end = max(0, y_center - h//2 - margin), min(ct_image.shape[1], y_center + h//2 + margin)
+    z_start, z_end = max(0, z_center - d//2 - margin), min(ct_image.shape[2], z_center + d//2 + margin)
+    bbox_region = ct_image[x_start:x_end, y_start:y_end, z_start:z_end]
+    # Reshape the region for FCM
+    flat_region = bbox_region.reshape(-1, 1)
+    # Perform FCM clustering
+    cntr, u, u0, d, jm, p, fpc = fuzz.cluster.cmeans(flat_region.T, n_clusters, 2, error=0.005, maxiter=1000, init=None)
+    # Assign each voxel to the cluster with the highest membership
+    labels = np.argmax(u, axis=0)
+    # Reshape the labels back to the original bounding box shape
+    clustered_region = labels.reshape(bbox_region.shape)
+    # Assume the nodule is in the cluster with the highest mean intensity
+    nodule_cluster = np.argmax(cntr)
+    # Create a binary mask for the nodule
+    nodule_mask = (clustered_region == nodule_cluster)
+    # Apply morphological operations to refine the segmentation
+    nodule_mask = ndimage.binary_closing(nodule_mask, structure=np.ones((3, 3, 3)))
+    nodule_mask = ndimage.binary_opening(nodule_mask, structure=np.ones((3, 3, 3)))
+    # Initialize the segmented image
+    segmented_image = np.zeros_like(ct_image, dtype=np.uint8)
+    # Place the nodule mask in the correct position in the segmented image
+    segmented_image[x_start:x_end, y_start:y_end, z_start:z_end] = nodule_mask
+    return segmented_image
+def segment_nodule_otsu(ct_image, bbox_center, bbox_whd, margin=5):
+    """
+    Segments a nodule in a 3D CT image using Otsu's thresholding with a margin around the bounding box.
+    Parameters:
+    - ct_image: 3D NumPy array representing the CT image.
+    - bbox_center: Tuple of (x, y, z) coordinates for the center of the bounding box.
+    - bbox_whd: Tuple of (w, h, d) representing the width, height, and depth of the bounding box.
+    - margin: Margin to add around the bounding box (default is 5).
+    Returns:
+    - segmented_image: 3D NumPy array with the segmented nodule.
+    """
+    x_center, y_center, z_center = bbox_center
+    w, h, d = bbox_whd
+    # Calculate the bounding box with margin
+    x_start, x_end = max(0, x_center - w//2 - margin), min(ct_image.shape[0], x_center + w//2 + margin)
+    y_start, y_end = max(0, y_center - h//2 - margin), min(ct_image.shape[1], y_center + h//2 + margin)
+    z_start, z_end = max(0, z_center - d//2 - margin), min(ct_image.shape[2], z_center + d//2 + margin)
+    bbox_region = ct_image[x_start:x_end, y_start:y_end, z_start:z_end]
+    # Flatten the region for thresholding
+    flat_region = bbox_region.flatten()
+    # Calculate the Otsu threshold
+    otsu_threshold = threshold_otsu(flat_region)
+    # Apply the threshold to create a binary mask
+    nodule_mask = bbox_region >= otsu_threshold
+    # Apply morphological operations to refine the segmentation
+    nodule_mask = ndimage.binary_closing(nodule_mask, structure=np.ones((3, 3, 3)))
+    nodule_mask = ndimage.binary_opening(nodule_mask, structure=np.ones((3, 3, 3)))
+    # Initialize the segmented image
+    segmented_image = np.zeros_like(ct_image, dtype=np.uint8)
+    # Place the nodule mask in the correct position in the segmented image
+    segmented_image[x_start:x_end, y_start:y_end, z_start:z_end] = nodule_mask
+    return segmented_image
+def expand_mask_by_distance(segmented_nodule_gmm, spacing, expansion_mm):
+    """
+    Expands the segmentation mask by a given distance in mm in all directions by directly updating pixel values.
+    Parameters:
+    segmented_nodule_gmm (numpy array): 3D binary mask of the nodule (1 for nodule, 0 for background).
+    spacing (tuple): Spacing of the image in mm for each voxel, given as (spacing_x, spacing_y, spacing_z).
+    expansion_mm (float): Distance to expand the mask in millimeters.
+    Returns:
+    numpy array: Expanded segmentation mask.
+    """
+    # Reorder spacing to match the numpy array's (z, y, x) format
+    spacing_reordered = (spacing[2], spacing[1], spacing[0])  # (spacing_z, spacing_y, spacing_x)
+    # Calculate the number of pixels to expand in each dimension
+    expand_pixels = np.array([int(np.round(expansion_mm / s)) for s in spacing_reordered])
+    # Create a new expanded mask with the same shape
+    expanded_mask = np.zeros_like(segmented_nodule_gmm)
+    # Get the coordinates of all white pixels in the original mask
+    white_pixel_coords = np.argwhere(segmented_nodule_gmm == 1)
+    # Expand each white pixel by adding the specified number of pixels in each direction
+    for coord in white_pixel_coords:
+        z, y, x = coord  # Extract the z, y, x coordinates of each white pixel
+        # Define the range to expand for each coordinate
+        z_range = range(max(0, z - expand_pixels[0]), min(segmented_nodule_gmm.shape[0], z + expand_pixels[0] + 1))
+        y_range = range(max(0, y - expand_pixels[1]), min(segmented_nodule_gmm.shape[1], y + expand_pixels[1] + 1))
+        x_range = range(max(0, x - expand_pixels[2]), min(segmented_nodule_gmm.shape[2], x + expand_pixels[2] + 1))
+        # Update the new mask by setting all pixels in this range to 1
+        for z_new in z_range:
+            for y_new in y_range:
+                for x_new in x_range:
+                    expanded_mask[z_new, y_new, x_new] = 1
+    return expanded_mask
+def find_nodule_lobe(cccwhd, lung_mask, class_map):
+    """
+    Determine the lung lobe where a nodule is located based on a 3D mask and bounding box.
+    Parameters:
+    cccwhd (list or tuple): Bounding box in the format [center_x, center_y, center_z, width, height, depth].
+    lung_mask (numpy array): 3D array representing the lung mask with different lung regions.
+    class_map (dict): Dictionary mapping lung region labels to their names.
+    Returns:
+    str: Name of the lung lobe where the nodule is located.
+    """
+    center_x, center_y, center_z, width, height, depth = cccwhd
+    # Calculate the bounding box limits
+    start_x = int(center_x - width // 2)
+    end_x = int(center_x + width // 2)
+    start_y = int(center_y - height // 2)
+    end_y = int(center_y + height // 2)
+    start_z = int(center_z - depth // 2)
+    end_z = int(center_z + depth // 2)
+    # Ensure the indices are within the mask dimensions
+    start_x = max(0, start_x)
+    end_x = min(lung_mask.shape[0], end_x)
+    start_y = max(0, start_y)
+    end_y = min(lung_mask.shape[1], end_y)
+    start_z = max(0, start_z)
+    end_z = min(lung_mask.shape[2], end_z)
+    # Extract the region of interest (ROI) from the mask
+    roi = lung_mask[start_x:end_x, start_y:end_y, start_z:end_z]
+    # Count the occurrences of each lobe label within the ROI
+    unique, counts = np.unique(roi, return_counts=True)
+    label_counts = dict(zip(unique, counts))
+    # Exclude the background (label 0)
+    if 0 in label_counts:
+        del label_counts[0]
+    # Find the label with the maximum count
+    if label_counts:
+        nodule_lobe = max(label_counts, key=label_counts.get)
+    else:
+        nodule_lobe = None
+    # Map the label to the corresponding lung lobe
+    if nodule_lobe is not None:
+        nodule_lobe_name = class_map["lungs"][nodule_lobe]
+    else:
+        nodule_lobe_name = "Undefined"
+    return nodule_lobe_name
+def find_nodule_lobe_and_distance(cccwhd, lung_mask, class_map,spacing):
+    """
+    Determine the lung lobe where a nodule is located and measure its distance from the lung wall.
+    Parameters:
+    cccwhd (list or tuple): Bounding box in the format [center_x, center_y, center_z, width, height, depth].
+    lung_mask (numpy array): 3D array representing the lung mask with different lung regions.
+    class_map (dict): Dictionary mapping lung region labels to their names.
+    Returns:
+    tuple: (Name of the lung lobe, Distance from the lung wall)
+    """
+    center_x, center_y, center_z, width, height, depth = cccwhd
+    # Calculate the bounding box limits
+    start_x = int(center_x - width // 2)
+    end_x = int(center_x + width // 2)
+    start_y = int(center_y - height // 2)
+    end_y = int(center_y + height // 2)
+    start_z = int(center_z - depth // 2)
+    end_z = int(center_z + depth // 2)
+    # Ensure the indices are within the mask dimensions
+    start_x = max(0, start_x)
+    end_x = min(lung_mask.shape[0], end_x)
+    start_y = max(0, start_y)
+    end_y = min(lung_mask.shape[1], end_y)
+    start_z = max(0, start_z)
+    end_z = min(lung_mask.shape[2], end_z)
+    # Extract the region of interest (ROI) from the mask
+    roi = lung_mask[start_x:end_x, start_y:end_y, start_z:end_z]
+    # Count the occurrences of each lobe label within the ROI
+    unique, counts = np.unique(roi, return_counts=True)
+    label_counts = dict(zip(unique, counts))
+    # Exclude the background (label 0)
+    if 0 in label_counts:
+        del label_counts[0]
+    # Find the label with the maximum count
+    if label_counts:
+        nodule_lobe = max(label_counts, key=label_counts.get)
+    else:
+        nodule_lobe = None
+    # Map the label to the corresponding lung lobe
+    if nodule_lobe is not None:
+        nodule_lobe_name = class_map["lungs"][nodule_lobe]
+    else:
+        nodule_lobe_name = "Undefined"
+    # Calculate the distance from the nodule centroid to the nearest lung wall
+    nodule_centroid = np.array([center_x, center_y, center_z])
+    # Create a binary lung mask where lung region is 1 and outside lung is 0
+    lung_binary_mask = lung_mask > 0
+    # Create the lung wall mask by finding the outer boundary
+    # Use binary erosion to shrink the lung mask, then subtract it from the original mask to get the boundary
+    lung_eroded    = binary_erosion(lung_binary_mask)
+    lung_wall_mask = lung_binary_mask & ~lung_eroded  # Lung wall mask is the outermost boundary (contour)
+    # Compute the distance transform from the lung wall
+    distance_transform = distance_transform_edt(~lung_wall_mask)  # Compute distance to nearest lung boundary
+    # Get the distance from the nodule centroid to the nearest lung wall in voxel units
+    voxel_distance_to_lung_wall = distance_transform[center_x, center_y, center_z]
+    # Convert voxel distance to real-world distance in mm
+    physical_distance_to_lung_wall = voxel_distance_to_lung_wall * np.sqrt(
+        spacing[0]**2 + spacing[1]**2 + spacing[2]**2
+    )
+    return nodule_lobe_name, voxel_distance_to_lung_wall,physical_distance_to_lung_wall
+# +
+def expand_mask_by_distance(segmented_nodule_gmm, spacing, expansion_mm):
+    """
+    Expands the segmentation mask by a given distance in mm in all directions by directly updating pixel values.
+    Parameters:
+    segmented_nodule_gmm (numpy array): 3D binary mask of the nodule (1 for nodule, 0 for background).
+    spacing (tuple): Spacing of the image in mm for each voxel, given as (spacing_x, spacing_y, spacing_z).
+    expansion_mm (float): Distance to expand the mask in millimeters.
+    Returns:
+    numpy array: Expanded segmentation mask.
+    """
+    # Reorder spacing to match the numpy array's (z, y, x) format
+    spacing_reordered = (spacing[2], spacing[1], spacing[0])  # (spacing_z, spacing_y, spacing_x)
+    # Calculate the number of pixels to expand in each dimension
+    expand_pixels = np.array([int(np.round(expansion_mm / s)) for s in spacing_reordered])
+    # Create a new expanded mask with the same shape
+    expanded_mask = np.zeros_like(segmented_nodule_gmm)
+    # Get the coordinates of all white pixels in the original mask
+    white_pixel_coords = np.argwhere(segmented_nodule_gmm == 1)
+    # Expand each white pixel by adding the specified number of pixels in each direction
+    for coord in white_pixel_coords:
+        z, y, x = coord  # Extract the z, y, x coordinates of each white pixel
+        # Define the range to expand for each coordinate
+        z_range = range(max(0, z - expand_pixels[0]), min(segmented_nodule_gmm.shape[0], z + expand_pixels[0] + 1))
+        y_range = range(max(0, y - expand_pixels[1]), min(segmented_nodule_gmm.shape[1], y + expand_pixels[1] + 1))
+        x_range = range(max(0, x - expand_pixels[2]), min(segmented_nodule_gmm.shape[2], x + expand_pixels[2] + 1))
+        # Update the new mask by setting all pixels in this range to 1
+        for z_new in z_range:
+            for y_new in y_range:
+                for x_new in x_range:
+                    expanded_mask[z_new, y_new, x_new] = 1
+    return expanded_mask
+# Function to plot the contours of a mask
+def plot_contours(ax, mask, color, linewidth=1.5):
+    contours = measure.find_contours(mask, level=0.5)  # Find contours at a constant level
+    for contour in contours:
+        ax.plot(contour[:, 1], contour[:, 0], color=color, linewidth=linewidth)

scr/Pyradiomics_feature_extarctor_pram.json ADDED Viewed

	@@ -0,0 +1,6 @@

+{
+    "binWidth": 25,
+    "resampledPixelSpacing": [1, 1, 1],
+    "interpolator": "sitkBSpline",
+    "labelInterpolator": "sitkNearestNeighbor"
+}

scr/__pycache__/cvseg_utils.cpython-310.pyc ADDED Viewed

Binary file (11.6 kB). View file

scr/__pycache__/cvseg_utils.cpython-311.pyc ADDED Viewed

Binary file (20.7 kB). View file

scr/__pycache__/cvseg_utils.cpython-38.pyc ADDED Viewed

Binary file (12 kB). View file

scr/__pycache__/cvseg_utils.cpython-39.pyc ADDED Viewed

Binary file (11.9 kB). View file

scr/__pycache__/segmentation_utils.cpython-38.pyc ADDED Viewed

Binary file (12.2 kB). View file

scr/candidateSeg_pipiline.py ADDED Viewed

	@@ -0,0 +1,86 @@

+from cvseg_utils import *
+import warnings
+warnings.filterwarnings("ignore", message="GLCM is symmetrical, therefore Sum Average = 2 * Joint Average")
+import os
+import logging
+from datetime import datetime
+def seg_main():
+    parser = argparse.ArgumentParser(description='Nodule segmentation and feature extraction from CT images.')
+    parser.add_argument('--raw_data_path', type=str, required=True, help='Path to raw CT images')
+    parser.add_argument('--csv_save_path', type=str, required=True, help='Path to save the CSV files')
+    parser.add_argument('--dataset_csv', type=str, required=True, help='Path to the dataset CSV')
+    parser.add_argument('--nifti_clm_name', type=str, required=True, help='name to the nifti column name')
+    parser.add_argument('--unique_Annotation_id', type=str, help='Column for unique annotation ID')
+    parser.add_argument('--Malignant_lbl', type=str, required=True, help='Column name for malignancy labels')
+    parser.add_argument('--coordX', type=str, required=True, help='Column name for X coordinate')
+    parser.add_argument('--coordY', type=str, required=True, help='Column name for Y coordinate')
+    parser.add_argument('--coordZ', type=str, required=True, help='Column name for Z coordinate')
+    parser.add_argument('--w', type=str, required=True, help='Column name for width')
+    parser.add_argument('--h', type=str, required=True, help='Column name for height')
+    parser.add_argument('--d', type=str, required=True, help='Column name for depth')
+    parser.add_argument('--seg_alg', type=str, default='gmm', choices=['gmm', 'knn', 'fcm', 'otsu'], help='Segmentation algorithm to use')
+    parser.add_argument('--dataset_name', type=str, default='DLCS24', help='Dataset to use')
+    parser.add_argument('--expansion_mm', type=float, default=1.0, help='Expansion in mm')
+    parser.add_argument('--use_expand', action='store_true', help='Use expansion if set')
+    parser.add_argument('--params_json', type=str, required=True, help="Path to JSON file with radiomics parameters")
+    parser.add_argument('--save_the_generated_mask', action='store_true', help='Save generated segmentation mask')
+    parser.add_argument('--save_nifti_path', type=str, help='Path to save the nifti files')
+    args = parser.parse_args()
+    df = pd.read_csv(args.dataset_csv)
+    final_dect = df[args.nifti_clm_name].unique()
+    for dictonary_list_i, ct_filename in enumerate(final_dect):
+        try:
+            filtered_df = df[df[args.nifti_clm_name] == ct_filename].reset_index()
+            ct_nifti_path = os.path.join(args.raw_data_path, ct_filename)
+            ct_image = sitk.ReadImage(ct_nifti_path)
+            ct_array = sitk.GetArrayFromImage(ct_image)
+            spacing = ct_image.GetSpacing()
+            full_mask_array = np.zeros_like(ct_array, dtype=np.uint8)
+            for idx, row in filtered_df.iterrows():
+                worldCoord = np.asarray([row[args.coordX], row[args.coordY], row[args.coordZ]])
+                voxelCoord = ct_image.TransformPhysicalPointToIndex(worldCoord.tolist())
+                spacing = ct_image.GetSpacing()
+                w = int(row[args.w] / spacing[0])
+                h = int(row[args.h] / spacing[1])
+                d = int(row[args.d] / spacing[2])
+                bbox_center = [voxelCoord[2], voxelCoord[1], voxelCoord[0]]
+                bbox_whd = [d, h, w]
+                if args.seg_alg == 'gmm':
+                    mask_image_array = segment_nodule_gmm(ct_array, bbox_center, bbox_whd)
+                elif args.seg_alg == 'knn':
+                    mask_image_array = segment_nodule_kmeans(ct_array, bbox_center, bbox_whd)
+                elif args.seg_alg == 'fcm':
+                    mask_image_array = segment_nodule_fcm(ct_array, bbox_center, bbox_whd)
+                elif args.seg_alg == 'otsu':
+                    mask_image_array = segment_nodule_otsu(ct_array, bbox_center, bbox_whd)
+                if args.use_expand:
+                    mask_image_array = expand_mask_by_distance(mask_image_array, spacing=spacing, expansion_mm=args.expansion_mm)
+                    #print("Segmented mask sum:", np.sum(mask_image_array))
+                full_mask_array[mask_image_array==1] = 1
+            if args.save_the_generated_mask:
+                print("Segmented mask sum:", np.sum(full_mask_array))
+                combined_mask_image = sitk.GetImageFromArray(full_mask_array)
+                combined_mask_image.SetSpacing(ct_image.GetSpacing())
+                combined_mask_image.SetDirection(ct_image.GetDirection())
+                combined_mask_image.SetOrigin(ct_image.GetOrigin())
+                mask_filename = ct_filename.split('.nii')[0]+"_mask.nii.gz"
+                sitk.WriteImage(combined_mask_image, os.path.join(args.save_nifti_path, mask_filename))
+                print(f"Saved {mask_filename}")
+        except Exception as e:
+            print(f"Error processing {ct_filename}: {e}")
+if __name__ == "__main__":
+    seg_main()

scr/candidateSeg_radiomicsExtractor_pipiline.py ADDED Viewed

	@@ -0,0 +1,213 @@

+from cvseg_utils import*
+import warnings
+warnings.filterwarnings("ignore", message="GLCM is symmetrical, therefore Sum Average = 2 * Joint Average")
+import os
+import logging
+from datetime import datetime
+def seg_pyradiomics_main():
+    parser = argparse.ArgumentParser(description='Nodule segmentation and feature extraction from CT images.')
+    parser.add_argument('--raw_data_path', type=str, required=True, help='Path to raw CT images')
+    parser.add_argument('--csv_save_path', type=str, required=True, help='Path to save the CSV files')
+    parser.add_argument('--dataset_csv',   type=str, required=True, help='Path to the dataset CSV')
+    # Allow multiple column names as input arguments
+    parser.add_argument('--nifti_clm_name',   type=str, required=True, help='name to the nifti column name')
+    parser.add_argument('--unique_Annotation_id', type=str, help='Column for unique annotation ID')
+    parser.add_argument('--Malignant_lbl', type=str, help='Column name for malignancy labels')
+    parser.add_argument('--coordX', type=str, required=True, help='Column name for X coordinate')
+    parser.add_argument('--coordY', type=str, required=True, help='Column name for Y coordinate')
+    parser.add_argument('--coordZ', type=str, required=True, help='Column name for Z coordinate')
+    parser.add_argument('--w', type=str, required=True, help='Column name for width')
+    parser.add_argument('--h', type=str, required=True, help='Column name for height')
+    parser.add_argument('--d', type=str, required=True, help='Column name for depth')
+    parser.add_argument('--seg_alg',       type=str, default='gmm',    choices=['gmm', 'knn', 'fcm', 'otsu'], help='Segmentation algorithm to use')
+    parser.add_argument('--dataset_name',  type=str, default='DLCS24', help='Dataset to use')
+    parser.add_argument('--expansion_mm',  type=float, default=1.0, help='Expansion in mm')
+    parser.add_argument('--use_expand',           action='store_true', help='Use expansion if set')
+    parser.add_argument('--extract_radiomics',    action='store_true', help='extarct Radiomics if set')
+    parser.add_argument('--params_json',          type=str, required=True, help="Path to JSON file with radiomics parameters")
+    parser.add_argument('--save_the_generated_mask',  action='store_true', help='Use expansion if set')
+    parser.add_argument('--save_nifti_path', type=str, help='Path to save the nifti files')
+    args           = parser.parse_args()
+    raw_data_path  = args.raw_data_path
+    csv_save_path  = args.csv_save_path
+    dataset_csv    = args.dataset_csv
+    seg_alg        = args.seg_alg
+    if args.use_expand:
+        if args.extract_radiomics:
+            output_csv = csv_save_path + f'PyRadiomics_CandidateSeg_{args.dataset_name}_{seg_alg}_expand_{args.expansion_mm}mm.csv'
+            Erroroutput_csv = csv_save_path + f'PyRadiomics_CandidateSeg_{args.dataset_name}_{seg_alg}_expand_{args.expansion_mm}mm_Error.csv'
+        else:
+            output_csv = csv_save_path + f'CandidateSeg_{args.dataset_name}_{seg_alg}_expand_{args.expansion_mm}mm.csv'
+            Erroroutput_csv = csv_save_path + f'CandidateSeg_{args.dataset_name}_{seg_alg}_expand_{args.expansion_mm}mm_Error.csv'
+    else:
+        if args.extract_radiomics:
+            output_csv = csv_save_path + f'PyRadiomics_CandidateSeg_{args.dataset_name}_{seg_alg}.csv'
+            Erroroutput_csv = csv_save_path + f'PyRadiomics_CandidateSeg_{args.dataset_name}_{seg_alg}_Error.csv'
+        else:
+            output_csv = csv_save_path + f'CandidateSeg_{args.dataset_name}_{seg_alg}.csv'
+            Erroroutput_csv = csv_save_path + f'CandidateSeg_{args.dataset_name}_{seg_alg}_Error.csv'
+    # Derive the log file name from the output CSV file
+    log_file = output_csv.replace('.csv', '.log')
+    # Configure logging
+    logging.basicConfig(
+        filename=log_file,
+        level=logging.INFO,
+        format="%(asctime)s - %(levelname)s - %(message)s",
+        datefmt="%Y-%m-%d %H:%M:%S"
+    )
+    logging.info(f"Output CSV File: {output_csv}")
+    logging.info(f"Error CSV File: {Erroroutput_csv}")
+    logging.info(f"Log File Created: {log_file}")
+    logging.info("File names generated successfully.")
+    ###----input CSV
+    df                      = pd.read_csv(dataset_csv)
+    final_dect              = df[args.nifti_clm_name].unique()
+    # Initialize the feature extractor
+    with open(args.params_json, 'r') as f:
+        params = json.load(f)
+    interpolator_map = {"sitkBSpline": sitk.sitkBSpline,"sitkNearestNeighbor": sitk.sitkNearestNeighbor}
+    params["interpolator"] = interpolator_map.get(params["interpolator"], sitk.sitkBSpline)
+    params["labelInterpolator"] = interpolator_map.get(params["labelInterpolator"], sitk.sitkNearestNeighbor)
+    extractor = featureextractor.RadiomicsFeatureExtractor(**params)
+    # Prepare the output CSV
+    output_df               = pd.DataFrame()
+    Error_ids = []
+    for dictonary_list_i in range(0,len(final_dect)):
+        try:
+            logging.info(f"---Loading---: {dictonary_list_i+1}")
+            print(make_bold('|' + '-'*30 + ' No={} '.format(dictonary_list_i+1) + '-'*30 + '|'))
+            print('\n')
+            desired_value      = final_dect[dictonary_list_i]
+            filtered_df        = df[df[args.nifti_clm_name] == desired_value]
+            example_dictionary = filtered_df.reset_index()
+            logging.info(f"Loading the Image:{example_dictionary[args.nifti_clm_name][0]}")
+            logging.info(f"Number of Annotations:{len(example_dictionary)}")
+            print('Loading the Image: {}'.format(example_dictionary[args.nifti_clm_name][0]))
+            print('Number of Annotations: {}'.format(len(example_dictionary)))
+            ct_nifti_path = raw_data_path + example_dictionary[args.nifti_clm_name][0]
+            ct_image      = sitk.ReadImage(ct_nifti_path)
+            ct_array      = sitk.GetArrayFromImage(ct_image)
+            for Which_box_to_use in range(0,len(example_dictionary)):
+                print('-----------------------------------------------------------------------------------------------')
+                if args.unique_Annotation_id in example_dictionary.columns:
+                    annotation_id = example_dictionary[args.unique_Annotation_id][Which_box_to_use]
+                else:
+                    # Generate an ID using the image name (without extension) and an index
+                    image_name = example_dictionary[args.nifti_clm_name][0].split('.nii')[0]
+                    annotation_id = f"{image_name}_annotation_{Which_box_to_use+1}"
+                if args.Malignant_lbl in example_dictionary.columns:
+                   annotation_lbl       = example_dictionary[args.Malignant_lbl][Which_box_to_use]
+                print('Annotation-ID = {}'.format(annotation_id))
+                worldCoord = np.asarray([float(example_dictionary[args.coordX][Which_box_to_use]), float(example_dictionary[args.coordY][Which_box_to_use]), float(example_dictionary[args.coordZ][Which_box_to_use])])
+                voxelCoord = ct_image.TransformPhysicalPointToIndex(worldCoord)
+                voxel_coords = voxelCoord
+                print('WorldCoord  CCC (x,y,z) = {}'.format(worldCoord))
+                print('VoxelCoord CCC (x,y,z) = {}'.format(voxelCoord))
+                whd_worldCoord = np.asarray([float(example_dictionary[args.w][Which_box_to_use]), float(example_dictionary[args.h][Which_box_to_use]), float(example_dictionary[args.d][Which_box_to_use])])
+                spacing        = ct_image.GetSpacing()
+                w = int(whd_worldCoord[0] / spacing[0])
+                h = int(whd_worldCoord[1] / spacing[1])
+                d = int(whd_worldCoord[2] / spacing[2])
+                whd_voxelCoord = [w, h, d]
+                print('WorldCoord (w,h,d) = {}'.format(whd_worldCoord))
+                print('VoxelCoord (w,h,d) = {}'.format(whd_voxelCoord))
+                # Define bounding box
+                center_index   = voxelCoord
+                size_voxel     = np.array(whd_voxelCoord) / 2
+                bbox_center    = [voxel_coords[2],voxel_coords[1],voxel_coords[0]]
+                bbox_whd       = [d,h,w]
+                #--Image-processing algorithms for segmentations
+                if seg_alg=='gmm':
+                    mask_image_array = segment_nodule_gmm(ct_array, bbox_center, bbox_whd)
+                if seg_alg=='knn':
+                    mask_image_array = segment_nodule_kmeans(ct_array, bbox_center, bbox_whd)
+                if seg_alg=='fcm':
+                    mask_image_array = segment_nodule_fcm(ct_array, bbox_center, bbox_whd)
+                if seg_alg=='otsu':
+                    mask_image_array = segment_nodule_otsu(ct_array, bbox_center, bbox_whd)
+                if args.use_expand:
+                    mask_image_array = expand_mask_by_distance(mask_image_array, spacing=spacing, expansion_mm=args.expansion_mm)
+                #--- Segmentation---#
+                mask_image       = sitk.GetImageFromArray(mask_image_array)
+                mask_image.SetSpacing(ct_image.GetSpacing())
+                mask_image.SetDirection(ct_image.GetDirection())
+                mask_image.SetOrigin(ct_image.GetOrigin())
+                if args.extract_radiomics:
+                    # Extract features
+                    features = extractor.execute(ct_image, mask_image)
+                    # Convert the features to a pandas DataFrame row
+                    feature_row = pd.DataFrame([features])
+                    feature_row[args.nifti_clm_name]   = example_dictionary[args.nifti_clm_name][0]
+                    feature_row['candidateID']         = annotation_id
+                    if args.Malignant_lbl in example_dictionary.columns:
+                       feature_row[args.Malignant_lbl]    = annotation_lbl
+                else:
+                    # Convert the features to a pandas DataFrame row
+                    if args.Malignant_lbl in example_dictionary.columns:
+                       feature_row = pd.DataFrame({args.nifti_clm_name: [example_dictionary[args.nifti_clm_name][0]],'candidateID': [annotation_id],args.Malignant_lbl: [annotation_lbl]})
+                    else:
+                       feature_row = pd.DataFrame({args.nifti_clm_name: [example_dictionary[args.nifti_clm_name][0]],'candidateID': [annotation_id]})
+                    print(feature_row)
+                feature_row[args.coordX] = example_dictionary[args.coordX][Which_box_to_use]
+                feature_row[args.coordY] = example_dictionary[args.coordY][Which_box_to_use]
+                feature_row[args.coordZ] = example_dictionary[args.coordZ][Which_box_to_use]
+                feature_row[args.w]      = example_dictionary[args.w][Which_box_to_use]
+                feature_row[args.h]      = example_dictionary[args.h][Which_box_to_use]
+                feature_row[args.d]      = example_dictionary[args.d][Which_box_to_use]
+                # Save mask if needed
+                if args.save_the_generated_mask:
+                    output_nifti_path = os.path.join(args.save_nifti_path, f"{annotation_id}.nii.gz")
+                    sitk.WriteImage(mask_image, output_nifti_path)
+                # Append the row to the output DataFrame
+                output_df = pd.concat([output_df, feature_row], ignore_index=True)
+        except Exception as e:
+            logging.error(f"An error occurred: {str(e)}")
+            print(f" Error occured: {e}")
+            Error_ids.append(final_dect[dictonary_list_i])
+            pass
+    # Save the output DataFrame to a CSV file
+    output_df.to_csv(output_csv, index=False,encoding='utf-8')
+    print("completed and saved to {}".format(output_csv))
+    Erroroutput_df = pd.DataFrame(list(Error_ids),columns=[args.nifti_clm_name])
+    Erroroutput_df.to_csv(Erroroutput_csv, index=False,encoding='utf-8')
+    print("completed and saved Error to {}".format(Erroroutput_csv))
+if __name__ == "__main__":
+    seg_pyradiomics_main()

scr/candidate_worldCoord_patchExtarctor_pipeline.py ADDED Viewed

	@@ -0,0 +1,177 @@

+from cvseg_utils import*
+import warnings
+import os
+import logging
+import random
+import cv2
+import json
+import torch
+import logging
+import argparse
+import numpy as np
+import pandas as pd
+from tqdm import tqdm
+import SimpleITK as sitk
+from monai.transforms import Compose, ScaleIntensityRanged
+random.seed(200)
+np.random.seed(200)
+from datetime import datetime
+def create_folder_if_not_exists(folder_path):
+    import os
+    # Check if the folder exists
+    if not os.path.exists(folder_path):
+        # If the folder doesn't exist, create it
+        os.makedirs(folder_path)
+        print(f"Folder created: {folder_path}")
+    else:
+        print(f"Folder already exists: {folder_path}")
+def nifti_patche_extractor_for_worldCoord_main():
+    parser = argparse.ArgumentParser(description='Nodule segmentation and feature extraction from CT images.')
+    parser.add_argument('--raw_data_path', type=str, required=True, help='Path to raw CT images')
+    parser.add_argument('--csv_save_path', type=str, required=True, help='Path to save the CSV files')
+    parser.add_argument('--dataset_csv',   type=str, required=True, help='Path to the dataset CSV')
+    parser.add_argument('--dataset_name',  type=str, default='DLCS24', help='Dataset to use')
+    # Allow multiple column names as input arguments
+    parser.add_argument('--nifti_clm_name',   type=str, required=True, help='name to the nifti column name')
+    parser.add_argument('--unique_Annotation_id', type=str, help='Column for unique annotation ID')
+    parser.add_argument('--Malignant_lbl', type=str, help='Column name for malignancy labels')
+    parser.add_argument('--coordX', type=str, required=True, help='Column name for X coordinate')
+    parser.add_argument('--coordY', type=str, required=True, help='Column name for Y coordinate')
+    parser.add_argument('--coordZ', type=str, required=True, help='Column name for Z coordinate')
+    parser.add_argument('--patch_size', type=int, nargs=3, default=[64, 64, 64], help="Patch size as three integers, e.g., --patch_size 64 64 64")
+    # Normalization (4 values)
+    parser.add_argument('--normalization', type=float, nargs=4, default=[-1000, 500.0, 0.0, 1.0],help="Normalization values as four floats: A_min A_max B_min B_max")
+    # Clip (Boolean from string input)
+    parser.add_argument('--clip', type=str, choices=["True", "False"], default="False",help="Enable or disable clipping (True/False). Default is False.")
+    parser.add_argument('--save_nifti_path', type=str, help='Path to save the nifti files')
+    args           = parser.parse_args()
+    raw_data_path  = args.raw_data_path
+    csv_save_path  = args.csv_save_path
+    dataset_csv    = args.dataset_csv
+    create_folder_if_not_exists(csv_save_path)
+    create_folder_if_not_exists(args.save_nifti_path)
+    # Extract normalization values
+    A_min, A_max, B_min, B_max = args.normalization
+    # Convert clip argument to boolean
+    CLIP = args.clip == "True"
+    output_csv = csv_save_path + f'CandidateSeg_{args.dataset_name}_patch{args.patch_size[0]}x{args.patch_size[1]}y{args.patch_size[2]}z.csv'
+    Erroroutput_csv = csv_save_path + f'CandidateSeg_{args.dataset_name}_patch{args.patch_size[0]}x{args.patch_size[1]}y{args.patch_size[2]}z_Error.csv'
+    # Derive the log file name from the output CSV file
+    log_file = output_csv.replace('.csv', '.log')
+    # Configure logging
+    logging.basicConfig(
+        filename=log_file,
+        level=logging.INFO,
+        format="%(asctime)s - %(levelname)s - %(message)s",
+        datefmt="%Y-%m-%d %H:%M:%S"
+    )
+    logging.info(f"Output CSV File: {output_csv}")
+    logging.info(f"Error CSV File: {Erroroutput_csv}")
+    logging.info(f"Log File Created: {log_file}")
+    logging.info("File names generated successfully.")
+    ###----input CSV
+    df                      = pd.read_csv(dataset_csv)
+    final_dect              = df[args.nifti_clm_name].unique()
+    output_df               = pd.DataFrame()
+    Error_ids = []
+    for dictonary_list_i in tqdm(range(0,len(final_dect)), desc='Processing CTs'):
+        try:
+            logging.info(f"---Loading---: {dictonary_list_i+1}")
+            #print(make_bold('|' + '-'*30 + ' No={} '.format(dictonary_list_i+1) + '-'*30 + '|'))
+            #print('\n')
+            desired_value      = final_dect[dictonary_list_i]
+            filtered_df        = df[df[args.nifti_clm_name] == desired_value]
+            example_dictionary = filtered_df.reset_index()
+            logging.info(f"Loading the Image:{example_dictionary[args.nifti_clm_name][0]}")
+            logging.info(f"Number of Annotations:{len(example_dictionary)}")
+            #print('Loading the Image: {}'.format(example_dictionary[args.nifti_clm_name][0]))
+            #print('Number of Annotations: {}'.format(len(example_dictionary)))
+            ct_nifti_path = raw_data_path + example_dictionary[args.nifti_clm_name][0]
+            ct_image      = sitk.ReadImage(ct_nifti_path)
+            ct_array      = sitk.GetArrayFromImage(ct_image)
+            torch_image         = torch.from_numpy(ct_array)
+            temp_torch_image    = {"image": torch_image}
+            intensity_transform = Compose([ScaleIntensityRanged(keys=["image"], a_min=A_min, a_max=A_max, b_min=B_min, b_max=B_max, clip=CLIP),])
+            transformed_image   = intensity_transform(temp_torch_image)
+            numpyImage          = transformed_image["image"].numpy()
+            for Which_box_to_use in range(0,len(example_dictionary)):
+                #print('-----------------------------------------------------------------------------------------------')
+                if args.unique_Annotation_id in example_dictionary.columns:
+                    annotation_id = example_dictionary[args.unique_Annotation_id][Which_box_to_use]
+                else:
+                    # Generate an ID using the image name (without extension) and an index
+                    image_name = example_dictionary[args.nifti_clm_name][0].split('.nii')[0]
+                    annotation_id = f"{image_name}_candidate_{Which_box_to_use+1}"
+                worldCoord = np.asarray([float(example_dictionary[args.coordX][Which_box_to_use]), float(example_dictionary[args.coordY][Which_box_to_use]), float(example_dictionary[args.coordZ][Which_box_to_use])])
+                voxelCoord = ct_image.TransformPhysicalPointToIndex(worldCoord)
+                # Access individual values
+                w = args.patch_size[0]
+                h = args.patch_size[1]
+                d = args.patch_size[2]
+                start_x, end_x = int(voxelCoord[0] - w/2), int(voxelCoord[0] + w/2)
+                start_y, end_y = int(voxelCoord[1] - h/2), int(voxelCoord[1] + h/2)
+                start_z, end_z = int(voxelCoord[2] - d/2), int(voxelCoord[2] + d/2)
+                X, Y, Z = int(voxelCoord[0]), int(voxelCoord[1]), int(voxelCoord[2])
+                numpy_to_save_np = numpyImage[max(start_z,0):end_z, max(start_y,0):end_y, max(start_x,0):end_x]
+                # Pad if necessary
+                if np.any(numpy_to_save_np.shape != (d, h, w)):
+                    dZ, dY, dX = numpyImage.shape
+                    numpy_to_save_np = np.pad(numpy_to_save_np, ((max(d // 2 - Z, 0), d // 2 - min(dZ - Z, d // 2)),
+                                                                 (max(h // 2 - Y, 0), h // 2 - min(dY - Y, h // 2)),
+                                                                 (max(w // 2 - X, 0), w // 2 - min(dX - X, w // 2))), mode="constant", constant_values=0.)
+                #--- Segmentation---#
+                patch_image       = sitk.GetImageFromArray(numpy_to_save_np)
+                patch_image.SetSpacing(ct_image.GetSpacing())
+                patch_image.SetDirection(ct_image.GetDirection())
+                patch_image.SetOrigin(ct_image.GetOrigin())
+                if args.Malignant_lbl in example_dictionary.columns:
+                    feature_row = pd.DataFrame({args.nifti_clm_name: [example_dictionary[args.nifti_clm_name][0]],'candidateID': [annotation_id],args.Malignant_lbl: [example_dictionary[args.Malignant_lbl][Which_box_to_use]]})
+                else:
+                    feature_row = pd.DataFrame({args.nifti_clm_name: [example_dictionary[args.nifti_clm_name][0]],'candidateID': [annotation_id]})
+                feature_row[args.coordX] = example_dictionary[args.coordX][Which_box_to_use]
+                feature_row[args.coordY] = example_dictionary[args.coordY][Which_box_to_use]
+                feature_row[args.coordZ] = example_dictionary[args.coordZ][Which_box_to_use]
+                # Save
+                output_nifti_path = os.path.join(args.save_nifti_path, f"{annotation_id}.nii.gz")
+                # Ensure the directory exists before writing the file
+                output_nifti_dir = os.path.dirname(output_nifti_path)
+                os.makedirs(output_nifti_dir, exist_ok=True)         # Creates the directory if it doesn't exist
+                sitk.WriteImage(patch_image, output_nifti_path)
+                # Append the row to the output DataFrame
+                output_df = pd.concat([output_df, feature_row], ignore_index=True)
+        except Exception as e:
+            logging.error(f"An error occurred: {str(e)}")
+            print(f" Error occured: {e}")
+            Error_ids.append(final_dect[dictonary_list_i])
+            pass
+    # Save the output DataFrame to a CSV file
+    output_df.to_csv(output_csv, index=False,encoding='utf-8')
+    print("completed and saved to {}".format(output_csv))
+    Erroroutput_df = pd.DataFrame(list(Error_ids),columns=[args.nifti_clm_name])
+    Erroroutput_df.to_csv(Erroroutput_csv, index=False,encoding='utf-8')
+    print("completed and saved Error to {}".format(Erroroutput_csv))
+if __name__ == "__main__":
+    nifti_patche_extractor_for_worldCoord_main()

scr/cvseg_utils.py ADDED Viewed

	@@ -0,0 +1,488 @@

+#-- Import libraries
+import os
+import argparse
+import json
+# Numerical and Data Handling
+import numpy as np
+import pandas as pd
+# Medical Imaging
+import SimpleITK as sitk
+import radiomics
+from radiomics import featureextractor
+# Machine Learning & Clustering
+from sklearn.cluster import KMeans
+from sklearn.mixture import GaussianMixture
+import skfuzzy as fuzz
+# Image Processing & Segmentation
+import scipy.ndimage as ndimage
+from skimage.filters import threshold_otsu
+from skimage.segmentation import watershed
+from skimage.feature import peak_local_max
+from skimage import morphology
+from scipy.ndimage import distance_transform_edt, binary_erosion
+def make_bold(text):
+    return f"\033[1m{text}\033[0m"
+def load_itk_image(filename):
+    itkimage = sitk.ReadImage(filename)
+    numpyImage = sitk.GetArrayFromImage(itkimage)
+    numpyOrigin = itkimage.GetOrigin()
+    numpySpacing = itkimage.GetSpacing()
+    return numpyImage, numpyOrigin, numpySpacing
+def normalize_image_to_uint8(image, lower_bound=-1000, upper_bound=100):
+    clipped_img = np.clip(image, lower_bound, upper_bound)
+    normalized_img = ((clipped_img - lower_bound) / (upper_bound - lower_bound)) * 255.0
+    normalized_img = normalized_img.astype(np.uint8)
+    return normalized_img
+def segment_nodule_kmeans(ct_image, bbox_center, bbox_whd, margin=5, n_clusters=2):
+    """
+    Segments a nodule in a 3D CT image using k-means clustering with a margin around the bounding box.
+    Parameters:
+    - ct_image: 3D NumPy array representing the CT image.
+    - bbox_center: Tuple of (x, y, z) coordinates for the center of the bounding box.
+    - bbox_whd: Tuple of (w, h, d) representing the width, height, and depth of the bounding box.
+    - margin: Margin to add around the bounding box (default is 5).
+    - n_clusters: Number of clusters to use in k-means (default is 2).
+    Returns:
+    - segmented_image: 3D NumPy array with the segmented nodule.
+    """
+    x_center, y_center, z_center = bbox_center
+    w, h, d = bbox_whd
+    # Calculate the bounding box with margin
+    x_start, x_end = max(0, x_center - w//2 - margin), min(ct_image.shape[0], x_center + w//2 + margin)
+    y_start, y_end = max(0, y_center - h//2 - margin), min(ct_image.shape[1], y_center + h//2 + margin)
+    z_start, z_end = max(0, z_center - d//2 - margin), min(ct_image.shape[2], z_center + d//2 + margin)
+    bbox_region = ct_image[x_start:x_end, y_start:y_end, z_start:z_end]
+    # Reshape the region for k-means clustering
+    flat_region = bbox_region.reshape(-1, 1)
+    # Perform k-means clustering
+    kmeans = KMeans(n_clusters=n_clusters, n_init=10,random_state=0).fit(flat_region)
+    labels = kmeans.labels_
+    # Reshape the labels back to the original bounding box shape
+    clustered_region = labels.reshape(bbox_region.shape)
+    # Assume the nodule is in the cluster with the highest mean intensity
+    nodule_cluster = np.argmax(kmeans.cluster_centers_)
+    # Create a binary mask for the nodule
+    nodule_mask = (clustered_region == nodule_cluster)
+    # Apply morphological operations to refine the segmentation
+    nodule_mask = ndimage.binary_closing(nodule_mask, structure=np.ones((3, 3, 3)))
+    nodule_mask = ndimage.binary_opening(nodule_mask, structure=np.ones((2, 2, 2)))
+    # Initialize the segmented image
+    segmented_image = np.zeros_like(ct_image, dtype=np.uint8)
+    # Place the nodule mask in the correct position in the segmented image
+    segmented_image[x_start:x_end, y_start:y_end, z_start:z_end] = nodule_mask
+    return segmented_image
+def segment_nodule_gmm(ct_image, bbox_center, bbox_whd, margin=5, n_components=2):
+    """
+    Segments a nodule in a 3D CT image using a Gaussian Mixture Model with a margin around the bounding box.
+    Parameters:
+    - ct_image: 3D NumPy array representing the CT image.
+    - bbox_center: Tuple of (x, y, z) coordinates for the center of the bounding box.
+    - bbox_whd: Tuple of (w, h, d) representing the width, height, and depth of the bounding box.
+    - margin: Margin to add around the bounding box (default is 5).
+    - n_components: Number of components to use in the Gaussian Mixture Model (default is 2).
+    Returns:
+    - segmented_image: 3D NumPy array with the segmented nodule.
+    """
+    x_center, y_center, z_center = bbox_center
+    w, h, d = bbox_whd
+    # Calculate the bounding box with margin
+    x_start, x_end = max(0, x_center - w//2 - margin), min(ct_image.shape[0], x_center + w//2 + margin)
+    y_start, y_end = max(0, y_center - h//2 - margin), min(ct_image.shape[1], y_center + h//2 + margin)
+    z_start, z_end = max(0, z_center - d//2 - margin), min(ct_image.shape[2], z_center + d//2 + margin)
+    bbox_region = ct_image[x_start:x_end, y_start:y_end, z_start:z_end]
+    # Reshape the region for GMM
+    flat_region = bbox_region.reshape(-1, 1)
+    # Perform GMM
+    gmm = GaussianMixture(n_components=n_components, random_state=0).fit(flat_region)
+    labels = gmm.predict(flat_region)
+    # Reshape the labels back to the original bounding box shape
+    clustered_region = labels.reshape(bbox_region.shape)
+    # Assume the nodule is in the component with the highest mean intensity
+    nodule_component = np.argmax(gmm.means_)
+    # Create a binary mask for the nodule
+    nodule_mask = (clustered_region == nodule_component)
+    # Apply morphological operations to refine the segmentation
+    nodule_mask = ndimage.binary_closing(nodule_mask, structure=np.ones((3, 3, 3)))
+    nodule_mask = ndimage.binary_opening(nodule_mask, structure=np.ones((3, 3, 3)))
+    # Initialize the segmented image
+    segmented_image = np.zeros_like(ct_image, dtype=np.uint8)
+    # Place the nodule mask in the correct position in the segmented image
+    segmented_image[x_start:x_end, y_start:y_end, z_start:z_end] = nodule_mask
+    return segmented_image
+def segment_nodule_fcm(ct_image, bbox_center, bbox_whd, margin=5, n_clusters=2):
+    """
+    Segments a nodule in a 3D CT image using Fuzzy C-means clustering with a margin around the bounding box.
+    Parameters:
+    - ct_image: 3D NumPy array representing the CT image.
+    - bbox_center: Tuple of (x, y, z) coordinates for the center of the bounding box.
+    - bbox_whd: Tuple of (w, h, d) representing the width, height, and depth of the bounding box.
+    - margin: Margin to add around the bounding box (default is 5).
+    - n_clusters: Number of clusters to use in Fuzzy C-means (default is 2).
+    Returns:
+    - segmented_image: 3D NumPy array with the segmented nodule.
+    """
+    x_center, y_center, z_center = bbox_center
+    w, h, d = bbox_whd
+    # Calculate the bounding box with margin
+    x_start, x_end = max(0, x_center - w//2 - margin), min(ct_image.shape[0], x_center + w//2 + margin)
+    y_start, y_end = max(0, y_center - h//2 - margin), min(ct_image.shape[1], y_center + h//2 + margin)
+    z_start, z_end = max(0, z_center - d//2 - margin), min(ct_image.shape[2], z_center + d//2 + margin)
+    bbox_region = ct_image[x_start:x_end, y_start:y_end, z_start:z_end]
+    # Reshape the region for FCM
+    flat_region = bbox_region.reshape(-1, 1)
+    # Perform FCM clustering
+    cntr, u, u0, d, jm, p, fpc = fuzz.cluster.cmeans(flat_region.T, n_clusters, 2, error=0.005, maxiter=1000, init=None)
+    # Assign each voxel to the cluster with the highest membership
+    labels = np.argmax(u, axis=0)
+    # Reshape the labels back to the original bounding box shape
+    clustered_region = labels.reshape(bbox_region.shape)
+    # Assume the nodule is in the cluster with the highest mean intensity
+    nodule_cluster = np.argmax(cntr)
+    # Create a binary mask for the nodule
+    nodule_mask = (clustered_region == nodule_cluster)
+    # Apply morphological operations to refine the segmentation
+    nodule_mask = ndimage.binary_closing(nodule_mask, structure=np.ones((3, 3, 3)))
+    nodule_mask = ndimage.binary_opening(nodule_mask, structure=np.ones((3, 3, 3)))
+    # Initialize the segmented image
+    segmented_image = np.zeros_like(ct_image, dtype=np.uint8)
+    # Place the nodule mask in the correct position in the segmented image
+    segmented_image[x_start:x_end, y_start:y_end, z_start:z_end] = nodule_mask
+    return segmented_image
+def segment_nodule_otsu(ct_image, bbox_center, bbox_whd, margin=5):
+    """
+    Segments a nodule in a 3D CT image using Otsu's thresholding with a margin around the bounding box.
+    Parameters:
+    - ct_image: 3D NumPy array representing the CT image.
+    - bbox_center: Tuple of (x, y, z) coordinates for the center of the bounding box.
+    - bbox_whd: Tuple of (w, h, d) representing the width, height, and depth of the bounding box.
+    - margin: Margin to add around the bounding box (default is 5).
+    Returns:
+    - segmented_image: 3D NumPy array with the segmented nodule.
+    """
+    x_center, y_center, z_center = bbox_center
+    w, h, d = bbox_whd
+    # Calculate the bounding box with margin
+    x_start, x_end = max(0, x_center - w//2 - margin), min(ct_image.shape[0], x_center + w//2 + margin)
+    y_start, y_end = max(0, y_center - h//2 - margin), min(ct_image.shape[1], y_center + h//2 + margin)
+    z_start, z_end = max(0, z_center - d//2 - margin), min(ct_image.shape[2], z_center + d//2 + margin)
+    bbox_region = ct_image[x_start:x_end, y_start:y_end, z_start:z_end]
+    # Flatten the region for thresholding
+    flat_region = bbox_region.flatten()
+    # Calculate the Otsu threshold
+    otsu_threshold = threshold_otsu(flat_region)
+    # Apply the threshold to create a binary mask
+    nodule_mask = bbox_region >= otsu_threshold
+    # Apply morphological operations to refine the segmentation
+    nodule_mask = ndimage.binary_closing(nodule_mask, structure=np.ones((3, 3, 3)))
+    nodule_mask = ndimage.binary_opening(nodule_mask, structure=np.ones((3, 3, 3)))
+    # Initialize the segmented image
+    segmented_image = np.zeros_like(ct_image, dtype=np.uint8)
+    # Place the nodule mask in the correct position in the segmented image
+    segmented_image[x_start:x_end, y_start:y_end, z_start:z_end] = nodule_mask
+    return segmented_image
+def expand_mask_by_distance(segmented_nodule_gmm, spacing, expansion_mm):
+    """
+    Expands the segmentation mask by a given distance in mm in all directions by directly updating pixel values.
+    Parameters:
+    segmented_nodule_gmm (numpy array): 3D binary mask of the nodule (1 for nodule, 0 for background).
+    spacing (tuple): Spacing of the image in mm for each voxel, given as (spacing_x, spacing_y, spacing_z).
+    expansion_mm (float): Distance to expand the mask in millimeters.
+    Returns:
+    numpy array: Expanded segmentation mask.
+    """
+    # Reorder spacing to match the numpy array's (z, y, x) format
+    spacing_reordered = (spacing[2], spacing[1], spacing[0])  # (spacing_z, spacing_y, spacing_x)
+    # Calculate the number of pixels to expand in each dimension
+    expand_pixels = np.array([int(np.round(expansion_mm / s)) for s in spacing_reordered])
+    # Create a new expanded mask with the same shape
+    expanded_mask = np.zeros_like(segmented_nodule_gmm)
+    # Get the coordinates of all white pixels in the original mask
+    white_pixel_coords = np.argwhere(segmented_nodule_gmm == 1)
+    # Expand each white pixel by adding the specified number of pixels in each direction
+    for coord in white_pixel_coords:
+        z, y, x = coord  # Extract the z, y, x coordinates of each white pixel
+        # Define the range to expand for each coordinate
+        z_range = range(max(0, z - expand_pixels[0]), min(segmented_nodule_gmm.shape[0], z + expand_pixels[0] + 1))
+        y_range = range(max(0, y - expand_pixels[1]), min(segmented_nodule_gmm.shape[1], y + expand_pixels[1] + 1))
+        x_range = range(max(0, x - expand_pixels[2]), min(segmented_nodule_gmm.shape[2], x + expand_pixels[2] + 1))
+        # Update the new mask by setting all pixels in this range to 1
+        for z_new in z_range:
+            for y_new in y_range:
+                for x_new in x_range:
+                    expanded_mask[z_new, y_new, x_new] = 1
+    return expanded_mask
+def find_nodule_lobe(cccwhd, lung_mask, class_map):
+    """
+    Determine the lung lobe where a nodule is located based on a 3D mask and bounding box.
+    Parameters:
+    cccwhd (list or tuple): Bounding box in the format [center_x, center_y, center_z, width, height, depth].
+    lung_mask (numpy array): 3D array representing the lung mask with different lung regions.
+    class_map (dict): Dictionary mapping lung region labels to their names.
+    Returns:
+    str: Name of the lung lobe where the nodule is located.
+    """
+    center_x, center_y, center_z, width, height, depth = cccwhd
+    # Calculate the bounding box limits
+    start_x = int(center_x - width // 2)
+    end_x = int(center_x + width // 2)
+    start_y = int(center_y - height // 2)
+    end_y = int(center_y + height // 2)
+    start_z = int(center_z - depth // 2)
+    end_z = int(center_z + depth // 2)
+    # Ensure the indices are within the mask dimensions
+    start_x = max(0, start_x)
+    end_x = min(lung_mask.shape[0], end_x)
+    start_y = max(0, start_y)
+    end_y = min(lung_mask.shape[1], end_y)
+    start_z = max(0, start_z)
+    end_z = min(lung_mask.shape[2], end_z)
+    # Extract the region of interest (ROI) from the mask
+    roi = lung_mask[start_x:end_x, start_y:end_y, start_z:end_z]
+    # Count the occurrences of each lobe label within the ROI
+    unique, counts = np.unique(roi, return_counts=True)
+    label_counts = dict(zip(unique, counts))
+    # Exclude the background (label 0)
+    if 0 in label_counts:
+        del label_counts[0]
+    # Find the label with the maximum count
+    if label_counts:
+        nodule_lobe = max(label_counts, key=label_counts.get)
+    else:
+        nodule_lobe = None
+    # Map the label to the corresponding lung lobe
+    if nodule_lobe is not None:
+        nodule_lobe_name = class_map["lungs"][nodule_lobe]
+    else:
+        nodule_lobe_name = "Undefined"
+    return nodule_lobe_name
+def find_nodule_lobe_and_distance(cccwhd, lung_mask, class_map,spacing):
+    """
+    Determine the lung lobe where a nodule is located and measure its distance from the lung wall.
+    Parameters:
+    cccwhd (list or tuple): Bounding box in the format [center_x, center_y, center_z, width, height, depth].
+    lung_mask (numpy array): 3D array representing the lung mask with different lung regions.
+    class_map (dict): Dictionary mapping lung region labels to their names.
+    Returns:
+    tuple: (Name of the lung lobe, Distance from the lung wall)
+    """
+    center_x, center_y, center_z, width, height, depth = cccwhd
+    # Calculate the bounding box limits
+    start_x = int(center_x - width // 2)
+    end_x = int(center_x + width // 2)
+    start_y = int(center_y - height // 2)
+    end_y = int(center_y + height // 2)
+    start_z = int(center_z - depth // 2)
+    end_z = int(center_z + depth // 2)
+    # Ensure the indices are within the mask dimensions
+    start_x = max(0, start_x)
+    end_x = min(lung_mask.shape[0], end_x)
+    start_y = max(0, start_y)
+    end_y = min(lung_mask.shape[1], end_y)
+    start_z = max(0, start_z)
+    end_z = min(lung_mask.shape[2], end_z)
+    # Extract the region of interest (ROI) from the mask
+    roi = lung_mask[start_x:end_x, start_y:end_y, start_z:end_z]
+    # Count the occurrences of each lobe label within the ROI
+    unique, counts = np.unique(roi, return_counts=True)
+    label_counts = dict(zip(unique, counts))
+    # Exclude the background (label 0)
+    if 0 in label_counts:
+        del label_counts[0]
+    # Find the label with the maximum count
+    if label_counts:
+        nodule_lobe = max(label_counts, key=label_counts.get)
+    else:
+        nodule_lobe = None
+    # Map the label to the corresponding lung lobe
+    if nodule_lobe is not None:
+        nodule_lobe_name = class_map["lungs"][nodule_lobe]
+    else:
+        nodule_lobe_name = "Undefined"
+    # Calculate the distance from the nodule centroid to the nearest lung wall
+    nodule_centroid = np.array([center_x, center_y, center_z])
+    # Create a binary lung mask where lung region is 1 and outside lung is 0
+    lung_binary_mask = lung_mask > 0
+    # Create the lung wall mask by finding the outer boundary
+    # Use binary erosion to shrink the lung mask, then subtract it from the original mask to get the boundary
+    lung_eroded    = binary_erosion(lung_binary_mask)
+    lung_wall_mask = lung_binary_mask & ~lung_eroded  # Lung wall mask is the outermost boundary (contour)
+    # Compute the distance transform from the lung wall
+    distance_transform = distance_transform_edt(~lung_wall_mask)  # Compute distance to nearest lung boundary
+    # Get the distance from the nodule centroid to the nearest lung wall in voxel units
+    voxel_distance_to_lung_wall = distance_transform[center_x, center_y, center_z]
+    # Convert voxel distance to real-world distance in mm
+    physical_distance_to_lung_wall = voxel_distance_to_lung_wall * np.sqrt(
+        spacing[0]**2 + spacing[1]**2 + spacing[2]**2
+    )
+    return nodule_lobe_name, voxel_distance_to_lung_wall,physical_distance_to_lung_wall
+# +
+def expand_mask_by_distance(segmented_nodule_gmm, spacing, expansion_mm):
+    """
+    Expands the segmentation mask by a given distance in mm in all directions by directly updating pixel values.
+    Parameters:
+    segmented_nodule_gmm (numpy array): 3D binary mask of the nodule (1 for nodule, 0 for background).
+    spacing (tuple): Spacing of the image in mm for each voxel, given as (spacing_x, spacing_y, spacing_z).
+    expansion_mm (float): Distance to expand the mask in millimeters.
+    Returns:
+    numpy array: Expanded segmentation mask.
+    """
+    # Reorder spacing to match the numpy array's (z, y, x) format
+    spacing_reordered = (spacing[2], spacing[1], spacing[0])  # (spacing_z, spacing_y, spacing_x)
+    # Calculate the number of pixels to expand in each dimension
+    expand_pixels = np.array([int(np.round(expansion_mm / s)) for s in spacing_reordered])
+    # Create a new expanded mask with the same shape
+    expanded_mask = np.zeros_like(segmented_nodule_gmm)
+    # Get the coordinates of all white pixels in the original mask
+    white_pixel_coords = np.argwhere(segmented_nodule_gmm == 1)
+    # Expand each white pixel by adding the specified number of pixels in each direction
+    for coord in white_pixel_coords:
+        z, y, x = coord  # Extract the z, y, x coordinates of each white pixel
+        # Define the range to expand for each coordinate
+        z_range = range(max(0, z - expand_pixels[0]), min(segmented_nodule_gmm.shape[0], z + expand_pixels[0] + 1))
+        y_range = range(max(0, y - expand_pixels[1]), min(segmented_nodule_gmm.shape[1], y + expand_pixels[1] + 1))
+        x_range = range(max(0, x - expand_pixels[2]), min(segmented_nodule_gmm.shape[2], x + expand_pixels[2] + 1))
+        # Update the new mask by setting all pixels in this range to 1
+        for z_new in z_range:
+            for y_new in y_range:
+                for x_new in x_range:
+                    expanded_mask[z_new, y_new, x_new] = 1
+    return expanded_mask
+# Function to plot the contours of a mask
+def plot_contours(ax, mask, color, linewidth=1.5):
+    contours = measure.find_contours(mask, level=0.5)  # Find contours at a constant level
+    for contour in contours:
+        ax.plot(contour[:, 1], contour[:, 0], color=color, linewidth=linewidth)

scr/segmentation_utils.py ADDED Viewed

	@@ -0,0 +1,490 @@

+import SimpleITK as sitk
+import numpy as np
+import pandas as pd
+import os
+import radiomics
+from radiomics import featureextractor
+import argparse
+import numpy as np
+from sklearn.cluster import KMeans
+import scipy.ndimage as ndimage
+from sklearn.mixture import GaussianMixture
+import skfuzzy as fuzz
+from skimage.filters import threshold_otsu
+from skimage.filters import threshold_otsu
+from skimage.segmentation import watershed
+from skimage.feature import peak_local_max
+from skimage import morphology
+from scipy.ndimage import distance_transform_edt, binary_erosion
+from matplotlib import colors
+import numpy as np
+import matplotlib.pyplot as plt
+import SimpleITK as sitk
+import numpy as np
+import pandas as pd
+from matplotlib import colors
+import cv2
+from skimage import measure
+def make_bold(text):
+    return f"\033[1m{text}\033[0m"
+def load_itk_image(filename):
+    itkimage = sitk.ReadImage(filename)
+    numpyImage = sitk.GetArrayFromImage(itkimage)
+    numpyOrigin = itkimage.GetOrigin()
+    numpySpacing = itkimage.GetSpacing()
+    return numpyImage, numpyOrigin, numpySpacing
+def normalize_image_to_uint8(image, lower_bound=-1000, upper_bound=100):
+    clipped_img = np.clip(image, lower_bound, upper_bound)
+    normalized_img = ((clipped_img - lower_bound) / (upper_bound - lower_bound)) * 255.0
+    normalized_img = normalized_img.astype(np.uint8)
+    return normalized_img
+def segment_nodule_kmeans(ct_image, bbox_center, bbox_whd, margin=5, n_clusters=2):
+    """
+    Segments a nodule in a 3D CT image using k-means clustering with a margin around the bounding box.
+    Parameters:
+    - ct_image: 3D NumPy array representing the CT image.
+    - bbox_center: Tuple of (x, y, z) coordinates for the center of the bounding box.
+    - bbox_whd: Tuple of (w, h, d) representing the width, height, and depth of the bounding box.
+    - margin: Margin to add around the bounding box (default is 5).
+    - n_clusters: Number of clusters to use in k-means (default is 2).
+    Returns:
+    - segmented_image: 3D NumPy array with the segmented nodule.
+    """
+    x_center, y_center, z_center = bbox_center
+    w, h, d = bbox_whd
+    # Calculate the bounding box with margin
+    x_start, x_end = max(0, x_center - w//2 - margin), min(ct_image.shape[0], x_center + w//2 + margin)
+    y_start, y_end = max(0, y_center - h//2 - margin), min(ct_image.shape[1], y_center + h//2 + margin)
+    z_start, z_end = max(0, z_center - d//2 - margin), min(ct_image.shape[2], z_center + d//2 + margin)
+    bbox_region = ct_image[x_start:x_end, y_start:y_end, z_start:z_end]
+    # Reshape the region for k-means clustering
+    flat_region = bbox_region.reshape(-1, 1)
+    # Perform k-means clustering
+    kmeans = KMeans(n_clusters=n_clusters, random_state=0).fit(flat_region)
+    labels = kmeans.labels_
+    # Reshape the labels back to the original bounding box shape
+    clustered_region = labels.reshape(bbox_region.shape)
+    # Assume the nodule is in the cluster with the highest mean intensity
+    nodule_cluster = np.argmax(kmeans.cluster_centers_)
+    # Create a binary mask for the nodule
+    nodule_mask = (clustered_region == nodule_cluster)
+    # Apply morphological operations to refine the segmentation
+    nodule_mask = ndimage.binary_closing(nodule_mask, structure=np.ones((3, 3, 3)))
+    nodule_mask = ndimage.binary_opening(nodule_mask, structure=np.ones((2, 2, 2)))
+    # Initialize the segmented image
+    segmented_image = np.zeros_like(ct_image, dtype=np.uint8)
+    # Place the nodule mask in the correct position in the segmented image
+    segmented_image[x_start:x_end, y_start:y_end, z_start:z_end] = nodule_mask
+    return segmented_image
+def segment_nodule_gmm(ct_image, bbox_center, bbox_whd, margin=5, n_components=2):
+    """
+    Segments a nodule in a 3D CT image using a Gaussian Mixture Model with a margin around the bounding box.
+    Parameters:
+    - ct_image: 3D NumPy array representing the CT image.
+    - bbox_center: Tuple of (x, y, z) coordinates for the center of the bounding box.
+    - bbox_whd: Tuple of (w, h, d) representing the width, height, and depth of the bounding box.
+    - margin: Margin to add around the bounding box (default is 5).
+    - n_components: Number of components to use in the Gaussian Mixture Model (default is 2).
+    Returns:
+    - segmented_image: 3D NumPy array with the segmented nodule.
+    """
+    x_center, y_center, z_center = bbox_center
+    w, h, d = bbox_whd
+    # Calculate the bounding box with margin
+    x_start, x_end = max(0, x_center - w//2 - margin), min(ct_image.shape[0], x_center + w//2 + margin)
+    y_start, y_end = max(0, y_center - h//2 - margin), min(ct_image.shape[1], y_center + h//2 + margin)
+    z_start, z_end = max(0, z_center - d//2 - margin), min(ct_image.shape[2], z_center + d//2 + margin)
+    bbox_region = ct_image[x_start:x_end, y_start:y_end, z_start:z_end]
+    # Reshape the region for GMM
+    flat_region = bbox_region.reshape(-1, 1)
+    # Perform GMM
+    gmm = GaussianMixture(n_components=n_components, random_state=0).fit(flat_region)
+    labels = gmm.predict(flat_region)
+    # Reshape the labels back to the original bounding box shape
+    clustered_region = labels.reshape(bbox_region.shape)
+    # Assume the nodule is in the component with the highest mean intensity
+    nodule_component = np.argmax(gmm.means_)
+    # Create a binary mask for the nodule
+    nodule_mask = (clustered_region == nodule_component)
+    # Apply morphological operations to refine the segmentation
+    nodule_mask = ndimage.binary_closing(nodule_mask, structure=np.ones((3, 3, 3)))
+    nodule_mask = ndimage.binary_opening(nodule_mask, structure=np.ones((3, 3, 3)))
+    # Initialize the segmented image
+    segmented_image = np.zeros_like(ct_image, dtype=np.uint8)
+    # Place the nodule mask in the correct position in the segmented image
+    segmented_image[x_start:x_end, y_start:y_end, z_start:z_end] = nodule_mask
+    return segmented_image
+def segment_nodule_fcm(ct_image, bbox_center, bbox_whd, margin=5, n_clusters=2):
+    """
+    Segments a nodule in a 3D CT image using Fuzzy C-means clustering with a margin around the bounding box.
+    Parameters:
+    - ct_image: 3D NumPy array representing the CT image.
+    - bbox_center: Tuple of (x, y, z) coordinates for the center of the bounding box.
+    - bbox_whd: Tuple of (w, h, d) representing the width, height, and depth of the bounding box.
+    - margin: Margin to add around the bounding box (default is 5).
+    - n_clusters: Number of clusters to use in Fuzzy C-means (default is 2).
+    Returns:
+    - segmented_image: 3D NumPy array with the segmented nodule.
+    """
+    x_center, y_center, z_center = bbox_center
+    w, h, d = bbox_whd
+    # Calculate the bounding box with margin
+    x_start, x_end = max(0, x_center - w//2 - margin), min(ct_image.shape[0], x_center + w//2 + margin)
+    y_start, y_end = max(0, y_center - h//2 - margin), min(ct_image.shape[1], y_center + h//2 + margin)
+    z_start, z_end = max(0, z_center - d//2 - margin), min(ct_image.shape[2], z_center + d//2 + margin)
+    bbox_region = ct_image[x_start:x_end, y_start:y_end, z_start:z_end]
+    # Reshape the region for FCM
+    flat_region = bbox_region.reshape(-1, 1)
+    # Perform FCM clustering
+    cntr, u, u0, d, jm, p, fpc = fuzz.cluster.cmeans(flat_region.T, n_clusters, 2, error=0.005, maxiter=1000, init=None)
+    # Assign each voxel to the cluster with the highest membership
+    labels = np.argmax(u, axis=0)
+    # Reshape the labels back to the original bounding box shape
+    clustered_region = labels.reshape(bbox_region.shape)
+    # Assume the nodule is in the cluster with the highest mean intensity
+    nodule_cluster = np.argmax(cntr)
+    # Create a binary mask for the nodule
+    nodule_mask = (clustered_region == nodule_cluster)
+    # Apply morphological operations to refine the segmentation
+    nodule_mask = ndimage.binary_closing(nodule_mask, structure=np.ones((3, 3, 3)))
+    nodule_mask = ndimage.binary_opening(nodule_mask, structure=np.ones((3, 3, 3)))
+    # Initialize the segmented image
+    segmented_image = np.zeros_like(ct_image, dtype=np.uint8)
+    # Place the nodule mask in the correct position in the segmented image
+    segmented_image[x_start:x_end, y_start:y_end, z_start:z_end] = nodule_mask
+    return segmented_image
+def segment_nodule_otsu(ct_image, bbox_center, bbox_whd, margin=5):
+    """
+    Segments a nodule in a 3D CT image using Otsu's thresholding with a margin around the bounding box.
+    Parameters:
+    - ct_image: 3D NumPy array representing the CT image.
+    - bbox_center: Tuple of (x, y, z) coordinates for the center of the bounding box.
+    - bbox_whd: Tuple of (w, h, d) representing the width, height, and depth of the bounding box.
+    - margin: Margin to add around the bounding box (default is 5).
+    Returns:
+    - segmented_image: 3D NumPy array with the segmented nodule.
+    """
+    x_center, y_center, z_center = bbox_center
+    w, h, d = bbox_whd
+    # Calculate the bounding box with margin
+    x_start, x_end = max(0, x_center - w//2 - margin), min(ct_image.shape[0], x_center + w//2 + margin)
+    y_start, y_end = max(0, y_center - h//2 - margin), min(ct_image.shape[1], y_center + h//2 + margin)
+    z_start, z_end = max(0, z_center - d//2 - margin), min(ct_image.shape[2], z_center + d//2 + margin)
+    bbox_region = ct_image[x_start:x_end, y_start:y_end, z_start:z_end]
+    # Flatten the region for thresholding
+    flat_region = bbox_region.flatten()
+    # Calculate the Otsu threshold
+    otsu_threshold = threshold_otsu(flat_region)
+    # Apply the threshold to create a binary mask
+    nodule_mask = bbox_region >= otsu_threshold
+    # Apply morphological operations to refine the segmentation
+    nodule_mask = ndimage.binary_closing(nodule_mask, structure=np.ones((3, 3, 3)))
+    nodule_mask = ndimage.binary_opening(nodule_mask, structure=np.ones((3, 3, 3)))
+    # Initialize the segmented image
+    segmented_image = np.zeros_like(ct_image, dtype=np.uint8)
+    # Place the nodule mask in the correct position in the segmented image
+    segmented_image[x_start:x_end, y_start:y_end, z_start:z_end] = nodule_mask
+    return segmented_image
+def expand_mask_by_distance(segmented_nodule_gmm, spacing, expansion_mm):
+    """
+    Expands the segmentation mask by a given distance in mm in all directions by directly updating pixel values.
+    Parameters:
+    segmented_nodule_gmm (numpy array): 3D binary mask of the nodule (1 for nodule, 0 for background).
+    spacing (tuple): Spacing of the image in mm for each voxel, given as (spacing_x, spacing_y, spacing_z).
+    expansion_mm (float): Distance to expand the mask in millimeters.
+    Returns:
+    numpy array: Expanded segmentation mask.
+    """
+    # Reorder spacing to match the numpy array's (z, y, x) format
+    spacing_reordered = (spacing[2], spacing[1], spacing[0])  # (spacing_z, spacing_y, spacing_x)
+    # Calculate the number of pixels to expand in each dimension
+    expand_pixels = np.array([int(np.round(expansion_mm / s)) for s in spacing_reordered])
+    # Create a new expanded mask with the same shape
+    expanded_mask = np.zeros_like(segmented_nodule_gmm)
+    # Get the coordinates of all white pixels in the original mask
+    white_pixel_coords = np.argwhere(segmented_nodule_gmm == 1)
+    # Expand each white pixel by adding the specified number of pixels in each direction
+    for coord in white_pixel_coords:
+        z, y, x = coord  # Extract the z, y, x coordinates of each white pixel
+        # Define the range to expand for each coordinate
+        z_range = range(max(0, z - expand_pixels[0]), min(segmented_nodule_gmm.shape[0], z + expand_pixels[0] + 1))
+        y_range = range(max(0, y - expand_pixels[1]), min(segmented_nodule_gmm.shape[1], y + expand_pixels[1] + 1))
+        x_range = range(max(0, x - expand_pixels[2]), min(segmented_nodule_gmm.shape[2], x + expand_pixels[2] + 1))
+        # Update the new mask by setting all pixels in this range to 1
+        for z_new in z_range:
+            for y_new in y_range:
+                for x_new in x_range:
+                    expanded_mask[z_new, y_new, x_new] = 1
+    return expanded_mask
+def find_nodule_lobe(cccwhd, lung_mask, class_map):
+    """
+    Determine the lung lobe where a nodule is located based on a 3D mask and bounding box.
+    Parameters:
+    cccwhd (list or tuple): Bounding box in the format [center_x, center_y, center_z, width, height, depth].
+    lung_mask (numpy array): 3D array representing the lung mask with different lung regions.
+    class_map (dict): Dictionary mapping lung region labels to their names.
+    Returns:
+    str: Name of the lung lobe where the nodule is located.
+    """
+    center_x, center_y, center_z, width, height, depth = cccwhd
+    # Calculate the bounding box limits
+    start_x = int(center_x - width // 2)
+    end_x = int(center_x + width // 2)
+    start_y = int(center_y - height // 2)
+    end_y = int(center_y + height // 2)
+    start_z = int(center_z - depth // 2)
+    end_z = int(center_z + depth // 2)
+    # Ensure the indices are within the mask dimensions
+    start_x = max(0, start_x)
+    end_x = min(lung_mask.shape[0], end_x)
+    start_y = max(0, start_y)
+    end_y = min(lung_mask.shape[1], end_y)
+    start_z = max(0, start_z)
+    end_z = min(lung_mask.shape[2], end_z)
+    # Extract the region of interest (ROI) from the mask
+    roi = lung_mask[start_x:end_x, start_y:end_y, start_z:end_z]
+    # Count the occurrences of each lobe label within the ROI
+    unique, counts = np.unique(roi, return_counts=True)
+    label_counts = dict(zip(unique, counts))
+    # Exclude the background (label 0)
+    if 0 in label_counts:
+        del label_counts[0]
+    # Find the label with the maximum count
+    if label_counts:
+        nodule_lobe = max(label_counts, key=label_counts.get)
+    else:
+        nodule_lobe = None
+    # Map the label to the corresponding lung lobe
+    if nodule_lobe is not None:
+        nodule_lobe_name = class_map["lungs"][nodule_lobe]
+    else:
+        nodule_lobe_name = "Undefined"
+    return nodule_lobe_name
+def find_nodule_lobe_and_distance(cccwhd, lung_mask, class_map,spacing):
+    """
+    Determine the lung lobe where a nodule is located and measure its distance from the lung wall.
+    Parameters:
+    cccwhd (list or tuple): Bounding box in the format [center_x, center_y, center_z, width, height, depth].
+    lung_mask (numpy array): 3D array representing the lung mask with different lung regions.
+    class_map (dict): Dictionary mapping lung region labels to their names.
+    Returns:
+    tuple: (Name of the lung lobe, Distance from the lung wall)
+    """
+    center_x, center_y, center_z, width, height, depth = cccwhd
+    # Calculate the bounding box limits
+    start_x = int(center_x - width // 2)
+    end_x = int(center_x + width // 2)
+    start_y = int(center_y - height // 2)
+    end_y = int(center_y + height // 2)
+    start_z = int(center_z - depth // 2)
+    end_z = int(center_z + depth // 2)
+    # Ensure the indices are within the mask dimensions
+    start_x = max(0, start_x)
+    end_x = min(lung_mask.shape[0], end_x)
+    start_y = max(0, start_y)
+    end_y = min(lung_mask.shape[1], end_y)
+    start_z = max(0, start_z)
+    end_z = min(lung_mask.shape[2], end_z)
+    # Extract the region of interest (ROI) from the mask
+    roi = lung_mask[start_x:end_x, start_y:end_y, start_z:end_z]
+    # Count the occurrences of each lobe label within the ROI
+    unique, counts = np.unique(roi, return_counts=True)
+    label_counts = dict(zip(unique, counts))
+    # Exclude the background (label 0)
+    if 0 in label_counts:
+        del label_counts[0]
+    # Find the label with the maximum count
+    if label_counts:
+        nodule_lobe = max(label_counts, key=label_counts.get)
+    else:
+        nodule_lobe = None
+    # Map the label to the corresponding lung lobe
+    if nodule_lobe is not None:
+        nodule_lobe_name = class_map["lungs"][nodule_lobe]
+    else:
+        nodule_lobe_name = "Undefined"
+    # Calculate the distance from the nodule centroid to the nearest lung wall
+    nodule_centroid = np.array([center_x, center_y, center_z])
+    # Create a binary lung mask where lung region is 1 and outside lung is 0
+    lung_binary_mask = lung_mask > 0
+    # Create the lung wall mask by finding the outer boundary
+    # Use binary erosion to shrink the lung mask, then subtract it from the original mask to get the boundary
+    lung_eroded    = binary_erosion(lung_binary_mask)
+    lung_wall_mask = lung_binary_mask & ~lung_eroded  # Lung wall mask is the outermost boundary (contour)
+    # Compute the distance transform from the lung wall
+    distance_transform = distance_transform_edt(~lung_wall_mask)  # Compute distance to nearest lung boundary
+    # Get the distance from the nodule centroid to the nearest lung wall in voxel units
+    voxel_distance_to_lung_wall = distance_transform[center_x, center_y, center_z]
+    # Convert voxel distance to real-world distance in mm
+    physical_distance_to_lung_wall = voxel_distance_to_lung_wall * np.sqrt(
+        spacing[0]**2 + spacing[1]**2 + spacing[2]**2
+    )
+    return nodule_lobe_name, voxel_distance_to_lung_wall,physical_distance_to_lung_wall
+# +
+def expand_mask_by_distance(segmented_nodule_gmm, spacing, expansion_mm):
+    """
+    Expands the segmentation mask by a given distance in mm in all directions by directly updating pixel values.
+    Parameters:
+    segmented_nodule_gmm (numpy array): 3D binary mask of the nodule (1 for nodule, 0 for background).
+    spacing (tuple): Spacing of the image in mm for each voxel, given as (spacing_x, spacing_y, spacing_z).
+    expansion_mm (float): Distance to expand the mask in millimeters.
+    Returns:
+    numpy array: Expanded segmentation mask.
+    """
+    # Reorder spacing to match the numpy array's (z, y, x) format
+    spacing_reordered = (spacing[2], spacing[1], spacing[0])  # (spacing_z, spacing_y, spacing_x)
+    # Calculate the number of pixels to expand in each dimension
+    expand_pixels = np.array([int(np.round(expansion_mm / s)) for s in spacing_reordered])
+    # Create a new expanded mask with the same shape
+    expanded_mask = np.zeros_like(segmented_nodule_gmm)
+    # Get the coordinates of all white pixels in the original mask
+    white_pixel_coords = np.argwhere(segmented_nodule_gmm == 1)
+    # Expand each white pixel by adding the specified number of pixels in each direction
+    for coord in white_pixel_coords:
+        z, y, x = coord  # Extract the z, y, x coordinates of each white pixel
+        # Define the range to expand for each coordinate
+        z_range = range(max(0, z - expand_pixels[0]), min(segmented_nodule_gmm.shape[0], z + expand_pixels[0] + 1))
+        y_range = range(max(0, y - expand_pixels[1]), min(segmented_nodule_gmm.shape[1], y + expand_pixels[1] + 1))
+        x_range = range(max(0, x - expand_pixels[2]), min(segmented_nodule_gmm.shape[2], x + expand_pixels[2] + 1))
+        # Update the new mask by setting all pixels in this range to 1
+        for z_new in z_range:
+            for y_new in y_range:
+                for x_new in x_range:
+                    expanded_mask[z_new, y_new, x_new] = 1
+    return expanded_mask
+# Function to plot the contours of a mask
+def plot_contours(ax, mask, color, linewidth=1.5):
+    contours = measure.find_contours(mask, level=0.5)  # Find contours at a constant level
+    for contour in contours:
+        ax.plot(contour[:, 1], contour[:, 0], color=color, linewidth=linewidth)

scripts/DLCS24_CADe_64Qpatch.sh ADDED Viewed

	@@ -0,0 +1,76 @@

+#!/bin/bash
+# ============================
+# Docker Container Activation
+# ============================
+echo "Starting Docker container..."
+cd "$(dirname "$0")/.."  # Go to project root
+# Remove existing container if it exists
+docker rm -f nodule_seg_pipeline 2>/dev/null || true
+# Start container using existing medical imaging image
+docker run -d --name nodule_seg_pipeline \
+  -v "$(pwd):/app" \
+  -w /app \
+  ft42/pins:latest \
+  tail -f /dev/null
+# Create output directory and set proper permissions
+echo "Setting up output directories and permissions..."
+docker exec nodule_seg_pipeline mkdir -p /app/demofolder/output/DLCS24_64Q_CAD_patches
+docker exec nodule_seg_pipeline chmod -R 777 /app/demofolder/output/
+echo "Installing missing Python packages..."
+docker exec nodule_seg_pipeline apt-get update > /dev/null 2>&1
+docker exec nodule_seg_pipeline apt-get install -y libgl1 libglib2.0-0 > /dev/null 2>&1
+docker exec nodule_seg_pipeline pip install opencv-python-headless torch torchvision monai "numpy<2.0" --quiet
+echo "Docker container is running with all dependencies installed"
+# ============================
+# Configuration Variables
+# ============================
+# Define paths
+PYTHON_SCRIPT="/app/scr/candidate_worldCoord_patchExtarctor_pipeline.py"  # Path inside container
+DATASET_NAME="DLCSD24"
+RAW_DATA_PATH="/app/demofolder/data/DLCS24/"
+CSV_SAVE_PATH="/app/demofolder/output/DLCS24_64Q_CAD_patches/"
+DATASET_CSV="/app/demofolder/data/DLCSD24_Annotations_N2.csv"
+NIFTI_CLM_NAME="ct_nifti_file"
+UNIQUE_ANNOTATION_ID="nodule_id"  # Leave empty or remove if not in CSV
+MALIGNANT_LBL="Malignant_lbl"
+COORD_X="coordX"
+COORD_Y="coordY"
+COORD_Z="coordZ"
+W="w"
+H="h"
+D="d"
+SAVE_NIFTI_PATH="/app/demofolder/output/DLCS24_64Q_CAD_patches/nifti/"
+PATCH_SIZE="64 64 64"
+NORMALIZATION="-1000 500 0 1"
+CLIP="True"  # Change to "False" if needed
+# ============================
+# Run the Python script in Docker
+# ============================
+echo "Running segmentation in Docker container..."
+docker exec nodule_seg_pipeline python3 "$PYTHON_SCRIPT" \
+  --dataset_name "$DATASET_NAME" \
+  --raw_data_path "$RAW_DATA_PATH" \
+  --csv_save_path "$CSV_SAVE_PATH" \
+  --dataset_csv "$DATASET_CSV" \
+  --nifti_clm_name "$NIFTI_CLM_NAME" \
+  --unique_Annotation_id "$UNIQUE_ANNOTATION_ID" \
+  --Malignant_lbl "$MALIGNANT_LBL" \
+  --coordX "$COORD_X" \
+  --coordY "$COORD_Y" \
+  --coordZ "$COORD_Z" \
+  --patch_size $PATCH_SIZE \
+  --normalization $NORMALIZATION \
+  --clip "$CLIP" \
+  --save_nifti_path "$SAVE_NIFTI_PATH"

scripts/DLCS24_KNN_2mm_Extend_Radiomics.sh ADDED Viewed

	@@ -0,0 +1,89 @@

+#!/bin/bash
+# ===# Create output directory and set proper permissions
+echo "Setting up output directories and permissions..."
+docker exec nodule_seg_pipeline mkdir -p /app/demofolder/output/DLCS24_KNN_2mm_Extend_Seg
+docker exec nodule_seg_pipeline chmod -R 777 /app/demofolder/output/
+echo "Installing missing Python packages if needed..."
+docker exec nodule_seg_pipeline pip install opencv-python-headless --quiet > /dev/null 2>&1 || true
+echo "Docker container is running with write permissions set"===================
+# Docker Container Activation
+# ============================
+echo "Starting Docker container..."
+cd "$(dirname "$0")/.."  # Go to project root
+# Remove existing container if it exists
+docker rm -f nodule_seg_pipeline 2>/dev/null || true
+# Start container using existing medical imaging image
+docker run -d --name nodule_seg_pipeline \
+  -v "$(pwd):/app" \
+  -w /app \
+  ft42/pins:latest \
+  tail -f /dev/null
+# Create output directory and set proper permissions
+echo "Setting up output directories and permissions..."
+docker exec nodule_seg_pipeline mkdir -p /app/demofolder/output/DLCS24_KNN_2mm_Extend_RadiomicsSeg
+docker exec nodule_seg_pipeline chmod -R 777 /app/demofolder/output/
+echo "Docker container is running with write permissions set"
+# ============================
+# Configuration Variables
+# ============================
+# Define paths
+PYTHON_SCRIPT="/app/scr/candidateSeg_radiomicsExtractor_pipiline.py"  # Path inside container
+PARAMS_JSON="/app/scr/Pyradiomics_feature_extarctor_pram.json"  # Path inside container
+DATASET_NAME="DLCSD24"
+RAW_DATA_PATH="/app/demofolder/data/DLCS24/"
+CSV_SAVE_PATH="/app/demofolder/output/"
+DATASET_CSV="/app/demofolder/data/DLCSD24_Annotations_N2.csv"
+NIFTI_CLM_NAME="ct_nifti_file"
+UNIQUE_ANNOTATION_ID="nodule_id"  # Leave empty or remove if not in CSV
+MALIGNANT_LBL="Malignant_lbl"
+COORD_X="coordX"
+COORD_Y="coordY"
+COORD_Z="coordZ"
+W="w"
+H="h"
+D="d"
+SEG_ALG="knn"  # Choose from: gmm, knn, fcm, otsu
+EXPANSION_MM=2.0  # Set the expansion in millimeters
+SAVE_NIFTI_PATH="/app/demofolder/output/DLCS24_KNN_2mm_Extend_RadiomicsSeg/"
+SAVE_MASK_FLAG="--save_the_generated_mask"  # Remove if you don't want to save masks
+USE_EXPAND_FLAG="--use_expand"  # Include if you want to use expansion
+EXTRACT_RADIOMICS_FLAG="--extract_radiomics"  # Include if you want to extract radiomics "--extract_radiomics"
+# ============================
+# Run the Python script in Docker
+# ============================
+echo "Running segmentation in Docker container..."
+docker exec nodule_seg_pipeline python3 "$PYTHON_SCRIPT" \
+  --dataset_name "$DATASET_NAME" \
+  --raw_data_path "$RAW_DATA_PATH" \
+  --csv_save_path "$CSV_SAVE_PATH" \
+  --dataset_csv "$DATASET_CSV" \
+  --nifti_clm_name "$NIFTI_CLM_NAME" \
+  --unique_Annotation_id "$UNIQUE_ANNOTATION_ID" \
+  --Malignant_lbl "$MALIGNANT_LBL" \
+  --coordX "$COORD_X" \
+  --coordY "$COORD_Y" \
+  --coordZ "$COORD_Z" \
+  --w "$W" \
+  --h "$H" \
+  --d "$D" \
+  --seg_alg "$SEG_ALG" \
+  --expansion_mm "$EXPANSION_MM" \
+  --params_json "$PARAMS_JSON" \
+  --save_nifti_path "$SAVE_NIFTI_PATH" \
+  $USE_EXPAND_FLAG \
+  $EXTRACT_RADIOMICS_FLAG \
+  $SAVE_MASK_FLAG
+echo "✅ Segmentation completed! Check demofolder/output/ directory for results."

scripts/DLCS24_KNN_2mm_Extend_Seg.sh ADDED Viewed

	@@ -0,0 +1,84 @@

+#!/bin/bash
+# ============================
+# Docker Container Activation
+# ============================
+echo "Starting Docker container..."
+cd "$(dirname "$0")/.."  # Go to project root
+# Remove existing container if it exists
+docker rm -f nodule_seg_pipeline 2>/dev/null || true
+# Start container using existing medical imaging image
+docker run -d --name nodule_seg_pipeline \
+  -v "$(pwd):/app" \
+  -w /app \
+  ft42/pins:latest \
+  tail -f /dev/null
+# Create output directory and set proper permissions
+echo "Setting up output directories and permissions..."
+docker exec nodule_seg_pipeline mkdir -p /app/demofolder/output/DLCS24_KNN_2mm_Extend_Seg
+docker exec nodule_seg_pipeline chmod -R 777 /app/demofolder/output/
+echo "Installing missing Python packages if needed..."
+docker exec nodule_seg_pipeline pip install opencv-python-headless --quiet > /dev/null 2>&1 || true
+echo "Docker container is running with write permissions set"
+# ============================
+# Configuration Variables
+# ============================
+# Define paths
+PYTHON_SCRIPT="/app/scr/candidateSeg_pipiline.py"  # Path inside container
+PARAMS_JSON="/app/scr/Pyradiomics_feature_extarctor_pram.json"  # Path inside container
+DATASET_NAME="DLCSD24"
+RAW_DATA_PATH="/app/demofolder/data/DLCS24/"
+CSV_SAVE_PATH="/app/demofolder/output/"
+DATASET_CSV="/app/demofolder/data/DLCSD24_Annotations_N2.csv"
+NIFTI_CLM_NAME="ct_nifti_file"
+UNIQUE_ANNOTATION_ID="nodule_id"  # Leave empty or remove if not in CSV
+MALIGNANT_LBL="Malignant_lbl"
+COORD_X="coordX"
+COORD_Y="coordY"
+COORD_Z="coordZ"
+W="w"
+H="h"
+D="d"
+SEG_ALG="knn"  # Choose from: gmm, knn, fcm, otsu
+EXPANSION_MM=2.0  # Set the expansion in millimeters
+SAVE_NIFTI_PATH="/app/demofolder/output/DLCS24_KNN_2mm_Extend_Seg/"
+SAVE_MASK_FLAG="--save_the_generated_mask"  # Remove if you don't want to save masks
+USE_EXPAND_FLAG="--use_expand"  # Include if you want to use expansion
+EXTRACT_RADIOMICS_FLAG=""  # Include if you want to extract radiomics "--extract_radiomics"
+# ============================
+# Run the Python script in Docker
+# ============================
+echo "Running segmentation in Docker container..."
+docker exec nodule_seg_pipeline python3 "$PYTHON_SCRIPT" \
+  --dataset_name "$DATASET_NAME" \
+  --raw_data_path "$RAW_DATA_PATH" \
+  --csv_save_path "$CSV_SAVE_PATH" \
+  --dataset_csv "$DATASET_CSV" \
+  --nifti_clm_name "$NIFTI_CLM_NAME" \
+  --unique_Annotation_id "$UNIQUE_ANNOTATION_ID" \
+  --Malignant_lbl "$MALIGNANT_LBL" \
+  --coordX "$COORD_X" \
+  --coordY "$COORD_Y" \
+  --coordZ "$COORD_Z" \
+  --w "$W" \
+  --h "$H" \
+  --d "$D" \
+  --seg_alg "$SEG_ALG" \
+  --expansion_mm "$EXPANSION_MM" \
+  --params_json "$PARAMS_JSON" \
+  --save_nifti_path "$SAVE_NIFTI_PATH" \
+  $USE_EXPAND_FLAG \
+  $EXTRACT_RADIOMICS_FLAG \
+  $SAVE_MASK_FLAG
+echo "✅ Segmentation completed! Check demofolder/output/ directory for results."

scripts/build.sh ADDED Viewed

	@@ -0,0 +1,242 @@

+#!/bin/bash
+# Medical Imaging Nodule Segmentation Pipeline - Build Script
+# Automated Docker container build and setup
+set -e  # Exit on any error
+# Color codes for output
+RED='\033[0;31m'
+GREEN='\033[0;32m'
+YELLOW='\033[1;33m'
+BLUE='\033[0;34m'
+NC='\033[0m' # No Color
+# Script configuration
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+PROJECT_ROOT="$(dirname "$SCRIPT_DIR")"
+IMAGE_NAME="medical-imaging/nodule-segmentation"
+IMAGE_TAG="latest"
+CONTAINER_NAME="nodule_seg_pipeline"
+# Function to print colored output
+print_status() {
+    echo -e "${BLUE}[INFO]${NC} $1"
+}
+print_success() {
+    echo -e "${GREEN}[SUCCESS]${NC} $1"
+}
+print_warning() {
+    echo -e "${YELLOW}[WARNING]${NC} $1"
+}
+print_error() {
+    echo -e "${RED}[ERROR]${NC} $1"
+}
+# Function to check prerequisites
+check_prerequisites() {
+    print_status "Checking prerequisites..."
+    # Check if Docker is installed and running
+    if ! command -v docker &> /dev/null; then
+        print_error "Docker is not installed. Please install Docker first."
+        exit 1
+    fi
+    # Check if Docker daemon is running
+    if ! docker info &> /dev/null; then
+        print_error "Docker daemon is not running. Please start Docker first."
+        exit 1
+    fi
+    # Check if Docker Compose is available
+    if ! command -v docker-compose &> /dev/null; then
+        print_warning "docker-compose not found. Checking for 'docker compose'..."
+        if ! docker compose version &> /dev/null; then
+            print_error "Docker Compose is not available. Please install Docker Compose."
+            exit 1
+        else
+            DOCKER_COMPOSE="docker compose"
+        fi
+    else
+        DOCKER_COMPOSE="docker-compose"
+    fi
+    print_success "Prerequisites check passed!"
+}
+# Function to create necessary directories
+setup_directories() {
+    print_status "Setting up directory structure..."
+    # Create directories if they don't exist
+    directories=(
+        "docker/logs"
+        "docker/notebooks"
+        "src"
+        "scripts"
+        "params"
+        "data"
+        "output"
+        "logs"
+    )
+    for dir in "${directories[@]}"; do
+        if [ ! -d "$PROJECT_ROOT/$dir" ]; then
+            mkdir -p "$PROJECT_ROOT/$dir"
+            print_status "Created directory: $dir"
+        fi
+    done
+    print_success "Directory structure ready!"
+}
+# Function to copy source files
+setup_source_files() {
+    print_status "Setting up source files..."
+    # Copy source files to appropriate directories
+    if [ -f "$PROJECT_ROOT/scr/candidateSeg_pipiline.py" ]; then
+        cp "$PROJECT_ROOT/scr/candidateSeg_pipiline.py" "$PROJECT_ROOT/src/"
+        print_status "Copied main pipeline script"
+    fi
+    if [ -f "$PROJECT_ROOT/scr/cvseg_utils.py" ]; then
+        cp "$PROJECT_ROOT/scr/cvseg_utils.py" "$PROJECT_ROOT/src/"
+        print_status "Copied utility scripts"
+    fi
+    if [ -f "$PROJECT_ROOT/DLCS24_KNN_2mm_Extend_Seg.sh" ]; then
+        cp "$PROJECT_ROOT/DLCS24_KNN_2mm_Extend_Seg.sh" "$PROJECT_ROOT/scripts/"
+        chmod +x "$PROJECT_ROOT/scripts/DLCS24_KNN_2mm_Extend_Seg.sh"
+        print_status "Copied execution scripts"
+    fi
+    if [ -f "$PROJECT_ROOT/scr/Pyradiomics_feature_extarctor_pram.json" ]; then
+        cp "$PROJECT_ROOT/scr/Pyradiomics_feature_extarctor_pram.json" "$PROJECT_ROOT/params/"
+        print_status "Copied parameter files"
+    fi
+    print_success "Source files ready!"
+}
+# Function to build Docker image
+build_image() {
+    print_status "Building Docker image: $IMAGE_NAME:$IMAGE_TAG"
+    cd "$PROJECT_ROOT"
+    # Build with docker-compose
+    if $DOCKER_COMPOSE build --no-cache; then
+        print_success "Docker image built successfully!"
+    else
+        print_error "Failed to build Docker image"
+        exit 1
+    fi
+}
+# Function to verify the build
+verify_build() {
+    print_status "Verifying Docker image..."
+    # Check if image exists
+    if docker images | grep -q "$IMAGE_NAME"; then
+        print_success "Docker image verified!"
+        # Show image details
+        print_status "Image details:"
+        docker images | grep "$IMAGE_NAME" | head -1
+        # Test basic functionality
+        print_status "Testing basic functionality..."
+        if docker run --rm "$IMAGE_NAME:$IMAGE_TAG" python3 -c "import SimpleITK, radiomics, sklearn, skimage, scipy, pandas, numpy; print('All dependencies available!')"; then
+            print_success "All dependencies are working correctly!"
+        else
+            print_warning "Some dependencies may not be working correctly"
+        fi
+    else
+        print_error "Docker image not found after build"
+        exit 1
+    fi
+}
+# Function to show usage instructions
+show_usage() {
+    print_status "Build complete! Here's how to use the container:"
+    echo ""
+    echo "1. Start the container:"
+    echo "   $DOCKER_COMPOSE up -d nodule-segmentation"
+    echo ""
+    echo "2. Run the segmentation pipeline:"
+    echo "   $DOCKER_COMPOSE exec nodule-segmentation bash /app/scripts/DLCS24_KNN_2mm_Extend_Seg.sh"
+    echo ""
+    echo "3. Run interactively:"
+    echo "   $DOCKER_COMPOSE exec nodule-segmentation bash"
+    echo ""
+    echo "4. Start Jupyter (optional):"
+    echo "   $DOCKER_COMPOSE --profile jupyter up -d"
+    echo "   # Access at http://localhost:8888 (token: medical_imaging_2024)"
+    echo ""
+    echo "5. View logs:"
+    echo "   $DOCKER_COMPOSE logs -f nodule-segmentation"
+    echo ""
+    echo "6. Stop the container:"
+    echo "   $DOCKER_COMPOSE down"
+    echo ""
+}
+# Function to clean up previous builds
+cleanup() {
+    print_status "Cleaning up previous builds..."
+    # Stop and remove containers
+    $DOCKER_COMPOSE down --remove-orphans 2>/dev/null || true
+    # Remove previous images (optional)
+    if [ "$1" = "--clean" ]; then
+        docker rmi "$IMAGE_NAME:$IMAGE_TAG" 2>/dev/null || true
+        print_status "Removed previous image"
+    fi
+}
+# Main build process
+main() {
+    echo "========================================"
+    echo "Medical Imaging Pipeline - Build Script"
+    echo "========================================"
+    echo ""
+    # Parse command line arguments
+    CLEAN_BUILD=false
+    if [ "$1" = "--clean" ]; then
+        CLEAN_BUILD=true
+        print_status "Clean build requested"
+    fi
+    # Execute build steps
+    check_prerequisites
+    if [ "$CLEAN_BUILD" = true ]; then
+        cleanup --clean
+    fi
+    setup_directories
+    setup_source_files
+    build_image
+    verify_build
+    show_usage
+    print_success "Build completed successfully!"
+    echo ""
+    echo "Next steps:"
+    echo "1. Review the README.md for detailed usage instructions"
+    echo "2. Prepare your input data in the expected format"
+    echo "3. Start the container and run your analysis"
+    echo ""
+}
+# Run main function with all arguments
+main "$@"

scripts/run.sh ADDED Viewed

	@@ -0,0 +1,385 @@

+#!/bin/bash
+## Script configuration
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+PROJECT_ROOT="$(dirname "$SCRIPT_DIR")"
+CONTAINER_NAME="nodule_seg_pipeline"
+# Load environment variables if .env file exists
+if [ -f "$PROJECT_ROOT/.env" ]; then
+    print_status "Loading environment variables from .env file"
+    set -a
+    source "$PROJECT_ROOT/.env"
+    set +a
+elif [ -f "$PROJECT_ROOT/.env.template" ]; then
+    print_warning "No .env file found. Copy .env.template to .env and customize paths"
+    print_warning "Using default paths for now"
+fi
+# Set default environment variables if not set
+export PUID=${PUID:-$(id -u)}
+export PGID=${PGID:-$(id -g)}
+export DATA_PATH=${DATA_PATH:-"$PROJECT_ROOT/demofolder/data"}
+export OUTPUT_PATH=${OUTPUT_PATH:-"$PROJECT_ROOT/output"}ical Imaging Nodule Segmentation Pipeline - Run Script
+# Easy execution of the containerized pipeline
+set -e  # Exit on any error
+# Color codes for output
+RED='\033[0;31m'
+GREEN='\033[0;32m'
+YELLOW='\033[1;33m'
+BLUE='\033[0;34m'
+NC='\033[0m' # No Color
+# Script configuration
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+PROJECT_ROOT="$(dirname "$SCRIPT_DIR")"
+CONTAINER_NAME="nodule_seg_pipeline"
+# Function to print colored output
+print_status() {
+    echo -e "${BLUE}[INFO]${NC} $1"
+}
+print_success() {
+    echo -e "${GREEN}[SUCCESS]${NC} $1"
+}
+print_warning() {
+    echo -e "${YELLOW}[WARNING]${NC} $1"
+}
+print_error() {
+    echo -e "${RED}[ERROR]${NC} $1"
+}
+# Function to check if Docker Compose is available
+check_docker_compose() {
+    if command -v docker-compose &> /dev/null; then
+        DOCKER_COMPOSE="docker-compose"
+    elif docker compose version &> /dev/null; then
+        DOCKER_COMPOSE="docker compose"
+    else
+        print_error "Docker Compose is not available"
+        exit 1
+    fi
+}
+# Function to show help
+show_help() {
+    echo "Medical Imaging Nodule Segmentation Pipeline - Run Script"
+    echo ""
+    echo "Usage: $0 [COMMAND] [OPTIONS]"
+    echo ""
+    echo "Commands:"
+    echo "  start           Start the container"
+    echo "  stop            Stop the container"
+    echo "  restart         Restart the container"
+    echo "  run             Run the segmentation pipeline"
+    echo "  shell           Open interactive shell in container"
+    echo "  jupyter         Start Jupyter notebook service"
+    echo "  logs            Show container logs"
+    echo "  status          Show container status"
+    echo "  clean           Clean up containers and volumes"
+    echo "  custom          Run custom command in container"
+    echo ""
+    echo "Options:"
+    echo "  -h, --help      Show this help message"
+    echo "  -f, --follow    Follow logs (for logs command)"
+    echo "  -v, --verbose   Verbose output"
+    echo ""
+    echo "Examples:"
+    echo "  $0 start                    # Start the container"
+    echo "  $0 run                      # Run the segmentation pipeline"
+    echo "  $0 shell                    # Open interactive shell"
+    echo "  $0 logs -f                  # Follow logs"
+    echo "  $0 custom 'python3 --version'  # Run custom command"
+    echo ""
+}
+# Function to check container status
+check_status() {
+    if $DOCKER_COMPOSE ps | grep -q "$CONTAINER_NAME.*Up"; then
+        return 0  # Container is running
+    else
+        return 1  # Container is not running
+    fi
+}
+# Function to start the container
+start_container() {
+    print_status "Starting the nodule segmentation container..."
+    cd "$PROJECT_ROOT"
+    if check_status; then
+        print_warning "Container is already running"
+        return 0
+    fi
+    if $DOCKER_COMPOSE up -d nodule-segmentation; then
+        print_success "Container started successfully!"
+        # Wait for container to be ready
+        print_status "Waiting for container to be ready..."
+        sleep 5
+        # Check health
+        if $DOCKER_COMPOSE exec nodule-segmentation python3 -c "import SimpleITK; print('Container is ready!')" 2>/dev/null; then
+            print_success "Container is healthy and ready!"
+        else
+            print_warning "Container started but health check failed"
+        fi
+    else
+        print_error "Failed to start container"
+        exit 1
+    fi
+}
+# Function to stop the container
+stop_container() {
+    print_status "Stopping the nodule segmentation container..."
+    cd "$PROJECT_ROOT"
+    if ! check_status; then
+        print_warning "Container is not running"
+        return 0
+    fi
+    if $DOCKER_COMPOSE stop nodule-segmentation; then
+        print_success "Container stopped successfully!"
+    else
+        print_error "Failed to stop container"
+        exit 1
+    fi
+}
+# Function to restart the container
+restart_container() {
+    print_status "Restarting the nodule segmentation container..."
+    stop_container
+    sleep 2
+    start_container
+}
+# Function to run the segmentation pipeline
+run_pipeline() {
+    print_status "Running the nodule segmentation pipeline..."
+    cd "$PROJECT_ROOT"
+    if ! check_status; then
+        print_status "Container is not running. Starting it first..."
+        start_container
+    fi
+    # Check if the segmentation script exists
+    if $DOCKER_COMPOSE exec nodule-segmentation test -f "/app/scripts/DLCS24_KNN_2mm_Extend_Seg.sh"; then
+        print_status "Executing segmentation pipeline..."
+        if $DOCKER_COMPOSE exec nodule-segmentation bash /app/scripts/DLCS24_KNN_2mm_Extend_Seg.sh; then
+            print_success "Pipeline executed successfully!"
+        else
+            print_error "Pipeline execution failed"
+            exit 1
+        fi
+    else
+        print_error "Segmentation script not found in container"
+        print_status "Available scripts:"
+        $DOCKER_COMPOSE exec nodule-segmentation ls -la /app/scripts/ || true
+        exit 1
+    fi
+}
+# Function to open interactive shell
+open_shell() {
+    print_status "Opening interactive shell in container..."
+    cd "$PROJECT_ROOT"
+    if ! check_status; then
+        print_status "Container is not running. Starting it first..."
+        start_container
+    fi
+    print_status "Entering container shell..."
+    print_status "Type 'exit' to leave the container"
+    $DOCKER_COMPOSE exec nodule-segmentation bash
+}
+# Function to start Jupyter
+start_jupyter() {
+    print_status "Starting Jupyter notebook service..."
+    cd "$PROJECT_ROOT"
+    if $DOCKER_COMPOSE --profile jupyter up -d; then
+        print_success "Jupyter started successfully!"
+        echo ""
+        echo "Access Jupyter at: http://localhost:8888"
+        echo "Token: medical_imaging_2024"
+        echo ""
+        echo "To stop Jupyter:"
+        echo "  $DOCKER_COMPOSE --profile jupyter down"
+    else
+        print_error "Failed to start Jupyter"
+        exit 1
+    fi
+}
+# Function to show logs
+show_logs() {
+    cd "$PROJECT_ROOT"
+    if [ "$1" = "-f" ] || [ "$1" = "--follow" ]; then
+        print_status "Following container logs (Ctrl+C to stop)..."
+        $DOCKER_COMPOSE logs -f nodule-segmentation
+    else
+        print_status "Showing container logs..."
+        $DOCKER_COMPOSE logs nodule-segmentation
+    fi
+}
+# Function to show status
+show_status() {
+    print_status "Container status:"
+    cd "$PROJECT_ROOT"
+    echo ""
+    echo "=== Docker Compose Services ==="
+    $DOCKER_COMPOSE ps
+    echo ""
+    echo "=== Container Health ==="
+    if check_status; then
+        print_success "Container is running"
+        # Show resource usage
+        if command -v docker &> /dev/null; then
+            echo ""
+            echo "=== Resource Usage ==="
+            docker stats --no-stream --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.MemPerc}}" | grep "$CONTAINER_NAME" || true
+        fi
+        # Test dependencies
+        echo ""
+        echo "=== Dependency Check ==="
+        if $DOCKER_COMPOSE exec nodule-segmentation python3 -c "
+import SimpleITK
+import radiomics
+import sklearn
+import skimage
+import scipy
+import pandas
+import numpy
+print('✓ All dependencies available')
+" 2>/dev/null; then
+            print_success "All dependencies are working"
+        else
+            print_warning "Some dependencies may not be working"
+        fi
+    else
+        print_warning "Container is not running"
+    fi
+}
+# Function to clean up
+cleanup() {
+    print_status "Cleaning up containers and volumes..."
+    cd "$PROJECT_ROOT"
+    # Stop all services
+    $DOCKER_COMPOSE down --remove-orphans
+    # Remove volumes (optional)
+    read -p "Remove data volumes? (y/N): " -n 1 -r
+    echo
+    if [[ $REPLY =~ ^[Yy]$ ]]; then
+        $DOCKER_COMPOSE down -v
+        print_status "Volumes removed"
+    fi
+    print_success "Cleanup completed!"
+}
+# Function to run custom command
+run_custom() {
+    local command="$1"
+    if [ -z "$command" ]; then
+        print_error "No command provided"
+        echo "Usage: $0 custom 'your command here'"
+        exit 1
+    fi
+    print_status "Running custom command: $command"
+    cd "$PROJECT_ROOT"
+    if ! check_status; then
+        print_status "Container is not running. Starting it first..."
+        start_container
+    fi
+    $DOCKER_COMPOSE exec nodule-segmentation bash -c "$command"
+}
+# Main function
+main() {
+    # Change to project directory
+    cd "$PROJECT_ROOT"
+    # Check Docker Compose availability
+    check_docker_compose
+    local command="$1"
+    shift || true
+    case "$command" in
+        "start")
+            start_container
+            ;;
+        "stop")
+            stop_container
+            ;;
+        "restart")
+            restart_container
+            ;;
+        "run")
+            run_pipeline
+            ;;
+        "shell")
+            open_shell
+            ;;
+        "jupyter")
+            start_jupyter
+            ;;
+        "logs")
+            show_logs "$@"
+            ;;
+        "status")
+            show_status
+            ;;
+        "clean")
+            cleanup
+            ;;
+        "custom")
+            run_custom "$1"
+            ;;
+        "-h"|"--help"|"help"|"")
+            show_help
+            ;;
+        *)
+            print_error "Unknown command: $command"
+            echo ""
+            show_help
+            exit 1
+            ;;
+    esac
+}
+# Run main function with all arguments
+main "$@"