license: cc-by-nc-4.0 tags: - digital-pathology - camelyon16 - vit - mil - oncology

BTRUST-ViT-MIL for CAMELYON16

This model is a Vision Transformer (ViT) based Multiple Instance Learning (MIL) framework designed for detecting breast cancer metastasis in lymph node Whole Slide Images (WSI).

🔥 Reproducibility

Dataset preparation: https://github.com/kimdesok/ViT-backbone-MIL-on-CAMELYON16/blob/main/Convert_TIFs_to_TFRecords.ipynb Training: https://github.com/kimdesok/ViT-backbone-MIL-on-CAMELYON16/blob/main/MIL_ViT_CAME.ipynb

🏆 Institutional Achievement

Developed as part of National HPC Supporting Program by AICA, Gwangju, s.Korea and also partly through National NPU Support Program by NIPA, Daegu, s. Korea where Elice Group Co., Ltd. in Seoul, s. Korea kindly provided A100 x 2 GPUs for the model training. This model represents our commitment to reducing the manual workload of pathologists through high-performance AI.

📊 Model Details

Architecture: ViT-Backbone with Attention-based MIL Aggregator
Training Data: CAMELYON16 (H&E Stained Slides)
Framework: Keras / TensorFlow
Target: Lymph node metastasis detection of breast cancers
Note: keras_hub utilizes standardized Vision Transformer weights originally researched and released by the Google/timm teams. The base_model tag on Hugging Face is used for lineage tracking.

📁 Dataset & Data Availability

The model was trained on a curated version of the CAMELYON16 dataset, processed into multi-scale patches and masks.

Dataset Components:

Tissue Masks: Automated tissue detection at 2.5x/10x.
Tumor Masks: Expert-verified ground truth masks.
Patches: Extracted at 2.5x (contextual) and 10.0x (morphological) magnifications.

Access:

Due to the significant storage size and ongoing curation for commercial spin-off readiness, the processed dataset is not publicly hosted at this time.

Academic Researchers: Available upon reasonable request for validation purposes.
Inquiries: Please contact [dskim@btrust.co.kr] for data access requests.

📊 Dataset Pipeline

We provide the full pipeline to convert original CAMELYON16 TIFF images into the TFRecord format used for training this model. Available at https://github.com/kimdesok/ViT-backbone-MIL-on-CAMELYON16/Convert_TIFs_to_TFRecords.ipynb

Data Components

Source: Original CAMELYON16 WSIs (.tif)
Output: Multi-scale TFRecord sets (2.5x and 10.0x magnification)
Contents: Tissue masks, Tumor masks, and Patch sets.

Accessing the Data

The processed TFRecord files are hosted on our secure institutional storage due to their large scale.

Scripts: See here for the TIFF-to-TFRecord conversion code.
Download: To request access to the pre-processed TFRecord sets, please fill out our Data Request Form/Email us.

📈 Version History

Version	Date	Description	Status
v1.0	2024-05-22	Initial Release (Fine-tuned on CAMELYON16)	Current
v2.0	(TBD)	Planned Virchow 2.0 Integration on H100	R&D Phase

⚠️ License & Commercial Use

This model is licensed under Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0).

Academics: Free to use for research and publications.
Industry/Commercial: Use for-profit requires a separate commercial license.
Inquiries: Please contact [dskim@btrust.co.kr] for licensing and collaboration.

Downloads last month: 38

Model tree for kimdesok/vit_mil_camelyon16

Base model

timm/vit_base_patch16_224.augreg_in21k

Finetuned

(3)

this model