Coder Social home page Coder Social logo

cpystan / wsi-caption Goto Github PK

View Code? Open in Web Editor NEW
11.0 2.0 0.0 1.42 MB

Official Inplementation of 《WsiCaption: Multiple Instance Generation of Pathology Reports for Gigapixel Whole Slide Images》(MICCAI 2024)

License: MIT License

Python 100.00%
aigc pathology python visual-language wsi

wsi-caption's Introduction

WsiCaption: Multiple Instance Generation of Pathology Reports for Gigapixel Whole Slide Images [MICCAI2024]

=====

WsiCaption: Multiple Instance Generation of Pathology Reports for Gigapixel Whole Slide Images. [Link]
Pingyi Chen, Honglin Li, Chenglu Zhu, Sunyi Zheng, Lin Yang
Summary:1. We propose a pipeline to curate high-quality WSI-text pairs from TCGA. The dataset TCGA-PathoText contains about ten thousand pairs which will be publicly accessible. It can potentially promote the development of visual-language models in pathology. 2. We design a multiple instance generation framework(MI-Gen). By incorporating the position-aware module, our model is more sensitive to the spatial information in WSIs.

Pre-requisites:

We will share our collected slide-level captions but WSIs still need to be downloaded due to their large resolution.

Downloading TCGA Slides

To download diagnostic WSIs (formatted as .svs files), please refer to the NIH Genomic Data Commons Data Portal. WSIs for each cancer type can be downloaded using the GDC Data Transfer Tool.

Processing Whole Slide Images

To process WSIs, first, the tissue regions in each biopsy slide are segmented using Otsu's Segmentation on a downsampled WSI using OpenSlide. The 256 x 256 patches without spatial overlapping are extracted from the segmented tissue regions at the desired magnification. Consequently, a pretrained truncated ResNet50 is used to encode raw image patches into 1024-dim feature vectors, which we then save as .pt files for each WSI. We achieve the pre-processing of WSIs by using CLAM

TCGA-PathoText: Slide-Text captions

We notice that TCGA includes scanning copies of pathology reports in the format of PDF1. But they are too long with redundant information and present in a complex structure. Therefore, we propose a pipeline to extract and clean pathological texts from TCGA, which can convert complex PDF files to concise WSI-text pairs with the assistance of large language models (LLM). We also use a classifier to remove the pairs with bad quality.

dataset construction

Our dataset can be downloaded online now. The following folder structure is assumed for the TCGA-PathoText:

TCGA-PathoText/
    └──TCGA_BLCA/
        ├── case_1
              ├──annotation ##(slide-level captions we obtained by ocr and GPT)
              ├──case_1.pdf ##(softlink to the corresponding raw TCGA report)
              └── ...
        ├── case_2
        └── ...
    └──TCGA_BRCA/
        ├── case_1
        ├── case_2
        └── ...
    ...

TCGA-Slide-Features/
    └──TCGA_BLCA/
        ├── case_1.pt
        ├── case_2.pt
        └── ...
    └──TCGA_BRCA/
        ├── case_1.pt
        ├── case_2.pt
        └── ...
    ...

TCGA-PathoText contains the captions and TCGA-Slide-Features includes the extracted features of WSIs.

More details about the dataset are shown below. . (a) Histogram of text lengths. It shows that TCGA-PathoText includes longer pathology reports compared to ARCH which only describes small patches. (b) Word cloud showing 100 most frequent tokens.

Running Experiments

Experiments can be run using the following generic command-line:

Training model

python main.py --mode 'Train' --n_gpu <GPUs to be used, e.g '0,1,2,3' for 4 cards training> --image_dir <SLIDE FEATURE PATH> --ann_path <CAPTION PATH> --split_path <PATH to the directory containing the train/val/test splits> 

Testing model

python main.py --mode 'Test' --image_dir <SLIDE FEATURE PATH> --ann_path <CAPTION PATH> --split_path <PATH to the directory containing the train/val/test splits> --checkpoint_dir <PATH TO CKPT>

Basic Environment

  • Linux (Tested on Ubuntu 18.04)
  • NVIDIA GPU (Tested on Nvidia GeForce A100) with CUDA 12.0
  • Python (3.8)

wsi-caption's People

Contributors

cpystan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

wsi-caption's Issues

splits划分

请问可以开源其他数据集的split_path <PATH to the directory containing the train/val/test splits>划分吗,另外我还想咨询下数据集中的svs图片是怎么选取的,似乎并没有把数据集中所以svs图片用上,我一开始以为全是01z的,但发现几张是01A的svs图片

Image Data Preparation

Thanks for your amazing work. I'm going to test your method with my dataset. Could you please explain how you prepare the WSI with HIPT pretrained model for use in your model? Do you first crop the WSIs into 4096x4096 patches and then extract and save their features? These extracted features would then be fed into the model. Am I correct?

Decompression Issue with Uploaded Data

Thank you for your amazing work. I have a question about the data you uploaded. The data you posted is not decompressing; could you please check if there is a problem with the file?
Screenshot 2024-03-28 at 4 48 35 PM

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.