Coder Social home page Coder Social logo

smujiang / wsitools Goto Github PK

View Code? Open in Web Editor NEW
34.0 6.0 11.0 5.86 MB

Tools for whole slide image (WSI) processing. Especially for (pairwise) patch extraction, annotation parsing and data preparation for deep learning purposes.

License: GNU General Public License v3.0

Python 91.57% Groovy 5.46% Jupyter Notebook 1.17% Shell 1.80%
pathological image-processing wis patch extraction openslide digital-pathology

wsitools's Introduction

WSITools

Tools for whole slide image (WSI) pre-processing, including tissue detection, patch extraction, annotation parsing etc.

Citation

Use this bibtex to cite this repository:

@misc{Jun Jiang WSITools 2019,
  title={Whole slide image pre-processing tools for deep learning tasks},
  author={Jun Jiang},
  year={2019},
  publisher={Github},
  journal={GitHub repository},
  howpublished={\url{https://github.com/smujiang/WSITools}},
}

Or cite our paper used this tool

[1] Jiang, Jun, Burak Tekin, Lin Yuan, Sebastian Armasu, Stacey Winham, E. Goode, Hongfang Liu, Yajue Huang, Ruifeng Guo, and Chen Wang. "Computational tumor stroma reaction evaluation led to novel prognosis-associated fibrosis and molecular signature discoveries in high-grade serous ovarian carcinoma." Frontiers in medicine 9 (2022).
[2] Jiang, Jun, Burak Tekin, Ruifeng Guo, Hongfang Liu, Yajue Huang, and Chen Wang. "Digital pathology-based study of cell-and tissue-level morphologic features in serous borderline ovarian tumor and high-grade serous ovarian cancer." Journal of Pathology Informatics 12 (2021).
[3] Jiang, Jun, Naresh Prodduturi, David Chen, Qiangqiang Gu, Thomas Flotte, Qianjin Feng, and Steven Hart. "Image-to-image translation for automatic ink removal in whole slide images." Journal of Medical Imaging 7, no. 5 (2020): 057502.

Quick Start

Installation

git clone https://github.com/smujiang/WSITools.git
cd WSITools
python setup.py install
  • Note that when you install our package, the dependencies can be automatically installed, but you may need to install the dependent OpenSlide library.
    • If using PyCharm or venv on Windows:
      1. Download the correct binary file for your system
      2. Copy all files from /bin into your venv/Scripts/ directory

Testing

We provide examples for Patch Extraction and Pairwise Patch Extraction. You can choose to save the extracted patches into PNG/JPG files or tfRecords.

If you just want to extract patches from a WSI, and save them into JPG/PNG files, it needs only a few lines of code:

# Import the relevant libraries from this module
from wsitools.tissue_detection.tissue_detector import TissueDetector
from wsitools.patch_extraction.patch_extractor import ExtractorParameters, PatchExtractor
import multiprocessing

#Define some run parameters
num_processors = 20                     # Number of processes that can be running at once
wsi_fn = "/path/2/file.tff"             # Define a sample image that can be read by OpenSlide
output_dir = "/data/WSIs_extraction"    # Define an output directory

# Define the parameters for Patch Extraction, including generating an thumbnail from which to traverse over to find 
# tissue.
parameters = ExtractorParameters(output_dir, # Where the patches should be extracted to
    save_format = '.png',                      # Can be '.jpg', '.png', or '.tfrecord'              
    sample_cnt = -1,                           # Limit the number of patches to extract (-1 == all patches)
    patch_size = 128,                          # Size of patches to extract (Height & Width)
    rescale_rate = 128,                        # Fold size to scale the thumbnail to (for faster processing)
    patch_filter_by_area = 0.5,                # Amount of tissue that should be present in a patch
    with_anno = True,                          # If true, you need to supply an additional XML file
    extract_layer = 0                          # OpenSlide Level
    )

# Choose a method for detecting tissue in thumbnail image
tissue_detector = TissueDetector("LAB_Threshold",   # Can be LAB_Threshold or GNB
    threshold = 85,                                   # Number from 1-255, anything less than this number means there is tissue
    training_files = None                             # Training file for GNB-based detection
    )

# Create the extractor object
patch_extractor = PatchExtractor(tissue_detector, 
    parameters, 
    feature_map = None,                       # See note below                     
    annotations = None                        # Object of Annotation Class (see other note below)
    )

if __name__ == '__main__':
    # Run the extraction process
    multiprocessing.set_start_method('spawn')
    pool = multiprocessing.Pool(processes = num_processors)
    pool.map(patch_extractor.extract, [wsi_fn])

See Feature Maps for more detail

See Annotation Objects for more detail

Descriptions

WSITools is a whole slide image processing toolkit. It provides efficient ways to extract patches from whole slide images, and some other useful features for pathological image processing. Currently, it supports four patch extraction scenarios:

  1. Extract patches from WSIs
  2. Extract patches from WSIs and their label (i.e. their directory name)
    1. TODO: Incomplete
  3. Extract patches from a fixed and a float WSI
  4. Extract patches from a fixed and a float WSI in places that intersect annotation objects
    1. TODO: Incomplete

Additional Features

  1. Detect tissue in a WSI
  2. Export and parsing annotation from QuPath and Aperio Image Scope
  3. WSI registration for image pairs [Paper]
  4. Reconstruct WSI from the processed image patches

Architectures

Architecture

Documents

Tissue Detection
Patch Extraction
WSI Alignment
Pairwise Patch Extraction
Annotate with QuPath and Export Annotations
Annotation Parsing

TODO list

  • Validate saved tfRecords.
  • Add annotation labels into patch extraction.

wsitools's People

Contributors

antoinehabis avatar smujiang avatar steven-n-hart avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

wsitools's Issues

Reading tfrecord file

Hi
Is it possible please to have an example of reading the tfrecord file please ? In fact, I'm facing a lot of issues.

Thanks for your help !

How to perform the "pen mark" removal and color normalization

Hello
I found your method very useful. I am interested to apply WSItools for WSI image data analysis and prepare data ready for the deep learning. I have multiple "svs" format slide images. Some of these WSI, consists of "pen" marks. I am planning to follow analysis as :
remove "pen" marks -> apply color normalization -> tissue detection and patches extraction.
From this method, tissue detection and patches extraction works perfectly fine. How can I tackle the issue of "pen mark" removal and "color normalization" in my datasets. I would appreciate all the suggestions.

Thanks in advance

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.