Coder Social home page Coder Social logo

neelkanwal / deepkernellearning Goto Github PK

View Code? Open in Web Editor NEW
2.0 0.0 1.0 15.03 MB

This repository contains code for paper: "Are you sure it’s an artifact? Artifact detection and uncertainty quantification in histological images."

License: GNU General Public License v3.0

Python 100.00%
computational-pathology dcnn-architectures deep-kernel-learning deeplearning probablistic-machine-learning python artifact-detection

deepkernellearning's Introduction

Deep Kernel Learning

This repository contains the source code for deep kernel learning described in the paper: "Are you sure it’s an artifact? Artifact detection and uncertainty quantification in histological images", published in Computerized Medical Imaging and Graphics Journal.

link to the paper:

image

Requirements

  • Python >= 3.6.7
  • Numpy == 1.23.4
  • Pytroch == 2.2.0
  • Gpytorch == 1.9.0
  • Pandas == 1.5.1
  • deepsig == 1.2.6
  • opencv-python == 4.7.0.68
  • pyvips == 2.2.1
  • openslide-python == 1.2.0
  • Matploblib
  • Scipy
  • Scikit-learn
  • Seaborn

Abstract

Modern cancer diagnostics involves extracting tissue specimens from suspicious areas and conducting histotechnical procedures to prepare a digitized glass slide, called Whole Slide Image (WSI), for further examination. These procedures frequently introduce different types of artifacts in the obtained WSI, and histological artifacts might influence Computational Pathology (CPATH) systems further down to a diagnostic pipeline if not excluded or handled. Deep Convolutional Neural Networks (DCNNs) have achieved promising results for the detection of some WSI artifacts; however, they do not incorporate uncertainty in their predictions. This paper proposes an uncertaintyaware Deep Kernel Learning (DKL) model to detect blurry areas and folded tissues, two types of artifacts that can appear in WSIs. The proposed probabilistic model combines a CNN feature extractor and a sparse Gaussian Processes (GPs) classifier, which improves the performance of current state-of-the-art artifact detection DCNNs and provides uncertainty estimates. We achieved 0.996 and 0.938 F1 scores for blur and folded tissue detection on unseen data, respectively. In extensive experiments, we validated the DKL model on unseen data from external independent cohorts with different staining and tissue types, where it outperformed DCNNs. Interestingly, the DKL model is more confident in the correct predictions and less in the wrong ones. The proposed DKL model can be integrated into the preprocessing pipeline of CPATH systems to provide reliable predictions and possibly serve as a quality control tool.

image

Results

image image image image image

How to use the code

Please install requirements.txt or Python dependencies separately. Update paths to the processed dataset and path to save experiment results.

Dataset

The dataset is publicaly available at Zenodo. https://zenodo.org/records/10809442.

You can use D40x directory and corresponding folders with artifacts to organize in the following order.

For folded tissue, D20x is used in this work for development and D40x for testing the folded tissue DKL models.

- path_to\blur_dataset
      - training
           -- artifact_free
            -- blur
      - validation
            -- artifact_free
            -- blur
       - test
            -- artifact_free
            -- blur
  • Train models using train_dcnn.py for SOTA DCNNs mentioned in the paper

  • Train DKL models using train_dkl.py, choosing specific architectures and hyperparameters.

  • Train Baseline models using train_baseline.py

  • Use paths to experiment directories for best_weights.dat and run inference.py for test set, TCGAFocus and FocusPath

  • Use predicted excel sheets to create confidence plots using plot_confidence.py

Other Publically available datasets (used in external validation)

Use transform_tcga.py to transform the dataset for running inference models.

Private Dataset

EMC dataset mentioned in the paper will soon be released and link will be added here.

How to cite our Work

The code is released free of charge as open-source software under the GPL-3.0 license. Please cite this work if you our code, data or build on top of it.

@article{kanwal2023you,
  title={Are you sure it’s an artifact? Artifact detection and uncertainty quantification in histological images},
  author={Kanwal, Neel and L{\'o}pez-P{\'e}rez, Miguel and Kiraz, Umay and Zuiverloon, Tahlita CM and Molina, Rafael and Engan, Kjersti},
  journal={Computerized Medical Imaging and Graphics},
  pages={102321},
  year={2023},
  publisher={Elsevier}
}

Other works on HistoArtifact datasets:

  1. Vision-Transformers-for-Small-Histological-Datasets-Learned-Through-Knowledge-Distillation: https://github.com/NeelKanwal/Vision-Transformers-for-Small-Histological-Datasets-Learned-Through-Knowledge-Distillation
  2. Quantifying-the-effect-of-color-processing-on-blood-and-damaged-tissue-detection: https://github.com/NeelKanwal/Quantifying-the-effect-of-color-processing-on-blood-and-damaged-tissue-detection
  3. Are you sure it’s an artifact? Artifact detection and uncertainty quantification in histological images: https://github.com/NeelKanwal/DeepKernelLearning

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.