Coder Social home page Coder Social logo

felixpeters / lung-cancer-detection Goto Github PK

View Code? Open in Web Editor NEW
23.0 4.0 14.0 535.92 MB

Deep learning-based segmentation and classification of lung nodules

Jupyter Notebook 99.02% Python 0.97% Makefile 0.01% Dockerfile 0.01%
pytorch medical-imaging medical-application deep-learning

lung-cancer-detection's Introduction

Deep learning-based lung cancer detection

CI Open in Streamlit Visualize in W&B

Inspiration

With an estimated 160,000 deaths in 2018, lung cancer is the most common cause of cancer death in the United States (Ardila et al. 2019).

Lung cancer is one of the most prevalent cancers worldwide, causing 1.76 million deaths per year (Yu et al. 2020).

Clinical decision support systems have been developed to enable early diagnosis of lung cancer from CT images. However, most of these tools are limited to lung or nodule segmentation, leaving classifation of nodules to the radiologist. Early research shows that deep learning models can support with this task as well. Integrating these research efforts into clinical applications is an active area of development. See the Arterys Marketplace for examples of lung cancer detection models, some of which are currently under review for FDA or CE approval. This project constitutes a design study of how a deep learning-based lung cancer detection app could look like.

Data

LIDC-IDRI dataset

This dataset contains 1010 chest CT scans (in DICOM format) containing 2625 nodules. Nodules are annotated by radiologists regarding their malignancy, measurements and additional characteristics (e.g., calcification, spiculation). The lung_cancer_detection package contains modules for reading and preprocessing images. The raw data can be downloaded from the Cancer Imaging Archive.

Models

Nodule classification

Model 1: Malignancy classification from tabular data

  • Code: nbs/14_Nodule_Classification_Tabular.ipynb
  • Data:
    • Source: Nodule metadata
    • Filters: minimum three annotations, malignancy annotation is not "Indeterminate"
    • Target: Binary classification (benign => labels 1, 2; malignant => labels 4, 5)
    • Features: 11 in total (measurements and additional annotations)
    • Split: 672 training examples, 75 test examples
  • Model:
    • Type: Random Forest (scikit-learn defaults)
    • Performance: 94.67% accuracy, 0.9895 AUC score (on test data)

Roadmap

Nodule detection model:

  • Preprocess LIDC dataset
  • Train baseline model

Random ideas:

  • Apply TCAV algorithm to trained model, use additional annotations as concepts

References

Materials

Basics of CT images:

PyTorch Lightning:

Monai (PyTorch-based library for medical imaging):

Preprocessing of DICOM images:

Lung cancer detection datasets:

Scientific papers

Ardila, D., Kiraly, A. P., Bharadwaj, S., Choi, B., Reicher, J. J., Peng, L., Tse, D., Etemadi, M., Ye, W., Corrado, G., Naidich, D. P., and Shetty, S. 2019. “End-to-End Lung Cancer Screening with Three-Dimensional Deep Learning on Low-Dose Chest Computed Tomography,” Nature Medicine (25:6), Springer US, pp. 954–961. (https://doi.org/10.1038/s41591-019-0447-x).

Setio, A. A. A., Traverso, A., de Bel, T., Berens, M. S. N., Bogaard, C. van den, Cerello, P., Chen, H., Dou, Q., Fantacci, M. E., Geurts, B., Gugten, R. van der, Heng, P. A., Jansen, B., de Kaste, M. M. J., Kotov, V., Lin, J. Y. H., Manders, J. T. M. C., Sóñora-Mengana, A., García-Naranjo, J. C., Papavasileiou, E., Prokop, M., Saletta, M., Schaefer-Prokop, C. M., Scholten, E. T., Scholten, L., Snoeren, M. M., Torres, E. L., Vandemeulebroucke, J., Walasek, N., Zuidhof, G. C. A., Ginneken, B. van, and Jacobs, C. 2017. “Validation, Comparison, and Combination of Algorithms for Automatic Detection of Pulmonary Nodules in Computed Tomography Images: The LUNA16 Challenge,” Medical Image Analysis. (https://doi.org/10.1016/j.media.2017.06.015).

Svoboda, E. 2020. “Artificial Intelligence Is Improving the Detection of Lung Cancer,” Nature (587:7834), pp. S20–S22. (https://doi.org/10.1038/d41586-020-03157-9).

Yu, K. H., Lee, T. L. M., Yen, M. H., Kou, S. C., Rosen, B., Chiang, J. H., and Kohane, I. S. 2020. “Reproducible Machine Learning Methods for Lung Cancer Detection Using Computed Tomography Images: Algorithm Development and Validation,” Journal of Medical Internet Research (22:8), pp. 1–11. (https://doi.org/10.2196/16709).

Zhu, W., Liu, C., Fan, W., & Xie, X. (2018, March). Deeplung: Deep 3d dual path nets for automated pulmonary nodule detection and classification. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV) (pp. 673-681). IEEE.

lung-cancer-detection's People

Contributors

felixpeters avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

lung-cancer-detection's Issues

AttributeError: 'tuple' object has no attribute 'endswith'

  • Hello! i'm using this code: "lung-cancer-detection/nbs/08_Image_Reader.ipynb"

data_dict = {"image": "images/LIDC-IDRI-0133.npy", "label": "masks/LIDC-IDRI-0133.npy"}
print(data_dict)
loader = LoadImaged(keys=("image", "label"))
loader.register(LIDCReader(DATA_DIR))
data_dict = loader(data_dict)

  • but jupyter warning like this "AttributeError: 'tuple' object has no attribute 'endswith'" and jupyter warning it in "lung-cancer-detection/lung_cancer_detection/data/reader.py"

def verify_suffix(self, filename: str) -> bool:
"""
Verify whether the specified file format is supported by LIDCReader.
Args:
filename (Union[Sequence[str], str]): file name or a list of file names to read (should always be one file for LIDC dataset)
Returns:
bool: if file format is supported
"""
if isinstance(filename, list):
raise ValueError(
"LIDCReader only supports individual files to be loaded.")
return filename.endswith(".npy")

  • How can i fix it?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.