Coder Social home page Coder Social logo

nmpoole / cs5199-dissertation Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 17.16 MB

Dissertation completed for the award of MSci in Computer Science. This dissertation is about automated breast cancer detection in low-resolution whole-slide pathology images using a deep convolutional neural network pipeline.

License: MIT License

Python 100.00%
breast-cancer-detection camelyon camelyon16 camelyon17 convolutional-neural-networks deep-learning low-resolution machine-learning medical-image-analysis pytorch

cs5199-dissertation's Introduction

CS5199-Dissertation

Breast Cancer Detection in Low Resolution Images:

Machine learning systems exist for the automatic detection of breast cancer in histopathology whole-slide images with high confidence. Such systems can potentially automate large portions of conventional diagnostic procedures used to identify breast cancer, improving support for diagnoses via digital second opinion or reducing cognitive load by shifting work away from medical personnel.

However, these current systems are complex as they often fully utilise high-resolution whole-slide images with dimensions that are hundreds of thousands of pixels in width and height. Such images represent pathology slides at considerably high magnification. Due to the high resolution of the images, these systems are typically resource intensive, requiring either significant time or compute power, which hinders their clinical viability.

This project investigates automated breast cancer detection via deep learning techniques using lower resolution images (i.e., digital histopathology slides at a lower magnification). The investigation intends to reveal whether machine learning models can be developed that provide high confidence results with some fractional amount of resources by using low- versus high-resolution whole-slide images.

Information On The Contents Of The Project Directory:

CS5199_Report.pdf

  • The final report for the project in PDF format.

src/

  • Contains the project source code. Primarily, this includes the model training and inference scripts. Also included is the tools/ subdirectory containing the various data preparation scripts described in the report for making the input data set suitable for use. This directory also includes a ray_results/ folder, where the hyper-parameter optimisation results are stored (not provided given size), and a tensorboard/ directory where GUI outputs for training are stored (visible via the TensorBoard tool which is a requirement for the program environment). Full user instructions for the project source code are found within the appendices of the report.

models/

  • OMITTED (files too large): Contains the models created for this project. The model names included their expect input resolution (e.g., 299 for 299 x 299 pixels) as well as the model version used (a0 is model version 0, etc.).

data/

  • OMITTED (files too large): Contains the low-resolution data sets used in the project for training models at 299 x 299 pixels. This primarily includes the full Camelyon data set used in the project. Within the data set folder(s) are the train/, eval/, and test/ sub-directories required by the implementation. Note these can be used to replicate the project work carried out for models using 299 x 299 pixel inputs, but are not suitable for all higher input resolutions. For higher resolutions, the Camelyon data set will have to be downloaded and the data preparation process described in the user instructions of the report will have to be followed.

testdata/

  • Contains a small sub-set of the Camelyon data set used for debugging the source code. Serves no utility purpose now.

env/

  • Contains the Dockerfile used to create the Docker container (i.e., program environment) at the remote GPU machine. This is provided for completeness. Also contains the requirements.txt file which lists the pip packages that are required to execute the project code. The only unlisted package required is python3-openslide, whose installation is shown in the Dockerfile.

cs5199-dissertation's People

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.