Coder Social home page Coder Social logo

mmlp_mw7r's Introduction

Medical Machine Learning Platform

General Information

This repository contains the prototypical implementation of my Masterthesis in 2019 at the University of Heidelberg in the medical context of the German Cancer Research Center (DKFZ):

Platform to Assist Medical Experts in Training, Application, and Control of Machine Learning Models Using Patient Data from a Clinical Information System

The full thesis and further information are published in the Heidelberg Document Repository: HeiDOKs.

Further changes and developments after July 2019 were performed independently from the Heidelberg University and the DKFZ.

Platform Data Storage Structure

All data objects are stored using UUIDs to avoid conflicts with similar objects.

  • /data is per default created and mounted to stores the data of the platform.
  • /data/MMLP/models contains all data related to models, including training snapshots.
  • /data/MMLP/datasets contains all data sets uploaded by the user.
  • /data/MMLP/results if a user uploads data and applies a model, the resulting predictions are stored here.

The configuration is part of the backend, check backend/README and backend/mmlp/config.py

System Requirements

This prototypical platform implementation does support on-premise, hybrid, and public clouds. It is tested on Amazon Web Services, Microsoft Azure, and Google Cloud. In case you need assistance, please contact me.

Before you attempt to deploy the platform, please ensure your system meets the following requirements:

  1. Docker is installed
  2. GPU support is available within docker (if you run machine learning on GPU) For Nvidia GPUS: https://github.com/NVIDIA/nvidia-docker For AMD GPUs: https://rocm.github.io/
  3. If you do not update the default configuration, the following settings are assumed: The global folder /data is used to store all kinds of data related to the platform; it could consume a lot of disk space, depending on your model and data set. If you use a distributed computing environment, please ensure this folder is appropriately shared between the computing nodes. Note: Currently, distribution and scaling are planned but not yet implemented. Please contact me for further information.

Usage

  1. Clone the repository:
git clone https://github.com/magreiner/MMLP
cd MMLP
  1. Adjust the configuration
vi backend/mmlp/Config.py
  1. Deploy the platform The platform can be deployed using docker-compose:
# build the containers (repeat this step every time you changed the code or the configuration)
docker-compose build --parallel

# foreground deployment (useful for development, showing the logs directly):
docker-compose up

# background deployment as service (access logs via docker-compose -f logs)
docker-compose up -d
  1. Enjoy If deployed locally you can access the platform on port 80 with http://localhost

Note:

  • https is not activated by default, due to the increased complexity with the certificates. To create certificates Letsencrypt is recommended.
  • Sometimes, the browser tries to switch to https automatically and fails. If the platform is not showing as expected, check your browser.

Screenshots

Clinical Data Scientist (Developer) View

  1. Welcome Page Welcome page

  2. Option to Switch Between Clinical Data Scientist (Developer) View and Medical Expert (User) View Welcome Page

  3. Data Set Overview Data Set Overview

  4. Data Set Version Overview Data Set Version Overview

  5. Model Overview Model Overview

  6. Model Version Overview (Commits, ...) Model Version Overview

  7. Snapshots for a particular Model Version Overview Snapshots for a particular Model Version Overview

  8. Training Pipeline, please be aware that these pages have dynamic content based on the used model. Therefore this view can vary greatly, depending on the functionality of the used model. Due to copyright, no model is currently included in this prototype.

    1. Select the data set for training Data Set Selection

    2. Select a version of the data set for training Data Set Version Selection

    3. Verify the selection Data Set Summary

    4. Select the model for training Model Selection

    5. Select the version (commit-id) for training Model Version Selection

    6. Verify the selection Model Summary

    7. Select a training snapshot for fine-tuning, or create a new training snapshot Snapshot Selection

    8. Verify the selection Snapshot Summary

    9. Customize the training settings, such as pre-processing and hyper-parameters Training Customization

    10. Verify all settings before deployment Configuration Summary

  1. Method Overview (A method represents a model snapshot, that is exported and made available to a medical expert. It can be used without further configuration) Methid Overview

  2. Result View: An overview of the results of the application of a method by the user. This is intended to allow further debugging by the clinical data scientist. Result Overview

Medical Expert (User without machine learning experience) View

  1. Welcome Page Welcome page

  2. Method Overview Method Overview

  3. Analyzing New Data (Use the pretrained model to predict on new data) Guided Pipeline Upload Patient Cohort Method Selection Summary

  4. Result View Result Overview

Containers

Various containers were helpful during development. Maybe they can be useful for you, too:

  • PACS Container Stack (based on https://www.dcm4che.org)
  • Dataset Generators
  • Port Redirect
  • Postprocessing
  • Preprocessing
  • Visdom-Docker

Evaluation of the prototypical platform

The platform (as of July 2019) was evaluated by clinical data scientists and medical experts. For details consult the thesis published here: http://www.ub.uni-heidelberg.de/archiv/27446

mmlp_mw7r's People

Contributors

dependabot[bot] avatar magreiner avatar trellixvulnteam avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.