medtorch / q-aid-core Goto Github PK

View Code? Open in Web Editor NEW

11.0 11.0 5.0 140.51 MB

An intuitive platform for deploying the latest discoveries in healthcare AI to everybody's phones. Powered by PyTorch!

License: MIT License

Python 98.20% JavaScript 0.95% CSS 0.10% Dockerfile 0.54% Shell 0.21%

computer-vision healthcare python pytorch vqa xai

q-aid-core's People

Contributors

Stargazers

Watchers

Forkers

tudorcebere reloadbrain makama-md yashjaiswal1 diya4covid

q-aid-core's Issues

Pytorch Datasets support

Get statistical insights from a dataset like in our curent XAISummaryWriter:

https://github.com/pandas-profiling/pandas-profiling

but for our plugin.

Federated learning using Syft + Grid

Using 2 workers (ex: alice and bob) apply federated learning using the MedNIST dataset from the Monai repo.

We need to use PyGrid (probably also PySyft for doing this)

Documentatie Hackaton

Avem nevoie sa trasam ca idee ce vrem sa obtinem, exemplu de readme:
https://github.com/learnables/learn2learn

Alte idei necesare:

Elevator Pitch: What's your idea? This will be a short tagline for the project.
What languages, APIs, hardware, hosts, libraries, UI Kits or frameworks are you using?
Here's the whole story. Be sure to write what inspired you, what you learned, how you built, and challenges you faced.

feat: Add a sample event summary generator for the plugin.

Steps:

Fresh clone of the repositoy.
mkdir tmp/ && tensorboard --logdir $PWD/tmp
go to http://localhost:6006/

Tensorboard displays:

No dashboards are active for the current data set.
Probable causes:

You haven’t written any data to your event files.
TensorBoard can’t find your event files.

It is not clear how to generate event files for the plugin.
We should add an example around summary_pb2 for our plugin, just to get the idea/get started.

Understand Plugin + PluginData

We need a better understanding of tensorboard plugins, we should understand the basic Images plugin to see how we can make our custom one. Currently, we are reusing add_image from vanilla tensorboard.

Start point: https://github.com/tensorflow/tensorboard/blob/master/tensorboard/plugins/image/images_plugin.py

Medical MVP: Choose a dataset for the demo

We have to choose a dataset that can help us demo the following:

MONAI support/any other established medical framework.
Usable from a mobile device, for demonstrating the ease of use.
Relevant information in the interpretability benchmarks, for overcoming the medical skepticism to AI.

Add more saliency

Based on the saliency map written by @bcebere and on the API change written by me, continue adding more saliency methods as in pytorchxai.xai.saliency.

Geo-Location and possibly more metadata

Try to use federated learning to train only on a part of the workers where are in in "similar area".
Ideas:

use a worker as a central point and then use federated learning taking into account only the workers that are under a specific radius.

How to know where a worker is located:

We have to simulate the location
We can keep the location for the workers in the Grid or keep it as a tensor "tagged" with "metadata" where we can keep a lot more information.

In Syft we send tensors using:

Ex: torch.Tensor([32, 42]).tag("metadata", "location).send(alice)

We can do the above to tag a tensor with two tags

Logo

Avem nevoie si de un logo, ceva sa apara pe github, sa para cool si sa se lege de explainability.

Tensorboard Submodule

Add tensorboard as a submodule/use bazel to reuse html/css.

Medical MVP: Captum XAI demos

Depends on #35, #34 and #33 . We should do some XAI demos on trained models.

Medical MVP: Model training infrastructure

We should pick a service that gives us some GPU compute to train bigger models. I think that we all have access to some pretty decent GPUs to prototype on, but we don't wanna wait one week for a single model.

As far as I know, our options are the free credit on AWS provided by the hackaton organizers, or the 300$ free credit provided by Google on their Compute Engine platform.

Someone should also set up the virtual machines so they can be accessed by everyone.

Translate Issues

Translate everything to EN.

Tests fail because of missing pytorchxai.skeleton

Steps:

Clone the repo
run python setup.py install
run python setup.py test
The tests fail with

ImportError while importing test module '/home/bcebere/code/github/hackathon/PyTorchXAI/tests/test_skeleton.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
tests/test_skeleton.py:4: in <module>
    from pytorchxai.skeleton import fib
E   ModuleNotFoundError: No module named 'pytorchxai.skeleton'

----------- coverage: platform linux, python 3.8.3-final-0 -----------
Name                                  Stmts   Miss Branch BrPart  Cover   Missing
---------------------------------------------------------------------------------
src/pytorchxai/__init__.py                0      0      0      0   100%
src/pytorchxai/pytorchxai_plugin.py      29     29      2      0     0%   1-59
---------------------------------------------------------------------------------
TOTAL                                    29     29      2      0     0%

Tested with Python 3.8.3, tensorboard 2.2.2 on Linux Mint 19.3

Dashboard

Final dashboard

Datasets: #19
Saliency/CV #21
Optional: Model bias techniques: Idee - TSNE/PCA/Heatmap pentru atentie custom
Optional: RL

XAI nn.Module

Medical MVP: Train a MONAI architecture over a dataset and export it to TorchScript

depends on #33

Train a model to a decent accuracy and export it using TorchScript

PyTorchXAI MVP

PyTorchXAI MVP este functionalitatea de baza pe care vrem sa o implementam pentru PyTorch Summer Hackaton[0].

PyTorchXAI ar trebui sa implementeze in tensorboard[1] sau in orice alt tool de vizualizare un plugin care sa permita folosirea usoara a algoritmilor de tip XAI[2].

Milestones pentru PyTorchXAI:

Tensorboard Plugin

scrierea un tensorboard plugin [3] care permite selectarea unei imagini (deja existente in tensorboard sau prin upload) si afiseaza fix aceeasi imagine.
integrarea plugin-ului cu ce exista in torch.utils.tensorboard[4].
logarea activarilor/gradientilor in tensorboard pe o retea convolutionala dummy si reutilizarea lor in cadrul pluginului.
implementarea selectarii carui algoritm de XAI vrem sa il folosim printr-un dropdown menu.

Motivation, Datasets, probleme deschise:

documentarea motivatiei existentei acestui tool pentru concurs/noi. Bias in atrenare, explicabil pentru persoane care nu cunosc literatura, etc. Ar trebui sa existe si o explicatie informala, dar si o colectie de paper-uri relevante.
gasirea unor dataset-uri care genereaza astfel de bias-uri sa le putem evidentia cu acest plugin.
tema de cercetare: suport pentru XAI in videoclipuri?

Algoritmi:
Daca pana aici functioneaza totul corect, putem mai departe sa incepem implementarea de diferiti algoritmi de XAI si sa cream tutoriale de folosire cu fiecare:

saliency maps[5], gradCAM[6] ar trebui sa fie un punct de plecare bun.
orice idee de pe blogul/slack-ul de distill[7]
vizualizare pe NBRT[8] <- s-ar putea sa fie foarte hardcore
Benchmark si alte idei: https://github.com/utkuozbulak/pytorch-cnn-visualizations.

Referinte:
[0]: https://pytorch2020.devpost.com/
[1]: https://github.com/tensorflow/tensorboard
[2]: https://bair.berkeley.edu/blog/2020/04/23/decisions/
[3]: https://github.com/tensorflow/tensorboard/blob/master/ADDING_A_PLUGIN.md
[4]: https://github.com/pytorch/pytorch/tree/master/torch/utils/tensorboard
[5]: https://analyticsindiamag.com/what-are-saliency-maps-in-deep-learning/
[6]: https://towardsdatascience.com/demystifying-convolutional-neural-networks-using-gradcam-554a85dd4e48
[7]: https://distill.pub/
[8]: https://bair.berkeley.edu/blog/2020/04/23/decisions/

Medical MVP: Adding federated learning stub

We want to distribute the learning process among hospitals, without releasing their documents.
This should give us the ability to improve the current baseline.

Depending on the dataset, we could do federated learning over user's data instead, and we need an API to support that.

We need to document our limits/depends

Medical MVP: Simple app that uses our API

We can create a simple cross platform app using

https://flutter.dev/

that can query our service over HTTPS.

Is will show the resulted labels and interpretability data.

Medical MVP: Expose the model in the cloud

We could use Amazon or Azure for exposing a public API
We can also use TorchServe out of the box for that.

Input:

A photo
Result:
a Label
interpretability info.

Azure has the advantage for providing homomorphic encryption support, but we need to research into that.

Medical MVP: Private inference

One nice addition could be running the models over encrypted data.

The limits here are:

We might need multiple roundtrips to overcome the growing noise in data.
We might need to test polynomial approximations for our architecture.

Azure has support for services like this, might worth having a look https://docs.microsoft.com/en-us/azure/machine-learning/how-to-homomorphic-encryption-seal

Other option is to use TenSEAL over the submitted photos.

The purpose is to provide full privacy to the uploaded info.

One downside might be generating saliency maps

Github workflow + repo setup

Crearea unui proiect de python (cu suport pentru linter/teste/setup.py) si un github flow de baza, eventual folosind: https://pyscaffold.org/en/latest/

setup repo
generare de docs (pyscaffold genereaza sphinx stuff afaik)
github actions pentru un test basic dummy

Should we merge it in tensorboard/pytorch?

Reread the rules

Inference on a remote model

At the moment of creating this issue, I do not know if we can do this in Syft but it should be pretty easy to implement.
We need to have a way to send data to a server and that server should run the inference (a minimum required) on a pre-trained model (that was trained using FL) and give back the result.

More ideas after we have the basic building block:

send back the saliency map
use HE to send to model
use DP in case the client wants to "attack" the model

Medical MVP: Integrate captum in the flow

Using https://github.com/pytorch/captum, we need to find some good benchmarks that show the reason for a verdict/diagnosis.

Our goal is provide a verifiable second opinion on a diagnosis.

Medical MVP: Allocate a domain, a TLS certificate and a public IP for our medical API

We need to safely expose our API, to be reachable.

TLS certificates from Let's encrypt.
Domains from any provider.
The cloud provider might offer us a public IP.

Medical MVP: VQA+Captum demo

See #44 for more details about the dataset and network architecture.

The task is:

Train a baseline VQA model with decent accuracy. Try using tools from MONAI as much as possible. (related to #40)
Use Captum for interpretability. (related to #35)

Nice to have:
3. Federalize the model and do the same Captum demo. (depends on #41)
4. Find a stronger architecture (maybe based on transformers) and make a demo with it.

Tensorboard Plugin

Implementarea unui plugin basic folosind https://github.com/tensorflow/tensorboard/tree/master/tensorboard/examples/plugins in care sa putem incarca o imagine si sa o afiseze.

Se bazeaza pe #2.

Acest issue trebuie sa acopere si instalarea intr-un tensorboard existent, acesta fiind punctul de plecare al muncii.

Medical MVP: App enhancements

We can pretrain some models for doing some local preprocessing in the app before submitting it to our service.

We can train models using PyTorch and export them to the mobile using PyTorch Mobile.

Flutter has support for pytorch too https://pub.dev/packages/pytorch_mobile

Some examples:

prevent irrelevant submissions

Datasets: MedNISTDataset, DecathlonDataset
Network blocks highlights: atrous spatial pyramid pooling module.
Dataset loaders: GridPatchDataset might be useful.
Inference: Sliding Window Inference.
Losses: DiceLoss, MaskedDiceLoss, FocalLoss, TverskyLoss,
Existing Nets: Densenet3D, Highresnet, Unet, Generator, Regressor, Classifier, Discriminator, Critic
Metrics: Mean dice, Area under the ROC curve

Do we want to extend it to NLP?
Do we need another architecture?

The research might impact our demo dataset.

Datasets + Modele cu bias + Metode de interpretare + POC

Ideal, ar trebui luata informatia relevanta de aici pentru a vedea PoC-urile valoroase pe care am putea sa le facem si pe ce: https://github.com/pbiecek/xai_resources

Ce dataset-uri cu bias exista? Daca nu avem unul public, putem genera unul?
Exista algoritmi cu bias?
PoC de interes?

Videos support

Would be amazing to have support for XAI over videos, similar to this one: https://github.com/chrisranderson/beholder