Coder Social home page Coder Social logo

lindawangg / covid-net Goto Github PK

View Code? Open in Web Editor NEW
1.1K 75.0 479.0 66.44 MB

COVID-Net Open Source Initiative

License: Other

Jupyter Notebook 56.92% Python 43.08%
coronavirus coronavirus-dataset coronavirus-detect chest-radiography covid-net covidx-dataset covid-19 sars-cov-2

covid-net's People

Contributors

aalhaimi avatar andyzzhao avatar h-aboutalebi avatar haydengunraj avatar jingfeipeng avatar lindawangg avatar lindeny avatar maclean-alexander avatar mayaliliya avatar naomiterhljan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

covid-net's Issues

CovidNet query

Issue Template

Before posting, have you looked at the FAQ page?

Description

Please include a summary of the issue.
Please include the steps to reproduce.
List any additional libraries that are affected.

Steps to Reproduce

  1. First step
  2. Second step
  3. Third step

Expected behavior

A description of what you expected to happen.

Actual behavior

A description of what happens instead.

Environment

  • Build: [e.g. 3180 - type "About" in the Command Palette]
  • Operating system and version: [e.g. macOS 10.14, Windows 10, Ubuntu 18.04]
  • [Linux] Desktop Environment and/or Window Manager: [e.g. Gnome, LXDE, i3]
    Can COVIDNet be used for binary class classification? (normal and COVID19)

Model detects COVID-19 or Data Source?

Given the datasource is different for the two classes: COVID and Pneumonia/Normal, how do you validate that the model doesn't classify the data source, but actually classifies the presence of COVID-19?

Can you provide pretrained model on imageNet?

Issue Template

Description

Please include a summary of the issue.
Please include the steps to reproduce.
List any additional libraries that are affected.

Steps to Reproduce

  1. First step
  2. Second step
  3. Third step

Expected behavior

A description of what you expected to happen.

Actual behavior

A description of what happens instead.

Environment

  • Build: [e.g. 3180 - type "About" in the Command Palette]
  • Operating system and version: [e.g. macOS 10.14, Windows 10, Ubuntu 18.04]
  • [Linux] Desktop Environment and/or Window Manager: [e.g. Gnome, LXDE, i3]

PEPX Network Design Pattern

Thanks for working on this project! This is very interesting and very impactful.

COVID-Net relies on a design pattern of projection-expansion-projection-extension (PEPX) throughout the network. I have beginner-level knowledge of computer vision, and I haven't seen this design pattern before.

  1. Without loading in the model, what are the output dimensions of each layer in the PEPX module (Figure 2, top right box) for PEPX1.1? This would give me a better understanding of how dimension is changing within the module.

  2. What is the intuition around the effectiveness of this design pattern? Are there some previous papers that use this design pattern for their core results?

Name of model ckpts

I couldn't find the model ckpts file inside the covid_net.large file .. what should i do?

Out-of-Sample filtering and transformation

Hello, and thank you for sharing this great project.

I am testing the net on out-of-sample data, some known COVID or non-COVID images and having some troubles. These are my questions to the community if anybody could help:

  • Is it mandatory to filter the out-of-sample images to PA projections? Don't know if it is important or it supposed to work fine with AP too.

  • Is it needed to transform the image to RGB? On the README says the Net is expecting a 224,244,3 array and DICOM images are just grayscale, i'm trying with openCV libraries transformation, don't know if this is correct to handle DICOM files:
    "img=cv2.cvtColor(img, cv2.COLOR_GRAY2RGB)"

Visual result control

Hello everyone,

I run the inference.py scipt on some "healthy" X ray images but the result was "Covid-19". I would like to check what the network is really classifing.
I read in the joint paper that the QSInquire method was used to have a visual control of results. Is the method available ? Does someone know how it works ?
Or have an alternative way of visualise the result ?

Thanks,

Issue related to Confidence score in Inference.py

The below recent commit in Inference.py file is missing "Covid-19". "Normal" is showing twice.

print('Confidence')
print('Normal: {:.3f}, Pneumonia: {:.3f}, Normal: {:.3f}'.format(pred[0][0], pred[0][1], pred[0][2]))

running inference file is getting Killed.

I am trying to inference for multiple input images (six of them) and getting below error
it works fine for 4 input images.

2020-04-14 07:43:21.737334: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-04-14 07:43:21.748542: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2300000000 Hz
2020-04-14 07:43:21.748968: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f2a7c638260 executing computations on platform Host. Device
s:
2020-04-14 07:43:21.749010: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): ,
Killed

Example of how to use the pre-trained model

I've managed to load the provided model, but I'm not sure how to proceed from there to actually use it on an image.

In #2 @Vikramank mentioned a Flask app, so I assume it's possible to use the pre-trained models, but without more info or docs, I don't know how to do it.

EDIT: actually, digging a bit more into this, it seems that, without the actual Keras model, using the Tensorflow checkpoint is pretty hard :-? From keras-team/keras#5273 (comment):

Fundamentally, you cannot "turn an arbitrary TensorFlow checkpoint into a Keras model".

What you can do, however, is build an equivalent Keras model then load into this Keras model the weights contained in a TensorFlow checkpoint that corresponds to the saved model. In fact this is how the pre-trained InceptionV3 in Keras was obtained.

So, without more info, it seems pretty hard (or impossible?) to do :-?

Would it be possible to add a simple example script that (1) gets the path to an image as input (1) loads the module (2) uses it and outputs the "COVID probability"?

Thanks!

At least two variables have the same name: conv1_conv/bias

Hi,
I am getting below error message frequently
"At least two variables have the same name: conv1_conv/bias"
when was trying to test pneumonia image and also i saw this error with normal and COVID-19 image but less frequently

version of tensor flow i have.
tensorboard = 1.14.0
tensorflow = 1.14.0

model-8485 vs model-10 (epoch 10)

Why the evaluation results are different when loading model-8485 and model-10 (epoch 10)?
These are my results when running eval.py using COVIDNet-CXR-Large, test_COVIDx2.txt and model-10 (epoch 10)

[[93. 7. 0.]
[ 5. 93. 2.]
[ 2. 2. 27.]]
Sens Normal: 0.930, Pneumonia: 0.930, COVID-19: 0.871
PPV Normal: 0.930, Pneumonia 0.912, COVID-19: 0.931

Information regarding GSInquire

Example chest radiography images of COVID-19 cases from 2 different patients and their associated critical factors (highlighted in red) as identified by GSInquire.

Can you kindly tell what is GSInquire and how it was instrumental in identifying associated critical factors?

Covid-net model?

Paper mentions that code for covid-net architecture is available at repo but it is not present atm.

Failure to acknowledge 1 other open source lung scan based ConvNet that was published back on February 9, over a month before CovidNet

Just a heads up.

With reference to this line on main repository: "Motivated by this, a number of artificial intelligence (AI) systems based on deep learning have been proposed and results have been shown to be quite promising in terms of accuracy in detecting patients infected with COVID-19 using chest radiography images. However, to the best of the authors' knowledge, these developed AI systems have been closed source and unavailable to the research community for deeper understanding and extension, and unavailable for public access and use."

There has been another open source project since February 9, that pre-dated CovidNet by over a month:

Alt Text

Emails were sent to one of the authors namely Alexander, but the main CovidNet github repository is yet to reflect/acknowledge the much earlier repository, which is easy and quickly doable. Why hasn't this been done?

Missing Expected regions where xray analysis is expected to be viable

There are many things one may learn from your work here with CovidNet, as it is robust work that will reasonably help to push/democratize Ai in a positive direction.

One thing of note, is that there are expected regimes for which xray/ct based techniques, be it by human or by ai, are expected to be viable. Unless I am mistaken, I did not detect that in the CovidNet paper nor the CovidNet/repository's readme file.

I think a similar section from the repository seen in issue 55 after title "Preliminary Conclusion", concerning expected constraints on testing/diagnosis, should be considered for CovidNet.

A quick snippet can be seen below:
image

Missing requirements.txt

Hi @lindawangg,

Can you please add requirements.txt so we can install the exact library versions you tested with?

pip freeze > requirements.txt and then remove the libraries you aren't explicitly using here.

Open Source Helps!

Thanks for your work to help the people in need! Your site has been added! I currently maintain the Open-Source-COVID-19 page, which collects all open source projects related to COVID-19, including maps, data, news, api, analysis, medical and supply information, etc. Please share to anyone who might need the information in the list, or will possibly contribute to some of those projects. You are also welcome to recommend more projects.

http://open-source-covid-19.weileizeng.com/

Cheers!

How to obtain the COVID-Netv2 model?

Hello my friends,

I'm trying to use train_tf to train with a pre-trained model, but it doesn't have this COVIDNetv2 file or model-2069. How to get them?

Normal, Pneumonia and Covid-19 split

Dear Linda,

For training the model, is there a script available that can separate the classes ( Normal, Pneumonia and Covid-19) based on the train and test text files?

I've managed to build the train and test data sets but they aren't labeled at the moment.

Thanks,
Babs

Can't reconcile layer dimensions in chart from COVID_Netv2.pdf

Hi, First of all thanks for sharing your work.

  1. When will you release the training script?
  2. I can't seem to reconcile the layer dimensions in the PDF. The first layer gives the dimensions of the input images in parentheses, so I assume the numbers in parentheses are the dimensions of what is passed to the next layer. If so, how does a 7x7 convolutional layer output 112x112x64 from an input of 224x224x3? Assuming step size and padding size are integers, this doesn't seem to work with the formula for calculating the output size of convolutions unless you have unreasonably huge padding.

For the conv1x1 layers the dimension gets cut in half, suggesting the step size is 2. However with a 1x1 filter this means you're dropping half of the pixels. Is this correct?

Furthermore, the first flatten layer is said to have a flattened dimension of 100352 - but that's what you'd get from just PEPX 4.3. However you also have PEPX 4.2, PEPX 4.1, and the last conv1x1 on the right all feeding into the flattened layer, which each have 100352 elements. so are these 4x100352 all flattened together, feeding a vector of 401408 elements into the first FC layer (as I would expect since they all come from the same input image), or are you treating them separately?

  1. Could you please specify the PEPX layer dimensions?

Code for the model

Hi,

Will you be publishing the code for the model anytime soon? Thanks.

Best,

Arijit.

Dataset RSNA Pneumonia Challenge

Hi to everyone,

I've a doubt about Dataset RSNA Pneumonia Challenge.
I'm going to download dataset (4gb) for detection of pneumonia on my own NN model.
I was wondering if the dataset was paid or if there is any constraint over the term "challenge"

Confirmation on the data splits and benchmark results with {train,test}_COVIDx2.txt

Hi Linda,

Thanks for providing nice guidelines for the COVIDx dataset and COVID-Net. I recently compiled the dataset using the guideline provided in https://github.com/lindawangg/COVID-Net/blob/master/docs/COVIDx.md. However, I noticed that the test class distribution is slightly different than the one presented in https://github.com/lindawangg/COVID-Net#results. I have used the https://github.com/lindawangg/COVID-Net/blob/master/create_COVIDx_v3.ipynb script, train_COVIDx2.txt and test_COVIDx2.txt files. For your reference, I observed the following data distribution:

COVID19 -> Train (223 images), Test (31 images)
Normal -> Train (7966 images), Test (885 images)
Prenumonia -> Train (5451 images), Test (594 images)

Kindly confirm whether the distribution is correct.

Furthermore, do you have any benchmark result with the above data distribution? The benchmark presented in https://github.com/lindawangg/COVID-Net#results is with the lesser test samples. What version of data distribution do you recommend for comparison with the COVID-Net? Kindly advise.

I look forward to your answers. Thank you.

Regards,
Saimun

DataLossError While Loading Model

Issue

DataLossError While Loading Model

Environment

  • Libraries
    opencv-contrib-python 4.2.0.34
    tensorflow-gpu 1.15.0
  • Docker Environment, Tensorflow-GPU Image
  • JupyterLab 2.0.1

Description

Hi, while I'm trying to load model for inference or evaluation in jupyter, I always got this error.

DataLossError: Checksum does not match: stored 1497157360 vs. calculated on the restored bytes 2410561084
	 [[node save/RestoreV2 (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]]

My full notebook is shown below

import numpy as np
import os, argparse
import cv2
import tensorflow as tf
weightspath = '/tf/notebooks/mrc_is_here/COVID-Net/models/covid-net-large'
metaname = "model.meta_eval"
ckptname = "model-2069"
imagepath = 'assets/ex-covid.jpeg'

mapping = {'normal': 0, 'pneumonia': 1, 'COVID-19': 2}
inv_mapping = {0: 'normal', 1: 'pneumonia', 2: 'COVID-19'}
sess = tf.Session()
tf.get_default_graph()
saver = tf.train.import_meta_graph(os.path.join(weightspath, metaname))
saver.restore(sess, os.path.join(weightspath, ckptname))

My folder structure is:

COVID-NET

  • inference.ipynb
  • models
    • covid-net-large
      • checkpoint
      • model-2069.data-00000-of-00001
      • model-2069.index
      • mode.meta_eval
      • mode.meta_train
    • covid-net-small
      • checkpoint
      • model-6207.data-00000-of-00001
      • model-6207.index
      • mode.meta_eval
      • mode.meta_train

Steps to Reproduce

  1. Create a model folder as shown above
  2. Copy all the code to a new jupyter notebook
  3. Run step by step

Expected behavior

Restore model from meta and checkpoint files

Actual behavior

Throw exception DataLossError

Dataset distribution

There seem to be some discordance between the number of training and testing samples mentioned in the RSNA pneumonia dataset versus the dataset distribution mentioned on the github page, probably because of the multiple rows corresponding to same patient ID in the RSNA dataset’s csv file as it was supposed to be a detection task, please verify.

Duplicate filenames in train/test text files as per `create_COVIDx_v2.ipynb` and a way to resolve the issue

Hi Dear. First of all, thank you so much for sharing the data and network. Though you have removed the duplicates from test_COVIDx.txt but as per me, there are some duplicate filenames in train_COVIDx.txt file. It is requested to please add the following function at the last of your create_COVIDx_v2.ipynb notebook. This function will resolve all the duplicate issues and will sort all of the images in train/test data to their respective subfolders (i.e., Normal, Pneumonia and COVID-19) as following.

1- train
           |_____ normal
                    |_____ 7,966 images
           |_____ pneumonia
                    |_____ 5442 images
           |_____ COVID-19
                    |_____ 92 images
2- test
           |_____ normal
                    |_____ 885 images
           |_____ pneumonia
                    |_____ 594 images
           |_____ COVID-19
                    |_____ 10 images

The function is

import pandas as pd
import shutil
from tqdm import tqdm_notebook as tqdm


def ArrangeData_LabelNamedFolders(file_path, folder_path, dest_folder_path, indicator):
    print('{} Operation'.format(indicator))
    df = pd.read_csv(file_path, sep=' ', names=['patientid', 'filename', 'label'])
    df = df.drop_duplicates(subset='filename', keep="first")
    labelFolders = df.label.unique()
    print(labelFolders)
    for labelFolder in labelFolders:
        if not os.path.exists(dest_folder_path+'/'+labelFolder):
            os.makedirs(dest_folder_path+'/'+labelFolder)
    imageNames = sorted(os.listdir(folder_path))
    for imageName in tqdm(imageNames):
        temp_df = df.loc[df['filename']== imageName]
        class_ = temp_df['label'].values.item()
        src = folder_path +'/' + imageName
        dest = dest_folder_path + '/' + str(class_) + '/' + imageName 
        shutil.copy(src, dest)
        
        
    
train_file = 'train_split_v2.txt'
train_folder = './data/train'
dest_train_folder = './categorize data/train'

test_file = 'test_split_v2.txt'
test_folder = './data/test'
dest_test_folder = './categorize data/test'


ArrangeData_LabelNamedFolders(test_file, test_folder, dest_test_folder, indicator='Test')
ArrangeData_LabelNamedFolders(train_file, train_folder, dest_train_folder, indicator='Train')

How to train network from scratch?

Hello,

I want to train you network from scratch and not from the pre-trained "Small" and "Large" models.

Could you please describe how to do that since I want to compare the effects of data augmentation vs dropout vs no augmentation & no dropout?

Thank you!

Not able to obtain the same sensitivity values (96.8) when trained on COVIDNet-CXR-Large model

I am not getting the same sensitivity values when trained for 15 more epochs. The sensitivity values are not retained for the 1st epoch itself!
I have used COVIDNet-CXR-Large model and the dataset files being train_COVIDx2.txt and test_COVIDx2.txt.
It has been mentioned in the paper that you have used a learning rate policy which reduces the learning rate if the learning is stagnated for a period of time. The factor and patience values, 0.7 & 5 respectively have also been mentioned in the paper. However, I did not come across any line in the code which implement this.
I have tried to train the model on the same dataset for another 30 epochs as well with different learning rates (2e-07 & 2e-08). The sensitivity kept dropping.
Am I missing something?

Missing RSNA images

RSNA has ~26k images, but COVIDx has ~13k samples.
Could you include some explanation on how the samples are chosen?

Missing images from Figure1 dataset

Thanks for the amazing work!

I see that in Covidx2 you used only 3 images from Figure1 collections. Is there a reason for that, or was it just timing? Do you know if there are overlaps between Figure1 and ieee8023/covid-chestxray-dataset?

Missing checkpoint file

Could you please include the checkpoint file which is missing in your Pretrained model (COVID-Netv1.zip)

On using a train saver, the following files might have been generated at your end:

  • model.data-00000-of-00001
  • model.meta
  • model.index
  • checkpoint

These 4 files would be required when importing the meta graph and restoring the latest checkpoint.

Thanks!

Confusion Matrix and Results Should Be Updated Or Clarified

It looks like additional training and test examples were added but the Confusion Matrix and Results have not been updated to reflect this. I recommend either updating the results, or if the results are not available yet (possibly still training the new model?) a quick note added to make sure that there isn't confusion about the Confusion Matrix, which only shows 8 ground truth COVID-19 samples still. As there are two false positives in the Confusion Matrix, it's possible to assume that the results have been miscalculated with false negatives as false positives, which would reverse the precision and recall.

Migrate to tensorflow 2

Can you please make your code compatible with Tensorflow 2.0+ by default.

Pretty easy to do with no code changes, see:
https://www.tensorflow.org/guide/migrate

Just add this wherever you previously imported tensorflow:

import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

And update your requirements.txt per #45

Script for loading the model

is:issue Sir can you provide a simple code to load the model because i have tried many methods and all are showing some errors kindly also mention versions of tf. last error i got was :
{module 'tensorflow._api.v2.train' has no attribute 'import_meta_graph')
............... the code was as following :
import tensorflow as tf
new_graph = tf.Graph()
with tf.compat.v1.Session(graph=new_graph) as sess:
saver = tf.train.import_meta_graph('/content/models2/model.meta')
saver.restore(sess, "/content/models2/model")

i have also tried tf.compat.v1.Session and tf.Session but unable to load the model

inference on cpu vs gpu

What's the recommended environment for inference? CPU or GPU?

I've tested this on Google Colab

python inference.py --weightspath ./ --metaname model.meta_eval --ckptname model-6207 --imagepath assets/ex-covid.jpeg

With CPU it takes ~ 4.9s

With GPU it takes ~ 6.5s

Incorrect reference in white paper

There is an error in the white paper on the last reference point, where the Kaggle dataset link is pointing to Cohen's dataset on github.

image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.