Coder Social home page Coder Social logo

pytorch-object-detection-faster-rcnn-tutorial's Introduction

PyTorch Faster-RCNN Tutorial

Learn how to start an object detection deep learning project using PyTorch and the Faster-RCNN architecture in this beginner-friendly tutorial. Based on the blog series Train your own object detector with Faster-RCNN & PyTorch by Johannes Schmidt.

image1 image2

Summary

You can train the model using the training script.

In addition, I provide jupyter-notebooks for various tasks such as creating & exploring datasets, running inference and visualizing anchor boxes:

Installation

After cloning the repository, follow these steps to install the dependencies in a new environment and start a jupyter server:

  1. Set up & activate a new environment with an environment manager (recommended):

    1. poetry:
      1. poetry env use python3.10
      2. source .venv/bin/activate
    2. venv:
      1. python3 -m venv .venv
      2. source .venv/bin/activate
    3. conda:
      1. conda create --name faster-rcnn-tutorial -y
      2. conda activate faster-rcnn-tutorial
      3. conda install python=3.10 -y
  2. Install the libraries with pip or poetry:

    1. poetry:
      1. poetry install (poetry.lock)
    2. pip (including conda):
      1. pip install -r requirements.txt (requirements.txt)
  3. Start a jupyter server:

    1. jupyter-notebook (not jupyter-lab, because of a dependency issue with the neptune-client<1.0.0)

Note: This will install the CPU-version of torch. If you want to use a GPU or TPU, please refer to the instructions on the PyTorch website. To check whether pytorch uses the nvidia gpu, check if torch.cuda.is_available() returns True in a Python shell.

Windows user: If you can not start jupyter-lab or jupyter-notebook on Windows because of ImportError: DLL load failed while importing win32api, try to run conda install pywin32 with the conda package manager.

Dependencies

These are the libraries that are used in this project:

  • High-level deep learning library for PyTorch: PyTorch Lightning
  • Visualization software: Custom code with the image-viewer Napari
  • [OPTIONAL] Experiment tracking software/logging module: Neptune

If you want to use Neptune for your own experiments, add the API-Key to the NEPTUNE variable in the .env file.

Please make sure that you meet these requirements:

Dataset

The dataset consists of 20 selfie-images randomly selected from the internet.

Faster-RCNN model

Most of the model's code is based on PyTorch's Faster-RCNN implementation. Metrics can be computed based on the PASCAL VOC (Visual Object Classes) evaluator in the metrics section.

Anchor Sizes/Aspect Ratios

Anchor sizes/aspect ratios are really important for training a Faster-RCNN model (but also similar models like SSD, YOLO). These "default" boxes are compared to those outputted by the network, therefore choosing adequate sizes/ratios can be critical for the success of a project. The PyTorch implementation of the AnchorGenerator (and also the helper classes here) generally expect the following format:

  • anchor_size: Tuple[Tuple[int, ...], ...]
  • aspect_ratios: Tuple[Tuple[float, ...]]

Without FPN

The ResNet backbone without the FPN always returns a single feature map that is used to create anchor boxes. Because of that we must create a Tuple that contains a single Tuple: e.g. ((32, 64, 128, 256, 512),) or (((32, 64),)

With FPN

With FPN we can use 4 feature maps (output from a ResNet + FPN) and map our anchor sizes with the feature maps. Because of that we must create a Tuple that contains exactly 4 Tuples: e.g. ((32,), (64,), (128,), (256,)) or ((8, 16, 32), (32, 64), (32, 64, 128, 256, 512), (200, 300))

Examples

Examples on how to create a Faster-RCNN model with pretrained ResNet backbone (ImageNet) are provided in the tests section. Pay special attention to the test function test_get_faster_rcnn_resnet in test_faster_RCNN.py. Recommendation: Run the test in debugger mode.

Notes

  • Sliders in the inference script do not work right now due to dependency updates.
  • Please note that the library "neptune-client" is deprecated but the migration to "neptune" has not finished yet. Therefore, the library "neptune-client" is still used in this project.

pytorch-object-detection-faster-rcnn-tutorial's People

Contributors

johschmidt42 avatar schmiddi-75 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

pytorch-object-detection-faster-rcnn-tutorial's Issues

ValueError: Could not find a format to read the specified file in single-image mode

I tried to implement FASTER RCNN using PyTorch, but I'm consistently getting same error for all the viewers(datasets viewer/anchorviewer). I tried to add plugin = "imread" to read_images function, but error still remains.


ValueError Traceback (most recent call last)
/var/folders/_n/drkfc7xj0_b5jpq4c8c19f240000gn/T/ipykernel_3698/1313361764.py in
1 from pytorch_faster_rcnn_tutorial.visual import AnchorViewer
----> 2 image = dataset[0]['x']
3 feature_map_size = (512, 32, 32)
4 anchorviewer = AnchorViewer(image=image,
5 rcnn_transform=transform,

~/Documents/PhD Data/Machine learning/September 2021/PyTorch-Object-Detection-Faster-RCNN-Tutorial/pytorch_faster_rcnn_tutorial/datasets.py in getitem(self, index)
57
58 # Load input and target
---> 59 x, y = self.read_images(input_ID, target_ID)
60
61 # From RGBA to RGB

~/Documents/PhD Data/Machine learning/September 2021/PyTorch-Object-Detection-Faster-RCNN-Tutorial/pytorch_faster_rcnn_tutorial/datasets.py in read_images(inp, tar)
128 @staticmethod
129 def read_images(inp, tar):
--> 130 return imread(inp), read_json(tar)
131
132

~/opt/anaconda3/envs/napari-env/lib/python3.9/site-packages/skimage/io/_io.py in imread(fname, as_gray, plugin, **plugin_args)
46
47 with file_or_url_context(fname) as fname:
---> 48 img = call_plugin('imread', fname, plugin=plugin, **plugin_args)
49
50 if not hasattr(img, 'ndim'):

~/opt/anaconda3/envs/napari-env/lib/python3.9/site-packages/skimage/io/manage_plugins.py in call_plugin(kind, *args, **kwargs)
205 (plugin, kind))
206
--> 207 return func(*args, **kwargs)
208
209

~/opt/anaconda3/envs/napari-env/lib/python3.9/site-packages/skimage/io/_plugins/imageio_plugin.py in imread(*args, **kwargs)
8 @wraps(imageio_imread)
9 def imread(*args, **kwargs):
---> 10 return np.asarray(imageio_imread(*args, **kwargs))

~/opt/anaconda3/envs/napari-env/lib/python3.9/site-packages/imageio/core/functions.py in imread(uri, format, **kwargs)
263
264 # Get reader and read first
--> 265 reader = read(uri, format, "i", **kwargs)
266 with reader:
267 return reader.get_data(0)

~/opt/anaconda3/envs/napari-env/lib/python3.9/site-packages/imageio/core/functions.py in get_reader(uri, format, mode, **kwargs)
179 if format is None:
180 modename = MODENAMES.get(mode, mode)
--> 181 raise ValueError(
182 "Could not find a format to read the specified file in %s mode" % modename
183 )

ValueError: Could not find a format to read the specified file in single-image mode

Training error: invalid literal for int() with base 10: ''

Hi John,

I'm using your code for object detection of damages on wind turbine blades, thank you very much for the amazing explanation and comprehensive code. But as I'm going through the training stage I got this error:

transformations.py:68, in map_class_to_int(labels, mapping)
     65 for key, value in mapping.items():
     66     dummy[labels == key] = value
---> 68 return dummy.astype(np.uint8)

ValueError: invalid literal for int() with base 10: ''

The code I am trying to run is:

# start training
trainer.fit(
    model=task, train_dataloader=dataloader_train, val_dataloaders=dataloader_valid
)

I saw a comment with the same error mentioned in your post on medium.com (where I found your code) but as I looked into the resolved issues in this repository I couldn't find anything similar.

Could you please help with this one? If by any chance I missed it in the issues section please do let me know.

Kind regards,
Tetiana Buzykina

KeyError: 'name'

image

It seems that a learning_rate scheduler is not found? Don t understand the issue?

how do I monitor accuracy with the logger ?

Hello first of all thank you for making the article in the Medium
I wanted to make a custom object detector using the tutorial you provided
however, I saw that in the medium article the script did not monitor the accuracy (TP+TN)/(TP+TN+FP+FN)
how do I add that metric to the logger? any help would be appreciated

Error when using FPN

Hi John,

Thanks for your amazing tutorial! I have run some experiments using resnet34 and resnet18, the performance looks good. I'm interested to try using FPN, but there is error "raise ValueError("Anchors should be Tuple[Tuple[int]] because each feature "
ValueError: Anchors should be Tuple[Tuple[int]] because each feature map could potentially have different sizes and aspect ratios. There needs to be a match between the number of feature maps passed and the number of sizes / aspect ratios "
Is there somewhere need to set up ?

Thanks for looking into this

Error Using FPN Backbone

assert len(grid_sizes) == len(strides) == len(cell_anchors)
error comes whe I use to have FPN backbone with the model

Training - IndexError: Target 1 is out of bounds.

Hi John,

Thank you very much for your detailed and comprehensive code and description on medium.com. I'm using it for damage detection on wind turbine blades and during the training stage I encountered the following issue:

File ~\Documents\***\***\venv\lib\site-packages\torch\nn\functional.py:2846, in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction, label_smoothing)
   2844 if size_average is not None or reduce is not None:
   2845     reduction = _Reduction.legacy_get_string(size_average, reduce)
-> 2846 return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)

IndexError: Target 1 is out of bounds.

I was running the following training code chunk:

trainer.fit(
    model=task, train_dataloader=dataloader_train, val_dataloaders=dataloader_valid
)

In Jupyter Notebook.

image

Could you please suggest a possible solution or do you perhaps know the reason why this error could occur?

Kind regards,
Tetiana Buzykina

error: UnpicklingError: invalid load key, '{'.

I get this error when trying to load in the annotated file of my first image. I was using the annotator.export function to save the annotated file to my image path. Any ideas to fix this
Capture

This is the json file of that pickled image
432fdzhgz

Only AP > 0 for one class out of 36

Dear John,
Thanks for the very clear and nice tutorial. I am currently trying to implement Faster-RCNN on ActionGenome using your implementation. This dataset has 36 classes. I managed to get the model to train by setting the right parameters in training_script.py, however I'm only getting an AP > 0 for just one class, the "person" class (around 0.55). The dataset is a bit unbalanced towards that class, but not severely. I've checked the ground truth annotations, how they look before being passed to the model and everything seems to be fine. Since your tutorial was just for one class, I was wondering if you ever experienced similar issues. If yes, how did you solve them? Thanks!

Napari

Is the napari notebook supposed to work on colab? It doesnt for me and I read in a few places that napari wont work in colab,

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.