Coder Social home page Coder Social logo

drmmz / retinanet Goto Github PK

View Code? Open in Web Editor NEW
2.0 1.0 2.0 592 KB

RetinaNet for Object Detection in TensorFlow2 and Applications

License: MIT License

Python 14.30% Jupyter Notebook 85.70%
retinanet object-detection focal-loss nuclei-detection sku-110k tensorflow widerface-dataset global-wheat-detection

retinanet's Introduction

RetinaNet for Object Detection

RetinaNet is an efficient one-stage object detector trained with the focal loss. This repository is a TensorFlow2 implementation of RetinaNet and its applications, aiming for creating a tool in object detection task that can be easily extended to other datasets or used in building projects. It includes

  1. source code of RetinaNet and its configuration (multiple GPUs training and detecting);
  2. source code of data (RetinaNet's inputs) generator using multiple CPU cores;
  3. source code of utilities such as image/mask preprocessing, augmetation, average precision (AP) metric, visualization and so on;
  4. jupyter notebook demonstration using RetinaNet in training and real-time detection on some datasets.

Updates

  • soon/2022: Will have an update to clean up some mess and provide a tutorial on how to generate a customized dataset and then train.
  • 10/2/2021: Solve OOM problem when inferencing by fixing resnet_fpn.compute_fmap().

Applications

The following are example detections.

  • The Global Wheat Challenge 2021 is a detection and counting challenge of wheat head. By using this implementation and trained only on the given training set, we are able to achieve the following result (evaluated on the test set used for competition submission):
GPU size detection time (second per image) evaluation metric (ADA)
GeForce RTX 2070 SUPER 1024x1024 0.11 0.478

where the evaluation metric ADA is Average Domain Accuracy defined in here.

  • Video detection in human faces:
bourne_detect_540.mp4

Scenes are taken from The Bourne Ultimatum (2007 film) and the cover page is from The Bourne Identity (2002 film). It was trained on the wider face dataset.

Moveover, it can be used to recognize Jason Bourne. See the next video and ProtoNet for Few-Shot Learning in TensorFlow2 and Applications for details.

bourne_540.mp4
  • My own dataset, empty returns operations (ERO-CA), is a collection of images such that each contains empty beer, wine and liquor cans or bottles in densely packed scenes that can be returned for refunds in Canada. The goal is to count the number of returns fast and accurately, instead of manually checking by human (specially for some people like me who is bad on counting). The dataset (as of July 15 2021) consists of 47 labeled cellphone images in cans, variety of positions. If you are interested in contributing to this dataset or project, please email me.

  • The SKU-110K dataset, focusing on detection in densely packed scenes. Indeed, our ERO-CA detection above used transfer learning from SKU-110K.

  • The nuclei dataset, identifying the cells’ nuclei.

Requirements

python 3.7.9, tensorflow 2.3.1, matplotlib 3.3.4, numpy 1.19.2, opencv 4.5.1, scipy 1.6.0, scikit-image 0.17.2 and tensorflow-addons 0.13.0

References

  1. Lin et al., Focal Loss for Dense Object Detection, https://arxiv.org/abs/1708.02002, 2018
  2. Mask R-CNN for Object Detection and Segmentation, https://github.com/matterport/Mask_RCNN, 2018

retinanet's People

Contributors

drmmz avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Forkers

ttl518 minaahmed

retinanet's Issues

Dataset

Kindly May i know which dataset your using ? Thank you .

Format of Dataset

What is the format of Dataset ? could you please provide any training and testing dataset here ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.