Coder Social home page Coder Social logo

figma_rcnn's Introduction

Figma RCNN(Fine-grained Multi-Attribute RCNN)

Person Detection and Multi-Attributes Recognition with only one Jointly-Trained Holistic CNN Model
The master branch works with tensorpack 0.9
It is a part of the Figma RCNN project developed by Junlin Gu, Graduate student at UESTC.

Main References

Dependencies

  • Python 3.3+; OpenCV
  • TensorFlow ≥ 1.6
  • Tensorpack ≥ 0.9
  • pycocotools: pip install 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'
  • Pre-trained ImageNet ResNet model from tensorpack model zoo
  • COCO data. It needs to have the following directory structure:
COCO/DIR/
  annotations/
    instances_train201?.json
    instances_val201?.json
  train201?/
    COCO_train201?_*.jpg
  val201?/
    COCO_val201?_*.jpg
wider attibutes/
  Anno/
    wider_attribute_trainval.json
    wider_attribute_test.json
  train/
    0--Parade/
      0_Parade_marchingband_?.jpg

    1--Handshaking/
      1_Handshaking_Handshaking_?.jpg
    ...
  test/
    0--Parade/
      0_Parade_marchingband_?.jpg
    1--Handshaking/
      1_Handshaking_Handshaking_?.jpg
    ...

Usage

Installation

Setup Docker Environment on Ubuntu host(recommended)
Download the image file and install the Docker environment, please click here to view my blog

Train:

To train on a single machine:

./attr_train.py --config \
    BACKBONE.WEIGHTS=/path/to/COCO-R50C4-MaskRCNN-Standard.npz

Options can be changed by either the command line or the tensorpack_config.py file (recommended).

Inference:

To predict on an image (needs DISPLAY to show the outputs):

./demo_cam.py 
--image
/path/to/input.jpg
--cam
0
--obj_model
all-in-one
--obj_ckpt
/root/to/checkpoint
--obj_config
DATA.BASEDIR=/path/to/COCO/DIR

The trained models can be downloaded in the [Baidu Cloud] (Waiting upload).

Results

The models' detection branch are trained on COCO trainval35k and evaluated on COCO minival2014 using mAP@IoU=0.50:0.95. attributes branch are trained on Wider Attribute trainval and evaluated on Wider Attribute test using mAP. The models are fine-tuned from ResNet pre-trained R50C4 models in tensorpack model zoo

Performance in Person Detection can be approximately reproduced.

| Backbone  | mAP (box) | Detectron mAP (box) |Configurations                            |
| R50-C4    | 33.3      | 32.8                |TRAIN.LR_SCHEDULE=[120000, 240000, 280000]|

Performance in Person Attributes Recognition can be approximately reproduced.

|      Atrributes   | AP (positive/negative)     |
|        male       |        0.9503              |
|      longhair     |        0.8598              |
|      sunglass     |        0.7342              |
|        hat        |        0.9477              |
|       tshirt      |        0.7892              |
|     longsleeve    |        0.9602              |
|       formal      |        0.8084              |
|       shorts      |        0.9105              |
|       jeans       |        0.7461              |
|     longpants     |        0.9711              |
|       skirt       |        0.8454              |
|     facemask      |        0.7372              |
|       logo        |        0.8912              |
|      stripe       |        0.6159              |
|        mAP        |        0.8405              |

Some examples

Here are some visualization results of the figma rcnn model.
Image text
Image text
Image text
Image text
Image text

figma_rcnn's People

Contributors

itmessager avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.