Coder Social home page Coder Social logo

isabella232 / face-mask-detection Goto Github PK

View Code? Open in Web Editor NEW

This project forked from nvidia-ai-iot/face-mask-detection

0.0 0.0 0.0 964 KB

Face Mask Detection using NVIDIA Transfer Learning Toolkit (TLT) and DeepStream for COVID-19

License: MIT License

Python 58.65% Jupyter Notebook 41.35%

face-mask-detection's Introduction

face_mask_detection

NVIDIA Developer Blog

The project shows, tutorial for NVIDIA's Transfer Learning Toolkit (TLT) + DeepStream (DS) SDK ie training and inference flow for detecting faces with mask and without mask on Jetson Platform.

By the end of this project; you will be able to build DeepStream app on Jetson platform to detect faces with mask and without mask.

alt text

What this project includes

  • Transfer Learning Toolkit (TLT) scripts:
    • Dataset processing script to convert it in KITTI format
    • Specification files for configuring tlt-train, tlt-prune, tlt-evalute
  • DeepStream (DS) scripts:
    • deepstream-app config files (For demo on single stream camera and detection on stored video file)

What this project does not provide

  • Trained model for face-mask detection; we will go through step by step to produce detetctnet_v2 (with ResNet18 backbone) model for face-mask detection.
  • NVIDIA specific dataset for faces with and without mask; we suggest following dataset based on our experiments.

Preferred Datasets

Note: We do not use all the images from MAFA and WiderFace. Combining we will use about 6000 faces each with and without mask

Steps to perform Face Detection with Mask:

  • Install dependencies and Docker Container

    • On Training Machine with NVIDIA GPU:
      • Install NVIDIA Docker Container: installation instructions TLT Toolkit Requirements
      • Running Transfer Learning Toolkit using Docker
        • Pull docker container:
          docker pull nvcr.io/nvidia/tlt-streamanalytics:v2.0_py3
        • Run the docker image:
          docker run --gpus all -it -v "/path/to/dir/on/host":"/path/to/dir/in/docker" \
                        -p 8888:8888 nvcr.io/nvidia/tlt-streamanalytics:v2.0_py3 /bin/bash
          
      • Clone Git repo in TLT container:
        git clone https://github.com/NVIDIA-AI-IOT/face-mask-detection.git
        
      • Install data conversion dependencies
        cd face-mask-detection
        python3 -m pip install -r requirements.txt
        
    • On NVIDIA Jetson:
  • Prepare input data set (On training machine)

    • We expect downloaded data in this structure.

    • Convert data set to KITTI format cd face-mask-detection

      python3 data2kitti.py --kaggle-dataset-path <kaggle dataset absolute directory path> \
                               --mafa-dataset-path <mafa dataset absolute  directory path> \
                               --fddb-dataset-path < FDDB dataset absolute  directory path> \
                               --widerface-dataset-path <widerface dataset absolute  directory path> \
                               --kitti-base-path < Out directory for storing KITTI formatted annotations > \
                               --category-limit < Category Limit for Masked and No-Mask Faces > \
                               --tlt-input-dims_width < tlt input width > \
                               --tlt-input-dims_height <tlt input height > \
                               --train < for generating training dataset >
      

      You will see following output log:

        Kaggle Dataset: Total Mask faces: 4154 and No-Mask faces:790
        Total Mask Labelled:4154 and No-Mask Labelled:790
      
        MAFA Dataset: Total Mask faces: 1846 and No-Mask faces:232
        Total Mask Labelled:6000 and No-Mask Labelled:1022
      
        FDDB Dataset: Mask Labelled:0 and No-Mask Labelled:2845
        Total Mask Labelled:6000 and No-Mask Labelled:3867
      
        WideFace: Total Mask Labelled:0 and No-Mask Labelled:2134
        ----------------------------
        Final: Total Mask Labelled:6000
        Total No-Mask Labelled:6001
        ----------------------------
      

    Note: You might get warnings; you can safely ignore it

  • Perform training using TLT training flow

  • Perform inference using DeepStream SDK on Jetson

    • Transfer model files (.etlt), if int8: calibration file (calibration.bin)
    • Use config files from /ds_configs/* $vi config_infer_primary_masknet.txt
      • Modify model and label paths: according to your directory locations
        • Look for tlt-encoded-model, labelfile-path, model-engine-file, int8-calib-file
      • Modify confidence_threshold, class-attributes according to training
        • Look for classifier-threshold, class-attrs
    • Use deepstream_config files: $ vi deepstream_app_source1_masknet.txt
      • Modify model file and config file paths:
        • Look for model-engine-file, config-file under primary-gie
    • Use deepstream-app to deploy in real-time $deepstream-app -c deepstream_app_source1_video_masknet_gpu.txt
    • We provide two different config files:
      • DS running on GPU only with camera input: deepstream_app_source1__camera_masknet_gpu.txt
      • DS running on GPU only with saved video input: deepstream_app_source1_video_masknet_gpu.txt

Note:
- model-engine-file is generated at first run; once done you can locate it in same directory as .etlt - In case you want to generate model-engine-file before first run; use tlt-converter

Evaluation Results on NVIDIA Jetson Platform

Pruned mAP (Mask/No-Mask)
(%)
Inference Evaluations on Nano Inference Evaluations on Xavier NX Inference Evaluations on Xavier
GPU
(FPS)
GPU
(FPS)
DLA
(FPS)
GPU
(FPS)
DLA
(FPS)
No 86.12 (87.59, 84.65) 6.5 125.36 30.31 269.04 61.96
Yes (12%**) 85.50 (86.72, 84.27) 21.25 279 116.2 508.32 155.5

NVIDIA Transfer Learning Toolkit (TLT) Training Flow

  1. Download Pre-trained model ( For Mask Detection application, we have experimented with Detectnet_v2 with ResNet18 backbone)
  2. Convert dataset to KITTI format
  3. Train Model (tlt-train)
  4. Evaluate on validation data or infer on test images (tlt-evaluate, tlt-infer)
  5. Prune trained model (tlt-prune)
    Pruning model will help you to reduce parameter count thus improving FPS performance
  6. Retrain pruned model (tlt-train)
  7. Evaluate re-trained model on validation data (tlt-evaluate)
  8. If accuracy does not fall below satisfactory range in (7); perform step (5), (6), (7); else go to step (9)
  9. Export trained model from step (6) (tlt-export)
    Choose int8, fp16 based on you platform needs; such as Jetson Xavier and Jetson Xavier-NX has int8 DLA support

Interesting Resources

References

  • Evan Danilovich (2020 March). Medical Masks Dataset. Version 1. Retrieved May 14, 2020 from https://www.kaggle.com/ivandanilovich/medical-masks-dataset
  • Shiming Ge, Jia Li, Qiting Ye, Zhao Luo; "Detecting Masked Faces in the Wild With LLE-CNNs", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 2682-2690
  • Vidit Jain and Erik Learned-Miller. "FDDB: A Benchmark for Face Detection in Unconstrained Settings". Technical Report UM-CS-2010-009, Dept. of Computer Science, University of Massachusetts, Amherst. 2010
  • Yang, Shuo and Luo, Ping and Loy, Chen Change and Tang, Xiaoou; "WIDER FACE: A Face Detection Benchmark", IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2016
  • MAFA Dataset Google Link: Courtesy aome510

face-mask-detection's People

Contributors

ak-nv avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.