Coder Social home page Coder Social logo

mtcnn_sysu's Introduction

Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks

This repo contains the code, data and trained models for the paper Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks. Try out the Gradio Web Demo: Hugging Face Spaces

Overview

MTCNN is a popular algorithm for face detection that uses multiple neural networks to detect faces in images. It is capable of detecting faces under various lighting and pose conditions and can detect multiple faces in an image.

We have implemented MTCNN using the pytorch framework. Pytorch is a popular deep learning framework that provides tools for building and training neural networks.

Description of file

├── README.md                      # explanatory document
├── get_data.py                    # Generate corresponding training data depending on the input “--net”
├── img                            # mid.png is used for testing visualization effects,other images are the corresponding results.
│   ├── mid.png
│   ├── onet.png
│   ├── pnet.png
│   ├── rnet.png
│   ├── result.png
│   └── result.jpg
├── model_store                    # Our pre-trained model
│   ├── onet_epoch_20.pt
│   ├── pnet_epoch_20.pt
│   └── rnet_epoch_20.pt
├── requirements.txt               # Environmental version requirements
├── test.py                        # Specify different "--net" to get the corresponding visualization results
├── test.sh                        # Used to test mid.png, which will test the output visualization of three networks
├── train.out                      # Our complete training log for this experiment
├── train.py                       # Specify different "--net" for the training of the corresponding network
├── train.sh                       # Generate data from start to finish and train
└── utils                          # Some common tool functions and modules
    ├── config.py
    ├── dataloader.py
    ├── detect.py
    ├── models.py
    ├── tool.py
    └── vision.py

Requirements

  • numpy==1.21.4
  • matplotlib==3.5.0
  • opencv-python==4.4.0.42
  • torch==1.13.0+cu116

How to Install

  • conda create -n env python=3.8 -y
    conda activate env
  • pip install -r requirements.txt

Preprocessing

  • download WIDER_FACE face detection data then store it into ./data_set/face_detection
  • download CNN_FacePoint face detection and landmark data then store it into ./data_set/face_landmark

Preprocessed Data

# Before training Pnet
python get_data.py --net=pnet
# Before training Rnet, please use your trained model path
python get_data.py --net=rnet --pnet_path=./model_store/pnet_epoch_20.pt
# Before training Onet, please use your trained model path
python get_data.py --net=onet --pnet_path=./model_store/pnet_epoch_20.pt --rnet_path=./model_store/rnet_epoch_20.pt

How to Run

Train

python train.py --net=pnet/rnet/onet #Specify the corresponding network to start training
bash train.sh                        #Alternatively, use the sh file to train in order

The checkpoints will be saved in a subfolder of ./model_store/*.

Finetuning from an existing checkpoint

python train.py --net=pnet/rnet/onet --load=[model path]

model path should be a subdirectory in the ./model_store/ directory, e.g. --load=./model_store/pnet_epoch_20.pt

Evaluate

Use the sh file to test in order

bash test.sh

To detect a single image

python test.py --net=pnet/rnet/onet  --path=test.jpg

To detect a video stream from a camera

python test.py --input_mode=0

The result of "--net=pnet"

The result of "--net=rnet"

The result of "--net=onet"

mtcnn_sysu's People

Contributors

enderfga avatar notonion avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.