brightxiaohan / facedetector Goto Github PK

A re-implementation of mtcnn. Joint training, tutorial and deployment together.

Python 6.13% Jupyter Notebook 93.68% C++ 0.01% Cuda 0.19%

facedetection pytorch cnn deep-learning mtcnn mtcnn-pytorch

facedetector's Introduction

MTCNN

pytorch implementation of inference and training stage of face detection algorithm described in
Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks.

Why this projects

mtcnn-pytorch This is the most popular pytorch implementation of mtcnn. There are some disadvantages we found when using it for real-time detection task.

No training code.
Mix torch operation and numpy operation together, which resulting in slow inference speed.
No unified interface for setting computation device. ('cpu' or 'gpu')
Based on the old version of pytorch (0.2).

So we create this project and add these features:

Add code for training stage, you can train model by your own datasets.
Transfer all numpy operation to torch operation, so that it can benefit from gpu acceleration. It's 10 times faster than the original repo mtcnn-pytorch.
Provide unified interface to assign 'cpu' or 'gpu'.
Based on the latest version of pytorch (1.0) and we will provide long-term support.
It's is a component of our FaceLab ecosystem.
Real-time face tracking.
Friendly tutorial for beginner.

Installation

Create virtual env use conda (recommend)

conda create -n face_detection python=3
source activate face_detection

Installation dependency package

pip install opencv-python numpy easydict Cython progressbar2 torch tensorboardX

If you have gpu on your mechine, you can follow the official instruction and install pytorch gpu version.

Compile the cython code

Compile with gpu support

python setup.py build_ext --inplace

Compile with cpu only

python setup.py build_ext --inplace --disable_gpu

Also, you can install mtcnn as a package

python setup.py install

Test the code by example

We assume all these command running in the $SOURCE_ROOT directory.

Detect on example picture

python -m unittest tests.test_detection.TestDetection.test_detection

Detect on video

python scripts/detect_on_video.py --video_path ./tests/asset/video/school.avi --device cuda:0 --minsize 24

you can set device to 'cpu' if you have no valid gpu on your machine

Basic Usage

import cv2
import mtcnn

# First we create pnet, rnet, onet, and load weights from caffe model.
pnet, rnet, onet = mtcnn.get_net_caffe('output/converted')

# Then we create a detector
detector = mtcnn.FaceDetector(pnet, rnet, onet, device='cuda:0')

# Then we can detect faces from image
img = 'tests/asset/images/office5.jpg'
boxes, landmarks = detector.detect(img)

# Then we draw bounding boxes and landmarks on image
image = cv2.imread(img)
image = mtcnn.utils.draw.draw_boxes2(image, boxes)
image = mtcnn.utils.draw.batch_draw_landmarks(image, landmarks)

# Show the result
cv2.imshwow("Detected image.", image)
cv2.waitKey(0)

Doc

Train your own model from scratch

Tutorial

Detect step by step.

face_alignment step by step

facedetector's People

Contributors

Stargazers

Watchers

Forkers

smallonehan rroosshhaann gakki-yuii romeo-cc klwgo yanhuizen winnerineast bgbofficial 875798590 jiangnanfei lexuszhi1990 code-conquer bmsknight bkl255 recorsa sambd86 iui000 xincui-math xqycczu chenying99 xenostar123 duwizerak hnn123 maxmarketit javantang aabiao lwppwl brahimbellahcen karllzy shenmayufei beotborry senshu96 e2forks junweiz yupengg itinterpret jedivova ikonushok jh-lam neiljdo playbit1 zcylno kingthreestones desertsniper87

facedetector's Issues

CUDA error: an illegal memory access was encountered

When I try to generate training data for onet I receive a memory error in the middle of processing the image files. I am using pytorch 1.5. Any help would be appreciated.

does it a misuse in data.py?

hi, I dived into the project today, it seems a bug in the mtcnn/train/data.py#get_training_data():
positive_dest = os.path.join(output_folder, suffix, 'positive')
this line is nouse since the below line misued a prefix part_dest:
pos = [os.path.join(part_dest, i) for i in positive_meta.iloc[:, 0]]
I think this should be positive_dest rather than part_dest. otherwise this will result same output as part faces.

confusion regarding map output feature to the location of original image

We caculate Correspondence by this fomula:
x1 = x1_map * 2 + 1,
y1 = y1_map * 2 + 1,
x2 = x1_map * 2 + 1 + 12,
y2 = y2_map * 2 + 1 + 12.
First, how do you derive the equations here? Second, the axis of Red box in original image is (1, 1, 13, 13), which Correspond to (0, 0) in feature map. How about (0,0,12,12)? That is the first convolutional region, which should map to (0,0) in the output feature map? Could you clarify these? Thanks.

GPU-based NMS method was used to predict the results, and multiple calibration boxes appeared on each face.

Training from scratch

Hi,

Is it possible to train this model from scratch ?

How do you convert the *.torchm files into the format usable by the mtcnn module (pnet.npy, rnet.npy, onet.npy)?

I've followed the training code, and successfully created the pnet.torchm, rnet.torchm, onet.torchm files, but how to I make them useable by mtcnn? Right now when I try to load them with mtcnn it throws a raise KeyError("%s is not a file in the archive" % key) TypeError: not enough arguments for format string

can you provide the training scripts?

Thank you very much for providing the code.
I can't find a training script in your project. Can you provide it?

CUDA error: no kernel image is available for execution on the device

error while detect image

The torch version is 1.0. The error message is as follows. What is the problem?

/usr/local/lib/python2.7/site-packages/mtcnn/network/mtcnn_pytorch.py:9: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
nn.init.xavier_uniform(m.weight.data)
/usr/local/lib/python2.7/site-packages/mtcnn/network/mtcnn_pytorch.py:10: UserWarning: nn.init.constant is now deprecated in favor of nn.init.constant_.
nn.init.constant(m.bias, 0.1)
/usr/local/lib/python2.7/site-packages/torch/nn/functional.py:2423: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
"See the documentation of nn.Upsample for details.".format(mode))
Traceback (most recent call last):
File "detect_on_image_xb.py", line 33, in
boxes, landmarks = detector.detect(img)
File "/usr/local/lib/python2.7/site-packages/mtcnn/deploy/detect.py", line 59, in detect
stage_one_boxes = self.stage_one(img, threshold[0], factor, minsize, nms_threshold[0])
File "/usr/local/lib/python2.7/site-packages/mtcnn/deploy/detect.py", line 13, in wrapper
ret = func(*args, **kwargs)
File "/usr/local/lib/python2.7/site-packages/mtcnn/deploy/detect.py", line 236, in stage_one
img, size=(w, h), mode='bilinear')
File "/usr/local/lib/python2.7/site-packages/torch/nn/functional.py", line 2447, in interpolate
return torch._C._nn.upsample_bilinear2d(input, _output_size(2), align_corners)
TypeError: upsample_bilinear2d(): argument 'output_size' must be tuple of ints, but found element of type float at pos 1

Benchmark?

Are there any benchmarks on this project? What is the FPS?

How much speed on GPU

AttributeError: module 'mtcnn' has no attribute 'get_net_caffe'

Any suggestions to work around this error message?

Is it better to handle for the case num = int(line.strip()) = 0?

Hi thanks for sharing code.
if num = 0 https://github.com/faciallab/FaceDetector/blob/5b588f58884086d005a5ac67bd39a03ce631f8f4/mtcnn/datasets/wider_face.py#L43, for example
10422 0--Parade/0_Parade_Parade_0_452.jpg
10423 0
10424 0 0 0 0 0 0 0 0 0 0
10425 0--Parade/0_Parade_Parade_0_630.jpg
then, current_num != num, so flag still be 2. If so, in the next iteration the for line in f: will start from elif flag == 2:, but the current line 0--Parade/0_Parade_Parade_0_630.jpg (flag should be 0)

About evaluation metric

Hello!
Is there evaluation metric like accuracy in the paper here?
Looking forward to your reply！

a misused of w,h in cv2.imread

Hi, against, I found there is a misused w.r.t. image w,h in mtcnn/train/gen_landmark.py#gen_landmark_data():
`
cv2.cvtColor(img, cv2.COLOR_RGB2BGR) # Do this for compatible with caffe model

img_w = img.shape[0]

img_h = img.shape[1]
`
ie. w and h should be swapped above, right?

ValueError: Buffer dtype mismatch, expected 'int_t' but got 'long long'

When I run the test example, an error occured:
ValueError: Buffer dtype mismatch, expected 'int_t' but got 'long long'
How to solve it?