tencent / facedetection-dsfd Goto Github PK

View Code? Open in Web Editor NEW

2.9K 106.0 732.0 148.63 MB

腾讯优图高精度双分支人脸检测器

License: Other

Python 97.94% Shell 2.06%

facedetection-dsfd's Introduction

Update

2019.04: Release pytorch-version DSFD inference code.
2019.03: DSFD is accepted by CVPR2019.
2018.10: Our DSFD ranks No.1 on WIDER FACE and FDDB

Introduction

In this repo, we propose a novel face detection network, named DSFD, with superior performance over the state-of-the-art face detectors. You can use the code to evaluate our DSFD for face detection.

For more details, please refer to our paper DSFD: Dual Shot Face Detector! or poster slide!

Our DSFD face detector achieves state-of-the-art performance on WIDER FACE and FDDB benchmark.

WIDER FACE

FDDB

Requirements

Torch == 0.3.1
Torchvision == 0.2.1
Python == 3.6
NVIDIA GPU == Tesla P40
Linux CUDA CuDNN

Getting Started

Installation

Clone the github repository. We will call the cloned directory as $DSFD_ROOT.

git clone xxxxxx/FaceDetection-DSFD.git
cd FaceDetection-DSFD
export CUDA_VISIBLE_DEVICES=0

Evaluation

Download the images of WIDER FACE and FDDB to $DSFD_ROOT/data/.
Download our DSFD model 微云 google drive trained on WIDER FACE training set to $DSFD_ROOT/weights/.
Check out ./demo.py on how to detect faces using the DSFD model and how to plot detection results.

python demo.py [--trained_model [TRAINED_MODEL]] [--img_root  [IMG_ROOT]] 
               [--save_folder [SAVE_FOLDER]] [--visual_threshold [VISUAL_THRESHOLD]] 
    --trained_model      Path to the saved model
    --img_root           Path of test images
    --save_folder        Path of output detection resutls
    --visual_threshold   Confidence thresh

Evaluate the trained model via ./widerface_val.py on WIDER FACE.

python widerface_val.py [--trained_model [TRAINED_MODEL]] [--save_folder [SAVE_FOLDER]] 
                         [--widerface_root [WIDERFACE_ROOT]]
    --trained_model      Path to the saved model
    --save_folder        Path of output widerface resutls
    --widerface_root     Path of widerface dataset

Download the eval_tool to show the WIDERFACE performance.
Evaluate the trained model via ./fddb_test.py on FDDB.

python widerface_test.py [--trained_model [TRAINED_MODEL]] [--split_dir [SPLIT_DIR]] 
                         [--data_dir [DATA_DIR]] [--det_dir [DET_DIR]]
    --trained_model      Path of the saved model
    --split_dir          Path of fddb folds
    --data_dir           Path of fddb all images
    --det_dir            Path to save fddb results

Download the evaluation to show the FDDB performance.
Lightweight DSFD is here.

Qualitative Results

Citation

If you find DSFD useful in your research, please consider citing:

@inproceedings{li2018dsfd,
  title={DSFD: Dual Shot Face Detector},
  author={Li, Jian and Wang, Yabiao and Wang, Changan and Tai, Ying and Qian, Jianjun and Yang, Jian and Wang, Chengjie and Li, Jilin and Huang, Feiyue},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2019}
}

Contact

For any question, please file an issue or contact

Jian Li: [email protected]

facedetection-dsfd's People

Contributors

Stargazers

Watchers

Forkers

wuyunxiangwyx sunguwei labimage marc45 trantorrepository pjkui facexteam npc-wang zxt881108 mfzhang jacke121 wenlong0913 hacker-wei wangkanger wuxiaolianggit armstrongyang ankitshah009 a554142589 gavin666github pjunhyuk waterbearbee andyliu93 dreadlord1984 zlinwei 72etcai hdjsjyl hxhh panhiuchuen lite-java znsoftm tiansong1991 selfsongs column6942 huangti-plus jangocheng jangocity gavin-gy tongkubaya xiang-zhe zhyhy leo-xxx shuenlaw aloyschen qaz734913414 taotaoyuhust iscas-lee seonho bingqingsuimeng zhijl vincent-zhu sxhdroid rjt1990 goseign guojiapeng00 jiaodalpp v-shmyhlo-ring hanchaow shysky hufangjian readreamer23 jayhhl artechstark chensenym liyufeiat001 lhb2017 xiaotie1005 frizy-up changya1990 hack121 easonshow templeblock zhanma tianshuichen zilipeng skyneta hanshan123 beijinggao b-xiang yanziqiguo haolongjie graycrown sumnotes amitabhama mathpopo lootom bosen365 bugbuglike ieee820 kioco monkeyfx qing0991 rongya muxinghan fileset tianyu06030020 ningz7 aizyz noel1992 dl-85 youtang1993

facedetection-dsfd's Issues

Head Detection

I want to extend this work for head detection. Can anyone tell me want I need to do?

about train dataset

Hey, I wonder which training dataset did you use to train the model?
Can you tell me?
ty a lot

training stage?

Hi, this work is pretty impressive. Would you please release the training code so that we can reconstruct it?
Thanks!

what is the improvement compare with MTCNN ?

bbox_vote funtion problem

你好
我在運行demo.py時，出現了錯誤：

Traceback (most recent call last):
  File "demo.py", line 252, in <module>
    test_oneimage()
  File "demo.py", line 245, in test_oneimage
    det = bbox_vote(det)
  File "demo.py", line 73, in bbox_vote
    det_accu_sum[:, 0:4] = np.sum(det_accu[:, 0:4], axis=0) / np.sum(det_accu[:, -1:])
  File "/usr/local/lib/python3.6/dist-packages/torch/tensor.py", line 367, in __rdiv__
    return self.reciprocal() * other
TypeError: mul(): argument 'other' (position 1) must be Tensor, not numpy.ndarray

我於前一行查看了det_accu：

print(det_accu)
[[884.7191162109375 687.6126098632812 996.62158203125 823.8545532226562
  0.9999959468841553]
 [884.49133 687.3231 996.8358 823.6817 tensor(1.0000)]
 [884.73834 686.1709 996.74756 822.86896 tensor(1.0000)]]

錯誤是否與det_accu[0]有關？
如果有關，這錯誤怎麼產生？
我該如何處理這問題？

about some details

看了code，有几个疑问：

paper中的FEM模块（Fig3）和code实现（class FEM(nn.Module)）时，貌似不一样．
paper中，输入分3份，然后每份分别经过3个dilation conv层．而code中，貌似并不是这样的操作.
multi_scale_test_pyramid算是对multi_scale_test一个补充吗？？感觉就是单纯的用了更多的测试尺度．
多尺度测试时，为什么图片缩小时，要排除一些小结果（小于30的）？？而图片放大时，要排除一些大结果（大于100的）？？
paper中提到的：For 4 bounding box coordinates, we round down top left coordinates and round up width and height to expand the detection bounding box．这个在貌似在code中未体现？

还有一个关于PyramidBox的问题：

PyramidBox的Fig3可知：P0，P1，P2，P3，P4，P5生成face_anchors；P1，P2，P3，P4，P5生成head_anchors；P2，P3，P4，P5生成body_anchors．对应关系是：P0的face对应P1的head对应P2的body，而且尺度应该是加倍的，如：face20对应head40对应body80．但是从code来看，并没有体现出加倍．

麻烦了！！

TypeError: '>=' not supported between instances of 'builtin_function_or_method' and 'int'

Hello, when I ran demo.py, I faced with this error:

Traceback (most recent call last):
File "demo.py", line 256, in
test_oneimage()
File "demo.py", line 225, in test_oneimage
det0 = infer(net , img , transform , thresh , cuda , shrink)
File "demo.py", line 123, in infer
keep_index = np.where(det[:, 4] >= 0)[0]
TypeError: '>=' not supported between instances of 'builtin_function_or_method' and 'int'

I tryed to change the type of 0 but failed, how to solve it?

BTW, I use pytorch1.1, Is the version's problem?

About Demo.py

Hi when I run the demo.py But I face the error How to solve it?

_pytest.config.exceptions.UsageError: usage: _jb_pytest_runner.py [options] [file_or_dir] [file_or_dir] [...]
_jb_pytest_runner.py: error: unrecognized arguments: --trained_model --save_folder eval_tools/ --visual_threshold 0.1 --img_root ./data/worlds-largest-selfie.jpg D:/Face/FaceDetection-DSFD-master/demo.py
inifile: None
rootdir: D:\Face\FaceDetection-DSFD-master

Res50-based DSFD Pretrained model

Hi,

Do you have any plan to release the Res50-based DSFD pretrained model that has 22 fps as mentioned in the paper?

Many thanks

pretrain model selection?

what kind of pretrain model was use in training? imagenet resnet pretrain or else?

数据增强的疑问

代码段augmentations.py：412～416行
centers = (boxes[:, :2] + boxes[:, 2:]) / 2.0
m1 = (choice_box[0] < centers[:, 0]) * (choice_box[1] < centers[:, 1])
m2 = (choice_box[2] > centers[:, 0]) * (choice_box[3] > centers[:, 1])
mask = m1 * m2#是否包含某个box的中心点 #这个保留face box的依据带考究只要中心点
在sample_boxes中就保存
current_boxes = boxes[mask, :].copy()
通过判断源face box的中心点是否在crop区域来确定该box是否保留？这个原则是否会出现一些box中人脸占比很少的情况？

eval_tools

how can I use the eval_tools to eval the AP of the test data

model file is too large

when i run python demo.py, the issue

CUDA out of memory,Tried to allocate 62.00 MiB (GPU 0; 22.38 GiB total capacity; 20.83 GiB already allocated; 20.06 MiB free; 276.64 MiB cached)

about the draw_toolbox

when I try to run the ./fddb_test.py，there will report “ ImportError: cannot import name 'draw_toolbox' ”，I checked the “./utils” folder and found there only an "augmentations.py", nothing about draw_toolbox, could anyone tell me what wrong I make?

Report your inference time on different GPUs.

framework

the code will be open scource based on which framework??

why First shot PAL is Not Using in this code?

I find this code didn't have First shot PAL .

Where is IAM module

I am sorry that I do not find the IAM module.Can u give me some tips?

i got memory error when i run demo.py

RuntimeError: CUDA out of memory. Tried to allocate 88.00 MiB (GPU 0; 7.93 GiB total capacity; 6.82 GiB already allocated; 68.50 MiB free; 65.36 MiB cached)

demo.py

您好，把预训练模型放到指定位置，然后运行demo.py时出现这个错误：
Missing key(s) in state_dict: "resnet.conv1.weight", "resnet.bn1.weight", "resnet.bn1.bias", "resnet.bn1.running_mean",。。。。。
Unexpected key(s) in state_dict: "layer1.0.weight", "layer1.1.weight", "layer1.1.bias", "layer1.1.running_mean", 。。。。
请问这个问题如何解决呢？是版本问题吗？

Runtime error when trying demo

Hello, I'm trying to excute demo.py on google collab, it always gets a runtime error, i may solve this by reducing the batch size, but can you show me how ?
Thank you

Is this model really slow or am I using it wrong?

Currently I am inferencing on a 768x1024 image (doing 2x, 1x, 0.5x image pyramid with flips, giving a total of 6 images) and it takes about 20 sec per image on 1070ti. Other models like SFD or PyramidBox are much faster.

Is this slowness expected or am I using it wrong?

Does your demo.py have a memory leakage problem? It is unable to process more than one image in a loop.

It gives CUDA out of memory error.

Re-implementation of DSFD using Resnet series got poor performance?

Hi, I've re-implemented dsfd using vgg16 and it performs well but when I changed the backbone using resnet-50, the performance dropped. Is there any special details should be noticed? Appreciate your reply. thx.

What is the batch size when you train and with what and how many GPUs?

Cannot load torch state dictionary

I get an Unplicling error when I try to load the pre-trained weights for the model:

  File "widerface_val.py", line 222, in <module>
    net.load_state_dict(torch.load(args.trained_model))
  File "/home/marios/anaconda3/envs/pyramidbox/lib/python3.6/site-packages/torch/serialization.py", line 267, in load
    return _load(f, map_location, pickle_module)
  File "/home/marios/anaconda3/envs/pyramidbox/lib/python3.6/site-packages/torch/serialization.py", line 410, in _load
    magic_number = pickle_module.load(f)
_pickle.UnpicklingError: invalid load key, '<'.

ncnn implementation of this detector is available?

I wanted to try this on android with ncnn framework.
is it available?
@nihui
@tyshiwo

Do you not do any transform when testing on WIDERFACE?

I notice your code in widerface_val.py uses the following line:

testset = WIDERFaceDetection(args.widerface_root, 'val' , None, WIDERFaceAnnotationTransform())

which means you do not do any transform for WIDERFACE? Is that true?

For your other demo code, they all include TestBaseTransform, but this one doesn't. What is the required preprocessing for your code.

autograd function with non-static warning

Using in a conda env with Pytorch GPU, I get the following warning:

/tmp/pip-req-build-p5q91txh/torch/csrc/autograd/python_function.cpp:638: UserWarning: Legacy autograd function with non-static forward method is deprecated and will be removed in 1.3. Please use new-style autograd function with static forward method. (Example: https://pytorch.org/docs/stable/autograd.html#torch.autograd.Function)

Is this something that is a DIY?

How to train the model??

windows platform

Hi .
Will the code run on windows platform?
thank you

question about config under large net like Resnet152

For case like using Resnet-152, one single GPU can not have big batch
I am interseted in

(1)How many P40 GPUs are used when training with Res152,
(2)what is the total batch, and the batch on each GPU
(3)Did you use SyncBN across GPUs?

Many Thanks !

wrong rotation

Hi.
Some of the results have wrong rotation.
Can you give me an idea how to improve on these images ?

when i run the demo script, i got the warning that my cuda version is low

the warning is as follows:

/home/anke/.conda/envs/dsfd/lib/python3.6/site-packages/torch/cuda/init.py:95: UserWarning:
Found GPU0 GeForce RTX 2080 Ti which requires CUDA_VERSION >= 9000 for
optimal performance and fast startup time, but your PyTorch was compiled
with CUDA_VERSION 8000. Please install the correct PyTorch binary
using instructions from http://pytorch.org

warnings.warn(incorrect_binary_warn % (d, name, 9000, CUDA_VERSION))
loading pretrained resnet model
(no error hints, but the resnet model is loading all the time)

Actully, my cuda version is 10.

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130

i dont know how to solve it . could u pls do me a favor ?
looking forward to ur early reply .thx~

I notice that the visual threshold used in your work is very low. Wouldn't it result in a lot of false positives?

I notice that the visual threshold used in your work is very low (0.1 in demo and 0.01 in WIDERFACE). Wouldn't it result in a lot of false positives?

I tested your demo and it indeed gives many false positives. Is it normal?

Lighweight Face Detector ?

This is too big. Release something small.

why fem is different with paper

what's the time-cost of infer a image ?

I run demo.py on P100 cost 6-10 s -_-!

RuntimeError: CUDA out of memory

ubuntu16.04+cuda10+pytorch1.1.1+NVIDIA2080

I have a problem, the "RuntimeError: CUDA out of memory. Tried to allocate 52.00 MiB (GPU 0; 7.76 GiB total capacity; 5.63 GiB already allocated; 29.69 MiB free; 82.78 MiB cached)"

Thanks

请问检测一张图片的时间和mtcnn对比有提升嘛，我们做跟踪，检测用的mtcnn，但时间有点慢。

about training

HI
Is there no code for the training？？

Problem downloading the trained model

Thanks for sharing your work.

I think there is some problem with the trained model link, it opens up this website https://share.weiyun.com/567x0xQ, but the webpage is blank and nothing happens,could you please check?

Thanks

can i use max-in-out while using sigmoid focal loss?

Can I know which of the options in `widerface_640` config are toggled during training?

Specifically, I am referring to this

widerface_640 = {
    'num_classes': 2,

     #'lr_steps': (80000, 100000, 120000),
     #'max_iter': 120000,
     'lr_steps': (40000, 50000, 60000),
     'max_iter': 60000,

    'feature_maps': [160, 80, 40, 20, 10, 5],
    'min_dim': 640,

    'steps': [4, 8, 16, 32, 64, 128],   # stride 
    
    'variance': [0.1, 0.2],
    'clip': True,  # make default box in [0,1]
    'name': 'WIDERFace',
    'l2norm_scale': [10, 8, 5],
    'base': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 'C', 512, 512, 512, 'M', 512, 512, 512] , 
    'extras': [256, 'S', 512, 128, 'S', 256],
    
    'mbox': [1, 1, 1, 1, 1, 1] , 
    #'mbox': [2, 2, 2, 2, 2, 2],
    #'mbox': [4, 4, 4, 4, 4, 4],
    'min_sizes': [16, 32, 64, 128, 256, 512],
    'max_sizes': [],
    #'max_sizes': [8, 16, 32, 64, 128, 256],
    #'aspect_ratios': [ [],[],[],[],[],[] ],   # [1,2]  default 1
    'aspect_ratios': [ [1.5],[1.5],[1.5],[1.5],[1.5],[1.5] ],   # [1,2]  default 1
    
    'backbone': 'resnet152' , # vgg, resnet, detnet, resnet50
    'feature_pyramid_network':True ,
    'bottom_up_path': False ,
    'feature_enhance_module': True ,
    'max_in_out': True , 
    'focal_loss': False ,
    'progressive_anchor': True ,
    'refinedet': False ,   
    'max_out': False , 
    'anchor_compensation': False , 
    'data_anchor_sampling': False ,
   
    'overlap_thresh' : [0.4] ,
    'negpos_ratio':3 , 
    # test
    'nms_thresh':0.3 ,
    'conf_thresh':0.01 ,
    'num_thresh':5000 ,
}

Which of them are true during training?

training or predicting on lower model GPU (not Tesla)

I have access to a 1080 or 1070, would it be possible to train on my GPU?

If not, can I at least use the provided model (parameters) and make predictions on my GPU?

about data augmentation

尝试着做训练．
发现在数据增强时，有时crop出的区域不是正方形，那么在后续resize（640*640）时，会改变image的比例，进而改变image中的face的比例．
这样会不会影响最终的效果？？

比较久的问题

你好，请问您之前跑过SSD-DEEPSORT-TF的程序代码吗，链接是https://github.com/search?q=ssd-deepsort,请问出现File "ssd_deepSort.py", line 234, in
f = create_box_encoder(args.reID_model, batch_size=1, loss_mode=args.loss_mode)
TypeError: create_box_encoder() got an unexpected keyword argument 'loss_mode'
您是怎样解决的呢，刚刚开始做，希望得到您的帮助

Is it possible to use smaller GPU for inference?

I read about you training 8 images in a batch on P40. Is it possible to use the code with GTX 1080TI (12GB) with smaller batch size?