cleardusk / 3ddfa_v2 Goto Github PK

View Code? Open in Web Editor NEW

2.8K 70.0 505.0 74.78 MB

The official PyTorch implementation of Towards Fast, Accurate and Stable 3D Dense Face Alignment, ECCV 2020.

License: MIT License

Python 69.80% Shell 0.16% C++ 16.39% CMake 0.24% Jupyter Notebook 2.44% C 4.18% Cython 6.78%

eccv 3d-face-alignment pytorch face-alignment 3d-face 3dmm alignment 3d computer-vision onnx

3ddfa_v2's Introduction

Hi there, I'm Jianzhu Guo 👋

3ddfa_v2's People

Contributors

Stargazers

Watchers

Forkers

williamlx 3d-face cbsr-casia idesignitx ml-lab qaz734913414 brookswood abpin jhines2k7 chadleong proteanblank propellingbits lxzyuan fbion ideaplexus anishone viliusmat kokizzu 8secz-johndpope techmagic dacer250 shiyuan0806 icewm hxhh badexception jet-zheng zwxalgorithm artisr chenchy sailfish009 mathpopo 1364468984qqcom 99kyuu zeta1999 ziudeso bruinxiong dtmfgold freewind2016 hyongxin easy-shu assassindesign freedreamer-crypto hjguyhan thelordofdream liuwenhaha wyuzyf wangsheng1991 perfyperfect hengguan dafuny xrosliang cv-ip arryboom lzhbrian guomingjin williammc cybernetics match08 sinianyutian michaelshing peterzhousz ysc703 peternara xuguozhi mc261670164 zebrajack sammymx liannice satoshirobatofujimoto gxydemoprojects akumar14 jdc08161063 baldrlector ahn19 millionakking kakao014 rogalag sanbiv aqdus01 joongkyunkim lyp-deeplearning ziyeshanwai lsheiba olegjakushkin yoyokitartora dicksonyuan toucango antonlinderer farzanaanjum enigmahong yangyin2016 zjcrt guycook grainwei optimusprimeultra hegang3 dhananjay1646 cherryzbq mdda aodamiaomiao

3ddfa_v2's Issues

About the output

Is it able to turn the result to mesh file with texture?Thank you.

Hi!
I am trying to calculate the NME of the AFLW2000-3D dataset. But I have no way to get the bounding box annotation from AFLW2000-3D. The "roi" information provided by the AFLW2000-3D dataset is strange. I think it is not the bounding box of faces.
Could you tell me how to get the bounding box annotation of the AFLW2000-3D dataset?

hi I am wondering where the face landmark variable is and what variable name is stored.

I'm curious about the variable that stores facial landmarks. Can I use this code for face recognition?
Can you tell me the .py file name for the face landmark?

About test accuracy

First of all, thanks for such excellent work. With reference to the work of 3DDFA_V1, plus the synthesis of training data by myself, the test accuracy of the model has been greatly improved. The test accuracy is as follows:

Using the mobilenetv2 model, the training method is the same as 3DDFA_V1, the number of shape parameters is 60, and the number of expression parameters is 29. If a large model is used, the accuracy will be further improved. The actual test stability has also been greatly improved.
Therefore, I look forward to the open source of 3DDFA_V2 related training algorithms, thank you very much.

models

请问您这个models的输入大小shape是多少，还有就是models的参数文件在哪个文件夹里。
thanks you

About rendering .ply models to depth face maps

Hi,
Thank you for sharing your great work. I have a problem about how to render myself .ply models (having no tri faces) to depth face maps. Thanks for your help!

How to get the orthographic projection matrix?

Wonderful project! I want to know how to get the orthographic projection matrix Pr as stated in the paper.

Feature request: add onnxruntime inference

Feature request

I have benchmarked the onnxruntime library and find its latency (mobilenet in our case) is rather small. However, my personal time is limited. Therefore, I hope anyone interested in this repo can contribute to this repo by adding the onnxruntime inference : )

The onnxruntime tutorial is here.

meta-join

Hi,I realize meta-join training and find that VDC dominates in the early stage ,because of VDC loss valueconverge from about 493 to 200 very fast. It is opposite of the conclusion from paper:fPWDC dominates in the early stage and VDC guides in the late stage. Any suggestions?@cleardusk

sh ./build.sh help me

Hello, I am a student starting machine learning. I am trying to run the currently uploaded code by installing python with anaconda
sh ./build.sh In this part, because it is a Windows environment, I cannot use sh
If you can run this code on Windows, I would appreciate it if you can tell me how.

Face Detection problem

Using https://github.com/hollance/BlazeFace-PyTorch BlazeFace for face detection will reduce the inference time for face detection Also using OpenCV as inference could be better. Thanks for the great work

Resnet weights

Hello there! Thank you for your excellent work, I was curious if you had any plans to provide weights for resnet?

scipy.io.loadmat fail, decompressing error.

Please detail your questions here : )

File "mio5_utils.pyx", line 548, in scipy.io.matlab.mio5_utils.VarReader5.read_full_tag File "mio5_utils.pyx", line 556, in scipy.io.matlab.mio5_utils.VarReader5.cread_full_tag File "streams.pyx", line 176, in scipy.io.matlab.streams.ZlibInputStream.read_into File "streams.pyx", line 163, in scipy.io.matlab.streams.ZlibInputStream._fill_buffer zlib.error: Error -2 while decompressing data: inconsistent stream state

When I run demo.py, an error happened on line: from utils.uv import uv_tex. I search it on baidu or google, but find nothing. Someone get this? share pls.

BFM model's parameters: u_base, w_shp_base and w_exp_base are from the original BFM model ?

Thank you for your great work. I'm trying to write the training codes and I'm stuck at base parameters of BFM model. So the question is:

BFM model's parameters: u_base, w_shp_base and w_exp_base are from the original or did you create them your self

Thank you very much

构建失败

platform：macOS catalina 10.15.7 (19H2)
gcc version:

Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/c++/4.2.1
Apple clang version 12.0.0 (clang-1200.0.32.27)
Target: x86_64-apple-darwin19.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

 (pytorch) quanhaoguo@QuanhaodeMacBook-Pro 3DDFA_V2 % sh ./build.sh
running build_ext
skipping 'nms/cpu_nms.c' Cython extension (up-to-date)
running build_ext
skipping 'lib/rasterize.cpp' Cython extension (up-to-date)
building 'Sim3DR_Cython' extension
gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/quanhaoguo/anaconda3/envs/pytorch/include -arch x86_64 -I/Users/quanhaoguo/anaconda3/envs/pytorch/include -arch x86_64 -I/Users/quanhaoguo/anaconda3/envs/pytorch/lib/python3.6/site-packages/numpy/core/include -I/Users/quanhaoguo/anaconda3/envs/pytorch/include/python3.6m -c lib/rasterize.cpp -o build/temp.macosx-10.7-x86_64-3.6/lib/rasterize.o -std=c++11
clang: warning: include path for libstdc++ headers not found; pass '-stdlib=libc++' on the command line to use the libc++ standard library instead [-Wstdlibcxx-not-found]
In file included from lib/rasterize.cpp:624:
In file included from /Users/quanhaoguo/anaconda3/envs/pytorch/lib/python3.6/site-packages/numpy/core/include/numpy/arrayobject.h:4:
In file included from /Users/quanhaoguo/anaconda3/envs/pytorch/lib/python3.6/site-packages/numpy/core/include/numpy/ndarrayobject.h:12:
In file included from /Users/quanhaoguo/anaconda3/envs/pytorch/lib/python3.6/site-packages/numpy/core/include/numpy/ndarraytypes.h:1822:
/Users/quanhaoguo/anaconda3/envs/pytorch/lib/python3.6/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: warning: 
      "Using deprecated NumPy API, disable it with "          "#define
      NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-W#warnings]
#warning "Using deprecated NumPy API, disable it with " \
 ^
In file included from lib/rasterize.cpp:629:
lib/rasterize.h:5:10: fatal error: 'cmath' file not found
#include "cmath"
         ^~~~~~~
1 warning and 1 error generated.
error: command 'gcc' failed with exit status 1

NME metric

Hello, thanks for your excellent work!

The NME of different datasets is an important metric. For a fair comparison, may you share the code about how to calculate the NME? Or are there any official code to calculate the NME metrics, and the visibility vector which is shown in "Pose-Invariant 3D Face Alignment(ICCV 2015)" as follow:

Looking forward to your reply. Good luck!

Training code release?

Thank you for your amazing job~
One question, would you release your training code?

Question about the function `parse_roi_box_from_bbox`

Thanks for your great work!
When I read your released code at this place:

3DDFA_V2/utils/functions.py

Line 85 in 48002b4

def parse_roi_box_from_bbox(bbox):

I could not understand your operation purpose. Of course, the face image should be a square.

But I guess this is done because the face detector tends to detect the upper part of the face, so you move down the bounding box , is it right? By the way, are the hyper-parameters in this function like 0.14, 1.58 chosen empirically?

Looking forward to your reply!

How to visualise dense landmarks on videos?

I know I can use python3 demo_video.py -f examples/inputs/videos/214.avi --opt 3d render dense landmarks but how am I going to see 3d dense landmarks alone without rendering, I am planning to use them for Facial recognition embedding

Accuracy of eye and eyebrow key points improvement.

Type your opinions or ideas here.
Any way to improve accuracy of eye key points?

Originally posted by @duchengyao in #5 (comment)

Thanks!

Extracting UV textures in the video for all the frames

Describe the bug
Extracting UV textures for multiple frames in the video gives black image as output after the 1st frame. The output becomes greyish after the 1st frame and then eventually becomes black.

To Reproduce
add the line
uv_tex(img, ver_lst, tddfa.tri, show_flag=args.show_flag, wfp=wfp)
at line 131 in the demo_video_smooth.py below
elif args.opt == '3d'

Expected behavior
To produce the UV textures for all the images(frames) in the video

When the head is rotated to a certain angle, the key point is not accurate.

（1）If retinaface is used for face detection, it is more accurate
（2）When the head is rotated to a certain angle, the key point is not accurate.
（3）Pitch，yaw and roll angle is required for head pose estimation.

I achieved a better one https://github.com/WIKI2020/FacePose_pytorch

How to make expressions more expressive?

Hi, the model is very accurate for face shapes.
However, it seems to have only 10 parameters for expressions.
Eyes' and eyebrows' motions are not well detected. Expressions for mouth are good.
How to improve this? Do I need to train a model myself with more parameters for expressions?
Thank you very much for your reply.

Algorithm 2 of the paper

In the red circle shown above, why you divide N by 3 ?

N is the number of points, T(:, 4) has three elements, representing respectively the displacement in three dimensions of the coordinates.

If we take the first predicted displacement as Tx, and it's corresponding ground truth displacement as Tgx then, the resulting difference of reconstructed 3D shape should be:
||(Tx-Tgx, Tx-Tgx, Tx-Tgx, ... , Tx-Tgx)|| = |Tx-Tgx|*√(N)

Is there any mistake in my derivation above?

'Namespace' object has no attribute 'dense_flag'

Thanks for release the codes.
When run "python3 demo.py -f examples/inputs/emma.jpg", I meet
'Namespace' object has no attribute 'dense_flag'
I found this is because this line in demo.py https://github.com/cleardusk/3DDFA_V2/blob/master/demo.py#L44
arser.add_argument('--dense_flg', default='true', type=str2bool, help='whether reconstructing dense')
but in line32, https://github.com/cleardusk/3DDFA_V2/blob/master/demo.py#L32
ver_lst = tddfa.recon_vers(param_lst, roi_box_lst, dense_flag=args.dense_flag)
obviously, args.dense_flag is not match '--dense_flg'.
May be this should be fixed.

Need instruction for training for different architecture like mobilenet 0.25

Also along with training instructions. Can you add parameter to train for different input image size from default 120x120?

what makes such lightweight backbone works so well?

compared to the previous version of your work, 3ddfa, 3ddfa_v2's structure is much simpler, but achieves better results. so i wonder if the meta-joint loss is the reason that enable mobilenet to outperform previous works. i would like to know your opinion on applying these methods(look ahead, combine different losses) to solving other tasks.

Can you tell the big variable that saved the landmark?

Hello I would like to know if face recognition is possible using this code. Can you tell me which file and variable that stores the landmark of the face

In this code, I am trying to perform face recognition by giving the landmark variable as a face recognition code. Is it possible?

Align the coordinates of 3D landmarks

Hey congrats for this beautiful job, very interesting actually.

I was wondering if there is way to

get a 3d representation of the landmaks, like (x,y,z) coordinates for each landmark point.
I would like to align the arrays of landmarks of different faces with a reference landmarks.

Ideally I would like to have all the 3d landmarks coordinates aligned to the same reference point, for example: all the landmarks coordinates are the coordinates of a frontal face where the proportions of the face are all similar.

What is an embedding vector variable param_lst, roi_box_lst What are these variables?

Hello, I am a student who wants to do facial recognition. I'm trying to recognize a face with only face coordinate values, but it doesn't work. I want to know what the two variables mean.

how to transfer .obj tp jpg?

Can this 2d dense landmarks data be used for Face Recognition or is it too generic?

The 2d Dense data seems pretty generic instead of specific to everyone's face, so I have a doubt about its usage in Face recognition.

Can 2d dense landmark data be reliably used for Face recognition? Would euclidean distances be different for different people and match with the same person?

Eval on AFLW2000

Hi, jianzhu,
I eval 3DDFA_V2 on AFLW2000 by script from 3DDFA, and the result is
[ 0, 30] Mean: 2.735, Std: 1.127
[30, 60] Mean: 3.477, Std: 1.431
[60, 90] Mean: 4.543, Std: 1.961
[ 0, 90] Mean: 3.585, Std: 0.742

The model seems like MobileNet(M+R) in your paper, right?

Is it possible to add pncc projection code?

Hi, and thanks for the great project.
The previous version of the code could output pncc projections. Are you planning to add it to the new codebase?

Thank you!

ModuleNotFoundError: No module named 'FaceBoxes.utils.nms.cpu_nms'

When I run command python3 demo_webcam_smooth.py --onnx, can anyone solve this problem ?

Head Generation

I want to know if I got the 3D face using 3DDFA. How could I generate the whole head without hair?
Does anyone have a good idea? Thanks a lot!

通过3DDFA得到面部模型后，怎么样才能得到完整的光头模型呢？
希望大家出出主意，非常感谢！

How to view 3d face landmark or 2d_dense in demo_webcam_smooth.py if there is no render import and cv_draw_landmark only creates circle which works for only 2d

The 3d option in demo_webcam_smooth.py only displays hollowed out green circles for all landmarks instead of the gray overlay. The 2d_dense option displays the same but with filled green circles for all landmarks.

landmark inaccuracy on the RAVDESS dataset

Type your opinions or ideas here.

Hi, this project is really very useful for several downstream tasks.
Currently, I'm utilizing 3DFFA_V2 to reconstruct some talking faces on the RAVDESS dataset.
This is a very neat in-lab dataset, which has a high-resolution head with white background.
However, the reconstruction accuracy seems not to be good, several landmarks on lips are not aligned.
Here are some examples
this is the original image

this is the reconstructed image

Obviously, the lip in the original image is closed, but in the reconstructed image is opened.
I'm wondering whether should I adjust some parameters when conducting 3D reconstruction on videos?

How to generate 3D face model?

Are you using the CPU? Or are you using cuda?

If the code uses cpu, how to use it as cuda?

ValueError: Buffer dtype mismatch, expected 'int_t' but got 'long long'

In the line xx1 = np.maximum(x1[i], x1[order[1:]]) below it throws this error in FaceBoxes\utils\nms\py_cpu_nms.py

def py_cpu_nms(dets, thresh):
    """Pure Python NMS baseline."""
    x1 = dets[:, 0]
    y1 = dets[:, 1]
    x2 = dets[:, 2]
    y2 = dets[:, 3]
    scores = dets[:, 4]

    areas = (x2 - x1 + 1) * (y2 - y1 + 1)
    order = scores.argsort()[::-1]

    keep = []
    while order.size > 0:
        i = order[0]
        keep.append(i)
        xx1 = np.maximum(x1[i], x1[order[1:]])
        yy1 = np.maximum(y1[i], y1[order[1:]])
        xx2 = np.minimum(x2[i], x2[order[1:]])
        yy2 = np.minimum(y2[i], y2[order[1:]])

        w = np.maximum(0.0, xx2 - xx1 + 1)
        h = np.maximum(0.0, yy2 - yy1 + 1)
        inter = w * h
        ovr = inter / (areas[i] + areas[order[1:]] - inter)

        inds = np.where(ovr <= thresh)[0]
        order = order[inds + 1]

    return keep

Here I have a few questions:

In the old 3DDFA (3DDFA_V1), the reconstructed vertices cover the ears and neck regions. But it is excluded in the new 3DDFA_V2. However, in some applications we would like to get the details around those regions, may I know is there an easy way to adapt the new 3DDFA_V2 model to cover the ears and neck regions?
In the old 3DDFA paper (CVPR ver.), the authors visualized the visibility of each landmark. I am wondering how to get the landmark visibility from the reconstructed vertices, could you please give me some hints about it?

Looking forward to your reply!
Best regards.