hongsukchoi / pose2mesh_release Goto Github PK

Official Pytorch implementation of "Pose2Mesh: Graph Convolutional Network for 3D Human Pose and Mesh Recovery from a 2D Human Pose", ECCV 2020

License: MIT License

Python 99.91% Shell 0.09%

single-view graph-convolutional-network 3d-mesh 2d-human-pose 3d-human-pose 3d-human-mesh rgb-image eccv2020 eccv

pose2mesh_release's People

Contributors

Stargazers

Watchers

Forkers

peterzs peternara kevinlin311tw namepllet ai-machine-vision-lab t-xin cv-ip higuseonhye tuskaw karanshahgithub ccsvd areiner222 b2220333 hxu296 foggrasshopperfish hyeokhyen ericwang0701 annopackage zoq knowledgecluster marisssssa adityaviswabhusan14 chenmingthu christianingwersen dlwbm123 kaibing1 baldrlector fayne189 wenwenzju zhihong1224 ziyan-wyq goyallon lotayou yuzhou164 easy-shu ssong24 zczcwh hshahamat qingshi9974 cakin-kwong neoutlier mjpvz andyverne skeras holmes-alan wngtn zhiboniu poisonbox chenyanwu dfqytcom bobby20180331 kasvii thanhhovn2016 xzpro koushikam xushuolin moonsh pamodya98 heisenberg-kim himanshumoliya driver4567 tharuneshwar-369 threedlife rexryu dqj5182 emerld2011 fangwudi

pose2mesh_release's Issues

Detector Question

Which detector should I use that has eyes, nose, and pelvis like Posenet in the introduction picture?

Can I use the plevis coordinates as the center of both hips?

Question about the multiple datasets loader.

Hi. Thanks for your code sharing.
I have a question about your code, part of multiple datasets loading.

    class MultipleDatasets(Dataset):
       ...
        def __getitem__(self, index):
            if self.make_same_len:
                db_idx = random.randint(0,self.db_num-1) # uniform sampling
                data_idx = index % self.max_db_data_num
                if data_idx >= len(self.dbs[db_idx]) * (self.max_db_data_num // len(self.dbs[db_idx])): # last batch: random sampling
                    data_idx = random.randint(0,len(self.dbs[db_idx])-1)
                else: # before last batch: use modular
                    data_idx = data_idx % len(self.dbs[db_idx])
        ...
            return self.dbs[db_idx][data_idx]

if data_idx >= len(self.dbs[db_idx]) * (self.max_db_data_num // len(self.dbs[db_idx])): # last batch: random sampling
    data_idx = random.randint(0,len(self.dbs[db_idx])-1)
else: # before last batch: use modular
    data_idx = data_idx % len(self.dbs[db_idx])

I cannot understand these 4 lines. When We get an item from this class.
For my understanding, data_idx should be [0, self.max_db_data_num * self.db_num -1] if self.make_same_len is true.
But the length of self.dbs[db_idx] can be smaller than data_idx. In this case, you can simply get the modular as valid data_idx.

But why do we need the condition like this?

if data_idx >= len(self.dbs[db_idx]) * (self.max_db_data_num // len(self.dbs[db_idx]))

And why data_idx should be drown out by random sampling in this case??

May be a trivial question, but i want to know your purpose! Thank you :)

How fast can the inference time be achieved?

Hello! This work is very meaningful! I want to inference the video in 720P. How much can FPS achieve?

Question regarding augmentation

Hi,

I noticed something peculiar in the augmentation part and hence, I'd be graetful if you could clarify this for me. Basically, in your dataset class (for example in Human36M/dataset.py), when you do rotation augmentation for the image (variable rot in degrees in your code), you're not doing the same to your mesh data (mesh_cam on line 352). Instead, only the 2D and 3D joints (joint_img and smpl_joint_cam) are augmented as follows :

"joint_img, trans = j2d_processing(joint_img.copy(), (cfg.MODEL.input_shape[1], cfg.MODEL.input_shape[0]), bbox, rot, 0, None)" on line 368
"joint_cam = j3d_processing(joint_cam, rot, flip, self.flip_pairs)" on line 373

Shouldn't the augmentation be also done for mesh_cam?

Thanks

Visualization result on 3DPW dataset

Following #32, I'm tring to visualize the groud truth and test result on 3DPW using your render_mesh function. It seems even the ground truth mesh cannot totally fit the image well. I wonder if this is normal ? Since the ground truth is obtained using SMPLify-X. Or should I follow the pipeline in demo/run.py to let the project_net to learn the camera params?

        # get camera parameters
        project_net = models.project_net.get_model(crop_size=virtual_crop_size).cuda()
        joint_input = coco_joint_img
        out = optimize_cam_param(project_net, joint_input, crop_size=virtual_crop_size)

        # vis mesh
        color = colorsys.hsv_to_rgb(np.random.rand(), 0.5, 1.0)
        orig_img = render(out, orig_height, orig_width, orig_img, mesh_model.face, color)#s[idx])
        cv2.imwrite(output_path + f'{img_name[:-4]}_mesh_{idx}.png', orig_img)

def render_mesh(img, mesh, face, cam_param):
    # mesh
    mesh = trimesh.Trimesh(mesh, face)
    rot = trimesh.transformations.rotation_matrix(
	np.radians(180), [1, 0, 0])
    mesh.apply_transform(rot)
    material = pyrender.MetallicRoughnessMaterial(metallicFactor=0.0, alphaMode='OPAQUE', baseColorFactor=(1.0, 1.0, 0.9, 1.0))
    mesh = pyrender.Mesh.from_trimesh(mesh, material=material, smooth=False)
    scene = pyrender.Scene(ambient_light=(0.3, 0.3, 0.3))
    scene.add(mesh, 'mesh')
    
    focal, princpt = cam_param['focal'], cam_param['princpt']
    camera = pyrender.IntrinsicsCamera(fx=focal[0], fy=focal[1], cx=princpt[0], cy=princpt[1])
    scene.add(camera)
 
    # renderer
    renderer = pyrender.OffscreenRenderer(viewport_width=img.shape[1], viewport_height=img.shape[0], point_size=1.0)
   
    # light
    light = pyrender.DirectionalLight(color=[1.0, 1.0, 1.0], intensity=0.8)
    light_pose = np.eye(4)
    light_pose[:3, 3] = np.array([0, -1, 1])
    scene.add(light, pose=light_pose)
    light_pose[:3, 3] = np.array([0, 1, 1])
    scene.add(light, pose=light_pose)
    light_pose[:3, 3] = np.array([1, 1, 2])
    scene.add(light, pose=light_pose)

    # render
    rgb, depth = renderer.render(scene, flags=pyrender.RenderFlags.RGBA)
    rgb = rgb[:,:,:3].astype(np.float32)
    valid_mask = (depth > 0)[:,:,None]

    # save to image
    img = rgb * valid_mask + img * (1-valid_mask)
    return img

groud truth result

test result

3d keypoints

Hi, I replaced the 3d keypoints with the ones that I got from VideoPose3D and the generated mesh look really weird. Is it because of any normalization that you processed with the 3D keypoints? Thanks.

pkl file

Hi, from the demo, it can generate a .obj file, but is it possible to also generate the pkl file in the format of smpl?

colab demo.

please add colab demo

Problem in FreiHand data-loader

When I tried to train the PostNet on the FreiHand dataset, I got the following errors:

File "/home/nankaingy/data/code/pose-shape/Pose2Mesh_RELEASE/main/../data/FreiHAND/dataset.py", line 202, in getitem
mano_mesh_cam, mano_joint_cam = self.get_mano_coord(mano_param, cam_param)
File "/home/nankaingy/data/code/pose-shape/Pose2Mesh_RELEASE/main/../data/FreiHAND/dataset.py", line 172, in get_mano_coord
R, t = np.array(cam_param['R'], dtype=np.float32).reshape(3, 3), np.array(cam_param['t'], dtype=np.float32).reshape(3)
KeyError: 'R'

Since the argument ‘cam_param’ loaded from 'freihand_train_data.json'，the keys of ‘cam_param‘ dict in JSON file only contains 'focal' and 'princpt' without 'R' or 't', as shown below:

freihand_$SPLIT_data.json
|-- db_idx: {
‘cam_param’: {
‘focal’: focal lengths in x- and y-axis (intrinsic. pixel unit.),
‘princpt’: principal point coordinates in x- and y-axis (intrinsic. pixel unit.)
},
‘mano_param’: {
‘pose’: 48-dimensional mano pose vector (theta),
‘shape’: 10-dimensional mano shape vector (beta)
},
‘joint_3d’: 21x3 joint coordinates (MANO joint set. meter unit.) from mesh,
‘scale’: groundtruth scale provided from FreiHAND
}
How can I fix this?

Draw overlay of human mesh and original picture

Hi, is there code to draw human mesh on top of the original rgb picture as displayed in the paper?

about your other code 3DCrowdNet

Hello, I recently read your other paper, 3DCrowdNet. I was greatly inspired and wanted to ask you when this code will be open source. Thank you and look forward to your reply

Confused about the performance of Pose2mesh on Human3.6M

The performance of Pose2mesh on Human3.6M:
Training with Human3.6M:
MPJPE:64.9
PA-MPJPE:48.0

Training with Human3.6M and COCO:
MPJPE:67.9
PA-MPJPE:49.9

Best result:
MPJPE:64.9
PA-MPJPE:46.3

As mentioned in the paper，using more datasets to train Pose2mesh will decrease the performance on Human3.6M. I wonder the best result on Human3.6M is supposed to be trained with Human3.6M dataset only? In this case, the best result should be the same with the one trained with Human3.6M. Or it should be trained with Human3.6M+COCO+MuCo? Would you please show me the training settings on Human3.6M？

MoVi dataset

In the original paper, it is mentioned that MoVi dataset is used to train PoseNet. However. no relevant parameters regarding training with MoVi dataset is found in sset/yaml/posenet_{input joint set}train{dataset list}.yml. I wonder how MoVi dataset comes into play here. Would you specify?

Demo on images out of datasets

I am trying to figure out how to make "input.npy" for any raw image. Can I use posenet (pretrained) to do so?

How to get the Joint regressor for coco format

image annotation tool

hello dear can you tell me which annotation tool you have use for annotating images. and after which i have to run , to create .npy file
and also can you give step by step tutorial

thanx in advance

Training and using of PoseNet

Hi @hongsukchoi , superb work and thanks for sharing!
But I have a few questions about the training of PoseNet.

[1] According to Fig. 9 of the suppl, H3.6M and Coco have different definition of joint sets. Thus I am wondering how to combine these two datasets together (as shown in Table 9) to train PoseNet?

[2] Besides, when employing an off-the-shelf 2D pose detector, how to make sure that the input of PoseNet is consistent with the output of the 2D pose detector?

Best

unzip pretrained weights

Hi, thank you for putting up and share this excellent work. A siily question, I was trying to download the pretrained weights (as final_pth.tar), but had troubles even decompressing them. I m using a windows 10 pc and have tried 7z and tar command on colab notebook and jupyter notebook but none of them worked (I could decompress other tar files from other sources) so was wondering what tools I should use to decompress your weights files? Thank you

Joints order of the parsed data

Hello Choi,

Thanks for the wonderful contribution. I was wondering what is joints order of the parsed data (17, 2). You mentioned in another issue you used (Gyeongsik Moon) work for parsing the data but I'm not sure which project exactly.

Thanks in advanced for your response

test issue about PW3D

Hi,
I trained pose2mesh on Human36m using
"pose2mesh_human36J_train_human36.yml"
, how can I test this model on PW3D ?
I'm looking forward to hearing from you. Many thanks!

About Mano parameter

Why the trans parameter of mano can be replaced by the T of the camera extrinsic?

The visualizations seem a bit odd

Hello，I run the demo/run.py, get these outputs:

the right is the input and I use other pose estimation models to get 2D joints location.
the left is the output of pose2mesh.
the checkpoint is "pose2mesh_cocoJ_gt_train_human36_coco"
Would you like to tell me where the mistake is?
Looking forward to your reply.

Question about rescale_L

Pose2Mesh_RELEASE/lib/coarsening.py

Line 32 in e91bdd6

L /= lmax * 2 # L = 2.0*L / lmax

Hi, thank you for sharing the code. I have a question about rescale_L. Why not use L /= lmax * 2 rather than L = 2.0*L / lmax? Looking forward your reply.

MPJPE is 0.0000

I train the posenet with 'posenet_manoJ_train_freihand.yml'，why the loss MPLPE is 0.0000?

Colab demo ?

Hello, thank you for your amazing implementation
Can you provide a Google colab notebook to show the demo of this repo, I'm sure it'll help everyone and it'll increase the popularity of your repo as well and it'll be much appreciated,
Thank you

Surreal preprocessing

Hi @hongsukchoi, thanks for the great work!

I was wondering how did you obtain the global orientation (pose[:3]) for your surreal parsed data ? The pose in your parsed file is different from the one in the original XXX_info.mat files provided by surreal and I was unable to derive the same values from the raw data i.e. using the camera extrinsincs and intrinsics provided by surreal

Test on pretrained manoJ_train_freihand.yaml

Hi,

I use your pretrained model and test on freihand data as below,
python main/test.py --gpu 0,1,2,3 --cfg ./asset/yaml/pretrained manoJ_test_freihand.yaml

the average error output is 4700.
Is it so?

visualize the coarse mesh model

Hello, thanks for you excellent work!

I started the coarsening process from SMPL template with 6890 vertices and could get graphs of nine different resolutions, but I have no idea how to get the corresponding vertex coordinates to visualize the coarse mesh model.

Hope for your suggestions. Thanks a lot!

Body measurements ?

Hello, thank you for your amazing implementation
How can we measure the body measurements like waist, neck to hand, chest etc.
Thank you

Handling missing keypoints

Hi! And thank you for the awesome work!
How would you recommend dealing with missing keypoints? Should the corresponding rows in input.npy be filled with NaNs or 0s?

coco.npy file location

I am interested in your repo :)

Is there a J_regressor_coco.npy file uploaded?

I can't find the file.

Why the finest mesh has 12288 vertices instead of 6890 vertices?

Hello, I notice that pred_mesh has 12288 vertices in line 169 and then downsampled to 6890 vertices in line 170, as shown below.
Why not let the output has 6890 vertices? I think predict more vertices would make the network less efficient.

Pose2Mesh_RELEASE/demo/run.py

Lines 169 to 170 in c3a26ba

    
           pred_mesh, _ = model(joint_img) 
        
           pred_mesh = pred_mesh[:, graph_perm_reverse[:mesh_model.face.max() + 1], :]

Visualize Mesh

Hi, Thanks for your amazing work.

I have trouble using your render.py code to visualize the ground truth mesh of Human3.6 dataset. I can get the vertices and faces from your dataloader. But when I try to render it, I can't find the camera parameters according to your function. I tried to use the rotation and translation, focal length, and principal point in the original dataset but the rendered result is blank.

Could you explain the camera parameters in the render.py? Are these parameters related to the camera parameters in the dataset?

Thank you very much

Different joints order for different dataset

Hi,
I wondered how you are mapping different joint orders in case of multiple datasets. For example, when training on (human3.6 and coco) and I set in the configuration, the target joint set is set to coco format. How are you mapping human3.6 joints order to coco joints order?
Thanks in advance for your response

About environment

Hello, thank you for your amazing implementation.
Is this Pose2Mesh_RELEASE repo were developed on windows 10 ?

Obtaining intermediate meshes

Hi. Thanks for the great work! I had a rather elementary doubt. I wanted to obtain intermediate meshes obtained by mesh coarsening as you have visualized in your paper. The coarsen function takes just the input adjacency matrix as input and outputs the adjacency matrices corresponding to several coarsening levels. The input adjacency matrix is simply a matrix composed of 0s and 1s which indicate node connectivities. However, the other adjacency matrices look different i.e. are not composed of 0s and 1s, and hence I am having trouble understanding them. Could you please help me with how I should go about obtaining the intermediate meshes from the adjacency matrices corresponding to different coarsening levels?

Thanks in advance

Compatibility with Openpose 25 annotations?

Hi! I'd like to apply Pose2Mesh with my own data which has only OpenPose 25-joint annotations (which seems to be the 17 coco joints + neck + pelvis + 6 extra joints on the feet, as defined here).
Now that Pose2Mesh uses your customized COCO joints (original 17 + neck + pelvis) for 3DPW training, can I directly apply that pretrained model to my task?
If so, should I simply use the neck and pelvis joints that's contained in my openpose-25 annotations, or ignore them and recompute them using this line of code?

Thanks in advance, I just wanna make sure I'm using your model in a strictly correct way (great work by the way!).

Problem in demo

hello
in demo/run.py path to weights set as model_chk_path = './experiment/exp_07-07_23:02:27.03/checkpoint'

i have problems:
Traceback (most recent call last):
File "/home/alexe1ka/my_experiments/Pose2Mesh_RELEASE/demo/../lib/funcs_utils.py", line 134, in load_checkpoint
checkpoint = torch.load(checkpoint_dir, map_location='cuda')
File "/home/alexe1ka/.pyenv/versions/my_env/lib/python3.7/site-packages/torch/serialization.py", line 581, in load
with _open_file_like(f, 'rb') as opened_file:
File "/home/alexe1ka/.pyenv/versions/my_env/lib/python3.7/site-packages/torch/serialization.py", line 230, in _open_file_like
return _open_file(name_or_buffer, mode)
File "/home/alexe1ka/.pyenv/versions/my_env/lib/python3.7/site-packages/torch/serialization.py", line 211, in init
super(_open_file, self).init(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: './experiment/exp_07-07_23:02:27.03/checkpoint/checkpoint0.pth.tar'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/alexe1ka/my_experiments/Pose2Mesh_RELEASE/demo/run.py", line 130, in
model, joint_regressor, joint_num, skeleton, graph_L, graph_perm_reverse = get_joint_setting(mesh_model, joint_category=joint_set)
File "/home/alexe1ka/my_experiments/Pose2Mesh_RELEASE/demo/run.py", line 96, in get_joint_setting
checkpoint = load_checkpoint(load_dir=model_chk_path)
File "/home/alexe1ka/my_experiments/Pose2Mesh_RELEASE/demo/../lib/funcs_utils.py", line 137, in load_checkpoint
raise ValueError("No checkpoint exists!\n", e)
ValueError: ('No checkpoint exists!\n', FileNotFoundError(2, 'No such file or directory'))

Process finished with exit code 1

where i can get this file?

The influence of input errors of different 2D joints

Hi @hongsukchoi
Thank you for your sharing your great work! I’m a beginner of this domain. And I have some questions.
If I want to study the influence of input errors of each 2D joints, which part of the code should I change? For example, only the L_knee has errors in the input 2D joints.

As a Beginner

hi @hongsukchoi, you have done amazing work. and of course thanks for sharing it with us.
I am very interested in this domain. But from where I should start, getting no idea.
My system have windows 10 (seems like yours is Linux) with NVIDIA GeForce GTX 1050 Ti with 1T HDD. I have anaconda, pycharm, vs2015. While downloading the dataset its taking a lot of time and sometime getting aborted in middle. Can you advise me, how should I proceed?

The problem of camera parameters

Hello, thank you for your excellent work！
I am a beginner of 3D reconstruction，There are some doubts when reading your code

class OptimzeCamLayer(nn.Module):
    def __init__(self):
        super(OptimzeCamLayer, self).__init__()

        self.img_res = 500 / 2
        self.cam_param = nn.Parameter(torch.rand((1,3)))

    def forward(self, pose3d):
        output = pose3d[:, :, :2] + self.cam_param[None, :, 1:]
        output = output * self.cam_param[None, :, :1] * self.img_res + self.img_res
        return output


def get_model():
    model = OptimzeCamLayer()

    return model

Why are camera parameters random and how can they work correctly？
Hope to get your answer, thank you

Could you please provide the code to generate parsed data?

I noticed that the parsed data is a little different from the original dataset (e.x. 3DPW). Could you please provide the code to generate parsed data?

How to get gt smpl parameters in arbitrary person data sets

Hi, @hongsukchoi. superb work and thanks for sharing!
I would like to ask how to generate gt smpl parameters on my data set? How to label smpl parameters for two-dimensional images.

Ground truth mesh image correspondence

Hi,

Thank you for the great work. I had a small question about the data. For 3DPW and COCO, which has images with multiple people in it (even after cropping using the bounding box coordinates provided in your annotations), how did you decide which person to fit the mesh for? Attaching a cropped example for reference from the 3DPW dataset. I'd be grateful if you could let me know as I want to find out the correspondence between the person and the ground truth mesh for every image in COCO/3DPW datasets.

Thanks in advance :)

How to deal with multiple people in a single image with some errors of estimated poses?

Thank you for your sharing your great work, and hope your another work '3d crowdnet' will be published on the top conference. I am curious about how to deal with the errors of pose estimations, especially for some bottom-top methods, such as higher hrnet, in multi-person condition.

How to change mesh gender?

I am running demo/run.py with a human36 joint set. I am trying to create a mesh for a specific gender.

In SMPL.get_layer, which is invoked when creating mesh_model in demo/run.py three different layers are created, one for each gender. However, then it appears that these gendered layers are the neutral layer is used for the face, or for the smpl joint regressor. How can I output a female/male mesh?

How can I get the *.npy file for the demo

Hello, much appreciate your amazing work!

I got one question here, how can I generate the *.npy file for the demo file to generate the mesh *.obj file.

What's more, for example, I have the 3d poses presented by three dimensions Cartesian coordinate system(xyz-axis). like the image below. How can I convert the original 3d pose to the *.npy file?

Broken link to models

A link in the readme file is broken.

Download basicModel_f_lbs_10_207_0_v1.0.0.pkl, basicModel_m_lbs_10_207_0_v1.0.0.pkl, and basicModel_neutral_lbs_10_207_0_v1.0.0.pkl from here (female & male)..

The "here" link is broken and now only the neural model is available.

How may I get MANO parameters from pose2mesh model?

In the demo, it outputs the mesh and rendered joints on 2D image for MANO. Is there a method to output betas & poses instead of the mesh? I have the labeled hand joints, but need to extract mano parameters (betas, poses).

Wrong overall mesh volume for different person with certain bias

Hi ,

i try to use your amazing work to estimate a person volume by using the fitted SMPL Mesh. I was able to transform the Mesh in my respective camera coordinate system. I calculate for each body region (head, hands, arms, legs etc.) its respective volume and sum them up. I am using the world coordinates from the transformed mesh. A render to the image matches the persons silhouette.

But I found that different persons with different overall mesh representation seem to contain all similar volume of +- 0.05m^3.

It is crucial for my work to have a relatively good estimation of the persons volume but it seems that using the SMPL model wont be a good way to do this.

Could you give me ideas? Is this conversion from the SMPL Mesh to my camera coordinate system correct ? I feel it could be a scaling issue. I hope you could help me out.

	pred_mesh, _ = model(joint_img)
	pred_mesh = pred_mesh[:, graph_perm_reverse[:mesh_model.face.max() + 1], :]