Coder Social home page Coder Social logo

google / aistplusplus_api Goto Github PK

View Code? Open in Web Editor NEW
337.0 12.0 65.0 156 KB

API to support AIST++ Dataset: https://google.github.io/aistplusplus_dataset

License: Apache License 2.0

Python 100.00%
smpl-model 3d-keypoints aist pose-estimation 3d-reconstruction

aistplusplus_api's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

aistplusplus_api's Issues

2d keypoint JSON file

How to convert the 2d keypoint JSON file detected by openpose to the same 2d keypoint format as the AIST++ dataset.

Incorrect Visualization Results (using SMPL)

I'm very interested in this dataset so I did some experiments with SMPL backbone. But the result turns out quite surprising, so I further analyzed the dataset. My approach is to use run_vis.py to re-run unreasonable videos on my model. And here are my results:

  • gBR_sFM_c09_d04_mBR4_ch07

with the setting

flags.DEFINE_string(
    'video_name',
    'gBR_sFM_c09_d04_mBR4_ch07',
    'input video name to be visualized.')
flags.DEFINE_enum(
    'mode', 'SMPL', ['2D', '3D', 'SMPL', 'SMPLMesh'],
    'visualize 3D or 2D keypoints, or SMPL joints on image plane.')

image

  • gBR_sBM_c05_d04_mBR0_ch08

with the setting

flags.DEFINE_string(
    'video_name',
    'gBR_sBM_c05_d04_mBR0_ch08',
    'input video name to be visualized.')
flags.DEFINE_enum(
    'mode', 'SMPL', ['2D', '3D', 'SMPL', 'SMPLMesh'],
    'visualize 3D or 2D keypoints, or SMPL joints on image plane.')

image

  • gJB_sBM_c06_d07_mJB3_ch05

with the setting

flags.DEFINE_string(
    'video_name',
    'gJB_sBM_c06_d07_mJB3_ch05',
    'input video name to be visualized.')
flags.DEFINE_enum(
    'mode', 'SMPLMesh', ['2D', '3D', 'SMPL', 'SMPLMesh'],
    'visualize 3D or 2D keypoints, or SMPL joints on image plane.')

image

image

I'm wondering if anyone else has similar results? Or did I make a mistake on the code running? Cause I found hundreds of videos like above.

504 Error during downloading

I meet 504 error: Gateway Time-out when I download the videos.
I've tried many ways but failed and need some help.

Small suggestion about introduction

Hey, thanks for your amazing work for constructing this dataset. I encountered a small problem when I try to extract 3d joints from SMPL parameters. The scale doesn't seem right. It turns our we need to use "https://github.com/liruilong940607/smplx" instead of the original repo for "python setup.py install" after "pip install smplx". I feel it important to indicate that in the main introduction since it's a key step.

Just a minor suggestion. Hope you don't mind.

Questions about Transformer Network and Model Training

Hi, @liruilong940607
I have some questions about the input and output of transformer network.
image

  1. As the paper said, during training, the input of the Audio Transformer is music feature Y with shape (240, 35) and the input of Motion Transformer is motion data X with shape (120, 219) , so for all of Q,K,V are same, include the query vector of decoder of these transformer. Is this right?
  2. For the cross-modal transformer, assume I get H(X) (240, 512) and H(Y) (120, 512) after audio and motion transformer, how to fuse features from music and motion features? Just concatenate both in sequence dimension resulting in features with shape (240 + 120, 512)?
  3. What's the query vector of cross-modal transformer decoder? How to get a motion output with 20 frames?
  4. What is the representation of motion prediction? Is it same as the input motion data with 24*3*3 + 3 = 219. But I think it is not good to regress a 3x3 rotation matrix directly? And for the regression of global translation, do you directly regress the absolute translation (x,y,z) in 3D space? Or just the offset relative to the first frame of seed motion?
    Thanks!

Visualization SMPL

Hi,
Thanks a lot for releasing this dataset.
You are providing visualization of keypoints, that's great. However you are not providing a way to visualize SMPL from each camera view point, would it be possible to incorporate this in the run_vis.py file?
For the moment the SMPL poses is independant to camera viewpoints, how to you make it dependant from a specific camera viewpoint?
Thanks a lot for your help.
Fabien

Trouble about the 3D joint rot data

We noticed that your paper stated that the dataset contained 3D joint rot data, but we cannot find it on the web page. So I was wondering if there are any accesses to download the 3D joint rot data. We hope to do some research with this data.
I look forward to hearing from you.

3d keypoint 投影到image 的2d keypoint 不对应

您好,我在使用过程中将keypoints2d 绘制在图像的时候是对的,将keypoint3d 经过投影绘制在图像上,差的很多。
1、 根据video name 从mapping 里面获取 cames的参数,
2、 世界坐标系下的kpt 转换到 2d图像上的kpt
您能帮我分析一下是哪里不对吗?头都大了

FPS

I Downloaded the Keypoints 3D only and it says "all annotations are 60 fps", however when i open them they seems to be 30fps for example a motion of duration : 1 minute and 35 seconds is around 2800 frames not 5600 is it right ? should i treat the keypoints3d as 30fps.

Plus: if it is so and i want to extract audio_features with the same fps from audio, is it right to use a sample rate = 48000 and hope_size = 800 ?

How to visualize the generated dance motion with the API?

Hi ruilong, I'm trying the visualize the test result from the trained MINT model, I try to run the following demo code and the output is a freezing image like below.
image

image

Could you tell me is there a code to visualize(render) the generated dance output to a video like the videos provided in theoverview of AIST++? Thank you so much.

image

smpl render

I still have concern. I use your code but the results are still squeezed together. Is the smplx version or SMPL_MALE.pkl any different ?

Question about visualizing the SMPL joints

Hi,

Your work is excellent and I am interested in it! Now I'm trying my hands on visualizing the SMPL joints.

But I have no idea about where the file "SMPL_MALE.pkl" is and what function it serves as. Could you please tell me about it?

Sincerely,
Shuhong

What's the joints order in smpl_poses ?

Hi, thanks for your great contribution of this job. I wonder what the roations are in smpl_poses. I know the the dim of smpl_poses is [N, (24 * 3)]. Howerver, it don't seem right when I use the smpl order as the ReadME describes.

[
 "root", 
 "left_hip", "left_knee", "left_foot", "left_toe", 
 "right_hip", "right_knee", "right_foot", "right_toe",
 "waist", "spine", "chest", "neck", "head", 
 "left_in_shoulder", "left_shoulder", "left_elbow", "left_wrist",
 "right_in_shoulder", "right_shoulder", "right_elbow", "right_wrist"
 ]

I want to reorder the axis angles in smpl_poses with the follow order:

[
    "root",  # 0
    "lhip",  # 1
    "rhip",  # 2
    "waist",  # 3
    "lknee",  # 4
    "rknee",  # 5
    "spine",  # 6
    "lankle",  # 7
    "rankle",  # 8
    "chest",  # 9
    "ltoe",  # 10
    "rtoe",  # 11
    "neck",  # 12
    "lclavicle",  # 13
    "rclavicle",  # 14
    "head",  # 15
    "lshoulder",  # 16
    "rshoulder",  # 17
    "lelbow",  # 18
    "relbow",  # 19
    "lwrist",  # 20
    "rwrist",  # 21
]

I build a mapping index array to select the poses as follow:

order_map = [0, 5, 1, 9, 6, 2, 10, 7, 3, 11, 8, 4, 12, 18, 14, 13, 19, 15, 20, 16, 21, 17]

smpl_poses = smpl_poses.reshape(
    smpl_poses.shape[0], -1, 3)[:, order_map, :].reshape(smpl_poses.shape[0], -1)

But, I got a weird result:

image

Do I make a mistake?

missing for music data

In the dataset "aist_plusplus_final", the music data is missed. If I want to start my task that focus on music conditioned motion generation, maybe the AIST++ can't be used.

About SMPL shape parameters

Hi, thanks for your amazing dataset! I found this dataset does not contain the shape parameters of SMPL model. Also, the learning rate is set to 0 for all stages. Also, shape parameters will affect the scale optimization. So, which shape parameters should I use for visualizing the SMPL mesh?

Camera Trasnslation is not correct

I am trying to project the keypoints viewed from the Camera of (0,0,0) exactly as the image through the world_to_camera method and additional methods. For example:

image

In above image, assuming that the camera was shot at (0,0,0) in the current 3d space, I would like to match the rotation and translation of the keypoints as shown in the photo above. (The photo above was adjusted manually and was not correct. Also, don't mind scattered points of any color other than the human keypoint below)

image
However, when the camera translations of the annotations are displayed in 3d space, they appear as follows compared to other keypoints.
You can see that the height does not match with the human keypoints, and it is not surrounded by 360 degrees and is clustered in a specific area.

Looking at it a little further, it looks like this:
image

I think the coordinate scale of camera translation and human keypoints translation is different or there are other reasons.
If what I didn't aware about something can you explain?

Thank you very much for providing the dataset.

Exception in run_estimate_smpl.py

I tried to repeat Step 4 to get neutral SMPL annotations. But I got such an exception:

File "/home/anna/code/aistplusplus_api/processing/run_estimate_smpl.py", line 172, in main
    smpl, loss = smpl_regressor.fit(keypoints3d, dtype='openpose25', verbose=True)
  File "/home/anna/code/aistplusplus_api/processing/run_estimate_smpl.py", line 113, in fit
    keypoints3d = keypoints3d[:, mapping_target, :]
IndexError: index 17 is out of bounds for axis 1 with size 17

What is wrong? How can it be fixed?

Convention joint axis for SMPL pose visualisation

Hi there!
Hope you are well! :)

I am desperately trying to visualize the SMPL pose parameters from your dataset but it seems that your axis convention does not meet mine.

I have got inverted and inconsistent rotations as in the following picture :
image

I have tried to swap (x, y z) axis, oppose signs in the x, y or z direction without success.

Do you have any idea of the transformation I need to make to visualize the animation properly ?
Could you please provide the axis convention used for your skeleton please ?

Thank you :D

What's the difference between 3D points generated from 2D keypoints and generated from SMPL models?

hi, ruilong,
Thanks for you great work, what's difference between keypoints3d_true and keypoints3d? I use keypoints3d to generate smpl, but it didn't work.

`
smpl_poses, smpl_scaling, smpl_trans = AISTDataset.load_motion(
'motions', 'gBR_sBM_cAll_d04_mBR0_ch01')
smpl = None
smpl = SMPL(model_path='/home/public_data/sjc/smpl/models/', gender='MALE', batch_size=1)
keypoints3d = smpl.forward(
global_orient=torch.from_numpy(smpl_poses[:, 0:1]).float(),
body_pose=torch.from_numpy(smpl_poses[:, 1:]).float(),
transl=torch.from_numpy(smpl_trans/smpl_scaling).float(),
).joints.detach().numpy()[:, 0:24, :]

keypoints3d_true = AISTDataset.load_keypoint3d(
'keypoints3d', 'gBR_sBM_cAll_d04_mBR0_ch01', use_optim=False)

smpl, loss = smpl_regressor.fit(keypoints3d, dtype='coco', verbose=True)
`

Providing fbx model for smpl motions

Hi,
would you please provide fbx models for smpl body motions?
pkl models are quite good for python, but it will be convenient to use the fbx model in 3D software.

How to construct 3d skeleton?

Hi, I have used the model to create the SMPL but is there any way to reconstruct 3d skeletons given in the github page?

Trouble downloading the dataset

Hi,

Thanks for your. wonderful datasets.

I recently do some thing about 3D humane pose and I tried to download your dataset. However I always meet the error 'multiprocessing.pool.MaybeEncodingError: Error sending result: '<multiprocessing.pool.ExceptionWithTraceback object at 0x7f344db94160>'. Reason: 'TypeError("cannot serialize '_io.BufferedReader' object",)'.

Thus I am not sure whether I have finished the downloading. I find a video file list in downloader.py, so I download it individually. I find the video_list contains 12669 items, but it not equal to the annotations which only contains 1408 SMPL files.

I wonder how to check the dataset? One more thing, how can I fixed the error when I using?

Looking forward to hearing from you,

Audio fragments extraction/alignment

Hi,
Is it possible to post a script that aligned the music and motion recordings?
As I can see, in the AIST dataset the audio fragments do not match the length of the corresponding fragments, and when extracting audio directly from a video fragment it is not clear how to properly align the individual features of the audio and movement frames.

Timur Sokhin

extract images

Hi,
when i use ffmpeg_video_read to extract the image,i specified a frame rate of 60,but i found the number of frames and the number of corresponding motions are unequal.
while if i don't specify fps,the number are equal.
i use gHO_sBM_c01/02...09_d19_mHO2_ch02 .mp4 and get 481 images,but smpl poses only have 480.so do i need to specify the frame rate?
Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.