Coder Social home page Coder Social logo

voxelpose-pytorch's Introduction

VoxelPose

This is the official implementation for:

VoxelPose: Towards Multi-Camera 3D Human Pose Estimation in Wild Environment,
Hanyue Tu, Chunyu Wang, Wenjun Zeng
ECCV 2020 (Oral) (arXiv 2004.06239)

Installation

  1. Clone this repo, and we'll call the directory that you cloned multiview-multiperson-pose as ${POSE_ROOT}.
  2. Install dependencies.

Data preparation

Shelf/Campus datasets

  1. Download the datasets from http://campar.in.tum.de/Chair/MultiHumanPose and extract them under ${POSE_ROOT}/data/Shelf and ${POSE_ROOT}/data/CampusSeq1, respectively.

  2. We have processed the camera parameters to our formats and you can download them from this repository. They lie in ${POSE_ROOT}/data/Shelf/ and ${POSE_ROOT}/data/CampusSeq1/, respectively.

  3. Due to the limited and incomplete annotations of the two datasets, we don't train our model using this dataset. Instead, we directly use the 2D pose estimator trained on COCO, and use independent 3D human poses from the Panoptic dataset to train our 3D model. It lies in ${POSE_ROOT}/data/panoptic_training_pose.pkl. See our paper for more details.

  4. For testing, we first estimate 2D poses and generate 2D heatmaps for these two datasets in this repository. The predicted poses can also download from the repository. They lie in ${POSE_ROOT}/data/Shelf/ and ${POSE_ROOT}/data/CampusSeq1/, respectively. You can also use the models trained on COCO dataset (like HigherHRNet) to generate 2D heatmaps directly.

The directory tree should look like this:

${POSE_ROOT}
|-- data
    |-- Shelf
    |   |-- Camera0
    |   |-- ...
    |   |-- Camera4
    |   |-- actorsGT.mat
    |   |-- calibration_shelf.json
    |   |-- pred_shelf_maskrcnn_hrnet_coco.pkl
    |-- CampusSeq1
    |   |-- Camera0
    |   |-- Camera1
    |   |-- Camera2
    |   |-- actorsGT.mat
    |   |-- calibration_campus.json
    |   |-- pred_campus_maskrcnn_hrnet_coco.pkl
    |-- panoptic_training_pose.pkl

CMU Panoptic dataset

  1. Download the dataset by following the instructions in panoptic-toolbox and extract them under ${POSE_ROOT}/data/panoptic_toolbox/data.
  • You can only download those sequences you need. You can also just download a subset of camera views by specifying the number of views (HD_Video_Number) and changing the camera order in ./scripts/getData.sh. The sequences and camera views used in our project can be obtained from our paper.
  • Note that we only use HD videos, calibration data, and 3D Body Keypoint in the codes. You can comment out other irrelevant codes such as downloading 3D Face data in ./scripts/getData.sh.
  1. Download the pretrained backbone model from pretrained backbone and place it here: ${POSE_ROOT}/models/pose_resnet50_panoptic.pth.tar (ResNet-50 pretrained on COCO dataset and finetuned jointly on Panoptic dataset and MPII).

The directory tree should look like this:

${POSE_ROOT}
|-- models
|   |-- pose_resnet50_panoptic.pth.tar
|-- data
    |-- panoptic-toolbox
        |-- data
            |-- 16060224_haggling1
            |   |-- hdImgs
            |   |-- hdvideos
            |   |-- hdPose3d_stage1_coco19
            |   |-- calibration_160224_haggling1.json
            |-- 160226_haggling1  
            |-- ...

Training

CMU Panoptic dataset

Train and validate on the five selected camera views. You can specify the GPU devices and batch size per GPU in the config file. We trained our models on two GPUs.

python run/train_3d.py --cfg configs/panoptic/resnet50/prn64_cpn80x80x20_960x512_cam5.yaml

Shelf/Campus datasets

python run/train_3d.py --cfg configs/shelf/prn64_cpn80x80x20.yaml
python run/train_3d.py --cfg configs/campus/prn64_cpn80x80x20.yaml

Evaluation

CMU Panoptic dataset

Evaluate the models. It will print evaluation results to the screen./

python test/evaluate.py --cfg configs/panoptic/resnet50/prn64_cpn80x80x20_960x512_cam5.yaml

Shelf/Campus datasets

It will print the PCP results to the screen.

python test/evaluate.py --cfg configs/shelf/prn64_cpn80x80x20.yaml
python test/evaluate.py --cfg configs/campus/prn64_cpn80x80x20.yaml

Citation

If you use our code or models in your research, please cite with:

@inproceedings{voxelpose,
    author={Tu, Hanyue and Wang, Chunyu and Zeng, Wenjun},
    title={VoxelPose: Towards Multi-Camera 3D Human Pose Estimation in Wild Environment},
    booktitle = {European Conference on Computer Vision (ECCV)},
    year = {2020}
}

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

voxelpose-pytorch's People

Contributors

chunyuwang avatar dependabot[bot] avatar meijieru avatar microsoft-github-operations[bot] avatar microsoftopensource avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

voxelpose-pytorch's Issues

3D pose visualization

Thanks for your great work.
Can you provide the code about project the 3D pose to 2D image like Fig6 in the paper?
Thanks.
螢幕擷取畫面 2023-06-17 142031

Deciding on SPACE_SIZE and SPACE_CENTER

I have a custom dataset that I want to apply voxelpose to. But there is no explanation for how the values SPACE_SIZE and SPACE_CENTER were selected for the 3 datasets, therefore it is not clear to me what to set them to for my custom dataset

Directly pred 3D from calibration pic

Thank you for this great repo, I thank this is very insteresting,I want use Three or four camera to get the 3D pose ,could you give the code 3D frrom 2Dpics?

multiple GPUs training on CMU datasets

when i trained the model on cmu datasets with multiple gpus the dataloader function encontered the followed problem. but it's worked with single gpu.
Traceback (most recent call last):
File "run/train_3d.py", line 163, in
main()
File "run/train_3d.py", line 136, in main
train_3d(config, model, optimizer, train_loader, epoch, final_output_dir, writer_dict)
File "/home/gw/Project/voxelpose/lib/core/function.py", line 37, in train_3d
for i, (inputs, targets_2d, weights_2d, targets_3d, meta, input_heatmap) in enumerate(loader):
File "/home/gw/anaconda3/envs/VIBE/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 435, in next
data = self._next_data()
File "/home/gw/anaconda3/envs/VIBE/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1065, in _next_data
return self._process_data(data)
File "/home/gw/anaconda3/envs/VIBE/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1111, in _process_data
data.reraise()
File "/home/gw/anaconda3/envs/VIBE/lib/python3.7/site-packages/torch/_utils.py", line 428, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in DataLoader worker process 3.
Original Traceback (most recent call last):
File "/home/gw/anaconda3/envs/VIBE/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 198, in _worker_loop
data = fetcher.fetch(index)
File "/home/gw/anaconda3/envs/VIBE/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
return self.collate_fn(data)
File "/home/gw/anaconda3/envs/VIBE/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 83, in default_collate
return [default_collate(samples) for samples in transposed]
File "/home/gw/anaconda3/envs/VIBE/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 83, in
return [default_collate(samples) for samples in transposed]
File "/home/gw/anaconda3/envs/VIBE/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 81, in default_collate
raise RuntimeError('each element in list of batch should be of equal size')
RuntimeError: each element in list of batch should be of equal size

about the output result

Following the git description, the directory tree was configured as follows.

${POSE_ROOT}
|-- models
| |-- pose_resnet50_panoptic.pth.tar
|-- data
|-- panoptic-toolbox
|-- data
|-- 171204_pose1
| |-- hdImgs
| |-- hdvideos
| |-- hdPose3d_stage1_coco19
| |-- calibration_160224_haggling1.json
|-- 171204_pose1_sample
|-- ...

(Due to capacity issues, the 171204_pose1 and 171204_pose1_sample data
was used}

After configuring the directory tree, the panoptic.py code is modified as follows.

TRAIN_LIST = [ '171204_pose1', ]
VAL_LIST = [ '171204_pose1_sample' ]

After setting like this,
I have a question about the output output when executing the "python run/train_3d.py --cfg configs/panoptic/resnet50/prn64_cpn80x80x20_960x512_cam5.yaml" command.

Why are there no results for all val data?
(There are only 10 validation results in output/.../image_with_joints/
validation_0000000_view_1_gt ~ validation_0000000_view_5_gt
validation_0000002_view_1_gt ~ validation_0000002_view_5_gt)

pytorch1.7 can not run!

hello,I want to run voxelpose on pytorch 1.7.But,it has some errors on here!
voxelpose_test

error:
voxelpose
i guess the reason is "inplace operations" pytorch 1.7 can not support!!!
but i can not work it well.
Do you run the code on pytorch 1.7? Or can you give me some advices? thank you very much!

How do I check the loss ?

Epoch: 6
Test: [0/645] Time: 1.962s (1.962s) Speed: 10.2 samples/s Data: 1.474s (1.474s) Memory 200613376.0
Test: [100/645] Time: 0.385s (0.453s) Speed: 51.9 samples/s Data: 0.000s (0.049s) Memory 200613376.0
Test: [200/645] Time: 0.406s (0.439s) Speed: 49.2 samples/s Data: 0.000s (0.041s) Memory 200613376.0
Test: [300/645] Time: 0.410s (0.432s) Speed: 48.8 samples/s Data: 0.000s (0.037s) Memory 200613376.0
Test: [400/645] Time: 0.388s (0.428s) Speed: 51.6 samples/s Data: 0.000s (0.036s) Memory 200613376.0
Test: [500/645] Time: 0.407s (0.427s) Speed: 49.1 samples/s Data: 0.000s (0.035s) Memory 200613376.0
Test: [600/645] Time: 0.419s (0.428s) Speed: 47.7 samples/s Data: 0.000s (0.034s) Memory 200613376.0
Test: [644/645] Time: 0.373s (0.431s) Speed: 53.7 samples/s Data: 0.000s (0.036s) Memory 200613376.0
ap@25: 0.0000 ap@50: 0.0000 ap@75: 0.0000 ap@100: 0.0000 ap@125: 0.0000 ap@150: 0.0000 recall@500mm: 0.0000 mpjpe@500mm: inf

I haven't been able to find a way to see the loss, could you help ?

when i run train_3d.py ,i will get trouble

发生异常: RuntimeError
Expected object of scalar type Float but got scalar type Double for argument #2 'other'
File "/home/wu/voxelpose-pytorch/lib/models/project_layer.py", line 80, in get_voxel
bounding[i, 0, 0, :, c] = (xy[:, 0] >= 0) & (xy[:, 1] >= 0) & (xy[:, 0] < width) & (xy[:, 1] < height)
File "/home/wu/voxelpose-pytorch/lib/models/project_layer.py", line 107, in forward
cubes, grids = self.get_voxel(heatmaps, meta, grid_size, grid_center, cube_size)
File "/home/wu/voxelpose-pytorch/lib/models/cuboid_proposal_net.py", line 103, in forward
self.grid_size, [self.grid_center], self.cube_size)
File "/home/wu/voxelpose-pytorch/lib/models/multi_person_posenet.py", line 65, in forward
root_cubes, grid_centers = self.root_net(all_heatmaps, meta)
File "/home/wu/voxelpose-pytorch/lib/core/function.py", line 126, in validate_3d
input_heatmaps=input_heatmap)
File "/home/wu/voxelpose-pytorch/run/train_3d.py", line 135, in main
precision = validate_3d(config, model, test_loader, final_output_dir)
File "/home/wu/voxelpose-pytorch/run/train_3d.py", line 161, in
main()

Camera calibration

Questions about camera calibration, and how the measure of the camera parameters impact the results of the reconstruction.

So, I wondered how you guys did to have access to the camera parameters. I see that you use a half-globe full of cameras, maybe all of these are given right away.

On the other hand, how does these parameters influence the results. How close to reality should they be so that the result is not that much influenced.

About the number of keypoints of each dataset

Thanks for your work!

May I hava a question about pretrained pose-resnet backbone setting?

Whan I check pose_resnet50_panoptic.pth.tar, its number of the joints is 18.
However, the number of COCO (OpenPose version) keypoints is 18, the number of MPII keypoints is 16, the number of Panoptic is 19.
All are not same.
How do mapping other kind of keypoints?
Would you provide mapping function to me? Or Can I know the setting about this in detail?

As I guess about the training of backbone based on describing in paper,
first, load the COCO (18 keypoints) pretrained model of Pose-ResNet,
second, mapping or eliminating the keypoints of MPII/Panoptic suited to COCO keypoitns format.

Thanks for your contributions!

Campus space center , space size

Hello I know this has been answered before but I don't get how you define the space center. Let me explain.
In campus dataset the x,z coords in the original dataset are the following. (-4.9,11.2),(-1.78,5.22) and (4.9,6.68) in meters. I don't quite understand how you define a 12x12m box around these coords. Also the, space center should be around (0,8) (meters).
Could you please enlighten me ?

Same is the case if I instead get the coordinates from the -dot(R.T,T). The coordinates are the following: (-6.2,5.2) , (1.77,-5.05), (11.7,-1.8), I still can't see how the bounding box should be 12x12 and the space center 3,4.5.
Thanks.

How to get data like "panoptic_training_pose.pkl"?

There is a data file called "panoptic_training_pose.pkl" in your repository. I wonder if it is provided by CMU panoptic-toolbox or you make it by yourself. If you do it by yourself. May I know how do you make it or can you tell me the meaning of the keywords like "joints_3d_vis" and "joints_vis" ? Thanks and look forward to your reply.

training problem

when i trained the model on campus datasets and met such problem. and i use the torch1.7, cuda 11.1. And the training strategy in the code seems be different from the strategy given in the paper.
Traceback (most recent call last):
File "run/train_3d.py", line 163, in
main()
File "run/train_3d.py", line 136, in main
train_3d(config, model, optimizer, train_loader, epoch, final_output_dir, writer_dict)
File "/home/gw/Project/voxelpose/lib/core/function.py", line 68, in train_3d
accu_loss_3d.backward()
File "/home/gw/anaconda3/envs/VIBE/lib/python3.7/site-packages/torch/tensor.py", line 221, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/gw/anaconda3/envs/VIBE/lib/python3.7/site-packages/torch/autograd/init.py", line 132, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1, 32, 1, 1, 1]] is at version 8; expected version 6 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

I get an AP = 0

Am I the only one getting this error ?
What can I do ?

Epoch: 9

Test: [0/645]	Time: 2.164s (2.164s)	Speed: 9.2 samples/s	Data: 1.694s (1.694s)	Memory 200613376.0
Test: [0/645]	Time: 2.164s (2.164s)	Speed: 9.2 samples/s	Data: 1.694s (1.694s)	Memory 200613376.0
Test: [100/645]	Time: 0.406s (0.452s)	Speed: 49.3 samples/s	Data: 0.000s (0.049s)	Memory 200613376.0
Test: [100/645]	Time: 0.406s (0.452s)	Speed: 49.3 samples/s	Data: 0.000s (0.049s)	Memory 200613376.0
Test: [200/645]	Time: 0.379s (0.438s)	Speed: 52.7 samples/s	Data: 0.000s (0.041s)	Memory 200613376.0
Test: [200/645]	Time: 0.379s (0.438s)	Speed: 52.7 samples/s	Data: 0.000s (0.041s)	Memory 200613376.0
Test: [300/645]	Time: 0.378s (0.431s)	Speed: 52.9 samples/s	Data: 0.000s (0.037s)	Memory 200613376.0
Test: [300/645]	Time: 0.378s (0.431s)	Speed: 52.9 samples/s	Data: 0.000s (0.037s)	Memory 200613376.0
Test: [400/645]	Time: 0.415s (0.426s)	Speed: 48.2 samples/s	Data: 0.000s (0.035s)	Memory 200613376.0
Test: [400/645]	Time: 0.415s (0.426s)	Speed: 48.2 samples/s	Data: 0.000s (0.035s)	Memory 200613376.0
Test: [500/645]	Time: 0.420s (0.424s)	Speed: 47.7 samples/s	Data: 0.000s (0.034s)	Memory 200613376.0
Test: [500/645]	Time: 0.420s (0.424s)	Speed: 47.7 samples/s	Data: 0.000s (0.034s)	Memory 200613376.0
Test: [600/645]	Time: 0.378s (0.423s)	Speed: 52.9 samples/s	Data: 0.000s (0.034s)	Memory 200613376.0
Test: [600/645]	Time: 0.378s (0.423s)	Speed: 52.9 samples/s	Data: 0.000s (0.034s)	Memory 200613376.0
Test: [644/645]	Time: 0.373s (0.425s)	Speed: 53.5 samples/s	Data: 0.000s (0.036s)	Memory 200613376.0
Test: [644/645]	Time: 0.373s (0.425s)	Speed: 53.5 samples/s	Data: 0.000s (0.036s)	Memory 200613376.0
ap@25: 0.0000	ap@50: 0.0000	ap@75: 0.0000	ap@100: 0.0000	ap@125: 0.0000	ap@150: 0.0000	recall@500mm: 0.0000	mpjpe@500mm: inf
ap@25: 0.0000	ap@50: 0.0000	ap@75: 0.0000	ap@100: 0.0000	ap@125: 0.0000	ap@150: 0.0000	recall@500mm: 0.0000	mpjpe@500mm: inf

Regarding the accumulation steps!

First of all, thanks for making this awesome work public.

I don't understand what is the meaning of the accumulation steps.
Why would the loss be backpropagated every n steps for the 1d 2d and bbox errors?
Did this come from empirical testing?

I am not sure this is a correct approach as well, regarding the software side, as the parameters will most luckily change by then from the joint loss backprop, evident from the runtime errors in torch>1.4.0.

Visualization

Thanks for such a great repo. Will you release 3D visualization code for this repo same as Dong released visualization code?

Training data for shelf/campus

In the paper, you said you split campus dataset into training and testing subsets, but in the code campus/shelf datasets are only used for testing. And I'm curious about how you generated the synthetic data panoptic_training_pose.pkl.

Train heat map predictor

Hi, I am using the panoptic dataset, and I am wondering if it is possible to train the heatmap predictor with this dataset?

Tracking Method

Thank you for this excellent work.
I have some doubts about the tracking effect in the visualization results (each person is assigned a color to represent his ID). I didn't find in the paper or the code how you tracking subjects. Can you explain which tracking method was used?

How to get datasets

I tried to download the datasets from http://campar.in.tum.de/Chair/MultiHumanPose and extract them under ${POSE_ROOT}/data/Shelf and ${POSE_ROOT}/data/CampusSeq1, but the following error message appeared.

Could not connect successfully
Could not establish a connection to the server for campar.in.tum.de.

Please let me know the other URL to get those datasets.

About the performance on Campus dataset

Thanks for your great work. Recently, I was reproducing your experiments. The best result I got, for now, is just 96.5. But in your paper, the result on the Campus dataset you report was 96.7 in PCP.

So I was wondering which config you use to produce this result?

Thanks for your reply in advance.

Questions regarding your work

Hi guys,

first of all, this is really great work you did there. Thanks a lot!
I have a few question about your work. I'm really looking forward to your reply.

All the best!

Synthetic Heatmaps

I'm searching for the explication on how you generated the synthetic heatmaps. Eventhough, the code is written in a good coding style and mostly very understandable, at some points comments would have been very helpful. It would also be a great help if you could go more into detail how and why you generated synthetic heatmaps. Thank you! :)

Discussion of generalization capabilities

In your paper I'm missing a discussion on the following question:
In the panoptic dataset all cameras are in equal distance, and eventhough you chose random cameras for training and testing the cameras stay in a similar configuration (distance and direction) to the scene. Would it be possible to perhaps test an network, which has been trained on the panoptic dataset, on the campus data? This would show real generalization capabilites.

Decoupling meta data information

In your code it seems to be not absolutely clear to me, if the meta data, especially the number of persons in the image, is completely decoupled from the forward call of the model. Perhaps it would be good to give a maximum number of persons the network has to check for. Currently it uses - if I got it right - the meta data information. I'd be happy, if you could explain the meta data more in detailed; what it is and what is it used for?

Thanks in advance! :)

ValueError: setting an array element with a sequence.

Error after python run/train_3d.py --cfg configs/shelf/prn64_cpn80x80x20.yaml


=> load /media/user/LaCie/DataSet/voxel-data/data/Shelf/pred_shelf_maskrcnn_hrnet_coco.pkl
Traceback (most recent call last):
  File "run/train_3d.py", line 160, in <module>
    main()
  File "run/train_3d.py", line 87, in main
    test_dataset = eval('dataset.' + config.DATASET.TEST_DATASET)(
  File "/home/user/voxelpose-pytorch/run/../lib/dataset/shelf.py", line 70, in __init__
    self.db = self._get_db()
  File "/home/user/voxelpose-pytorch/run/../lib/dataset/shelf.py", line 92, in _get_db
    actor_3d = np.array(np.array(data['actor3D'].tolist()).tolist()).squeeze()  # num_person * num_frame
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 4 dimensions. The detected shape was (1, 4, 3200, 1) + inhomogeneous part.

My pip list :

pip list
Package Version Editable project location


addict 2.4.0
aliyun-python-sdk-core 2.13.36
aliyun-python-sdk-kms 2.16.1
attrs 23.1.0
brotlipy 0.7.0
certifi 2023.7.22
cffi 1.15.1
charset-normalizer 2.1.1
chumpy 0.70
click 8.1.7
colorama 0.4.6
contourpy 1.1.0
coverage 7.3.0
crcmod 1.7
cryptography 38.0.3
cycler 0.11.0
Cython 3.0.2
easydict 1.10
exceptiongroup 1.1.3
filelock 3.12.3
flake8 6.1.0
fonttools 4.42.1
idna 3.4
importlib-metadata 6.8.0
importlib-resources 6.0.1
iniconfig 2.0.0
interrogate 1.5.0
isort 4.3.21
Jinja2 3.1.2
jmespath 0.10.0
json-tricks 3.17.3
kiwisolver 1.4.5
Markdown 3.4.4
markdown-it-py 3.0.0
MarkupSafe 2.1.3
matplotlib 3.7.2
mccabe 0.7.0
mdurl 0.1.2
mmcv 2.0.0rc4
mmdet 3.1.0
mmengine 0.8.4
mmpose 1.1.0 /home/user/mmpose
model-index 0.1.11
mpmath 1.3.0
munkres 1.1.4
networkx 3.1
numpy 1.24.4
opencv-python 4.8.0.76
opendatalab 0.0.10
openmim 0.3.9
openxlab 0.0.23
ordered-set 4.1.0
oss2 2.17.0
packaging 23.1
pandas 2.0.3
parameterized 0.9.0
pbr 5.11.1
Pillow 9.2.0
pip 23.2.1
platformdirs 3.10.0
pluggy 1.3.0
protobuf 4.24.2
py 1.11.0
pycocotools 2.0.7
pycodestyle 2.11.0
pycparser 2.21
pycryptodome 3.18.0
pyflakes 3.1.0
Pygments 2.16.1
pyOpenSSL 22.1.0
pyparsing 3.0.9
PySocks 1.7.1
pytest 7.4.1
pytest-runner 6.0.0
python-dateutil 2.8.2
pytz 2023.3.post1
PyYAML 6.0.1
requests 2.28.2
rich 13.4.2
scipy 1.10.1
setuptools 68.1.2
shapely 2.0.1
six 1.16.0
sympy 1.12
tabulate 0.9.0
tensorboardX 2.6.2.2
termcolor 2.3.0
terminaltables 3.1.10
testresources 2.0.1
toml 0.10.2
tomli 2.0.1
torch 1.11.0
torchvision 0.12.0a0+9b5a3fe
tqdm 4.65.2
typing_extensions 4.7.1
tzdata 2023.3
urllib3 1.26.11
wheel 0.38.4
xdoctest 1.1.1
xtcocotools 1.14
yapf 0.40.1
zipp 3.16.2

Machine: Jetson Orin AGX

Camera parameters processing

You explicitly mention on the readme, that you have processed the camera parameters to your formats. Can you explain what process was made? Particularly, on the camera translation parameter?

From the original Campus dataset one can obtain the translation T for each camera, but it is way too different from the ones you provide in the json calib file. As example I got T=[-1.787557e+00, 1.361094e+00, 5.226973e+00] from the original data for cam 0, but in the json file you use T = [1774.89, -5051.69, 1923.35]. What special consideration should be made to obtain such values? How should they be interpreted?

I would appreciate if you can elaborate more on it.
Thank you for your time!!

Different Projection Formulas

Thanks for sharing your nice work!

I notice you use two different formulas to trasnsform 3D pose from the world coordinate to the camera coordinate when processing shelf and CMU panoptic datasets, i.e. x = np.dot(R, X) + t and x = R.dot(X - t). Why not use the same formula?

IndexError: list index out of range

When I run item on linux, it comes this error. But I can train on windows , I don't konw how to deal with this problem. THX
Traceback (most recent call last):
File "run/train_3d.py", line 160, in
main()
File "run/train_3d.py", line 133, in main
train_3d(config, model, optimizer, train_loader, epoch, final_output_dir, writer_dict)
File "/mnt/e/voxelpose-pytorch-main/run/../lib/core/function.py", line 44, in train_3d
targets_3d=targets_3d[0])
IndexError: list index out of range

General questions

For each dataset there is a pretrined Backbone
aka:
pose_resnet50_panoptic.pth.tar
pred_shelf_maskrcnn_hrnet_coco.pkl
pred_campus_maskrcnn_hrnet_coco.pkl

What are this for ?

About the monocular experiment setting

Hi, there
Thanks for your great work!

Now I'm trying to develop a monocular model comparable to yours (with panoptic-studio dataset)

And I wonder the exact setting of your monocular experiment.
What camera did you use for Training/Validation?

Thank you

How to get `calibration_shelf.json` and `calibration_campus.json` files?

Thanks for your great project. I am wondering how you generate these files calibration_shelf.json and calibration_campus.json from the related official Shelf and Campus dataset files. Could you please offer the related script to transform the official calibration data into your format? Thanks in advance.

Evaluate custom videos

Hello, Thanks for your rewarding word. But I want to know how can I evaluate custom videos, example for any .mp4 video files?

Thanks in advance for anyone who is willing to answer me.

run/train_3d.py fails with Segmentation fault (core dumped)

Hi, While trying to run train_3d.py I get a segmentation fault. I'm not sure where to start the debug. Could you please guide me as to what could be wrong with my environment? I've pasted my conda environment packages below:

Name Version Build Channel

_libgcc_mutex 0.1 main defaults
_pytorch_select 0.2 gpu_0 defaults
blas 1.0 mkl defaults
bzip2 1.0.8 h7b6447c_0 defaults
ca-certificates 2020.7.22 0 defaults
cairo 1.14.12 h8948797_3 defaults
certifi 2020.6.20 py36_0 defaults
cffi 1.14.2 py36he30daa8_0 defaults
cudatoolkit 10.1.243 h6bb024c_0 defaults
cudnn 7.6.5 cuda10.1_0 defaults
cycler 0.10.0 py36_0 defaults
dbus 1.13.16 hb2f20db_0 defaults
easydict 1.9 py_0 conda-forge
expat 2.2.9 he6710b0_2 defaults
ffmpeg 4.0 hcdf2ecd_0 defaults
fontconfig 2.13.0 h9420a91_0 defaults
freeglut 3.0.0 hf484d3e_5 defaults
freetype 2.10.2 h5ab3b9f_0 defaults
glib 2.65.0 h3eb4bd4_0 defaults
graphite2 1.3.14 h23475e2_0 defaults
gst-plugins-base 1.14.0 hbbd80ab_1 defaults
gstreamer 1.14.0 hb31296c_0 defaults
harfbuzz 1.8.8 hffaf4a1_0 defaults
hdf5 1.10.2 hba1933b_1 defaults
icu 58.2 he6710b0_3 defaults
intel-openmp 2020.2 254 defaults
jasper 2.0.14 h07fcdf6_1 defaults
jpeg 9b h024ee3a_2 defaults
json_tricks 3.13.5 py_0 conda-forge
kiwisolver 1.2.0 py36hfd86e86_0 defaults
lcms2 2.11 h396b838_0 defaults
ld_impl_linux-64 2.33.1 h53a641e_7 defaults
libedit 3.1.20191231 h14c3975_1 defaults
libffi 3.3 he6710b0_2 defaults
libgcc-ng 9.1.0 hdf63c60_0 defaults
libgfortran-ng 7.3.0 hdf63c60_0 defaults
libglu 9.0.0 hf484d3e_1 defaults
libopencv 3.4.2 hb342d67_1 defaults
libopus 1.3.1 h7b6447c_0 defaults
libpng 1.6.37 hbc83047_0 defaults
libprotobuf 3.5.1 h6f1eeef_0 defaults
libstdcxx-ng 9.1.0 hdf63c60_0 defaults
libtiff 4.1.0 h2733197_1 defaults
libuuid 1.0.3 h1bed415_2 defaults
libvpx 1.7.0 h439df22_0 defaults
libxcb 1.14 h7b6447c_0 defaults
libxml2 2.9.10 he19cac6_1 defaults
lz4-c 1.9.2 he6710b0_1 defaults
matplotlib 3.3.1 0 defaults
matplotlib-base 3.3.1 py36h817c723_0 defaults
mkl 2020.2 256 defaults
mkl-service 2.3.0 py36he904b0f_0 defaults
mkl_fft 1.1.0 py36h23d657b_0 defaults
mkl_random 1.1.1 py36h0573a6f_0 defaults
ncurses 6.2 he6710b0_1 defaults
ninja 1.10.1 py36hfd86e86_0 defaults
numpy 1.19.1 py36hbc911f0_0 defaults
numpy-base 1.19.1 py36hfa32c7d_0 defaults
olefile 0.46 py36_0 defaults
opencv 3.4.2 py36h6fd60c2_1 defaults
openssl 1.1.1g h7b6447c_0 defaults
pandas 1.1.1 py36he6710b0_0 defaults
pcre 8.44 he6710b0_0 defaults
pillow 7.2.0 py36hb39fc2d_0 defaults
pip 20.2.2 py36_0 defaults
pixman 0.40.0 h7b6447c_0 defaults
prettytable 0.7.2 py_3 conda-forge
protobuf 3.5.1 py36_3 conda-forge
py-opencv 3.4.2 py36hb342d67_1 defaults
pycparser 2.20 py_2 defaults
pyparsing 2.4.7 py_0 defaults
pyqt 5.9.2 py36h05f1152_2 defaults
python 3.6.12 hcff3b4d_2 defaults
python-dateutil 2.8.1 py_0 defaults
pytorch 1.4.0 cuda101py36h02f0884_0 defaults
pytz 2020.1 py_0 defaults
pyyaml 5.3.1 py36h7b6447c_1 defaults
qt 5.9.7 h5867ecd_1 defaults
readline 8.0 h7b6447c_0 defaults
scipy 1.5.2 py36h0b6359f_0 defaults
setuptools 49.6.0 py36_0 defaults
sip 4.19.8 py36hf484d3e_0 defaults
six 1.15.0 py_0 defaults
sqlite 3.33.0 h62c20be_0 defaults
tensorboardx 2.0 py_0 conda-forge
tk 8.6.10 hbc83047_0 defaults
torchvision 0.5.0 py36_cu101 pytorch
tornado 6.0.4 py36h7b6447c_1 defaults
tqdm 4.48.2 py_0 defaults
wheel 0.35.1 py_0 defaults
xz 5.2.5 h7b6447c_0 defaults
yaml 0.2.5 h7b6447c_0 defaults
zlib 1.2.11 h7b6447c_3 defaults
zstd 1.4.5 h9ceee32_0 defaults

Any help would be appreciated. Thanks!

Getting size mismatch error when training on panoptic dataset

Hi there,

when I run python run/train_3d.py --cfg configs/panoptic/resnet50/prn64_cpn80x80x20_960x512_cam5.yaml I am getting the following error:

Traceback (most recent call last):
  File "run/train_3d.py", line 160, in <module>
    main()
  File "run/train_3d.py", line 107, in main
    config, is_train=True)
  File "/home/ubuntu/voxelpose-pytorch/run/../lib/models/multi_person_posenet.py", line 112, in get_multi_person_pose_net
    backbone = eval(cfg.BACKBONE_MODEL + '.get_pose_net')(cfg, is_train=is_train)
  File "/home/ubuntu/voxelpose-pytorch/run/../lib/models/pose_resnet.py", line 277, in get_pose_net
    model.init_weights(cfg.NETWORK.PRETRAINED_BACKBONE)
  File "/home/ubuntu/voxelpose-pytorch/run/../lib/models/pose_resnet.py", line 222, in init_weights
    self.load_state_dict(pretrained_state_dict)
  File "/home/ubuntu/voxelpose-pytorch/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 830, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for PoseResNet:
	size mismatch for final_layer.weight: copying a param with shape torch.Size([17, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([15, 256, 1, 1]).
	size mismatch for final_layer.bias: copying a param with shape torch.Size([17]) from checkpoint, the shape in current model is torch.Size([15]).

Can anyone help? Is this an issue with JOINTS_DEF in lib/dataset/panoptic.py?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.