Coder Social home page Coder Social logo

mks0601 / 3dmppe_rootnet_release Goto Github PK

View Code? Open in Web Editor NEW
470.0 470.0 64.0 12.74 MB

Official PyTorch implementation of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image", ICCV 2019

License: MIT License

Python 100.00%
3d-human-pose computer-vision deep-learning human-pose-estimation iccv2019 pytorch

3dmppe_rootnet_release's People

Contributors

mks0601 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

3dmppe_rootnet_release's Issues

Problem about input_size for customized dataset

Hi thanks for the great works!

I've been trying testing the pre-trained model on cityperson dataset under the cityscape dataset.

The code perfectly went through when the input_size was setting as (256, 256) for Human3.6M as default. But obviously, the output root location was not correct due to the scaling problem. The proper input_size of the cityperson image is (2048, 1024). However, if I set the size to (2048, 1024) I got the following error:

06-15 11:35:20 Creating dataset...
loading annotations into memory...
Done (t=0.04s)
creating index...
index created!
06-15 11:35:21 Load checkpoint from /dump/algopre/c-szan/github/3DMPPE_ROOTNET_RELEASE/main/../output/model_dump/snapshot_18.pth.tar
06-15 11:35:21 Creating graph...
  0%|                                                                                                              | 0/33 [00:00<?, ?it/s]THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1607370193460/work/aten/src/THC/THCCachingHostAllocator.cpp line=278 error=700 : an illegal memory access was encountered
  0%|                                                                                                              | 0/33 [02:02<?, ?it/s]
Traceback (most recent call last):
  File "test.py", line 54, in <module>
    main()
  File "test.py", line 45, in main
    coord_out = tester.model(input_img, cam_param)
  File "/dump/algopre/c-szan/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/dump/algopre/c-szan/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 161, in forward
    outputs = self.parallel_apply(replicas, inputs, kwargs)
  File "/dump/algopre/c-szan/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 171, in parallel_apply
    return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
  File "/dump/algopre/c-szan/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply
    output.reraise()
  File "/dump/algopre/c-szan/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/_utils.py", line 428, in reraise
    raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
  File "/dump/algopre/c-szan/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
    output = module(*input, **kwargs)
  File "/dump/algopre/c-szan/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/dump/algopre/c-szan/github/3DMPPE_ROOTNET_RELEASE/main/model.py", line 100, in forward
    fm = self.backbone(input_img)
  File "/dump/algopre/c-szan/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/dump/algopre/c-szan/github/3DMPPE_ROOTNET_RELEASE/main/../common/nets/resnet.py", line 64, in forward
    x = self.layer4(x)
  File "/dump/algopre/c-szan/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/dump/algopre/c-szan/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/container.py", line 117, in forward
    input = module(input)
  File "/dump/algopre/c-szan/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/dump/algopre/c-szan/anaconda3/envs/pytorch/lib/python3.6/site-packages/torchvision/models/resnet.py", line 116, in forward
    identity = self.downsample(x)
  File "/dump/algopre/c-szan/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/dump/algopre/c-szan/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/container.py", line 117, in forward
    input = module(input)
  File "/dump/algopre/c-szan/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/dump/algopre/c-szan/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 423, in forward
    return self._conv_forward(input, self.weight)
  File "/dump/algopre/c-szan/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 420, in _conv_forward
    self.padding, self.dilation, self.groups)
RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR
You can try to repro this exception using the following code snippet. If that doesn't trigger the error, please include your original repro script when reporting this issue.

import torch
torch.backends.cuda.matmul.allow_tf32 = True
torch.backends.cudnn.benchmark = True
torch.backends.cudnn.deterministic = False
torch.backends.cudnn.allow_tf32 = True
data = torch.randn([32, 1024, 128, 64], dtype=torch.float, device='cuda', requires_grad=True)
net = torch.nn.Conv2d(1024, 2048, kernel_size=[1, 1], padding=[0, 0], stride=[2, 2], dilation=[1, 1], groups=1)
net = net.cuda().float()
out = net(data)
out.backward(torch.randn_like(out))
torch.cuda.synchronize()

ConvolutionParams
    data_type = CUDNN_DATA_FLOAT
    padding = [0, 0, 0]
    stride = [2, 2, 0]
    dilation = [1, 1, 0]
    groups = 1
    deterministic = false
    allow_tf32 = true
input: TensorDescriptor 0x7f260a187ca0
    type = CUDNN_DATA_FLOAT
    nbDims = 4
    dimA = 32, 1024, 128, 64,
    strideA = 8388608, 8192, 64, 1,
output: TensorDescriptor 0x7f2ed40a1160
    type = CUDNN_DATA_FLOAT
    nbDims = 4
    dimA = 32, 2048, 64, 32,
    strideA = 4194304, 2048, 32, 1,
weight: FilterDescriptor 0x7f2ed40a2b40
    type = CUDNN_DATA_FLOAT
    tensor_format = CUDNN_TENSOR_NCHW
    nbDims = 4
    dimA = 2048, 1024, 1, 1,
Pointer addresses:
    input: 0x7f1abc000000
    output: 0x7f1a7c000000
    weight: 0x7f3006a00000


terminate called after throwing an instance of 'c10::Error'
  what():  CUDA error: an illegal memory access was encountered
Exception raised from create_event_internal at /opt/conda/conda-bld/pytorch_1607370193460/work/c10/cuda/CUDACachingAllocator.cpp:687 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x42 (0x7f305a5d18b2 in /dump/algopre/c-szan/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: c10::cuda::CUDACachingAllocator::raw_delete(void*) + 0xad2 (0x7f305a823982 in /dump/algopre/c-szan/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/lib/libc10_cuda.so)
frame #2: c10::TensorImpl::release_resources() + 0x4d (0x7f305a5bcb7d in /dump/algopre/c-szan/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #3: <unknown function> + 0x5fe1ea (0x7f309bd8e1ea in /dump/algopre/c-szan/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #4: <unknown function> + 0x5fe296 (0x7f309bd8e296 in /dump/algopre/c-szan/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
<omitting python frames>
frame #19: __libc_start_main + 0xf0 (0x7f30cd22f840 in /lib/x86_64-linux-gnu/libc.so.6)

Aborted (core dumped)

To debug, I then make the input_size to (1024,1024). Then it perfectly went through and the result makes more sense than when I set it to (256, 256). I also tried (2048, 2048) and it gave a different error:

06-15 11:45:01 Creating dataset...
loading annotations into memory...
Done (t=0.02s)
creating index...
index created!
06-15 11:45:02 Load checkpoint from /dump/algopre/c-szan/github/3DMPPE_ROOTNET_RELEASE/main/../output/model_dump/snapshot_18.pth.tar
06-15 11:45:02 Creating graph...
  0%|                                                                                                              | 0/33 [02:14<?, ?it/s]
Traceback (most recent call last):
  File "/dump/algopre/c-szan/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 872, in _try_get_data
    data = self._data_queue.get(timeout=timeout)
  File "/dump/algopre/c-szan/anaconda3/envs/pytorch/lib/python3.6/queue.py", line 173, in get
    self.not_empty.wait(remaining)
  File "/dump/algopre/c-szan/anaconda3/envs/pytorch/lib/python3.6/threading.py", line 299, in wait
    gotit = waiter.acquire(True, timeout)
  File "/dump/algopre/c-szan/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
    _error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 85820) is killed by signal: Killed.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "test.py", line 54, in <module>
    main()
  File "test.py", line 43, in main
    for itr, (input_img, cam_param) in enumerate(tqdm(tester.batch_generator)):
  File "/dump/algopre/c-szan/anaconda3/envs/pytorch/lib/python3.6/site-packages/tqdm/std.py", line 1178, in __iter__
    for obj in iterable:
  File "/dump/algopre/c-szan/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 435, in __next__
    data = self._next_data()
  File "/dump/algopre/c-szan/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 1068, in _next_data
    idx, data = self._get_data()
  File "/dump/algopre/c-szan/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 1024, in _get_data
    success, data = self._try_get_data()
  File "/dump/algopre/c-szan/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 885, in _try_get_data
    raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str)) from e
RuntimeError: DataLoader worker (pid(s) 85820) exited unexpectedly

I don't know what is going on when the input_size is different. Can you please look at the problem and offer me some clues?

Thanks a lot!

PoseNet

hello, thanks for your good work.
Can you make me know where is your posenet code and MPJPE in your paper iccv2019?
I am looking for your reply. thanks.

Dataset and visualization?

  1. how can I get the visualization results? It's seems that there is no visualization code, I want to get the visualization results like your demonstration.

  2. I don't understand the datasets's json files, such as Human36M_subjectx.json, there are some information about camera parameters, but aren't the real coordination (x,y,z) about human? How can I know the real (x, y, z)?

  3. the test result 'bbox_root_human36m_output.json', there are information about 'bbox' and 'root_cam', no information prediction coordination (x, y, z)? How can I know the prediction (x, y, z)?

Sorry to bother you, thank you very much!

MuPots Evaluation

Hi,
I have a question regarding MuPots annotations. I know you did not create the dataset. However, I see that annotations for some people in the images are missing. For example, for TS1/img_0000001.jpg, the image has 3 people, but annotations for 2d and 3d_kpts come only for 2 people.

Have you came with this problem while evaluating your method?

Thanks

different joints

Very cool work. I'm wondering if I could apply this to a different data set. I have different 2D joints defined than the COCO dataset (some are the same) for my studies. How would I change the 3D data needed for training to accommodate for these?

Download links failure

Hi, thanks for your great contribution. It seems like the links of Human3.6M parsed data and MuCo parsed and composited data are not available. Could you please check the download links?

About 3DPW dataset

Hi, I'm sorry to bother you, but I have a problem that has been bothering me for a long time. I want to ask you something. I downloaded 3DPW and MPII datasets to try to train ROOTNET, but the following error was reported to me. I am a single GPU and have changed num_worker = 0.
Traceback (most recent call last):
File "D:/DeepLearning/3DMPPE_ROOTNET/3DMPPE_ROOTNET_RELEASE-master/main/train.py", line 83, in
main()
File "D:/DeepLearning/3DMPPE_ROOTNET/3DMPPE_ROOTNET_RELEASE-master/main/train.py", line 43, in main
for itr, (input_img, k_value, root_img, root_vis, joints_have_depth) in enumerate(trainer.batch_generator):
File "D:\Software\Anaconda\envs\topdown3d\lib\site-packages\torch\utils\data\dataloader.py", line 683, in next
data = self._next_data()
File "D:\Software\Anaconda\envs\topdown3d\lib\site-packages\torch\utils\data\dataloader.py", line 723, in _next_data
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
File "D:\Software\Anaconda\envs\topdown3d\lib\site-packages\torch\utils\data_utils\fetch.py", line 52, in fetch
return self.collate_fn(data)
File "D:\Software\Anaconda\envs\topdown3d\lib\site-packages\torch\utils\data_utils\collate.py", line 175, in default_collate
return [default_collate(samples) for samples in transposed] # Backwards compatibility.
File "D:\Software\Anaconda\envs\topdown3d\lib\site-packages\torch\utils\data_utils\collate.py", line 175, in
return [default_collate(samples) for samples in transposed] # Backwards compatibility.
File "D:\Software\Anaconda\envs\topdown3d\lib\site-packages\torch\utils\data_utils\collate.py", line 149, in default_collate
return default_collate([torch.as_tensor(b) for b in batch])
File "D:\Software\Anaconda\envs\topdown3d\lib\site-packages\torch\utils\data_utils\collate.py", line 141, in default_collate
return torch.stack(batch, 0, out=out)
RuntimeError: stack expects each tensor to be equal size, but got [] at entry 0 and [1] at entry 1

I also tried to add transforms.Resize((256, 256)) to trainset3d_loader.append(DatasetLoader ()) in base.py. I don't know how much img_size is, I tried 128. 256, but the problem was never solved. What should I do? I look forward to your explanation. Thank you very much.

Demo results on custom image/video

Hi,

Your work seems to be great and I was wondering if you could include a demo.py code that takes a single image or video as input and provides the 3d visual output.

The world coordinates of PW3D dataset

I would like to ask if it's available to get the world coordinates of the joints from PW3D dataset? The README you provided does not contain information about camera extrinsics and/or the world coordinates. Could you please release the code for converting PW3D to Json annotations (based on the official PW3D sequenceFiles/*.pkl)? Thanks.

Testing without access to camera intrinsic parameters

In the paper, you mentioned RootNet can be used during testing stage even without fx and fy. Does that mean the RootNet can still give a good estimate of depth with fake camera intrinsics values (cuz in the pipeline you built these features (fx, fy, cx, cy) are still required during testing)? Is this because the intrinsics are also somehow learned during training for other unseen images?

I know you are busy with other stuff, so thanks for your reply in advance when you have time!

How do I set root_depth_list

Hi, thank you for providing the code and such an insightful paper. I have a doubt.How to set root_depth_list based on our image.Looking forward to your reply

DetectNet으로 resnet 50 + FPN 사용

개인 데이터 셋으로 학습시키기 위해 mask r cnn으로 bbox를 얻고자 하는데요,
Mask r cnn 중 resnet-50 +FPN 모델 사용하신 것 맞나요?

human-images

Hello, I have a problem that I cannot decompress the human image after downloading it. I opened the link in the readme, which showed three folders, including masks images and so on. Should I directly click to download images, or open images and download the image.tar.gza* one by one? In addition, how many G should the downloaded image be?Thank you

Learning rate decrease code problem

Hello, I have been reviewing your paper and code (RootNet & PoseNet) for several days. I'd like to mention that the learning rate decrease code is implemented in the wrong way.

For instance, Line 77 used a local variable e, I guess that line 78-84 need to be indented by 4 spaces?

idx = cfg.lr_dec_epoch.index(e)

Your Code

    def set_lr(self, epoch):
        for e in cfg.lr_dec_epoch:
            if epoch < e:
                break
        if epoch < cfg.lr_dec_epoch[-1]:
            idx = cfg.lr_dec_epoch.index(e)
            for g in self.optimizer.param_groups:
                g['lr'] = cfg.lr / (cfg.lr_dec_factor ** idx)
        else:
            for g in self.optimizer.param_groups:
                g['lr'] = cfg.lr / (cfg.lr_dec_factor ** len(cfg.lr_dec_epoch))

I guess this is the right code? 🤔

    def set_lr(self, epoch):
        for e in cfg.lr_dec_epoch:
            if epoch < e:
                break
            if epoch < cfg.lr_dec_epoch[-1]:
                idx = cfg.lr_dec_epoch.index(e)
                for g in self.optimizer.param_groups:
                    g['lr'] = cfg.lr / (cfg.lr_dec_factor ** idx)
            else:
                for g in self.optimizer.param_groups:
                    g['lr'] = cfg.lr / (cfg.lr_dec_factor ** len(cfg.lr_dec_epoch))

BTW, I trained/tested several times and used several protocols and datasets (Human3.6M Protocol2 / MuCo / 3DPW), but I can not reproduce the precision you have mentioned in the README file or paperwork? Maybe the problem above caused it?

data/PW3D

Hello!
I can't find this "joint_cam". And it is not found in the 3DPW_train.json file.

Traceback (most recent call last):
File "train.py", line 83, in <module>
main()
File "train.py", line 33, in main
trainer._make_batch_generator()
File "/home/zqq/project/3DMPPE_ROOTNET_RELEASE-master/main/../common/base.py", line 94, in _make_batch_generator trainset3d_loader.append(DatasetLoader(eval(cfg.trainset_3d[i])("train"), True, transforms.Compose([\
File "/home/zqq/project/3DMPPE_ROOTNET_RELEASE-master/main/../data/PW3D/PW3D.py", line 18, in __init__ self.data = self.load_data()
File "/home/zqq/project/3DMPPE_ROOTNET_RELEASE-master/main/../data/PW3D/PW3D.py", line 35, in load_data joint_cam = np.array(ann['joint_cam'], dtype=np.float32).reshape(-1,3)
KeyError: 'joint_cam'

test dataset

Hello,how many frames does human36 test set generate a photo?Thank you

Do not have 'c' in MSCOCO annot

Hi, it seems I do not have 'c' in the key of MSCOCO 2017 annotations. Could you help me with this problem?
Traceback (most recent call last):
File "/home/workspace/code/3DMPPE_ROOTNET/main/test.py", line 60, in
main()
File "/home/workspace/code/3DMPPE_ROOTNET/main/test.py", line 56, in main
tester._evaluate(preds, cfg.result_dir)
File "/home/workspace/code/3DMPPE_ROOTNET/common/base.py", line 172, in _evaluate
self.testset.evaluate(preds, result_save_path)
File "/home/workspace/code/3DMPPE_ROOTNET/data/MSCOCO/MSCOCO.py", line 96, in evaluate
c = gt['c']
KeyError: 'c'

dataset availability

Hi,

Thank you for opening source such a great work! I would like to ask when will you release the pre-processed dataset or can you share the pre-processing code? Look forward to your response.

How to match the estimated bbox with the ground-truth 2D pose?

Thanks for sharing your nice work!

I have a naive question about how to match the estimated bbox with the groud-truth 2D pose. Since the bounding boxes are obtained from Mask Rcnn without fine-tuning, the sequence of bbox may be different from the ground truth. I would appreciate if you can give me any hint.

Question about COCO dataset

Hi, first of all, thank you for open sourcing such a great repo.

I would like to ask how did you train MSCOCO together with MuCO dataset? They have
different joints configuration.

Models Corrupted

I have attempted to download every pre-trained model linked (H36M Protocol 1+2 and MSCOCO) and both cannot be unzipped as files are corrupt: 1, 2.

I have tried multiple tools for unzipping, and outcome is the same- could you check that files have been correctly zipped?

Edit: This is the same for models in PoseNet repo.

Is there any follow-up study?

Hello,I read your paper and thought it was a very meaningful work.
are u have any follow-up study,Is rootnet extended to hand positioning?
Or if you can recommend similar work to me, thank you very much。
I wish you a happy life。

Issue with 3d visualisation

2d output image is coming fine but in case of 3d, points are not getting mapped properly. they are generating a wrong 3d figure. please help @mks0601
output_pose_2d
Screenshot from 2021-11-08 18-04-07

Measurement of bbox_real

Congratulation for the paper.

I am working on the pose correction due to camera parameters like camera angle, perspective transforms etc. I would like to use this algorithm. I have few questions.

In this line you have mentioned bbox_real = (2000, 2000). In wild image how we can measure this?

Thanks in advance.

Code for Full Framework Demo

I am interested in qualitative testing of your framework by demoing on my own data. I essentially want to combine Detectron, PoseNet and RootNet, and push my own image/video data through to obtain final absolute camera-centered coordinates of multiple persons’ keypoints (+ visualizations, although I see code exists for this so should be OK).

Do you have the code to demo the full pipeline? It seems no code exists for testing of your full framework, which I am sure others will also be interested in.

Why enlarge the bbox?

Hi, thank you for providing the code and such an insightful paper. I have a doubt, in line 63 you enlarge the bounding box by 25% in both directions. However, before that you sanitize the bounding box to ensure that the [xmin, ymin, xmax, ymax] are all within the image dimensions.
Is it possible that after enlarging our (xmin, ymin) or (xmax, ymax) might go outside image dimensions?

Secondly, I have a basic question: Are we preserving the aspect ratio as 1:1 of the bounding boxes, because the ratio of the human is assumed to be 2000mmx2000mm?

pretrained file for demo.py is damaged

I can't extract snapshot_18.pth.tar from your Google drive

error like this:
tar: This does not look like a tar archive
tar: Skipping to next header
tar: A lone zero block at 7115
tar: Exiting with failure status due to previous errors

Combination Strategy for Root + Pose

Hello! Thanks in advance for this implementation of a cool paper! Currently i am trying to combine both rootnet and posenet into a singular pipeline for inference into 3D. What i want to ask is, what is the combination strategy for both? What do we do with both information? How do we extract crop from posenet to input into rootnet? What if there is a problem with occlussion, etc? Given the paper uses Mask-RCNN to first extract individual humans, do we mask out the unnecessary regions from a singular crop?

Then what do we do with posenet and rootnet prediction? Do we just merely add them together? Because judging from the results in the paper, the x,y,z produced by posenet and rootnet is in pixels, right?

Also, rootnet requires a pre-computed K value, how did you get this value?

Thanks!

test on wild images

Hi,
I want to get the root depth in ndc space from a wild image, but the image was captured by another camera with different intrinsics(the focal length is different from 1500 used in the code).
Can I run rootnet inference on the image? If I can, how can I get the correct depth in NDC space?
Thanks a lot!

How to set the config for the FreiHAND dataset

Hi,

Thanks for making this awesome project open source. When I try to train the RootNet on the FreiHAND dataset, I fail to find the config for this dataset, such as how to set the bbox_real, pixel_mean, and pixel_std. If you can provide the config of the FreiHAND dataset, I will be very appreciative.

3D 그래프 상의 2D 이미지 와 스켈레톤 데이터

안녕하세요. 저는 포즈 추정 연구에 관심이 있는 유림 이라고 합니다.

다름이 아니라 아래 이미지와 같이 3D 그래프 상에 2D 이미지를 삽입 하고 싶은데 도움을 받을 수 있을지 궁금합니다.

image

vis.py 코드를 살펴보았는데 해당 코드는 스켈레톤 데이터만 보여지는 것 같아서.. 조심스럽게 여쭤봅니다.

감사합니다.

Cannot get reported AP on MuPoTS dataset

I downloaded the bbox from https://drive.google.com/drive/folders/1y0NM6dBbfQgOTRQ6t-V6qdvIt2J6rndS and downloaded the pre-trained model from https://drive.google.com/drive/folders/1nQfOIgc7_AG5xPAO-vtG_L0WxdOelxet. But I could not reproduce the AP score(31.0) reported in README.md

My result is

~/3DMPPE_ROOTNET_RELEASE/main$ python test.py --gpu 1 --test_epoch 18
>>> Using GPU: 1
07-14 17:00:48 Creating dataset...
loading annotations into memory...
Done (t=1.49s)
creating index...
index created!
Get bounding box from ../data/MuPoTS/bbox/bbox_mupots_output.json
07-14 17:00:58 Load checkpoint from ../output/muco_coco/model_dump/snapshot_18.pth.tar
07-14 17:00:58 Creating graph...
100%|████████████████████████████████████| 2163/2163 [16:05<00:00,  2.24it/s]
Evaluation start...
Test result is saved at ../output/muco_coco/result/bbox_root_mupots_output.json
loading annotations into memory...
Done (t=1.74s)
creating index...
index created!
AP_root: 0.1514230417627361

Root Result with BBox Issue.

Hello.

Thank you for your great work.

I tried the demo.py in the RootNet with your "input.jpg".
You gave us the bbox values like this
bbox_list = [
[139.41, 102.25, 222.39, 241.57],
[287.17, 61.52, 74.88, 165.61],
[540.04, 48.81, 99.96, 223.36],
[372.58, 170.84, 266.63, 217.19],
[0.5, 43.74, 90.1, 220.09]
] and this seems to get good Root results.

But there is a problem when I use my bbox results that I obtained from Detectron2.
bbox_list = [[367.4241, 177.2487, 636.8704, 389.0179],
[169.7263, 103.9277, 365.6891, 341.9860],
[ 1.0619, 43.5408, 88.0144, 266.1570],
[537.6930, 51.6290, 639.7495, 265.5615],
[292.3224, 59.8680, 364.7124, 224.1436]]
output

output_root_2d_4
output_root_2d_1

The results of the RootNet is bad. I have also tested with other images like 2 people in image
but I keep getting bad results although I have detected the humans.
I have used all the pretrained Detectron Mask RCNN and Faster RCNN Models.
and I also used the pretrained model of the RootNet "snapshot18.pth" that you provided.

Is The bbox you provided a Ground truth?? I can't figure out what the issue is.
Can I ask you some advice please?

Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.