Coder Social home page Coder Social logo

j96w / densefusion Goto Github PK

View Code? Open in Web Editor NEW
1.1K 1.1K 300.0 5.01 MB

"DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion" code repository

Home Page: https://sites.google.com/view/densefusion

License: MIT License

Python 83.46% Makefile 0.47% Cuda 4.62% C 1.41% C++ 0.05% MATLAB 7.19% Shell 2.20% Dockerfile 0.60%

densefusion's People

Contributors

danfeix avatar huckl3b3rry87 avatar huipengly avatar j96w avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

densefusion's Issues

About image normalization

The documents of "torvision.model" have said that
The images have to be loaded in to a range of [0, 1] and then normalized using mean = [0.485, 0.456, 0.406] and std = [0.229, 0.224, 0.225].

The cropped image have been directly normalized without scaled into [0, 1]。
Is there any bug though it work?

Training on Multiple GPUs

Hello, thanks for sharing the code
I am new to PyTorch. As far as I understand, the code provided/ DataParallel function should run training(Posenet) on multiple gpus however, in my case it is using only one out of four gpus. I have a gpu cluster of 4x 1080ti. Also, torch.cuda.device_count() shows 4. Could you please tell me if you have used multiple gpus for training and is there something I am missing here?

Is is possible to share trained segmentation model on YCB-video dataset?

Hi, Thanks for sharing the code!

I'm also trying to use your code in the ROS environment for robot manipulation with objects in YCB dataset. However, the inference in DenseFusion requires segmentation to generate the pose and it is very time consuming to train a segmentation model with all the training images in the YCB-Video dataset. I tried to train with the vanilla segmentation code and found even one epoch is taking around 10 hours on YCB-Video dataset with single GPU. And we don't have too much resources on GPU. It would be great if you can share the trained segmentation model on YCB-video dataset!

Thanks a lot!

PoseNet(No refine) model evaluate result does not match

On linemod dataset, we evaluated the model provided (trained_models/linemod/pose_model_9_0.01310166542980859.pth) without refinement and the success rate is 0.83169, which does not match what is claimed in the the paper (per-pixel: 86.2). Is the provided PoseNet model used to evaluate the per-pixel performance? If not, could you please provide the model used to evaluate the per-pixel performance?
Thank you!

Segmentation fault

Hi, Thanks for sharing the code!!!

I ran download.sh file and success

so I run

sh .experiments/scripts/train_linemod.sh

after object buffer loaded, I get this error

----------Dataset loaded!---------<<<<<<<<
length of the training set: 2373
length of the testing set: 1336
number of sample points on mesh: 500
symmetry object list: [7, 8]
2019-04-08 03:59:01,660 : Train time 00h 00m 00s, Training started
/home/user/miniconda/envs/py36/lib/python3.6/site-packages/torch/nn/functional.py:1749: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
"See the documentation of nn.Upsample for details.".format(mode))
/home/user/miniconda/envs/py36/lib/python3.6/site-packages/torch/nn/modules/container.py:91: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.
input = module(input)
Segmentation fault (core dumped)

How can I solve this error and run train code

undefined symbol:_Py_Dealloc

When trying to run the training script ./train_ycb.sh on Ubuntu with Python2.7.12, it fails with the stack trace:

  • set -e
  • export PYTHONUNBUFFERED=True
  • export CUDA_VISIBLE_DEVICES=0
  • python ./tools/train.py --dataset ycb --dataset_root ./data/YCB_Video_Dataset
    /home/user/Work/DenseFusion/lib/transformations.py:1912: UserWarning: failed to import module _transformations
    warnings.warn('failed to import module %s' % name)
    Traceback (most recent call last):
    File "./tools/train.py", line 26, in
    from lib.loss import Loss
    File "/home/user/Work/DenseFusion/lib/loss.py", line 9, in
    from lib.knn.init import KNearestNeighbor
    File "/home/user/Work/DenseFusion/lib/knn/init.py", line 7, in
    from lib.knn import knn_pytorch as knn_pytorch
    File "/home/user/Work/DenseFusion/lib/knn/knn_pytorch/init.py", line 3, in
    from ._knn_pytorch import lib as _lib, ffi as _ffi
    ImportError: /home/user/Work/DenseFusion/lib/knn/knn_pytorch/_knn_pytorch.so: undefined symbol: _Py_Dealloc

I want to addressed via building _knn_pytorch.so with Py2, but it still fails with the stack trace:

python2 build_ffi.py
Traceback (most recent call last):
File "build_ffi.py", line 19, in
include_dirs=[osp.join(abs_path, 'include')]
File "/usr/local/lib/python2.7/dist-packages/torch/utils/ffi/init.py", line 176, in create_extension
ffi = cffi.FFI()
File "/usr/local/lib/python2.7/dist-packages/cffi/api.py", line 46, in init
import _cffi_backend as backend
ImportError: /usr/local/lib/python2.7/dist-packages/_cffi_backend.so: undefined symbol: PyUnicodeUCS2_FromUnicode
Makefile:31: recipe for target 'build/knn_pytorch/_knn_pytorch.so' failed
make: *** [build/knn_pytorch/_knn_pytorch.so] Error 1

how to get the numbers of epochs of the training .

HI @j96w , i just wanted to know how can we get the number of epoches in orther to estimate our time of training , because i'm waiting it to finish training . I was supposing , thank's to README.txt , that epoches=30 , but it's not right .

17:38:09,448 : Train time 16h 25m 47s Epoch 39 Batch 8046 Frame 32184 Avg_dis:0.0042653061100281775
2019-03-24 17:38:09,549 : Train time 16h 25m 47s Epoch 39 Batch 8047 Frame 32188 Avg_dis:0.0038781535113230348
2019-03-24 17:38:09,660 : Train time 16h 25m 47s Epoch 39 Batch 8048 Frame 32192 Avg_dis:0.003846941574010998
2019-03-24 17:38:09,777 : Train time 16h 25m 47s Epoch 39 Batch 8049 Frame 32196 Avg_dis:0.002740198280662298
2019-03-24 17:38:09,915 : Train time 16h 25m 47s Epoch 39 Batch 8050 Frame 32200 Avg_dis:0.003869467240292579
2019-03-24 17:38:10,033 : Train time 16h 25m 47s Epoch 39 Batch 8051 Frame 32204 Avg_dis:0.004200253228191286

how many epoches are fixed for linemod training, Thank you in advance .

Error - train_ycb.sh - Pytorch-1.0

I used Pytorch-1.0 branch and ran ./experiments/scripts/train_ycb.sh.
Then I got the error:

pred = torch.add(torch.bmm(model_points, base), points + pred_t) RuntimeError: cublas runtime error : the GPU program failed to execute at /pytorch/aten/src/THC/THCBlas.cu:441

My system:

  • GeForce RTX 2080 Ti
  • nvidia/cuda:10.0-cudnn7-devel-ubuntu16.04

Evaluation on LineMOD

Thank you for your help. Sorry to bother you again. I still have some questions.
1.The max default number of epochs to train is 500. When training 104 epochs on ycb, the dis is 0.0090381440936. But it costs long time. And the ycb
trained_models you provide is pose_refine_model_69_0.009449292959118935.pth. So how to determine the epoch of training for pose refine model? The lower the dis, the better the model? How to prevent overfitting?

2.When evaluation on LineMOD Dataset, the final content of eval_result_logs.txt is as follows:

No.13390 NOT Pass! Distance: 0.0217275395989
No.13391 Pass! Distance: 0.0048154219985
No.13392 Pass! Distance: 0.0142814125866
No.13393 Pass! Distance: 0.00356977432966
No.13394 Pass! Distance: 0.00472941761836
No.13395 Pass! Distance: 0.00784354563802
No.13396 Pass! Distance: 0.0127922594547
No.13397 Pass! Distance: 0.00581285078079
No.13398 NOT Pass! Distance: 0.0267377775162
No.13399 Pass! Distance: 0.00628646928817
No.13400 Pass! Distance: 0.0146940667182
No.13401 NOT Pass! Distance: 0.0340396799147
No.13402 NOT Pass! Distance: 0.0713591426611
No.13403 NOT Pass! Distance: 0.0522820688784
No.13404 Pass! Distance: 0.0038491380401
No.13405 NOT Pass! Distance: 0.0213586390018
No.13406 Pass! Distance: 0.00927203428
Object 1 success rate: 0
Object 2 success rate: 0
Object 4 success rate: 0
Object 5 success rate: 0
Object 6 success rate: 0
Object 8 success rate: 0
Object 9 success rate: 0
Object 10 success rate: 0
Object 11 success rate: 1
Object 12 success rate: 0
Object 13 success rate: 0
Object 14 success rate: 0
Object 15 success rate: 0
ALL success rate: 0

Is anything wrong with evaluation on LineMOD Dataset?How can I get an evaluation result similar to YCB_Video Dataset run with MATLAB?

Thank you in advance.

Originally posted by @sunshantong in #7 (comment)

Inconsistence between equation (2) in the paper and code implementation

Equation (2) in the paper detects the closest point from predicted model points to each of ground truth model points for symmetric shapes which is consistent with the equation (6) in PoseCNN paper, but in the code implementation (line 44-47 in lib/loss.py) you instead find the closest point from ground truth model to each predicted model point, which is opposite. Can you explain? Because those two metrics are different. Thanks!

potential leaky information was uesd in eval linemod datasets

In eval_linemod.py, the code still use the rmin, rmax, cmin, cmax = get_bbox(meta['obj_bb']) to get the rmin, rmax, cmin, cmax. That's the important process for the image crop. In my opinion, gt.yaml is the groundtruth for the objects, and the obj_bb is the 2d bounding box. I don't know whether the code is right. Maybe I was wrong.
thank you =。=

Question on loss calculation

Hi,

Thank you for sharing the codes and I have several questions on the loss calculation :

Q1.

t = ori_t[which_max[0]] + points[which_max[0]]
Why do you combine one point from point clouds with pred_t here?

Q2.

new_target = torch.bmm((new_target - ori_t), ori_base).contiguous()
I am sorry that I am confused about the reason of updating with this method.
Why the point cloud is updated by subtracting pred_t before rotation? Shouldn't it be updated the same as prediction(adding pred_t after rotation) ?

Following is some of my understanding:

  1. To predict a residual pose, we can:
    1) update the model points same as prediction, while keeping the target the same;
    2) or keep the model points the same an update the target in a reverse way.

In the codes, the point clouds and target are updated in the same way, and it's hard for me to understand. Could you please help to explain?

Thank you and looking forward to your reply.

Best,
Stacey

Training vanilla SegNet

When I try to run
python3 train.py --dataset_root=./datasets/ycb/YCB_Video_Dataset

I get the following issue :

Traceback (most recent call last):
  File "train.py", line 69, in <module>
    for i, data in enumerate(dataloader, 0):
  File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 336, in __next__
    return self._process_next_batch(batch)
  File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 357, in _process_next_batch
    raise batch.exc_type(batch.exc_msg)
FileNotFoundError: Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 106, in _worker_loop
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 106, in <listcomp>
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/media/intern/disk2/DenseFusion/vanilla_segmentation/data_controller.py", line 47, in __getitem__
    label = np.array(Image.open('{0}/{1}-label.png'.format(self.root, self.path[index])))
  File "/usr/local/lib/python3.5/dist-packages/PIL/Image.py", line 2652, in open
    fp = builtins.open(filename, "rb")
FileNotFoundError: [Errno 2] No such file or directory: './datasets/ycb/YCB_Video_Dataset/data_syn/002715-label.png'

I tried several times and got the same issue with different #-label.png files and the files are in the folder so I don't really understand the error. It is always the 22nd call to getitem that crash. Any idea ?
Thanks

Why adding "points" in loss.py?

Thanks for sharing the code, amazing work!

I am reading your code, but I found that in loss.py file line 38, when you calculate the loss, why you also add points? see the copied code below:

pred= torch.add(torch.bmm(model_points, base), points + pred_t)

Generalization Ability of the Method

Hi dear authors,

I was thinking of using the method for kinda small objects (chip, ring, metal pieces, etc. found in devices such as this) and wondered if the method could actually generalize, or do we really need to have the precise 3D model of what we are looking for? Because it is impossible to have the models for each and every thing in this domain, so at some point the method should generalize. Would it actually follow this line of thought?

Regards,

pointfeat_1 meaning

Hi, a PointNet-based network that processes each point in the masked 3D point cloud to a geometric feature embedding, said in the paper. But in the code implementation only convolution and ReLU of point cloud x in network.py. No spacial transform networks(STN) and maxpooling operation on point cloud. These are the two most important features of PointNet.

` Class PoseNetFeat (nn. Module):
def init(self, num_points):
super(PoseNetFeat, self).init()
self.conv1 = torch.nn.Conv1d(3, 64, 1)
self.conv2 = torch.nn.Conv1d(64, 128, 1)

    self.e_conv1 = torch.nn.Conv1d(32, 64, 1)
    self.e_conv2 = torch.nn.Conv1d(64, 128, 1)

    self.conv5 = torch.nn.Conv1d(256, 512, 1)
    self.conv6 = torch.nn.Conv1d(512, 1024, 1)

    self.ap1 = torch.nn.AvgPool1d(num_points)
    self.num_points = num_points
def forward(self, x, emb):
    x = F.relu(self.conv1(x))
    emb = F.relu(self.e_conv1(emb))
    pointfeat_1 = torch.cat((x, emb), dim=1)

    x = F.relu(self.conv2(x))
    emb = F.relu(self.e_conv2(emb))
    pointfeat_2 = torch.cat((x, emb), dim=1)

    x = F.relu(self.conv5(pointfeat_2))
    x = F.relu(self.conv6(x))

    ap_x = self.ap1(x)

    ap_x = ap_x.view(-1, 1024, 1).repeat(1, 1, self.num_points)
    return torch.cat([pointfeat_1, pointfeat_2, ap_x], 1) #128 + 256 + 1024`

I am confused. Would you explain this to me?
Thanks in advance.

I have some related question about this part.

  1. I'm confused in this code which variable represents color embeddings and geometry embeddings??
    I understood pointfeat_2(color embeding+geometry embeding), ap_x(global_feature) based on the paper embeding dimensions are 128 .

what's the meaning of

pointfeat_1 = torch.cat((x, emb), dim=1)
return torch.cat([pointfeat_1, pointfeat_2, ap_x], 1) #128 + 256 + 1024`

Would you explain about this part??
Thanks.

Originally posted by @trevor-taeyeop in #34 (comment)

About batch and frame

Hi @j96w ! Thank you for your work.
line 131~154 in train.py

            for i, data in enumerate(dataloader, 0):
                #…………
                train_count += 1

                if train_count % opt.batch_size == 0:
                    logger.info('Train time {0} Epoch {1} Batch {2} Frame {3} Avg_dis:{4}'.format(time.strftime("%Hh %Mm %Ss", time.gmtime(time.time() - st_time)), epoch, int(train_count / opt.batch_size), train_count, train_dis_avg / opt.batch_size))

I think one iteration is one batch, so batch number should be equivalent to train_count.
In logger.info, why batch number equals int(train_count / opt.batch_size), and frame number equals train_count?

3d bbox

Hello,
There are many 3d bboxes in the figures of this paper. Could you tell me how to draw the 3d bbox with output of this network(R and T)?
thanks for help !
@j96w

some confusion about loss calculation

Thank you for sharing your codes.
I have some confusion about the dis_calculation in the stage of pose estimation.
pred = torch.add(torch.bmm(model_points, base), points + pred_t)
why not:
pred = torch.add(torch.bmm(model_points, base), pred_t)
Thank you and looking forward to your reply.

how to make my own dataset?

Dear sir , i am a freshman in 6d pose , i am tring to learn from your code ,but ,when i need to build my own dataset on a specific object like my own thing or some other thing .how to make ur own dataset?

ROS Package

Would it be possible if you'd include the ROS package used in the demo video ?
Was the inference time affected when the code was deployed to ROS ?
Thank you.

problem to run YCB_Video_toolbox/evaluate_poses_ keyframe.m

Hi, when I use YCB_Video_toolbox to run the matlab code "YCB_Video_toolbox/evaluate_poses_
keyframe.m
", I meet the problem! Who can help me ? Thanks a lot in advance!

Error using load
Unable to read file 'Densefusion_iterative_result/0026.mat'. No such file or directory.

Error in evaluate_poses_keyframe (line 50)
    result_my = load(filename);

Undefined symbol error inside knn while training

I followed the issue #33 and extracted egg file and moved *so and knn_python.py to the root knn dir.
After that when I run ./experiments/scripts/train_ycb.sh
it shows the error as below,
`

  • set -e
  • export PYTHONUNBUFFERED=True
  • export CUDA_VISIBLE_DEVICES=0
  • python3 ./tools/train.py --dataset ycb --dataset_root ./datasets/ycb/YCB_Video_Dataset
    /home/taeuk/network/DenseFusion/DenseFusion-Pytorch-1.0/lib/transformations.py:1912: UserWarning: failed to import module _transformations
    warnings.warn('failed to import module %s' % name)
    Traceback (most recent call last):
    File "./tools/train.py", line 26, in
    from lib.loss import Loss
    File "/home/taeuk/network/DenseFusion/DenseFusion-Pytorch-1.0/lib/loss.py", line 9, in
    from lib.knn.init import KNearestNeighbor
    File "/home/taeuk/network/DenseFusion/DenseFusion-Pytorch-1.0/lib/knn/init.py", line 7, in
    from lib.knn import knn_pytorch as knn_pytorch
    ImportError: /home/taeuk/network/DenseFusion/DenseFusion-Pytorch-1.0/lib/knn/knn_pytorch.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN3c105ErrorC1ENS_14SourceLocationERKSs
    `

my system is:
Ubuntu 16.04
GPU : RTX2080 ti
CUDA 10.0
python 3.6.8
pytorch 1.01

any small ideas would be precious for me.
Thanks.

Incorrect visualization result on YCB dataset

Hello, thanks for sharing your code!!
I tried to visualize the output on YCB test set, but the result doesn't not align with that in Fig 4. in your paper.
Here is one of my visualization results.
The up left image is generated using the ground truth R and T in xxx-mata.mat, and it is correct. The down left and down right images are generated by simply replacing the R and T by those in mat files in result_wo_refine_dir and result_refine_dir. Pose estimation in these two images seems to go wrong.
420_gt
The visualization process is to transfer points in points.xyz with R and T and then scale and transform those points to fit into the object's tight bounding box.
I used trained checkpoints provided by you.

May I ask your suggestion on what the problem might be?
Thanks for your time.

Real-time pose detection

Hello.
Thank you for sharing the good code.
I wanna test the model trained with my own data.
But, the evaluation code seems to require the information in the meta file.('cam_t_m2c','cam_R_m2c'..)
I think that it is difficult to test the model in real-time with this code.
Is there a code that can only give pose in real-time? If you don't, how can i modify it and try it?
I'll be waiting for the reply.
Thank you.

module 'lib.knn.knn_pytorch' has no attribute 'knn'

Hi, thanks for sharing your code!
I downloaded the latest version of DenseFusion-Pytorch-1.0. When I ran ./experiments/scripts/train_ycb.sh, I got an error 'AttributeError: module 'lib.knn.knn_pytorch' has no attribute 'knn'' . My python version is 3.6.8 |Anaconda custom (64-bit) and the pytorch version is 1.0.1.post2.

I tried to insert pdb.set_trace() in the line 20 of ./lib/knn/_init_.py

inds = torch.empty(query.shape[0], self.k, query.shape[2]).long().cuda()

#import pdb
#pdb.set_trace()
knn_pytorch.knn(ref, query, inds)

return inds

I print dir(knn_pytorch) which shows the following messages:

(Pdb) p dir(knn_pytorch)
['__doc__', '__loader__', '__name__', '__package__', '__path__', '__spec__']

It seems that the module knn_pytorch doesn't have the knn. How can I solve this error? Please help me.

The details are as follows:

+ set -e
+ export PYTHONUNBUFFERED=True
+ PYTHONUNBUFFERED=True
+ export CUDA_VISIBLE_DEVICES=0
+ CUDA_VISIBLE_DEVICES=0
+ python3 ./tools/train.py --dataset ycb --dataset_root ./datasets/ycb/YCB_Video_Dataset
/home/qingqing/Downloads/qingqing_disk/p4600_disk/DenseFusion/lib/transformations.py:1912:      UserWarning: failed to import module _transformations
 warnings.warn('failed to import module %s' % name)
96189
2949
>>>>>>>>----------Dataset loaded!---------<<<<<<<<
length of the training set: 96189
length of the testing set: 2949
number of sample points on mesh: 500
symmetry object list: [12, 15, 18, 19, 20]
/home/qingqing/anaconda3/lib/python3.6/site-packages/torch/nn/_reduction.py:49: UserWarning:    size_average and reduce args will be deprecated, please use reduction='mean' instead.
 warnings.warn(warning.format(ret))
2019-04-23 10:20:35,077 : Train time 00h 00m 00s, Training started
/home/qingqing/anaconda3/lib/python3.6/site-packages/torch/nn/functional.py:2351:   UserWarning: nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.
 warnings.warn("nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.")
/home/qingqing/anaconda3/lib/python3.6/site-packages/torch/nn/functional.py:2423:    UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False    since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
  "See the documentation of nn.Upsample for details.".format(mode))
/home/qingqing/anaconda3/lib/python3.6/site-packages/torch/nn/modules/upsampling.py:129:     UserWarning: nn.Upsample is deprecated. Use nn.functional.interpolate instead.
  warnings.warn("nn.{} is deprecated. Use nn.functional.interpolate instead.".format(self.name))
/home/qingqing/anaconda3/lib/python3.6/site-packages/torch/nn/modules/container.py:92:     UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to   include dim=X as an argument.
  input = module(input)
2019-04-23 10:20:36,413 : Train time 00h 00m 01s Epoch 1 Batch 1 Frame 8     Avg_dis:0.1779076661914587
Traceback (most recent call last):
  File "./tools/train.py", line 237, in <module>
    main()
  File "./tools/train.py", line 140, in main
    loss, dis, new_points, new_target = criterion(pred_r, pred_t, pred_c, target, model_points, idx,     points, opt.w, opt.refine_start)
  File "/home/qingqing/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line     489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/qingqing/Downloads/qingqing_disk/p4600_disk/DenseFusion/lib/loss.py", line 83, in forward
    return loss_calculation(pred_r, pred_t, pred_c, target, model_points, idx, points, w, refine, self.num_pt_mesh, self.sym_list)
  File "/home/qingqing/Downloads/qingqing_disk/p4600_disk/DenseFusion/lib/loss.py", line 44, in loss_calculation
    inds = knn(target.unsqueeze(0), pred.unsqueeze(0))
  File "/home/qingqing/Downloads/qingqing_disk/p4600_disk/DenseFusion/lib/knn/__init__.py", line 23, in forward
    knn_pytorch.knn(ref, query, inds)
AttributeError: module 'lib.knn.knn_pytorch' has no attribute 'knn'

confusion about get_bbox function

The code is in eval_ycb.py, and datasets/ycb/dataset.py.

I think it inputs the roi from the semantic segmentation, and makes the new width(height) be the the smallest value in border_list not less than the old one(like ceil function). The output roi will only have fixed choice.

Why don't use the rois from segmentation?

Thank you and looking forward to your reply.

border_list = [-1, 40, 80, 120, 160, 200, 240, 280, 320, 360, 400, 440, 480, 520, 560, 600, 640, 680]
def get_bbox(posecnn_rois):
    rmin = int(posecnn_rois[idx][3]) + 1
    rmax = int(posecnn_rois[idx][5]) - 1
    cmin = int(posecnn_rois[idx][2]) + 1
    cmax = int(posecnn_rois[idx][4]) - 1
    r_b = rmax - rmin
    for tt in range(len(border_list)):
        if r_b > border_list[tt] and r_b < border_list[tt + 1]:
            r_b = border_list[tt + 1]
            break
    c_b = cmax - cmin
    for tt in range(len(border_list)):
        if c_b > border_list[tt] and c_b < border_list[tt + 1]:
            c_b = border_list[tt + 1]
            break
    center = [int((rmin + rmax) / 2), int((cmin + cmax) / 2)]
    rmin = center[0] - int(r_b / 2)
    rmax = center[0] + int(r_b / 2)
    cmin = center[1] - int(c_b / 2)
    cmax = center[1] + int(c_b / 2)
    if rmin < 0:
        delt = -rmin
        rmin = 0
        rmax += delt
    if cmin < 0:
        delt = -cmin
        cmin = 0
        cmax += delt
    if rmax > img_width:
        delt = rmax - img_width
        rmax = img_width
        rmin -= delt
    if cmax > img_length:
        delt = cmax - img_length
        cmax = img_length
        cmin -= delt
    return rmin, rmax, cmin, cmax

Output Vanilla SegNet

I'm trying to use the output of vanilla SegNet network to label YCB-Video images but I don't find an efficient way to transform the 22*640*480 output into a single label image 640*480.
For the moment I'm using something like that:

seg_data = seg(rgb) #  output SegNet
seg_data = seg_data.detach().cpu().numpy()[0]
seg_image = np.zeros((480, 640))
obj_list = []
for i in range(480):
    for j in range(640):
        prob_max = 0
        label = 0
        for r in range(22):
            if seg_data[r][i][j] > prob_max:
                label = r
                prob_max = seg_data[r][i][j]
        seg_image[i][j] = label
        if label not in obj_list:
            obj_list.append(label)

How do you use the output for fast segmentation of an rgb image ?

Potential bug in lib.loss.loss_calculation

Disclaimer: I haven't read the paper

That being said, this looks very suspicious.

pred = torch.add(torch.bmm(model_points, base), points + pred_t)

It should be only

pred = torch.add(torch.bmm(model_points, base), pred_t)

like you have in loss_refiner.py. I don't see a reason to add the points you acquired from the camera here. Especially because you compute the point to point distance error a couple of lines below

dis = torch.mean(torch.norm((pred - target), dim=2), dim=1)

About the refine_margin

@j96w Thanks for your work!
Here is my question. The condition that "best_test < opt.refine_margin(0.013)" is achieved after about 10 epochs, which means the posenet is just trained 10 times. However, I found the best_test is about 0.006 after training the refineNet for more than 400 times. So, is the refine_margin a little big? Can I set it smaller (eg, 0.008) to train the posenet more times? Or you have found that a refine_margin value smaller than 0.013 will lead to overfitting on the posenet?

PointNet implementation

Hi, a PointNet-based network that processes each point in the masked 3D point cloud to a geometric feature embedding, said in the paper. But in the code implementation only convolution and ReLU of point cloud x in network.py. No spacial transform networks(STN) and maxpooling operation on point cloud. These are the two most important features of PointNet.

` Class PoseNetFeat (nn. Module):
def init(self, num_points):
super(PoseNetFeat, self).init()
self.conv1 = torch.nn.Conv1d(3, 64, 1)
self.conv2 = torch.nn.Conv1d(64, 128, 1)

    self.e_conv1 = torch.nn.Conv1d(32, 64, 1)
    self.e_conv2 = torch.nn.Conv1d(64, 128, 1)

    self.conv5 = torch.nn.Conv1d(256, 512, 1)
    self.conv6 = torch.nn.Conv1d(512, 1024, 1)

    self.ap1 = torch.nn.AvgPool1d(num_points)
    self.num_points = num_points
def forward(self, x, emb):
    x = F.relu(self.conv1(x))
    emb = F.relu(self.e_conv1(emb))
    pointfeat_1 = torch.cat((x, emb), dim=1)

    x = F.relu(self.conv2(x))
    emb = F.relu(self.e_conv2(emb))
    pointfeat_2 = torch.cat((x, emb), dim=1)

    x = F.relu(self.conv5(pointfeat_2))
    x = F.relu(self.conv6(x))

    ap_x = self.ap1(x)

    ap_x = ap_x.view(-1, 1024, 1).repeat(1, 1, self.num_points)
    return torch.cat([pointfeat_1, pointfeat_2, ap_x], 1) #128 + 256 + 1024`

I am confused. Would you explain this to me?
Thanks in advance.

RuntimeError: the derivative for 'index' is not implemented.

I followed @Mars-y470's suggestion and tried to recomplile DenseFusion/lib/knn. The problem #33 'module 'lib.knn.knn_pytorch' has no attribute 'knn'' was solved.
However, during the YCB training (I ran ./experiments/scripts/train_ycb.sh), I got an error 'RuntimeError: the derivative for 'index' is not implemented'. The details are as follows:

2019-05-11 22:29:54,240 : Test time 08h 36m 53s Test Frame No.2948 
dis:0.005122609902173281
2019-05-11 22:29:54,300 : Test time 08h 36m 53s Epoch 33 TEST FINISH Avg dis: 
0.01275980792574587
33 >>>>>>>>----------BEST TEST MODEL SAVED---------<<<<<<<<
96189
2949
>>>>>>>>----------Dataset loaded!---------<<<<<<<<
length of the training set: 96189
length of the testing set: 2949
number of sample points on mesh: 2600
symmetry object list: [12, 15, 18, 19, 20]
2019-05-11 22:29:54,795 : Train time 08h 36m 53s, Training started
Traceback (most recent call last):
  File "./tools/train.py", line 237, in <module>
    main()
  File "./tools/train.py", line 145, in main
    dis, new_points, new_target = criterion_refine(pred_r, pred_t, new_target, model_points, idx, 
new_points)
  File "/home/qingqing/anaconda3/lib/python3.6/site- 
packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/qingqing/Downloads/qingqing_disk/p4600_disk/DenseFusion/lib/loss_refiner.py", 
line 76, in forward
    return loss_calculation(pred_r, pred_t, target, model_points, idx, points, self.num_pt_mesh, 
self.sym_list)
  File "/home/qingqing/Downloads/qingqing_disk/p4600_disk/DenseFusion/lib/loss_refiner.py", 
line 45, in loss_calculation
    target = torch.index_select(target, 1, inds.view(-1) - 1)
RuntimeError: the derivative for 'index' is not implemented

It seems that the refine process of the network was failed.
Could you give me some suggestions? @j96w
Did you meet the same problem? @Mars-y470
Thanks!!

KNN segmentation fault

When I try to train on LINEMOD, I met this:

./experiments/scripts/train_linemod.sh: line 10: 10128 Segmentation fault python3 ./tools/train.py --dataset linemod --dataset_root ./datasets/linemod/Linemod_preprocessed

I have verified that it happens at this line

Could you point me to any solutions? Thanks!

About data augmentation implementation

In readme, it says:

For the YCB_Video dataset, since the synthetic data do not contain background. We randomly select the real training data as the background. In each frame, we also randomly select two instances segmentation clips from another synthetic training image to mask at the front of the input RGB-D image, so that more occlusion situations can be generated.

But I did not find such implementation in data loader. Is it somewhere else?
BTW, this work is awesome!

ImportError : torch.utils.ffi requires the cffi pacakge

Hi
when trying to run the training script experiments/scripts/train_ycb.sh on ubuntu with python2.7.12,
it fails with the stack trace:

set -e
export PYTHONUNBUFFERED=True
EXPORT CUDA_VISIBLE_DEVICES=0
python2 ./tools/train.py --dataset ycb --dataset_root ./datasets/ycb/YCB_VIdeoDataset
/root/catkin_ws/densefusion/DenseFusion/lib/transformations.py:1912: UserWarning: failed to import module _transformations

warnings.warn('failed to import module %s' % name)

Traceback (most recent call last):
File "./tools/train.py", line 26, in
from lib.loss import Loss
File "/root/catkin_ws/densefusion/DenseFusion/lib/loss.py", line 9, in
from lib.knn.init import KNearestNeighbor
File "/root/catkin_ws/densefusion/DenseFusion/lib/knn/init.py", line 7, in
from lib.knn import knn_pytorch as knn_pytorch
File "/root/catkin_ws/densefusion/DenseFusion/lib/knn/knn_pytorch/init.py", line 2, in
from torch.utils.ffi import _wrap_function
File "/usr/local/lib/python2.7/dist-packages/torch/utils/ffi/init.py", line 14, in
raise ImportError("torch.utils.ffi requires the cffi package")

ImportError: torch.utils.ffi requires the cffi package

I access DenseFusion/lib/knn folder and revised Makefile PYTHON:= python2
I run $make

python2 build_ffi.py
Traceback (most recent call last):
File "build_ffi.py", line 5, in
from torch.utils.ffi import create_extension
File "/usr/local/lib/python2.7/dist-packages/torch/utils/ffi/init.py", line 14, in
raise ImportError("torch.utils.ffi requires the cffi package")
ImportError: torch.utils.ffi requires the cffi package
Makefile:31: recipe for target 'build/knn_pytorch/_knn_pytorch.so' failed
make: *** [build/knn_pytorch/_knn_pytorch.so] Error 1

Problem with Segmentation on LineMod Dataset

I just wonder how did you train the segmentation network on the LineMod dataset.
The LineMod dataset doesn't contain segmentation ground truth for the multiple objects to be detected in the same picture. And you used the masks preprocessed from singshotpose, but there's only one mask for one object in each picture.
So did you train separate segmentation networks for each object using these masks as ground truth? Or if you trained one segmentation network to simultaneously detect all the objects, how did you get the segmentation ground truth?
Thank you !!!

Training DenseFusion with another dataset

Hey,
I developped a database based on a single object, a pink cube, whose data shared the same format than YCB (png images for color, depth and segmentation size 640x480 and .mat file for ground truth values). When I tried to train your network for pose estimation with it by running the .py file with modifications regarding the number of objects and the location of the files to load, most of the time, it failed between the 5th and 15th batch:


2019-06-13 15:08:29,614 : Train time 00h 00m 00s Epoch 1 Batch 1 Frame 8 Avg_dis:11.477169811725616
2019-06-13 15:08:29,961 : Train time 00h 00m 01s Epoch 1 Batch 2 Frame 16 Avg_dis:10.867223545908928
2019-06-13 15:08:30,334 : Train time 00h 00m 01s Epoch 1 Batch 3 Frame 24 Avg_dis:14.736500158905983
2019-06-13 15:08:30,701 : Train time 00h 00m 01s Epoch 1 Batch 4 Frame 32 Avg_dis:7.393726512789726
2019-06-13 15:08:31,080 : Train time 00h 00m 02s Epoch 1 Batch 5 Frame 40 Avg_dis:9.500172346830368
2019-06-13 15:08:31,442 : Train time 00h 00m 02s Epoch 1 Batch 6 Frame 48 Avg_dis:11.712907552719116
2019-06-13 15:08:31,813 : Train time 00h 00m 02s Epoch 1 Batch 7 Frame 56 Avg_dis:34.04928183555603
2019-06-13 15:08:32,182 : Train time 00h 00m 03s Epoch 1 Batch 8 Frame 64 Avg_dis:62.857853412628174
2019-06-13 15:08:32,557 : Train time 00h 00m 03s Epoch 1 Batch 9 Frame 72 Avg_dis:12.033220887184143
2019-06-13 15:08:32,917 : Train time 00h 00m 04s Epoch 1 Batch 10 Frame 80 Avg_dis:nan
2019-06-13 15:08:33,348 : Train time 00h 00m 04s Epoch 1 Batch 11 Frame 88 Avg_dis:nan
2019-06-13 15:08:33,703 : Train time 00h 00m 04s Epoch 1 Batch 12 Frame 96 Avg_dis:nan
2019-06-13 15:08:34,071 : Train time 00h 00m 05s Epoch 1 Batch 13 Frame 104 Avg_dis:nan
2019-06-13 15:08:34,489 : Train time 00h 00m 05s Epoch 1 Batch 14 Frame 112 Avg_dis:nan
2019-06-13 15:08:34,865 : Train time 00h 00m 05s Epoch 1 Batch 15 Frame 120 Avg_dis:nan

Nan values continue like that until the end.

I have no problem training with YCB the way you did so I think the trouble appeared because I am using an other dataset.

EDIT:
Looks like the problem could comes from the pred_c values.
At the beginning they are too close from 0. During the loss calculation there is this line:
loss = torch.mean((dis * pred_c - w * torch.log(pred_c)), dim=0)
and if at least one pred_c value is considered equal to 0 then torch.log as an -inf value, torch.mean returns inf and during the optimizer step an nan value appear.

However I don't know how to manage this issue, do you have any advice ?

Thank you for your help

what is "start_epoch" used for?

Hey, I want to ask what the option "start_epoch" is used for? My understanding is that if the model was trained for example until epoch 42 and the process was terminated, next time it can be set to start from epoch 42 if the option "--start_epoch= 42" is given, is that correct?

I have observed that the "Avg_dist" is much higher than last time before the process is terminated, e.g., the model was trained until 42 epoch and the "Avg_dist" is like 0.00xx before the process is terminated, next time when the process is restarted, it's like 0,0x even if the "--start_epoch=42" is set. Is that normal?

Thanks in advance!

make with conda 2 (python 2) ; doesn't work

hi @j96w , i'm working on conda 2.7 , so i built the knn_torch with python 2 in conda environment and always get this problem with modules ,

set -e
export PYTHONUNBUFFERED=True
PYTHONUNBUFFERED=True
export CUDA_VISIBLE_DEVICES=0
CUDA_VISIBLE_DEVICES=0
python ./tools/train.py --dataset linemod --dataset_root ./datasets/linemod/Linemod_preprocessed
Traceback (most recent call last):
File "./tools/train.py", line 12, in
import numpy as np
ImportError: No module named numpy

this is the make by python2 :
..$ make
python2 build_ffi.py
generating /tmp/tmpmsC9ic/_knn_pytorch.c
setting the current directory to '/tmp/tmpmsC9ic'
running build_ext
building '_knn_pytorch' extension
creating home
creating home/hamza
creating home/hamza/Téléchargements
creating home/hamza/Téléchargements/DenseFusion-master
creating home/hamza/Téléchargements/DenseFusion-master/lib
creating home/hamza/Téléchargements/DenseFusion-master/lib/knn
creating home/hamza/Téléchargements/DenseFusion-master/lib/knn/src
gcc -pthread -B /home/hamza/anaconda2/envs/py27/compiler_compat -Wl,--sysroot=/ -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/home/hamza/anaconda2/envs/py27/lib/python2.7/site-packages/torch/utils/ffi/../../lib/include -I/home/hamza/anaconda2/envs/py27/lib/python2.7/site-packages/torch/utils/ffi/../../lib/include/TH -I/home/hamza/anaconda2/envs/py27/lib/python2.7/site-packages/torch/utils/ffi/../../lib/include/THC -I/usr/local/cuda/include -I/home/hamza/Téléchargements/DenseFusion-master/lib/knn/include -I/home/hamza/anaconda2/envs/py27/include/python2.7 -c _knn_pytorch.c -o ./_knn_pytorch.o -std=c99
gcc -pthread -B /home/hamza/anaconda2/envs/py27/compiler_compat -Wl,--sysroot=/ -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/home/hamza/anaconda2/envs/py27/lib/python2.7/site-packages/torch/utils/ffi/../../lib/include -I/home/hamza/anaconda2/envs/py27/lib/python2.7/site-packages/torch/utils/ffi/../../lib/include/TH -I/home/hamza/anaconda2/envs/py27/lib/python2.7/site-packages/torch/utils/ffi/../../lib/include/THC -I/usr/local/cuda/include -I/home/hamza/Téléchargements/DenseFusion-master/lib/knn/include -I/home/hamza/anaconda2/envs/py27/include/python2.7 -c /home/hamza/Téléchargements/DenseFusion-master/lib/knn/src/knn_pytorch.c -o ./home/hamza/Téléchargements/DenseFusion-master/lib/knn/src/knn_pytorch.o -std=c99
gcc -pthread -shared -B /home/hamza/anaconda2/envs/py27/compiler_compat -L/home/hamza/anaconda2/envs/py27/lib -Wl,-rpath=/home/hamza/anaconda2/envs/py27/lib -Wl,--no-as-needed -Wl,--sysroot=/ ./_knn_pytorch.o ./home/hamza/Téléchargements/DenseFusion-master/lib/knn/src/knn_pytorch.o /home/hamza/Téléchargements/DenseFusion-master/lib/knn/build/knn_cuda_kernel.so /usr/local/cuda/lib64/libnppitc_static.a /usr/local/cuda/lib64/libcublas_device.a /usr/local/cuda/lib64/libnppc_static.a /usr/local/cuda/lib64/libcusparse_static.a /usr/local/cuda/lib64/libcublas_static.a /usr/local/cuda/lib64/libcufftw_static.a /usr/local/cuda/lib64/libnppicom_static.a /usr/local/cuda/lib64/libnppicc_static.a /usr/local/cuda/lib64/libcufft_static.a /usr/local/cuda/lib64/libcudnn_static.a /usr/local/cuda/lib64/libnppist_static.a /usr/local/cuda/lib64/libcurand_static.a /usr/local/cuda/lib64/libnpps_static.a /usr/local/cuda/lib64/libnppig_static.a /usr/local/cuda/lib64/libcudadevrt.a /usr/local/cuda/lib64/libnppidei_static.a /usr/local/cuda/lib64/libnppisu_static.a /usr/local/cuda/lib64/libnvgraph_static.a /usr/local/cuda/lib64/libcusolver_static.a /usr/local/cuda/lib64/libnppif_static.a /usr/local/cuda/lib64/libnppim_static.a /usr/local/cuda/lib64/libculibos.a /usr/local/cuda/lib64/libcudart_static.a /usr/local/cuda/lib64/libnppial_static.a -L/home/hamza/anaconda2/envs/py27/lib -lpython2.7 -o ./_knn_pytorch.so

I'm working with linemod data

i tried it also outside conda environment , always the same problem . i don't know if i miss something !
thank you in advance

Data visualisation

Hi @j96w , thank's for your work , i'm really stack on visualization part for days , could you just explain to me ( if it possible with enough details pls ) how to show , detected objects of linemod data on the videos files ( or data photos ) in real time ,

I trained data and also make an evaluation , and result was as expected , but i wan't this part of visualization in real time how to this see object mesh on videos , and also how to work with a camera in real time ,

Thank you in advance .
Hamza

Testing on my own data is not correct

Thank you very much for sharing the good code, I have run and test the code on YCB data correctly, I can also draw correct 3d box and point cloud on 2d-image plane. but when I use my own data to test the trained model on YCB, the segmentation result is correct but the pose estimation is not yet, the depth image is stored in png format, uint16, shown as following, could you give me some advice?
Predicted_3DBox_0001
Predicted_Pose_0001
Normalized_depth_0001
Predicted_Label_0001

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.