j96w / densefusion Goto Github PK
View Code? Open in Web Editor NEW"DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion" code repository
Home Page: https://sites.google.com/view/densefusion
License: MIT License
"DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion" code repository
Home Page: https://sites.google.com/view/densefusion
License: MIT License
The documents of "torvision.model" have said that
The images have to be loaded in to a range of [0, 1] and then normalized using mean = [0.485, 0.456, 0.406] and std = [0.229, 0.224, 0.225].
The cropped image have been directly normalized without scaled into [0, 1]。
Is there any bug though it work?
Hello, thanks for sharing the code
I am new to PyTorch. As far as I understand, the code provided/ DataParallel function should run training(Posenet) on multiple gpus however, in my case it is using only one out of four gpus. I have a gpu cluster of 4x 1080ti. Also, torch.cuda.device_count() shows 4. Could you please tell me if you have used multiple gpus for training and is there something I am missing here?
Hi, Thanks for sharing the code!
I'm also trying to use your code in the ROS environment for robot manipulation with objects in YCB dataset. However, the inference in DenseFusion requires segmentation to generate the pose and it is very time consuming to train a segmentation model with all the training images in the YCB-Video dataset. I tried to train with the vanilla segmentation code and found even one epoch is taking around 10 hours on YCB-Video dataset with single GPU. And we don't have too much resources on GPU. It would be great if you can share the trained segmentation model on YCB-video dataset!
Thanks a lot!
On linemod dataset, we evaluated the model provided (trained_models/linemod/pose_model_9_0.01310166542980859.pth) without refinement and the success rate is 0.83169, which does not match what is claimed in the the paper (per-pixel: 86.2). Is the provided PoseNet model used to evaluate the per-pixel performance? If not, could you please provide the model used to evaluate the per-pixel performance?
Thank you!
Hi, Thanks for sharing the code!!!
I ran download.sh file and success
so I run
sh .experiments/scripts/train_linemod.sh
after object buffer loaded, I get this error
----------Dataset loaded!---------<<<<<<<<
length of the training set: 2373
length of the testing set: 1336
number of sample points on mesh: 500
symmetry object list: [7, 8]
2019-04-08 03:59:01,660 : Train time 00h 00m 00s, Training started
/home/user/miniconda/envs/py36/lib/python3.6/site-packages/torch/nn/functional.py:1749: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
"See the documentation of nn.Upsample for details.".format(mode))
/home/user/miniconda/envs/py36/lib/python3.6/site-packages/torch/nn/modules/container.py:91: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.
input = module(input)
Segmentation fault (core dumped)
How can I solve this error and run train code
When trying to run the training script ./train_ycb.sh on Ubuntu with Python2.7.12, it fails with the stack trace:
I want to addressed via building _knn_pytorch.so with Py2, but it still fails with the stack trace:
python2 build_ffi.py
Traceback (most recent call last):
File "build_ffi.py", line 19, in
include_dirs=[osp.join(abs_path, 'include')]
File "/usr/local/lib/python2.7/dist-packages/torch/utils/ffi/init.py", line 176, in create_extension
ffi = cffi.FFI()
File "/usr/local/lib/python2.7/dist-packages/cffi/api.py", line 46, in init
import _cffi_backend as backend
ImportError: /usr/local/lib/python2.7/dist-packages/_cffi_backend.so: undefined symbol: PyUnicodeUCS2_FromUnicode
Makefile:31: recipe for target 'build/knn_pytorch/_knn_pytorch.so' failed
make: *** [build/knn_pytorch/_knn_pytorch.so] Error 1
Hi Jeremy,
I tried to run the evaluation on YCB dataset with your matlab codes.
It shows missing folder called 'results_3DCoordinate' coming from the code below:
Thank you.
Regards,
Stacey
HI @j96w , i just wanted to know how can we get the number of epoches in orther to estimate our time of training , because i'm waiting it to finish training . I was supposing , thank's to README.txt , that epoches=30 , but it's not right .
17:38:09,448 : Train time 16h 25m 47s Epoch 39 Batch 8046 Frame 32184 Avg_dis:0.0042653061100281775
2019-03-24 17:38:09,549 : Train time 16h 25m 47s Epoch 39 Batch 8047 Frame 32188 Avg_dis:0.0038781535113230348
2019-03-24 17:38:09,660 : Train time 16h 25m 47s Epoch 39 Batch 8048 Frame 32192 Avg_dis:0.003846941574010998
2019-03-24 17:38:09,777 : Train time 16h 25m 47s Epoch 39 Batch 8049 Frame 32196 Avg_dis:0.002740198280662298
2019-03-24 17:38:09,915 : Train time 16h 25m 47s Epoch 39 Batch 8050 Frame 32200 Avg_dis:0.003869467240292579
2019-03-24 17:38:10,033 : Train time 16h 25m 47s Epoch 39 Batch 8051 Frame 32204 Avg_dis:0.004200253228191286
how many epoches are fixed for linemod training, Thank you in advance .
I used Pytorch-1.0 branch and ran ./experiments/scripts/train_ycb.sh.
Then I got the error:
pred = torch.add(torch.bmm(model_points, base), points + pred_t) RuntimeError: cublas runtime error : the GPU program failed to execute at /pytorch/aten/src/THC/THCBlas.cu:441
My system:
l notice the two .yml files, where they come from?
And is there any other difference..?
Thank You!
Thank you for your help. Sorry to bother you again. I still have some questions.
1.The max default number of epochs to train is 500. When training 104 epochs on ycb, the dis is 0.0090381440936. But it costs long time. And the ycb
trained_models you provide is pose_refine_model_69_0.009449292959118935.pth
. So how to determine the epoch of training for pose refine model? The lower the dis, the better the model? How to prevent overfitting?
2.When evaluation on LineMOD Dataset, the final content of eval_result_logs.txt is as follows:
No.13390 NOT Pass! Distance: 0.0217275395989
No.13391 Pass! Distance: 0.0048154219985
No.13392 Pass! Distance: 0.0142814125866
No.13393 Pass! Distance: 0.00356977432966
No.13394 Pass! Distance: 0.00472941761836
No.13395 Pass! Distance: 0.00784354563802
No.13396 Pass! Distance: 0.0127922594547
No.13397 Pass! Distance: 0.00581285078079
No.13398 NOT Pass! Distance: 0.0267377775162
No.13399 Pass! Distance: 0.00628646928817
No.13400 Pass! Distance: 0.0146940667182
No.13401 NOT Pass! Distance: 0.0340396799147
No.13402 NOT Pass! Distance: 0.0713591426611
No.13403 NOT Pass! Distance: 0.0522820688784
No.13404 Pass! Distance: 0.0038491380401
No.13405 NOT Pass! Distance: 0.0213586390018
No.13406 Pass! Distance: 0.00927203428
Object 1 success rate: 0
Object 2 success rate: 0
Object 4 success rate: 0
Object 5 success rate: 0
Object 6 success rate: 0
Object 8 success rate: 0
Object 9 success rate: 0
Object 10 success rate: 0
Object 11 success rate: 1
Object 12 success rate: 0
Object 13 success rate: 0
Object 14 success rate: 0
Object 15 success rate: 0
ALL success rate: 0
Is anything wrong with evaluation on LineMOD Dataset?How can I get an evaluation result similar to YCB_Video Dataset run with MATLAB?
Thank you in advance.
Originally posted by @sunshantong in #7 (comment)
Equation (2) in the paper detects the closest point from predicted model points to each of ground truth model points for symmetric shapes which is consistent with the equation (6) in PoseCNN paper, but in the code implementation (line 44-47 in lib/loss.py) you instead find the closest point from ground truth model to each predicted model point, which is opposite. Can you explain? Because those two metrics are different. Thanks!
In eval_linemod.py, the code still use the rmin, rmax, cmin, cmax = get_bbox(meta['obj_bb'])
to get the rmin, rmax, cmin, cmax. That's the important process for the image crop. In my opinion, gt.yaml is the groundtruth for the objects, and the obj_bb is the 2d bounding box. I don't know whether the code is right. Maybe I was wrong.
thank you =。=
Hi,
Thank you for sharing the codes and I have several questions on the loss calculation :
Q1.
Line 58 in 42a21e7
Q2.
Line 67 in 42a21e7
Following is some of my understanding:
In the codes, the point clouds and target are updated in the same way, and it's hard for me to understand. Could you please help to explain?
Thank you and looking forward to your reply.
Best,
Stacey
When I try to run
python3 train.py --dataset_root=./datasets/ycb/YCB_Video_Dataset
I get the following issue :
Traceback (most recent call last):
File "train.py", line 69, in <module>
for i, data in enumerate(dataloader, 0):
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 336, in __next__
return self._process_next_batch(batch)
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 357, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
FileNotFoundError: Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 106, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 106, in <listcomp>
samples = collate_fn([dataset[i] for i in batch_indices])
File "/media/intern/disk2/DenseFusion/vanilla_segmentation/data_controller.py", line 47, in __getitem__
label = np.array(Image.open('{0}/{1}-label.png'.format(self.root, self.path[index])))
File "/usr/local/lib/python3.5/dist-packages/PIL/Image.py", line 2652, in open
fp = builtins.open(filename, "rb")
FileNotFoundError: [Errno 2] No such file or directory: './datasets/ycb/YCB_Video_Dataset/data_syn/002715-label.png'
I tried several times and got the same issue with different #-label.png files and the files are in the folder so I don't really understand the error. It is always the 22nd call to getitem that crash. Any idea ?
Thanks
Thanks for sharing the code, amazing work!
I am reading your code, but I found that in loss.py file line 38, when you calculate the loss, why you also add points? see the copied code below:
pred= torch.add(torch.bmm(model_points, base), points + pred_t)
Hi dear authors,
I was thinking of using the method for kinda small objects (chip, ring, metal pieces, etc. found in devices such as this) and wondered if the method could actually generalize, or do we really need to have the precise 3D model of what we are looking for? Because it is impossible to have the models for each and every thing in this domain, so at some point the method should generalize. Would it actually follow this line of thought?
Regards,
Hi, a PointNet-based network that processes each point in the masked 3D point cloud to a geometric feature embedding, said in the paper. But in the code implementation only convolution and ReLU of point cloud x in network.py. No spacial transform networks(STN) and maxpooling operation on point cloud. These are the two most important features of PointNet.
` Class PoseNetFeat (nn. Module):
def init(self, num_points):
super(PoseNetFeat, self).init()
self.conv1 = torch.nn.Conv1d(3, 64, 1)
self.conv2 = torch.nn.Conv1d(64, 128, 1)self.e_conv1 = torch.nn.Conv1d(32, 64, 1) self.e_conv2 = torch.nn.Conv1d(64, 128, 1) self.conv5 = torch.nn.Conv1d(256, 512, 1) self.conv6 = torch.nn.Conv1d(512, 1024, 1) self.ap1 = torch.nn.AvgPool1d(num_points) self.num_points = num_points def forward(self, x, emb): x = F.relu(self.conv1(x)) emb = F.relu(self.e_conv1(emb)) pointfeat_1 = torch.cat((x, emb), dim=1) x = F.relu(self.conv2(x)) emb = F.relu(self.e_conv2(emb)) pointfeat_2 = torch.cat((x, emb), dim=1) x = F.relu(self.conv5(pointfeat_2)) x = F.relu(self.conv6(x)) ap_x = self.ap1(x) ap_x = ap_x.view(-1, 1024, 1).repeat(1, 1, self.num_points) return torch.cat([pointfeat_1, pointfeat_2, ap_x], 1) #128 + 256 + 1024`
I am confused. Would you explain this to me?
Thanks in advance.
I have some related question about this part.
what's the meaning of
pointfeat_1 = torch.cat((x, emb), dim=1)
return torch.cat([pointfeat_1, pointfeat_2, ap_x], 1) #128 + 256 + 1024`
Would you explain about this part??
Thanks.
Originally posted by @trevor-taeyeop in #34 (comment)
Hi @j96w ! Thank you for your work.
line 131~154 in train.py
for i, data in enumerate(dataloader, 0):
#…………
train_count += 1
if train_count % opt.batch_size == 0:
logger.info('Train time {0} Epoch {1} Batch {2} Frame {3} Avg_dis:{4}'.format(time.strftime("%Hh %Mm %Ss", time.gmtime(time.time() - st_time)), epoch, int(train_count / opt.batch_size), train_count, train_dis_avg / opt.batch_size))
I think one iteration is one batch, so batch number should be equivalent to train_count
.
In logger.info
, why batch number equals int(train_count / opt.batch_size)
, and frame number equals train_count
?
Hello,
There are many 3d bboxes in the figures of this paper. Could you tell me how to draw the 3d bbox with output of this network(R and T)?
thanks for help !
@j96w
Thank you for sharing your codes.
I have some confusion about the dis_calculation in the stage of pose estimation.
pred = torch.add(torch.bmm(model_points, base), points + pred_t)
why not:
pred = torch.add(torch.bmm(model_points, base), pred_t)
Thank you and looking forward to your reply.
What I should do if I want to use the resnet101 instead of the renset18? thanks for your answer.
Dear sir , i am a freshman in 6d pose , i am tring to learn from your code ,but ,when i need to build my own dataset on a specific object like my own thing or some other thing .how to make ur own dataset?
Would it be possible if you'd include the ROS package used in the demo video ?
Was the inference time affected when the code was deployed to ROS ?
Thank you.
Hi, when I use YCB_Video_toolbox to run the matlab code "YCB_Video_toolbox/evaluate_poses_
keyframe.m", I meet the problem! Who can help me ? Thanks a lot in advance!
Error using load
Unable to read file 'Densefusion_iterative_result/0026.mat'. No such file or directory.
Error in evaluate_poses_keyframe (line 50)
result_my = load(filename);
I followed the issue #33 and extracted egg file and moved *so and knn_python.py to the root knn dir.
After that when I run ./experiments/scripts/train_ycb.sh
it shows the error as below,
`
my system is:
Ubuntu 16.04
GPU : RTX2080 ti
CUDA 10.0
python 3.6.8
pytorch 1.01
any small ideas would be precious for me.
Thanks.
Hello, thanks for sharing your code!!
I tried to visualize the output on YCB test set, but the result doesn't not align with that in Fig 4. in your paper.
Here is one of my visualization results.
The up left image is generated using the ground truth R and T in xxx-mata.mat, and it is correct. The down left and down right images are generated by simply replacing the R and T by those in mat files in result_wo_refine_dir and result_refine_dir. Pose estimation in these two images seems to go wrong.
The visualization process is to transfer points in points.xyz with R and T and then scale and transform those points to fit into the object's tight bounding box.
I used trained checkpoints provided by you.
May I ask your suggestion on what the problem might be?
Thanks for your time.
Hello.
Thank you for sharing the good code.
I wanna test the model trained with my own data.
But, the evaluation code seems to require the information in the meta file.('cam_t_m2c','cam_R_m2c'..)
I think that it is difficult to test the model in real-time with this code.
Is there a code that can only give pose in real-time? If you don't, how can i modify it and try it?
I'll be waiting for the reply.
Thank you.
Hi, thanks for sharing your code!
I downloaded the latest version of DenseFusion-Pytorch-1.0. When I ran ./experiments/scripts/train_ycb.sh, I got an error 'AttributeError: module 'lib.knn.knn_pytorch' has no attribute 'knn'' . My python version is 3.6.8 |Anaconda custom (64-bit) and the pytorch version is 1.0.1.post2.
I tried to insert pdb.set_trace() in the line 20 of ./lib/knn/_init_.py
inds = torch.empty(query.shape[0], self.k, query.shape[2]).long().cuda()
#import pdb
#pdb.set_trace()
knn_pytorch.knn(ref, query, inds)
return inds
I print dir(knn_pytorch) which shows the following messages:
(Pdb) p dir(knn_pytorch)
['__doc__', '__loader__', '__name__', '__package__', '__path__', '__spec__']
It seems that the module knn_pytorch doesn't have the knn. How can I solve this error? Please help me.
The details are as follows:
+ set -e
+ export PYTHONUNBUFFERED=True
+ PYTHONUNBUFFERED=True
+ export CUDA_VISIBLE_DEVICES=0
+ CUDA_VISIBLE_DEVICES=0
+ python3 ./tools/train.py --dataset ycb --dataset_root ./datasets/ycb/YCB_Video_Dataset
/home/qingqing/Downloads/qingqing_disk/p4600_disk/DenseFusion/lib/transformations.py:1912: UserWarning: failed to import module _transformations
warnings.warn('failed to import module %s' % name)
96189
2949
>>>>>>>>----------Dataset loaded!---------<<<<<<<<
length of the training set: 96189
length of the testing set: 2949
number of sample points on mesh: 500
symmetry object list: [12, 15, 18, 19, 20]
/home/qingqing/anaconda3/lib/python3.6/site-packages/torch/nn/_reduction.py:49: UserWarning: size_average and reduce args will be deprecated, please use reduction='mean' instead.
warnings.warn(warning.format(ret))
2019-04-23 10:20:35,077 : Train time 00h 00m 00s, Training started
/home/qingqing/anaconda3/lib/python3.6/site-packages/torch/nn/functional.py:2351: UserWarning: nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.
warnings.warn("nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.")
/home/qingqing/anaconda3/lib/python3.6/site-packages/torch/nn/functional.py:2423: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
"See the documentation of nn.Upsample for details.".format(mode))
/home/qingqing/anaconda3/lib/python3.6/site-packages/torch/nn/modules/upsampling.py:129: UserWarning: nn.Upsample is deprecated. Use nn.functional.interpolate instead.
warnings.warn("nn.{} is deprecated. Use nn.functional.interpolate instead.".format(self.name))
/home/qingqing/anaconda3/lib/python3.6/site-packages/torch/nn/modules/container.py:92: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.
input = module(input)
2019-04-23 10:20:36,413 : Train time 00h 00m 01s Epoch 1 Batch 1 Frame 8 Avg_dis:0.1779076661914587
Traceback (most recent call last):
File "./tools/train.py", line 237, in <module>
main()
File "./tools/train.py", line 140, in main
loss, dis, new_points, new_target = criterion(pred_r, pred_t, pred_c, target, model_points, idx, points, opt.w, opt.refine_start)
File "/home/qingqing/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
result = self.forward(*input, **kwargs)
File "/home/qingqing/Downloads/qingqing_disk/p4600_disk/DenseFusion/lib/loss.py", line 83, in forward
return loss_calculation(pred_r, pred_t, pred_c, target, model_points, idx, points, w, refine, self.num_pt_mesh, self.sym_list)
File "/home/qingqing/Downloads/qingqing_disk/p4600_disk/DenseFusion/lib/loss.py", line 44, in loss_calculation
inds = knn(target.unsqueeze(0), pred.unsqueeze(0))
File "/home/qingqing/Downloads/qingqing_disk/p4600_disk/DenseFusion/lib/knn/__init__.py", line 23, in forward
knn_pytorch.knn(ref, query, inds)
AttributeError: module 'lib.knn.knn_pytorch' has no attribute 'knn'
Please provide details on how to use output from label fusion.
The code is in eval_ycb.py
, and datasets/ycb/dataset.py
.
I think it inputs the roi from the semantic segmentation, and makes the new width(height) be the the smallest value in border_list
not less than the old one(like ceil function). The output roi will only have fixed choice.
Why don't use the rois from segmentation?
Thank you and looking forward to your reply.
border_list = [-1, 40, 80, 120, 160, 200, 240, 280, 320, 360, 400, 440, 480, 520, 560, 600, 640, 680]
def get_bbox(posecnn_rois):
rmin = int(posecnn_rois[idx][3]) + 1
rmax = int(posecnn_rois[idx][5]) - 1
cmin = int(posecnn_rois[idx][2]) + 1
cmax = int(posecnn_rois[idx][4]) - 1
r_b = rmax - rmin
for tt in range(len(border_list)):
if r_b > border_list[tt] and r_b < border_list[tt + 1]:
r_b = border_list[tt + 1]
break
c_b = cmax - cmin
for tt in range(len(border_list)):
if c_b > border_list[tt] and c_b < border_list[tt + 1]:
c_b = border_list[tt + 1]
break
center = [int((rmin + rmax) / 2), int((cmin + cmax) / 2)]
rmin = center[0] - int(r_b / 2)
rmax = center[0] + int(r_b / 2)
cmin = center[1] - int(c_b / 2)
cmax = center[1] + int(c_b / 2)
if rmin < 0:
delt = -rmin
rmin = 0
rmax += delt
if cmin < 0:
delt = -cmin
cmin = 0
cmax += delt
if rmax > img_width:
delt = rmax - img_width
rmax = img_width
rmin -= delt
if cmax > img_length:
delt = cmax - img_length
cmax = img_length
cmin -= delt
return rmin, rmax, cmin, cmax
I'm trying to use the output of vanilla SegNet network to label YCB-Video images but I don't find an efficient way to transform the 22*640*480 output into a single label image 640*480.
For the moment I'm using something like that:
seg_data = seg(rgb) # output SegNet
seg_data = seg_data.detach().cpu().numpy()[0]
seg_image = np.zeros((480, 640))
obj_list = []
for i in range(480):
for j in range(640):
prob_max = 0
label = 0
for r in range(22):
if seg_data[r][i][j] > prob_max:
label = r
prob_max = seg_data[r][i][j]
seg_image[i][j] = label
if label not in obj_list:
obj_list.append(label)
How do you use the output for fast segmentation of an rgb image ?
Disclaimer: I haven't read the paper
That being said, this looks very suspicious.
Line 38 in 42a21e7
It should be only
pred = torch.add(torch.bmm(model_points, base), pred_t)
like you have in loss_refiner.py
. I don't see a reason to add the points
you acquired from the camera here. Especially because you compute the point to point distance error a couple of lines below
Line 49 in 42a21e7
@j96w Thanks for your work!
Here is my question. The condition that "best_test < opt.refine_margin(0.013)" is achieved after about 10 epochs, which means the posenet is just trained 10 times. However, I found the best_test is about 0.006 after training the refineNet for more than 400 times. So, is the refine_margin a little big? Can I set it smaller (eg, 0.008) to train the posenet more times? Or you have found that a refine_margin value smaller than 0.013 will lead to overfitting on the posenet?
Hi, a PointNet-based network that processes each point in the masked 3D point cloud to a geometric feature embedding, said in the paper. But in the code implementation only convolution and ReLU of point cloud x in network.py. No spacial transform networks(STN) and maxpooling operation on point cloud. These are the two most important features of PointNet.
` Class PoseNetFeat (nn. Module):
def init(self, num_points):
super(PoseNetFeat, self).init()
self.conv1 = torch.nn.Conv1d(3, 64, 1)
self.conv2 = torch.nn.Conv1d(64, 128, 1)
self.e_conv1 = torch.nn.Conv1d(32, 64, 1)
self.e_conv2 = torch.nn.Conv1d(64, 128, 1)
self.conv5 = torch.nn.Conv1d(256, 512, 1)
self.conv6 = torch.nn.Conv1d(512, 1024, 1)
self.ap1 = torch.nn.AvgPool1d(num_points)
self.num_points = num_points
def forward(self, x, emb):
x = F.relu(self.conv1(x))
emb = F.relu(self.e_conv1(emb))
pointfeat_1 = torch.cat((x, emb), dim=1)
x = F.relu(self.conv2(x))
emb = F.relu(self.e_conv2(emb))
pointfeat_2 = torch.cat((x, emb), dim=1)
x = F.relu(self.conv5(pointfeat_2))
x = F.relu(self.conv6(x))
ap_x = self.ap1(x)
ap_x = ap_x.view(-1, 1024, 1).repeat(1, 1, self.num_points)
return torch.cat([pointfeat_1, pointfeat_2, ap_x], 1) #128 + 256 + 1024`
I am confused. Would you explain this to me?
Thanks in advance.
I followed @Mars-y470's suggestion and tried to recomplile DenseFusion/lib/knn. The problem #33 'module 'lib.knn.knn_pytorch' has no attribute 'knn'' was solved.
However, during the YCB training (I ran ./experiments/scripts/train_ycb.sh), I got an error 'RuntimeError: the derivative for 'index' is not implemented'. The details are as follows:
2019-05-11 22:29:54,240 : Test time 08h 36m 53s Test Frame No.2948
dis:0.005122609902173281
2019-05-11 22:29:54,300 : Test time 08h 36m 53s Epoch 33 TEST FINISH Avg dis:
0.01275980792574587
33 >>>>>>>>----------BEST TEST MODEL SAVED---------<<<<<<<<
96189
2949
>>>>>>>>----------Dataset loaded!---------<<<<<<<<
length of the training set: 96189
length of the testing set: 2949
number of sample points on mesh: 2600
symmetry object list: [12, 15, 18, 19, 20]
2019-05-11 22:29:54,795 : Train time 08h 36m 53s, Training started
Traceback (most recent call last):
File "./tools/train.py", line 237, in <module>
main()
File "./tools/train.py", line 145, in main
dis, new_points, new_target = criterion_refine(pred_r, pred_t, new_target, model_points, idx,
new_points)
File "/home/qingqing/anaconda3/lib/python3.6/site-
packages/torch/nn/modules/module.py", line 489, in __call__
result = self.forward(*input, **kwargs)
File "/home/qingqing/Downloads/qingqing_disk/p4600_disk/DenseFusion/lib/loss_refiner.py",
line 76, in forward
return loss_calculation(pred_r, pred_t, target, model_points, idx, points, self.num_pt_mesh,
self.sym_list)
File "/home/qingqing/Downloads/qingqing_disk/p4600_disk/DenseFusion/lib/loss_refiner.py",
line 45, in loss_calculation
target = torch.index_select(target, 1, inds.view(-1) - 1)
RuntimeError: the derivative for 'index' is not implemented
It seems that the refine process of the network was failed.
Could you give me some suggestions? @j96w
Did you meet the same problem? @Mars-y470
Thanks!!
When I try to train on LINEMOD, I met this:
./experiments/scripts/train_linemod.sh: line 10: 10128 Segmentation fault python3 ./tools/train.py --dataset linemod --dataset_root ./datasets/linemod/Linemod_preprocessed
I have verified that it happens at this line
Could you point me to any solutions? Thanks!
Can you share your visualization code? If you can, thank you very much.
I'm using pytorch 1.0.1 instead of 0.4.1. When I ran ./experiments/scripts/train_linemod.sh, then "ImportError: torch.utils.ffi is deprecated. Please use cpp extensions instead" appears.Any solutions except changing the pytorch's version?
In readme, it says:
For the YCB_Video dataset, since the synthetic data do not contain background. We randomly select the real training data as the background. In each frame, we also randomly select two instances segmentation clips from another synthetic training image to mask at the front of the input RGB-D image, so that more occlusion situations can be generated.
But I did not find such implementation in data loader. Is it somewhere else?
BTW, this work is awesome!
There is only node for training vanilla SegNet, is it possible to share test code and trained segmentation model?
Hi
when trying to run the training script experiments/scripts/train_ycb.sh on ubuntu with python2.7.12,
it fails with the stack trace:
set -e
export PYTHONUNBUFFERED=True
EXPORT CUDA_VISIBLE_DEVICES=0
python2 ./tools/train.py --dataset ycb --dataset_root ./datasets/ycb/YCB_VIdeoDataset
/root/catkin_ws/densefusion/DenseFusion/lib/transformations.py:1912: UserWarning: failed to import module _transformations
warnings.warn('failed to import module %s' % name)
Traceback (most recent call last):
File "./tools/train.py", line 26, in
from lib.loss import Loss
File "/root/catkin_ws/densefusion/DenseFusion/lib/loss.py", line 9, in
from lib.knn.init import KNearestNeighbor
File "/root/catkin_ws/densefusion/DenseFusion/lib/knn/init.py", line 7, in
from lib.knn import knn_pytorch as knn_pytorch
File "/root/catkin_ws/densefusion/DenseFusion/lib/knn/knn_pytorch/init.py", line 2, in
from torch.utils.ffi import _wrap_function
File "/usr/local/lib/python2.7/dist-packages/torch/utils/ffi/init.py", line 14, in
raise ImportError("torch.utils.ffi requires the cffi package")
ImportError: torch.utils.ffi requires the cffi package
I access DenseFusion/lib/knn folder and revised Makefile PYTHON:= python2
I run $make
python2 build_ffi.py
Traceback (most recent call last):
File "build_ffi.py", line 5, in
from torch.utils.ffi import create_extension
File "/usr/local/lib/python2.7/dist-packages/torch/utils/ffi/init.py", line 14, in
raise ImportError("torch.utils.ffi requires the cffi package")
ImportError: torch.utils.ffi requires the cffi package
Makefile:31: recipe for target 'build/knn_pytorch/_knn_pytorch.so' failed
make: *** [build/knn_pytorch/_knn_pytorch.so] Error 1
I just wonder how did you train the segmentation network on the LineMod dataset.
The LineMod dataset doesn't contain segmentation ground truth for the multiple objects to be detected in the same picture. And you used the masks preprocessed from singshotpose, but there's only one mask for one object in each picture.
So did you train separate segmentation networks for each object using these masks as ground truth? Or if you trained one segmentation network to simultaneously detect all the objects, how did you get the segmentation ground truth?
Thank you !!!
Hey,
I developped a database based on a single object, a pink cube, whose data shared the same format than YCB (png images for color, depth and segmentation size 640x480 and .mat file for ground truth values). When I tried to train your network for pose estimation with it by running the .py file with modifications regarding the number of objects and the location of the files to load, most of the time, it failed between the 5th and 15th batch:
2019-06-13 15:08:29,614 : Train time 00h 00m 00s Epoch 1 Batch 1 Frame 8 Avg_dis:11.477169811725616
2019-06-13 15:08:29,961 : Train time 00h 00m 01s Epoch 1 Batch 2 Frame 16 Avg_dis:10.867223545908928
2019-06-13 15:08:30,334 : Train time 00h 00m 01s Epoch 1 Batch 3 Frame 24 Avg_dis:14.736500158905983
2019-06-13 15:08:30,701 : Train time 00h 00m 01s Epoch 1 Batch 4 Frame 32 Avg_dis:7.393726512789726
2019-06-13 15:08:31,080 : Train time 00h 00m 02s Epoch 1 Batch 5 Frame 40 Avg_dis:9.500172346830368
2019-06-13 15:08:31,442 : Train time 00h 00m 02s Epoch 1 Batch 6 Frame 48 Avg_dis:11.712907552719116
2019-06-13 15:08:31,813 : Train time 00h 00m 02s Epoch 1 Batch 7 Frame 56 Avg_dis:34.04928183555603
2019-06-13 15:08:32,182 : Train time 00h 00m 03s Epoch 1 Batch 8 Frame 64 Avg_dis:62.857853412628174
2019-06-13 15:08:32,557 : Train time 00h 00m 03s Epoch 1 Batch 9 Frame 72 Avg_dis:12.033220887184143
2019-06-13 15:08:32,917 : Train time 00h 00m 04s Epoch 1 Batch 10 Frame 80 Avg_dis:nan
2019-06-13 15:08:33,348 : Train time 00h 00m 04s Epoch 1 Batch 11 Frame 88 Avg_dis:nan
2019-06-13 15:08:33,703 : Train time 00h 00m 04s Epoch 1 Batch 12 Frame 96 Avg_dis:nan
2019-06-13 15:08:34,071 : Train time 00h 00m 05s Epoch 1 Batch 13 Frame 104 Avg_dis:nan
2019-06-13 15:08:34,489 : Train time 00h 00m 05s Epoch 1 Batch 14 Frame 112 Avg_dis:nan
2019-06-13 15:08:34,865 : Train time 00h 00m 05s Epoch 1 Batch 15 Frame 120 Avg_dis:nan
Nan values continue like that until the end.
I have no problem training with YCB the way you did so I think the trouble appeared because I am using an other dataset.
EDIT:
Looks like the problem could comes from the pred_c values.
At the beginning they are too close from 0. During the loss calculation there is this line:
loss = torch.mean((dis * pred_c - w * torch.log(pred_c)), dim=0)
and if at least one pred_c value is considered equal to 0 then torch.log as an -inf value, torch.mean returns inf and during the optimizer step an nan value appear.
However I don't know how to manage this issue, do you have any advice ?
Thank you for your help
Hey, I want to ask what the option "start_epoch" is used for? My understanding is that if the model was trained for example until epoch 42 and the process was terminated, next time it can be set to start from epoch 42 if the option "--start_epoch= 42" is given, is that correct?
I have observed that the "Avg_dist" is much higher than last time before the process is terminated, e.g., the model was trained until 42 epoch and the "Avg_dist" is like 0.00xx before the process is terminated, next time when the process is restarted, it's like 0,0x even if the "--start_epoch=42" is set. Is that normal?
Thanks in advance!
hi @j96w , i'm working on conda 2.7 , so i built the knn_torch with python 2 in conda environment and always get this problem with modules ,
set -e
export PYTHONUNBUFFERED=True
PYTHONUNBUFFERED=True
export CUDA_VISIBLE_DEVICES=0
CUDA_VISIBLE_DEVICES=0
python ./tools/train.py --dataset linemod --dataset_root ./datasets/linemod/Linemod_preprocessed
Traceback (most recent call last):
File "./tools/train.py", line 12, in
import numpy as np
ImportError: No module named numpy
this is the make by python2 :
..$ make
python2 build_ffi.py
generating /tmp/tmpmsC9ic/_knn_pytorch.c
setting the current directory to '/tmp/tmpmsC9ic'
running build_ext
building '_knn_pytorch' extension
creating home
creating home/hamza
creating home/hamza/Téléchargements
creating home/hamza/Téléchargements/DenseFusion-master
creating home/hamza/Téléchargements/DenseFusion-master/lib
creating home/hamza/Téléchargements/DenseFusion-master/lib/knn
creating home/hamza/Téléchargements/DenseFusion-master/lib/knn/src
gcc -pthread -B /home/hamza/anaconda2/envs/py27/compiler_compat -Wl,--sysroot=/ -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/home/hamza/anaconda2/envs/py27/lib/python2.7/site-packages/torch/utils/ffi/../../lib/include -I/home/hamza/anaconda2/envs/py27/lib/python2.7/site-packages/torch/utils/ffi/../../lib/include/TH -I/home/hamza/anaconda2/envs/py27/lib/python2.7/site-packages/torch/utils/ffi/../../lib/include/THC -I/usr/local/cuda/include -I/home/hamza/Téléchargements/DenseFusion-master/lib/knn/include -I/home/hamza/anaconda2/envs/py27/include/python2.7 -c _knn_pytorch.c -o ./_knn_pytorch.o -std=c99
gcc -pthread -B /home/hamza/anaconda2/envs/py27/compiler_compat -Wl,--sysroot=/ -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/home/hamza/anaconda2/envs/py27/lib/python2.7/site-packages/torch/utils/ffi/../../lib/include -I/home/hamza/anaconda2/envs/py27/lib/python2.7/site-packages/torch/utils/ffi/../../lib/include/TH -I/home/hamza/anaconda2/envs/py27/lib/python2.7/site-packages/torch/utils/ffi/../../lib/include/THC -I/usr/local/cuda/include -I/home/hamza/Téléchargements/DenseFusion-master/lib/knn/include -I/home/hamza/anaconda2/envs/py27/include/python2.7 -c /home/hamza/Téléchargements/DenseFusion-master/lib/knn/src/knn_pytorch.c -o ./home/hamza/Téléchargements/DenseFusion-master/lib/knn/src/knn_pytorch.o -std=c99
gcc -pthread -shared -B /home/hamza/anaconda2/envs/py27/compiler_compat -L/home/hamza/anaconda2/envs/py27/lib -Wl,-rpath=/home/hamza/anaconda2/envs/py27/lib -Wl,--no-as-needed -Wl,--sysroot=/ ./_knn_pytorch.o ./home/hamza/Téléchargements/DenseFusion-master/lib/knn/src/knn_pytorch.o /home/hamza/Téléchargements/DenseFusion-master/lib/knn/build/knn_cuda_kernel.so /usr/local/cuda/lib64/libnppitc_static.a /usr/local/cuda/lib64/libcublas_device.a /usr/local/cuda/lib64/libnppc_static.a /usr/local/cuda/lib64/libcusparse_static.a /usr/local/cuda/lib64/libcublas_static.a /usr/local/cuda/lib64/libcufftw_static.a /usr/local/cuda/lib64/libnppicom_static.a /usr/local/cuda/lib64/libnppicc_static.a /usr/local/cuda/lib64/libcufft_static.a /usr/local/cuda/lib64/libcudnn_static.a /usr/local/cuda/lib64/libnppist_static.a /usr/local/cuda/lib64/libcurand_static.a /usr/local/cuda/lib64/libnpps_static.a /usr/local/cuda/lib64/libnppig_static.a /usr/local/cuda/lib64/libcudadevrt.a /usr/local/cuda/lib64/libnppidei_static.a /usr/local/cuda/lib64/libnppisu_static.a /usr/local/cuda/lib64/libnvgraph_static.a /usr/local/cuda/lib64/libcusolver_static.a /usr/local/cuda/lib64/libnppif_static.a /usr/local/cuda/lib64/libnppim_static.a /usr/local/cuda/lib64/libculibos.a /usr/local/cuda/lib64/libcudart_static.a /usr/local/cuda/lib64/libnppial_static.a -L/home/hamza/anaconda2/envs/py27/lib -lpython2.7 -o ./_knn_pytorch.so
I'm working with linemod data
i tried it also outside conda environment , always the same problem . i don't know if i miss something !
thank you in advance
Hi @j96w , thank's for your work , i'm really stack on visualization part for days , could you just explain to me ( if it possible with enough details pls ) how to show , detected objects of linemod data on the videos files ( or data photos ) in real time ,
I trained data and also make an evaluation , and result was as expected , but i wan't this part of visualization in real time how to this see object mesh on videos , and also how to work with a camera in real time ,
Thank you in advance .
Hamza
Thank you very much for sharing the good code, I have run and test the code on YCB data correctly, I can also draw correct 3d box and point cloud on 2d-image plane. but when I use my own data to test the trained model on YCB, the segmentation result is correct but the pose estimation is not yet, the depth image is stored in png format, uint16, shown as following, could you give me some advice?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.