cersar / 3d_detection Goto Github PK

License: MIT License

Python 100.00%

3d-detection kitty

3d_detection's Introduction

3D_detection

This work is inspired by image-to-3d-bbox(https://github.com/experiencor/image-to-3d-bbox), which is an an implementation of the paper "3D Bounding Box Estimation Using Deep Learning and Geometry" (https://arxiv.org/abs/1612.00496).

Instead of using kitti's 3-D truth, i mainly make two supplements:
1、Compute 3-D box center by 2-D box and network's output
2、Compute theta_ray by 2-D box center
Besides, I make some changes to the code structure.

By now, there are still several problems to be solved, for example:
1、The number of situations is 256 in this work, whereas it is 64 in the paper.
2、When detecting, i use objects's truncated and occluded level in kitti's label file to decide whether to generate 3D box, whereas it is reasonable to generate these by the trained neural network.

This is just a raw version, welcome to share your ideas to improve it!

Result on kitti:

Useage:

If you want to train, after fixing paths in the train.py, just run:

python3 train.py

In this way, you can get your own weights file, or you can download the pretrained file from https://pan.cstcloud.cn/web/share.html?hash=7dct49xER5w
In the detection time, after fixing paths in the detection.py, just run:

python3 detection.py

3d_detection's People

Contributors

Stargazers

Watchers

3d_detection's Issues

What's the meaning of the file voc_dims.txt?

Hi, could you tell me what's the meaning of the array in voc_dims.txt? How do you get the result?

Not able to split the kitti dataset

please tell me the way of splitting the kitti dataset

where can I get the test dataset

I have tried to run detection.py, but I found that there is no testing labels and calib.txt in kitti datasets. Can you give me a link to download it

How long does training take?

I am training on Azure VM, with the same number of images that you gave in the dataset folder. However, for each epoch it shows ETA of 11 hours. Is there any way to reduce the training time?

There is a error when i try the project

when i download your pretrained weights file, and run the detection.py, there is a error, which is "Dimension 1 in both shapes must be equal, but are 4 and 12. Shapes are [256,4] and [256,12]. for 'Assign_99' (op: 'Assign') with input shapes: [256,4], [256,12]." what i do with the error? thanks a lot!

Please check the way computing anchors

Thanks for your great work, it's been very helpful to understand 3D Detection!

One thing, which I found the difference between your work and the original paper, is the way how to compute the anchors in def compute_anchors .
According to the paper, we need to find the residual correction angle that needs to be applied to the center of the bin. But it seem that you're computing the angle from each left/right boundary to the ground truth angle. Should we not get the angle from the center of the bin to the ground truth angle?

Please share your opinions, @cersar thanks in advance!

Pretrained Mode

Please share the pretrained model

Does there any updates or results?

Does there any updates now?

thanks for your excellent job

This work really helped me , good job~~

version of python and libraries

can you please specify which version of python(2.7 or 3.5+) and libraries versions used

What's the way to calculate the 'new_alpha'?

First, Thanks for your amazing work done!
I wonder,
In the code

            new_alpha = float(line[3]) + np.pi / 2.
            if new_alpha < 0:
                new_alpha = new_alpha + 2. * np.pi
            new_alpha = new_alpha - int(new_alpha / (2. * np.pi)) * (2. * np.pi)

Can you please explain how this calculation has been made?

Not able to detect new images

when i am passing new image which is not of kitti dataset ,,,,,error is coming saying not able to corresponding label file
why we need label file for testing purpose....kindly help me out sir

compile environment

Could you tell me the environment? which version of Keras, tensorflow， python? thanks !

Regarding the usage of calib.txt file data

Hi,
Thank you so much for sharing this work with us, it is remarkable.
I have a question regarding the calibration file's data.

As I read the README file, they recommended to use the P_rect_xx data, I am wondering why your work is only using the P2, as in the data is collected via the second cam?
if I want to perform the training/testing on different camera, what should I modify the P2 data? I am not that sure about the meaning of the elements of this data. Could you please give some suggestion based on your experience and providing more detailed description of each elements?

I am looking forward to your reply soon! :)

Kind Regards

Can you explain about post_processing.py ?

First, thanks for your great work @cersar . It's really worth following rather than any other's work.

I've been trying to understand your intentions and most of the codes, and it's very readable and easy to follow.
One thing I have difficulty to understand is, the compute_center() in post_processing.py.
Can you please comment on some core lines that each line indicates for?

I really need your help, looking forward to getting your reply.
Thanks in advance!

Much larger distance Error compared to paper

Thanks for the great work @cersar , it's been so helpful to understand the principle of 3d vision processing.
While I've been working on with your project, I've noticed that the 3D Location error is much larger than the paper shows. x and y distance errors are relatively fine, but the z coordinate error(forward direction) is quite different from the original paper. In the paper, it shows that there are about 1m error in 10-20m distance and 2m error in 20-30m, but I usually get between 2m - 10m error from your code.

The result of Paper:

The result of mine:

file:  000127.png
box_2D:  [591.44 175.51 657.28 239.12]
center:  [ 0.27  1.18 15.25]
center_gt:  [ 0.38  1.7  20.03]
dimensions:  [1.02 1.47 2.6 ]
dimensions_gt:  [1.61 1.66 3.2 ]

Can you figure it out and work out this issue?

Overall, the orientation and dimension estimations seem great, and also 3D -> 2D projection seems great in the image, but when I try checking it in bird's eye view which includes the distance information, I see huge error between ground truth and the estimation value.

detection.py No loop matching

Hi, thanks for your work.
when I running detection.py, I have an Error as below, could you give any advice?
Traceback (most recent call last):
File "/home/3D_detection-master/detection.py", line 83, in
points2D = gen_3D_box(yaw, dims, cam_to_img, box_2D)
File "/home/3D_detection-master/util/post_processing.py", line 112, in gen_3D_box
center = compute_center(points3D, rot_M, cam_to_img, box_2D, inds)
File "/home/3D_detection-master/util/post_processing.py", line 75, in compute_center
result = solve_least_squre(W, y)
File "/home/3D_detection-master/util/post_processing.py", line 28, in solve_least_squre
U, Sigma, VT = np.linalg.svd(W)
File "/home/.local/lib/python3.6/site-packages/numpy/linalg/linalg.py", line 1612, in svd
u, s, vh = gufunc(a, signature=signature, extobj=extobj)
TypeError: No loop matching the specified signature and casting
was found for ufunc svd_n_f

cersar / 3d_detection Goto Github PK

3d_detection's Introduction

3D_detection

Useage:

3d_detection's People

Contributors

Stargazers

Watchers

Forkers

3d_detection's Issues

Recommend Projects

Recommend Topics

Recommend Org