Coder Social home page Coder Social logo

Comments (3)

ShreyasSkandanS avatar ShreyasSkandanS commented on August 23, 2024 2

Hi @dattadebrup, creation of ground truth LiDAR points from the raw KITTI data is outside the scope of this paper / repository but I will try address some of your concerns either way.

  1. The ground truth KITTI data was generated by accumulating a sequence of individual LiDAR scans and then filtering this accumulated set of depth points. The LiDAR points are then projected onto the respective camera frame. The filtering step is not mentioned in extreme detail but I believe the gist of it is that the accumulated points are compared with stereo depth from Semi-Global Matching and any points that disagree heavily are discarded. The authors also mention a fair amount of hand-intervened cleaning up of the ground truth data.

  2. I will assume that the script that you have designed handles the projection of N different, sequential LiDAR frames onto a single camera frame. And I'm assuming that your seeing a lot of noise in the resulting "ground truth" image. And yes, this can be expected - this is due to mild mis-calibrations and partial/half occlusions etc. which is why the authors post-process the resulting image.

In case you haven't read these papers, I would recommend the following:
1. Are we ready for Autonomous Driving?The KITTI Vision Benchmark Suite - paper

To obtain a high stereo and optical flow ground truth density, we register a set of consecutive frames (5 before and 5 after the frame of interest) using ICP. We project the accumulated point clouds onto the image and automatically remove points falling outside the image. We then manually remove all ambiguous image regions such as windows and fences. Given the camera calibration, the corresponding disparity maps are readily computed.

2. Object Scene Flow for Autonomous Vehicles - paper

In absence of appropriate public datasets we annotated400 dynamic scenes from the KITTI raw dataset with optical flow and disparity ground truth in two consecutive frames. The process of ground truth generation is especially challenging in the presence of individually moving objects since they cannot be easily recovered from laser scanner data alone due to the rolling shutter of the Velodyne and the low frame rate (10 fps). Our annotation work-flow consists of two major steps: First, we recover the static background of the scene by removing all dynamic objects and compensating for the vehicle’s ego motion. Second, we re-insert the dynamic objects by fitting detailed CAD models to the point clouds in each frame.

  1. More importantly, for this paper and repository I assume ground truth data already exists and that is the case for both KITTI and Middlebury. For the PMD Monstar dataset in this paper, I provide only qualitative results and do not have ground truth present.

  2. If you're interested in pre-processed KITTI Raw data, I would also take a look at the KITTI Depth Completion benchmark data - Depth Completion

I'm closing this issue since it isn't relevant to this repository or paper, but feel free to raise another issue if you have any questions related to this work.

from stereo_sparse_depth_fusion.

ShreyasSkandanS avatar ShreyasSkandanS commented on August 23, 2024

Hi @dattadebrup,

gt_disparity.png is the ground truth image from the KITTI Stereo 2015 dataset (http://www.cvlibs.net/datasets/kitti/eval_scene_flow.php?benchmark=stereo). You shouldn't need to generate this image. Can you provide more details about the code you're trying to run?

Best regards,
SS

from stereo_sparse_depth_fusion.

dattadebrup avatar dattadebrup commented on August 23, 2024

Hi @ShreyasSkandanS ,
Thanks for the response .I am sorry for not explaining the problem in details. Actually I am trying to implement your fusion algorithm on the Raw data of KITTI (http://www.cvlibs.net/datasets/kitti/raw_data.php). So I am trying to create the gt_disparity.png image

Additionally, using the calibrated intrinsics and extrinsics, we convert the depth sensor’s range measurements into a depth image in the left camera’s reference frame with matching focal length.

using a script . Also , I have tried my best to shift the depth values of the depth image produced from the 3D LIDAR pointcloud to be in consistent with the stereo-only depth image. Still the resultant fusion images produced are bad.
So how should I create the gt_disparity.png image properly from the 3D LIDAR pointcloud data?

from stereo_sparse_depth_fusion.

Related Issues (6)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.