Coder Social home page Coder Social logo

mobmonrob / cnnregistrationstudien Goto Github PK

View Code? Open in Web Editor NEW
0.0 3.0 0.0 43.15 MB

Studienarbeit zum Thema Pointcloud-Registration mithilfe von Convolutional Neural Networks

License: MIT License

Python 0.07% Jupyter Notebook 99.92% Shell 0.01%

cnnregistrationstudien's Introduction

Registration of point clouds using Convolutional Neural Networks

A brief explanation of the project

This is a student project on the registration of point clouds based on Convolutional Neural Networks.

FCGF stands for "Fully Convolutional Geometric Features". It is a method for generating point-wise features for point cloud data that are rotation and translation invariant, and that can be used for tasks such as point cloud registration, classification, and segmentation.

One of the advantages of FCGF is that it does not rely on hand-crafted geometric features, which can be difficult to design for complex point cloud shapes. Instead, the method learns the features directly from the data using deep learning techniques. This makes it more flexible and adaptable to a wide range of point cloud data.

FCGF (Fully Convolutional Geometric Features) is designed to be architecture-agnostic and can work with any convolutional neural network (CNN) as a feature extractor. In the original paper, the authors used the PointNet architecture to extract features, but FCGF can be also used with other architectures. In this implementation a CNN architecture called ResUNet is used.

ResUNet is a neural network architecture that combines the idea of residual networks (ResNets) and U-nets. ResNets use residual blocks to allow for the successful training of very deep neural networks, while U-nets are commonly used for segmentation tasks in biomedical image analysis. By combining these two architectures, ResUNet is able to effectively learn features at different levels of abstraction while preserving fine-grained details.

In the context of the FCGF model, ResUNet is used to extract features from the point cloud data, which are then used to perform geometric matching and registration tasks.

For the transformation of two point clouds we use RANSAC (RANdom SAmple Consensus). The latter is an iterative algorithm used in image processing. In the context of alignment for two point clouds, RANSAC can be used to estimate the transformation that aligns the two point clouds. The algorithm works by randomly selecting a subset of points from each point cloud and estimating the transformation parameters using only those points. The algorithm then checks how many of the remaining points in each point cloud can be aligned within a certain error threshold using the estimated transformation. If enough points are aligned, the algorithm accepts the transformation and proceeds to the next iteration. If not enough points are aligned, the algorithm rejects the transformation and selects a new random subset of points.

FCGF was introduced in a paper called "Fully Convolutional Geometric Features" by Panos Achlioptas, Olga Diamanti, Ioannis Mitliagkas, and Leonidas Guibas, which was presented at the Conference on Computer Vision and Pattern Recognition (CVPR) in 2019.

Requirements

  • Ubuntu 20.04 or higher
  • CUDA 11.8 or higher
  • Python v3.7 or higher
  • Pytorch v2.0.0 or higher
  • MinkowskiEngine v0.5.4

Setting up the requirements

The project was implemented on Google Colab. To set up the project open the Jupyter Notebook (FCGF.ipynb) and follow the steps.

Google Colab Notebooks

  • Registration 3DMatch Redkitchen - Trying the demo data from the 3DMatch data set
  • Registration vol. 1 - First try to align 2 point clouds of 2 objects (Ruler, Headphones). The attempt was unsuccessful due to too much background noise.
  • Registration vol. 2 - Alignment of 3 different objects (R2D2, Drone, Rabbit). The overlap between the respective 2 point clouds was 90%, 60%, 30%.
  • Registration vol. 3 - Alignment of 5 different objects (Backpack, Fanny pack, Shoes, Keyboard, Skeleton). The overlap between the respective 2 point clouds was 60%, 30%, 10%, 5%.

Fully Convolutional Geometric Features, ICCV, 2019

Extracting geometric features from 3D scans or point clouds is the first step in applications such as registration, reconstruction, and tracking. State-of-the-art methods require computing low-level features as input or extracting patch-based features with limited receptive field. In this work, we present fully-convolutional geometric features, computed in a single pass by a 3D fully-convolutional network. We also present new metric learning losses that dramatically improve performance. Fully-convolutional geometric features are compact, capture broad spatial context, and scale to large scenes. We experimentally validate our approach on both indoor and outdoor datasets. Fully-convolutional geometric features achieve state-of-the-art accuracy without requiring prepossessing, are compact (32 dimensions), and are 600 times faster than the most accurate prior method.

ICCV'19 Paper

3D Feature Accuracy vs. Speed

Comparison Table Speed vs. Accuracy
Table Accuracy vs. Speed

Feature-match recall and speed in log scale on the 3DMatch benchmark. FCGF is the most accurate and the fastest. The gray region shows the Pareto frontier of the prior methods, where is represented FCGF (red triangle) as a nadir point, the pareto optimal solution.

Reading material

  • 2020-10-02 Measure the FCGF speedup on v0.5 on MinkowskiEngineBenchmark. The speedup ranges from 2.7x to 7.7x depending on the batch size.
  • 2020-09-04 Updates on ME v0.5 further speed up the inference time from 13.2ms to 11.8ms. As a reference, ME v0.4 takes 37ms.
  • 2020-08-18 Merged the v0.5 to the master with v0.5 installation. You can now use the full GPU support for sparse tensor hi-COO representation for faster training and inference.
  • 2020-08-07 MinkowskiEngine v0.5 improves the FCGF inference speed by x2.8 (280% speed-up, feed forward time for ResUNetBN2C on the 3DMatch kitchen point cloud ID-20: 37ms (ME v0.4.3) down to 13.2ms (ME v0.5.0). Measured on TitanXP, Ryzen-3700X).
  • 2020-06-15 Source code for Deep Global Registration, CVPR'20 Oral has been released. Please refer to the repository and the paper for using FCGF for registration.

Related Works

3DMatch by Zeng et al. uses a Siamese convolutional network to learn 3D patch descriptors. CGF by Khoury et al. maps 3D oriented histograms to a low-dimensional feature space using multi-layer perceptrons. PPFNet and PPF FoldNet by Deng et al. adapts the PointNet architecture for geometric feature description. 3DFeat by Yew and Lee uses a PointNet to extract features in outdoor scenes.

Our work addressed a number of limitations in the prior work. First, all prior approaches extract a small 3D patch or a set of points and map it to a low-dimensional space. This not only limits the receptive field of the network but is also computationally inefficient since all intermediate representations are computed separately even for overlapping 3D regions. Second, using expensive low-level geometric signatures as input can slow down feature computation. Lastly, limiting feature extraction to a subset of interest points results in lower spatial resolution for subsequent matching stages and can thus reduce registration accuracy.

Related Projects

Projects using FCGF

@inproceedings{FCGF2019,
    author = {Christopher Choy and Jaesik Park and Vladlen Koltun},
    title = {Fully Convolutional Geometric Features},
    booktitle = {ICCV},
    year = {2019},
}

cnnregistrationstudien's People

Contributors

floxdeveloper avatar monikag14 avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.