Coder Social home page Coder Social logo

densefusion-with-sphercial-convolution-encoder's Introduction

DenseFusion-with-sphercial-convolution-encoder

Internship to be carried out under the supervision of @drmateo 6D pose estimation method developed during a master course. This method is based on the architecture of the DenseFusion method (https://github.com/j96w/DenseFusion) with the addition of a spherical convolution based encoder inspired by the DualPoseNet method (https://github.com/Gorilla-Lab-SCUT/DualPoseNet). The method has been developed for the YCB-Video database with a possible extension to LineMod in the future. The initial encoder is coded in TensorFlow, but the Densefusion architecture is coded in PyTorch. So the new encoder does not use the same spherical convolution function (https://github.com/jonkhler/s2cnn/tree/master/s2cnn/soft) as the original encoder.
To realise this method a study of many methods was carried out. The methods to be studied are those present in the citation file. These methods use visual detection, tactile detection or both. As well as a tool to visualise the installation in the form of a video on the YCB database.

Requirements

  • Python 2.7/3.5/3.6 (If you want to use Python2.7 to run this repo, please rebuild the lib/knn/ (with PyTorch 0.4.1).)
  • PyTorch 0.4.1 (PyTroch 1.0 branch)
  • PIL
  • scipy
  • numpy
  • pyyaml
  • logging
  • matplotlib
  • CUDA 7.5/8.0/9.0 (Required. CPU-only will lead to extreme slow training speed because of the loss calculation of the symmetry objects (pixel-wise nearest neighbour loss).)

The requirements for the use of the network, for the visualization of the results it is necessary :

  • Python 3.7
  • OpenCV

Datasets

This work is tested on two 6D object pose estimation datasets:

  • YCB_Video Dataset: Training and Testing sets follow PoseCNN. The training set includes 80 training videos 0000-0047 & 0060-0091 (choosen by 7 frame as a gap in our training) and synthetic data 000000-079999. The testing set includes 2949 keyframes from 10 testing videos 0048-0059.

Architecture :

Untitled Diagram drawio (1)

Results :

seg2 segmentationresult

densefusion-with-sphercial-convolution-encoder's People

Contributors

camilletaglione avatar

Stargazers

Carlos M. Mateo avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.