Coder Social home page Coder Social logo

geometry_processing's Introduction

Setup

Clone the repo, add the package to the python path, download python dependencies.

PYTHONPATH=$(pwd):$PYTHONPATH
git clone https://github.com/bradyz/geometry_processing.git

pip install -r requirements.txt
or
pip install --user -r requirements.txt

Data Dependencies

Modelnet with 25 viewpoints each - https://drive.google.com/open?id=0B0d9M5p2RxBqN0IzOXpudjMyTDQ

Our model's weights - https://drive.google.com/open?id=0B0d9M5p2RxBqMlNZOFg1YmlYR3c

Contents

  1. View Generator - take 2D projections of mesh files.
  2. Train CNN - fine tune a VGG-16 CNN on the new images.
  3. Classifier - train a SVM on the CNN features.
  4. References - papers and resources used.

View Generator

Given a model and a list of viewpoints - .png image files that correspond to 2D projection will be generated.

Preprocessing consists of centering the mesh, uniformly scaling the bounding box to a unit cube, and taking viewpoints that are centered at the centroid.

Currently there are 25 viewpoints being generated that fall around the unit sphere from 5 different phis and 5 different thetas (spherical coordinates).

Train CNN

The model used in this project is a VGG-16 with pretrained weights (ImageNet), with two additional layers fc1 (2048), fc2 (1024).

Training was done for 10 epochs on 100k training images (4000 meshes) over 10 labels of ModelNet10. The images were 224 x 224 rgb. Cross entropy loss was used in combination with a SGD optimizer with a batch size of 64. Training took approximately 5 hours a NVIDIA K40 gpu.

After training, classification accuracy, given a single pose, is at 80% on a test set of 20k images.

Classifier

The question asked is - given a mesh and several viewpoints, does it help to use all of the viewpoints (MVCNN), or does a selected subset of size k give better accuracy?

We use a one-vs-rest linear SVM, similar to MVCNN, to classify activation values of the final fc layer.

The current methods consist of the using the following (currently unimplemented) -

  • Sort by minimized entropy
  • Random K
  • FPS (farthest point selection) on sorted

References

Multi-view Convolutional Neural Networks for 3D Shape Recognition - https://arxiv.org/pdf/1505.00880.pdf Princeton ModelNet - http://modelnet.cs.princeton.edu/

geometry_processing's People

Contributors

bradyz avatar jasonlee27 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.