Coder Social home page Coder Social logo

caffe-theano-conversion's Introduction

caffe_theano_conversion

This is part of a project for CS231N at Stanford University, written by Ankit Kumar, Mathematics major, Class of 2015

This is a repository that allows you to convert pretrained caffe models into models in Lasagne, a thin wrapper around Theano. You can also convert a caffe model's architecture to an equivalent one in Lasagne. You do not need caffe installed to use this module.

Currently, the following caffe layers are supported:

* Convolution
* LRN
* Pooling
* Inner Product
* Relu
* Softmax

You can also load in a mean file using conversion's convert_mean_file function. Future work is to put that in the conversion of net architectures itself, which might have a data layer with transformparameter. That can be automated.

Right now, you have to put the cost layer on yourself, as well as do the backprop code. This is a future step for me, however, Lasagne is very easy to use and you can learn how to add your own stuff very easily. I want to keep this as configurable as possible because that's the benefit of theano.

DEPENDENCIES:

Theano (http://deeplearning.net/software/theano/) needs to be bleeding-edge:

pip install --upgrade --no-deps git+git://github.com/Theano/Theano.git

Lasagne (https://github.com/benanne/Lasagne), Google protobuf. Pylearn2 for cuda convnet wrappers (see below). If you have Caffe installed, you already have google protobuf, otherwise see here: https://code.google.com/p/protobuf/

##USING CUDA CONVNET WRAPPERS:

Install pylearn2 as develop:

git clone git://github.com/lisa-lab/pylearn2.git
cd pylearn2
python setup.py develop

The cuda-convnet wrappers in pylearn2 are much faster than the GPU implementations of convolutions in Theano. Lasagne has cuda-convnet layers, and I have created a caffe version of these layers. However, they require you to go into pylearn2 and change some of the files. I don't know what the best way to package that change in this repo is, so until someone tells me a better way I'll just describe what to do:

In this file: https://github.com/lisa-lab/pylearn2/blob/master/pylearn2/sandbox/cuda_convnet/base_acts.py, change:

class BaseActs(GpuOp):
    """
    Shared code for wrapping various convnet operations.
    """
    def __init__(self, pad=0, partial_sum=None, stride=1):

to:

class BaseActs(GpuOp):
    """
    Shared code for wrapping various convnet operations.
    """
    def __init__(self, pad=0, partial_sum=None, stride=1, numGroups=1):

Then, in that init function, change:

self.dense_connectivity = True

to:

if numGroups == 1:
	self.dense_connectivity = True
else:
	self.dense_connectivity = False

and add a line:

self.numGroups = numGroups

then, in this file: https://github.com/lisa-lab/pylearn2/blob/master/pylearn2/sandbox/cuda_convnet/filter_acts.py, change FilterActs c_code function from:

if self.dense_connectivity:
	basic_setup += """
    #define numGroups 1
    """

to:

if self.dense_connectivity:
	basic_setup += """
    #define numGroups 1
    """
else:
	basic_setup += """
	#define numGroups %s
	""" % self.numGroups

You should be able to not have that if statement at all, but I kept it in. You will have to do the same change (the #define numGroups) change for ImageActs and WeightActs as well if you want to backprop through the convolutional layers; if you just want to run the forward function, that isn't needed.

USAGE:

You can test the repo by python tests.py. All the tests should pass, but terminal is out of GPUs, so I haven't been able to run the tests.py script on GPU. However, I have used this repo with GPUs, and it worked, the only question is if it still works after I moved files around. If it doesn't let me know.

The file conversion.py has a function convert that takes a prototxt file and a caffemodel file, and returns a lasagne base model. I have plans to superclass base models with other models, for the purposes of training, but I haven't yet best figured out how to do this. This repo is still in active development while I use it for my project. If you have ideas for additions, let me know.

usage:

>>> from conversion import convert
>>> from models import *
>>> lmodel = convert('/path/to/deploy.prototxt', '/path/to/pretrained_model.caffemodel')
>>> dump(lmodel, 'filename')

caffe-theano-conversion's People

Contributors

an-kumar avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.