Coder Social home page Coder Social logo

codekansas / tinier-nn Goto Github PK

View Code? Open in Web Editor NEW
98.0 9.0 29.0 22 KB

:iphone: Binarized Neural Network TF training code + C matrix / eval library.

Home Page: https://gist.github.com/codekansas/3cb447e3d95ccac4c5a56ea7ffb079ce

C 50.21% Python 49.79%

tinier-nn's Introduction

tinier-nn

A tinier framework for deploying binarized neural networks on embedded systems.

I wrote a better version in C++ as a gist here.

About

The core of this framework is the use of the Binarized Neural Network (BNN) described in Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1. This framework seemed ideal for use with embedded systems such as an Arduino (or Raspberry Pi), but to my knowledge this wasn't already available.

The system consists of two parts:

  • train: TensorFlow code for building a BNN and saving it in a specific framework.
  • eval: The inference part, which runs on the system, is written in straight C. It reads the model into SRAM and performs matrix multiplications using a bitwise XOR, which (probably) leads to a big improvement in time and power consumption (although I haven't benchmarked anything).

The two sample scripts, train/model.py and eval/run.c demonstrate how to train a model to discriminate an XOR function. The model uses a lot more weights than would theoretically be necessary for this task, but together they demonstrate how to adapt the code to other use cases.

Demo

To run the demo, run:

make eval/run
cat models/model.def | eval/run

The outputs show the predictions for an XOR function.

To train the model, run:

python train/model.py --save_path models/model.def

This is how the models/model.def file was generated.

Math Notes

Encoding weights / activations with values of -1 and 1 as binary values: -1 -> 0, 1 -> 1. Then matrix multiplication done using the XOR operation. Here's an example:

Using binary weights and activations of -1 and 1:

  • Vector-matrix operation is [1, -1] * [1, -1; -1, 1] = [1 * 1 + -1 * -1, 1 * -1 + -1 * 1] = [2, -2]
  • After applying the binary activation function x > 0 ? 1 : -1 gives [1, -1]

Using binary weights and activations of -1 and 1:

  • Encoding the inputs as binary weights: [1, 0] * [1, 0; 0, 1]
  • Applying XOR + sum: [1 ^ 1 + 0 ^ 0; 1 ^ 0 + 0 ^ 1] = [0, 2]
  • Activation function then becomes x < (2 / 2) ? 1 : 0 which gives [1, 0]

Because the operations are done this way, I made it so that matrix dimensions must be multiples of the integer sizes. Padding can be used to make data line up correctly (although if someone wants to change this, LMK).

To Do

  • On most Arduinos, Flash memory is bigger than SRAM by about a factor of 32. So it's not too bad to encode models as characters instead of bits (and it makes them easier to debug). Although this is something that could be improved.
  • More examples would be awesome.
  • Matrix multiplication could be better, maybe.
  • Architectures besides feed-forward networks would be good.

This was a project that I worked on for CalHacks 3.0 (although I never submitted it).

tinier-nn's People

Contributors

codekansas avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tinier-nn's Issues

The number of hidden neuron warning?

In line 86-87
warnings.warn('Hidden layers should be multiples of '
'16, not %d' % num_hidden)
Why is there a restriction on the number on hidden neurons?
I am afraid that I could not find any such limitation in the paper?
Could you please explain Why?

Thanks
Sudarshan

Hi, tf.gradients vs tf.compute_gradients()

`
for p, prev_layer, w, bin_w in updates[::-1]:

    w_grad, loss_grad = tf.gradients(p, [bin_w, prev_layer], loss_grad)

    backprop_updates.append((w_grad, w))

`
Does tf.gradients have the same effect of using optimizer.compute_gradients(loss, var_list)?

The activation functions needs to be changed to Htanh() instead of tanh.

p=tf.tanh(wx)
Needs to be replaced by
p=tf.clip_by_value(wx,-1,1)
So, that it becomes the hard tanh function, as described in the paper and validates the straight through estimator.

I.Hubara et.al Binarized Neural Networks: Training Neural Networks with Weights and
Activations Constrained to +1 or -1.

Thanks
-Sudarshan

PS:- Great work done!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.