Coder Social home page Coder Social logo

nickolasrm / bitsmnist.jl Goto Github PK

View Code? Open in Web Editor NEW
2.0 1.0 0.0 24 KB

Handwritten numbers predicted by bit neural networks

License: MIT License

Julia 100.00%
bit bnn neural-network bit-neural-networks deep-learning reinforcement-learning supervised-learning

bitsmnist.jl's Introduction

BitsMNIST.jl

Build Status Coverage Status codecov

Handwritten numbers predicted by bit neural networks

Introduction

Bit Neural Networks (BNNs) are a low memory consumption and low-end processors friendly alternative to float32 neural networks (FNNs). It uses a bit per parameter (weights, biases and features), stored in 64-bit floats instead of 32-bit float per parameter. Because of that, BNNs can achieve up to 64 times less memory consumption and up to 32 times speed up when compared to FNNs.

Usage

Downloading datasets

Binary Neural networks can accept floats as features. However, treating the dataset by defining explicitly what should become 0 or 1 (bits) is good to make sure of what relevant pixels are gonna be shown. You can download it through these commands:

Bits MNIST

Regular MNIST with bits defined by if pixel > avg_of_pixels_greater_than_zero, then 1, else 0.

dataset = BitsMNIST.Datasets.mnist()
Dict{String, Any} with 4 entries:
  "train_y" => [5, 0, 4, 1, 9,   
  "train_x" => BitVector[[0, 0, 0, 0, 0, ...
  "test_y"  => [7, 2, 1, 0, 4, ...
  "test_x"  => BitVector[[0, 0, 0, 0, 0, ...

Noisy Bits MNIST

The previous dataset, but added noise in it. if rand() > 0.3, then pixel = !pixel

dataset = BitsMNIST.Datasets.noisymnist()
Dict{String, Any} with 4 entries:
  "train_y" => [-1, -1, -1, -1, -1, ...
  "train_x" => BitVector[[0, 0, 0, 0, 0, ... 
  "test_y"  => [-1, -1, -1, -1, -1, ...
  "test_x"  => BitVector[[0, 0, 0, 0, 0, ...

All noisymnist labels have the value defined by the constant BitsMNIST.Datasets.NOISE_LABEL

Once you've downloaded a dataset, it will be stored in a cached folder, so that you'll not need to download it again.

ZeroOne

Predicting numbers from 0 to 9 can be a CPU intensive task. A simpler case instead can be predicting whether a number is 0 or 1. Let's check it out how to perform this.

First step: Download the dataset

dset = BitsMNIST.Datasets.mnist()

Sampling

After downloading the dataset you'll have to take a sample with zeros and ones. Happily, there's a sample function that will extract these examples in a 50/50 proportion.

Second step: Sampling

sx, sy = BitsMNIST.ZeroOne.sample(set["train_x"], set["train_y"], 0.01)
#0.01 is the fraction of the entire dataset
#Since the dataset has 60000 examples, 0.01*60000 will return 600 examples.

Defining your model

Through TinyML you can use bit layers to define your bit neural network. Also, you can, and you shall use it with Flux.

model = Chain(BitDense(784, 800), BitDense(800, 2, true, σ=sigmoid))
#784 is the number of pixels of an example
#800 is the number of hidden neurons
#2 is the number of classes we want to predict as outputs (0 or 1).

You don't have to import these tools, they are reexported by this project for you to work with.

Training Setup

There is a difficult regarding BNNs training. Since the steps of a gradient training are too small to adjust the parameters, an alternative training method should be used. Remember, BNNs parameters can only assume 0 or 1, which means, for example, an adjustment of 0.1 is not really possible to apply.

Gradient

In fact, by modifying gradient to approximate the steps into bits is a possibility [1] [2]. However, this approach is not yet implemented.

Reinforcement

As an alternative, reinforcement learning turns out to be a possibility, since the search space is dramatically reduced for these networks.

Evaluation function

The first step towards reinforcement learning is to define an evaluation function in order to distinguish when a model is more suited than another. Currently, you can do this by using two functions.

score_fitness = BitsMNIST.ZeroOne.Reinforcement.generate_score_fitness(sx, sy)

This first function increases the score of a model by summing the value of the respective output when predicted correctly. if predicted_correctly, then score += max(model_output)

mcc_fitness = BitsMNIST.ZeroOne.Reinforcement.generate_mcc_fitness(sx, sy)

This second function increases the score of a model by applying the Matthews correlation coefficient (MCC)

TrainingSet

Another required step before start training is to configure our genetic algorithm. We do this by creating a TinyML's Genetic TrainingSet

tset = Genetic.TrainingSet(
	model, #The model we are gonna train
	model.layers, #The layers we want it to optimize,
	mutationRate=0.05) #Mutation rate reduced to 0.05 for this problem

Other properties can also be configured, but for this example it is enough for what we want to test. Check out these settings at the TinyML page.

Training (The hardest part)

After all these steps we can finally train our model.

Genetic.train!(tset, genNumber=10)

The most boring part is to wait it finishing...

Statistics

Checklist: model defined - true, model trained - true. Wait, how can we say our model is trained without a metric? In this case we can call the functions inside the Statistics module in order to test how well our model is performing. Let's use the ZeroOne example to try this out.

Error

An easy metric to be visualized is the error. The error is defined as the percentage of error-ed predictions in the total number of examples.

BitsMNIST.Statistics.error(model, sx, sy)
# This will calculate the error percentage among the sample.
0.05333333333333334
#This means 5.33% of the 600 examples were predicted wrongly.

IO

Let's say you liked your model so much you want to send it to a friend. Well, that is possible through the use of the IO module.

Save

BitsMNIST.IO.save("./mymodel.jld2", model, tset)

Load

mymodel = BitsMNIST.IO.load("./mymodel.jld2")
Dict{String, Any} with 2 entries:
  "model" => Chain(BitDense(784, 800), BitDense(800, 2, σ=σ))
  "trainingset" => TrainingSet(popSize=100)

References

[1] Binary Neural Networks: A Survey

[2] XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks

[3] TinyML

[4] Flux

[5] Matthews correlation coefficient

bitsmnist.jl's People

Contributors

nickolasrm avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.