Coder Social home page Coder Social logo

end's Introduction

Building Neural Network from Scratch.

1 : We will create a neural network

image

In the given image, we are having:

  • Inputs : i1, i2
  • First hidden layer before activation : h1,h2
  • First hidden layer after activation : a_h1, a_h2
  • Output before activation function : o1, o2
  • Output after activation : a_o1, o_oh2
  • Weights : w1, w2, w3, w4, w5, w6, w7, w8
  • Error from first input : E1
  • Error from second input : E2
  • Error : E_total
  • Target : we have two targets, t1 = 0.01 and t2 = 0.99. target is just the output which we are getting here i.e a_o1, a_o2

Note: Do not have any biased Activation layer is not just in hidden layer, it is in all the layers.

Formulas :

  • h1 : w1*i1 + w2 *i2
  • h2 : w3i1 :w4i2
  • a_h1 : σ(h1)
  • a_h2 : σ(h2)
  • o1 : w5a_h1 + w6a_h2
  • o2 : w7a_h1 + w8a_h2
  • a_o1 : σ(o1)
  • a_o2 : σ(o2)
  • E1 = ½*(t1-a_o1)²
  • E2 = ½*(t2-a_o2)²
  • E_total = E1 + E2

2 : Calculate all values and put them in the table.

image

Forward Propagation

Values Given,

  • inputs : i1 = 0.05, i2 = 0.1
  • outputs : t1 = 0.01, t2 = 0.99
  • Weights : w1 = 0.15, w2 = 0.2, w3 = 0.25, w4 = 0.3, w5 = 0.4, w6 = 0.45, w7 = 0.5, w8 = 0.55

Values to be calculated,

  • We will be calculating these values using the above formulas in our excel sheet.
  • h1, h2, a_h1, a_h2, o1, o2 , a_o1, a_o2 , E1, E2, E_total

Till here, we were calulating forward propagation.Now we will be calculating backward propogation.

Backward propagation

To do backward propogation, we will start with ∂E_total/∂w5. In the above equation, we have removed E2 as it is not getting generated by w5. So we will not be using it. w5 is directly linked with E_total, but there are many steps in between. ∂(E1)/∂W5 has two things in between, i.e o1 and a_o1. We will be going through this route.

  • ∂E_total/∂w5 = ∂(E1 +E2)/∂W5
  • ∂E_total/∂w5 = ∂(E1)/∂W5

Chain rule,

∂(E1)/∂W5 = ∂(E1)/∂(a_o1) * ∂(a_o1)/∂(o1) * ∂(o1)/∂w5

Now we will calculate the values of the above output.

1.

  • ∂(E1)/∂(a_o1) = ∂(½*(t1-a_o1)²) / ∂(a_o1)
  • ∂(E1)/∂(a_o1) = (t1 – a_o1) * (-1)
  • ∂(E1)/∂(a_o1) = a_o1 - t1

2.

  • ∂(a_o1)/∂(o1) = ∂(σ(o1)) / ∂(o1)
  • ∂(a_o1)/∂(o1) = σ(o1) * (1 - σ(o1))
  • ∂(a_o1)/∂(o1) = a_o1 * (1 – a_o1)

3.

  • ∂(o1)/∂w5 = ∂(w5 * a_h1 + w6 * a_h2) / ∂w5
  • ∂(o1)/∂w5 = a_h1

Now,the equation will be :

  • ∂E_total / ∂w5 = (a_o1 – t1) * a_o1 * (1 – a_o1)) * a_h1

Similarly, we will find the equation for w6, w7, w8.

  • ∂E_total / ∂w6 = (a_o1 – t1) * a_o1 * (1 – a_o1)) * a_h2
  • ∂E_total / ∂w7 = (a_o2 – t2) * a_o2 * (1 – a_o2)) * a_h1
  • ∂E_total / ∂w8 = (a_o2 – t2) * a_o2 * (1 – a_o2)) * a_h2

Now, we will calculate values for w1, w2, w3, w4.

Before looking at w1, we will look at a_h1 value because w1 taking two routes i.e from a_o1 and a_o2 For that we will first calculate ∂E_total/∂a_h1 then will go to w1.

∂E_total / ∂a_h1 = ∂(E1 + E2) / ∂(a_h1)

This time we are having E1 as well as E2, as we have two routes.

∂(E1) / ∂(a_h1) = ∂E1/∂a_o1 * ∂a_o1/∂o1 * ∂o1/∂a_h1

  • ∂(E1) / ∂(a_h1) = (a_o1 -t1) * (a_o1) * (1-a_o1) *w5

    ∂(E2) / ∂(a_h1) = ∂E2/∂a_o2 * ∂a_o2/∂o2 * ∂o2/∂a_h1

  • ∂(E2) / ∂(a_h1) = (a_o2 -t2) * (a_o2) * (1-a_o2) *w7

    ∂E_total / ∂a_h1 = ∂(E1 + E2) / ∂(a_h1)

  • ∂E_total / ∂a_h1 = (a_o1 -t1) * (a_o1) * (1-a_o1) *w5 + (a_o2 -t2) * (a_o2) * (1-a_o2) *w7

Similary, we will calculate for ∂E_total / ∂a_h2

  • ∂E_total / ∂a_h2 = (a_o2 -t2) * (a_o2) * (1-a_o2) *w8 + (a_o1 -t1) * (a_o1) * (1-a_o1) *w6

Now we will calculate ∂E_total/∂w1

  • ∂E_total/∂w1 = E_total/a_o1 * a_o1/o1 * o1/a_h1 * a_h1/h1 * h1/w1
  • ∂E_total/∂w1 = ∂E_total/∂a_h1 * ∂a_h1/∂h1 * ∂h1/∂w1
  • ∂E_total/∂w1 = ∂E_total/∂a_h1 * a_h1 * (1-a_h1) * ∂h1/∂w1
  • ∂E_total/∂w1 = ∂E_total/∂a_h1 *a_h1 * (1- a_h1)*i1

Similarly calculate for w2, w3 and w4,

  • ∂E_total/∂w2 = ∂E_total/∂a_h1 *a_h1 * (1- a_h1)*i2
  • ∂E_total/∂w3 = ∂E_total/∂a_h2 *a_h2 * (1- a_h2)*i1
  • ∂E_total/∂w4 = ∂E_total/∂a_h2 *a_h2 * (1- a_h2)*i2

Now we are done with the calculation. Using these formulas we can find out the values of all the given variables. Once done with the table, we can see in the error column that the error is decreasing. Now we will check with the different learning rates.

We will see how the learning rate is affecting the converging rate. if we have small learning rate, it will slow down the speed of training model, and i give it too high, it can cause undesirable divergent behavior to your loss function. That is why we need to find the optimal learning rate.

image

image

image

image

image

image

This was all about Neural network from Scratch.

end's People

Contributors

priyasharma2427 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.