Coder Social home page Coder Social logo

dbn's Introduction

DBN (Deep Belief Network)

When trained on a set of exnamples without supervision, a DBN can learn to probabilistically reconstruct its inputs. The layers then act as feature detectors.After this learning step, a DBN can be further trained with supervision to perform classification. DBNs can be viewed as a composition of simple, unsupervised networks such as restricted Boltzmann machines (RBMs) or autoencoders,where each sub-network's hidden layer serves as the visible layer for the next. An RBM is an undirected, generative energy-based model with a "visible" input layer and a hidden layer and connections between but not within layers. This composition leads to a fast, layer-by-layer unsupervised training procedure, where contrastive divergence is applied to each sub-network in turn, starting from the "lowest" pair of layers (the lowest visible layer is a training set).

Training :-

The training method for RBMs proposed by Geoffrey Hinton for use with training "Product of Expert" models is called contrastive divergence (CD).Β CD provides an approximation to the maximum likelihood method that would ideally be applied for learning the weights.Β In training a single RBM, weight updates are performed with gradient descent via the following equation:

𝑀𝑖𝑗(t+1)=𝑀𝑖𝑗(𝑑)+πœ‚ (𝛿log(𝑝(𝑣)) / 𝛿𝑀𝑖𝑗)

where , p(v) is the probability vector , which is given by p(v) = 1/ 𝑍 βˆ‘ β„Žπ‘’βˆ’πΈ(𝑣,β„Ž) Z is the partition function (used for normalizing) and E(v,h) is the energy function assigned to the state of the network. A lower energy indicates the network is in a more "desirable" configuration. The gradient 𝛿log(𝑝(𝑣)) / 𝛿𝑀𝑖𝑗 has the simple form βŸ¨π‘£π‘–β„Žπ‘—βŸ©π‘‘π‘Žπ‘‘π‘Žβˆ’βŸ¨π‘£π‘–β„Žπ‘—βŸ©π‘šπ‘œπ‘‘π‘’π‘™ represent averages with respect to distribution p. The issue arises in sampling βŸ¨π‘£π‘–β„Žπ‘—βŸ©π‘šπ‘œπ‘‘π‘’π‘™ because this requires extended alternating Gibbs sampling. CD replaces this step by running alternating Gibbs sampling for n steps (values of n =1 performs well) .After n steps ,he data are sampled and that sample is used in place of βŸ¨π‘£π‘–β„Žπ‘—βŸ©π‘šπ‘œπ‘‘π‘’π‘™ The CD procedure works as follows: β€’ Initialize the visible units to a training vector. β€’ Update the hidden units in parallel given the visible units: 𝑝(β„Žπ‘—=1βˆ£π‘‰)=𝜎(𝑏𝑗+βˆ‘π‘– 𝑣𝑖 𝑀𝑖𝑗) 𝜎 is the sigmoid function 𝑏𝑗 is the bias of hj

β€’ Update the visible units in parallel given the hidden units 
    β—¦ 𝑝(𝑣𝑗=1∣𝐻)=𝜎(π‘Žπ‘—+βˆ‘π‘–β„Žπ‘–π‘€π‘–π‘—)

where aj is the bias of vi . β—¦ And This step is known as the reconstruction step.

β€’ Re-update the hidden units in parallel given the reconstructed visible units using the same equation as in step 2.
β€’ Perform the weight update
    β—¦ Ξ”π‘€π‘–π‘—π›ΌβŸ¨π‘£π‘–β„Žπ‘—βŸ©π‘‘π‘Žπ‘‘π‘Žβˆ’βŸ¨π‘£π‘–β„Žπ‘—βŸ©π‘Ÿπ‘’π‘π‘œπ‘›π‘ π‘‘π‘Ÿπ‘’π‘π‘‘π‘–π‘œπ‘›

Once an RBM is trained, another RBM is "stacked" atop it, taking its input from the final trained layer. The new visible layer is initialized to a training vector, and values for the units in the already-trained layers are assigned using the current weights and biases. The new RBM is then trained with the procedure above. This whole process is repeated until the desired stopping criterion is met.

dbn's People

Contributors

rahul-singh98 avatar

Stargazers

Rajendra Agrawal avatar Vinay Saurabh avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.