Comments (5)
The code always builds a theano graph for both dropout and non-dropout versions of the network. The choice between dropout and no dropout at training time is made here https://github.com/mdenil/dropout/blob/master/mlp.py#L260 and here https://github.com/mdenil/dropout/blob/master/mlp.py#L314
For an explanation of dropout you should read the arxiv paper linked in the readme.
from dropout.
I think I shouldve phrased the question better. Even in your backprop network you have shared weights with the Dropout Layers as given here https://github.com/mdenil/dropout/blob/master/mlp.py#L130. Is there some motive behind this? Normally in backprop hidden layers would have automatically initialized their own weights without passing the values of W and b manually from the dropout layer, Right?
from dropout.
The dropout and non dropout layers share weights so that the same network can be evaluated both with and without dropout. If you compute dropout_cost
then you get a forward pass with dropout applied, but if you compute cost
then you get a forward pass through the same network with no dropout (and with appropriately scaled weights).
The computational graph looks like this:
cost/errors dropout_cost/dropout_errors
| |
HiddenLayers DropoutHiddenLayers <--- these share weights
\__ _____/
Input
This means that (when dropout=True
) we can differentiate with respect to the right pathway to get gradients (https://github.com/mdenil/dropout/blob/master/mlp.py#L260) but we can compute test error using the left pathway (https://github.com/mdenil/dropout/blob/master/mlp.py#L239). When dropout=False
the right pathway isn't used at all, but the code still builds the whole graph anyway.
from dropout.
Thanks for the explanation! Its definitely more comprehensible now. Whats the reason though for allowing someone using dropout to compute costs from the left pathway as well?
from dropout.
I use the right plathway to compute test error.
from dropout.
Related Issues (14)
- no bias in mlp.py HOT 2
- About the Resample Issue HOT 1
- dropout trainig doesn't work with over 3 hiddent layers
- Dropout rate should be set to 0 if not using dropout HOT 3
- Do all the weights multiply the included probability p during testing? HOT 1
- Why set the W by this formula W=layer.W / (1 - dropout_rates[layer_counter]) in testing? HOT 1
- License HOT 1
- Incorrect weight scaling on inputs
- Momentum bug
- Constrain weight matrix columns instead of rows HOT 1
- Random dropout at each mini-batch? HOT 8
- Momentum again HOT 2
- dropping output units rather than connections HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dropout.