Comments (3)
Hi @zeka0, you're right - the implementation is the most basic version. As stated in the paper, there are a couple of extensions possible:
- Frame-wise batch normalization before or after Recurrent + ReLU
- Sequence-wise batch normalization before or after Recurrent + ReLU
- Residual connections between layers
Not all of these are feasible to be incorporated into the cell code. For example, for sequence-wise batch normalization, you need access to the activations of all time steps. However, in TensorFlow the cells (LSTMCell
, BasicRNNCell
etc.) compute the activation for only one time step at once. See examples/sequential_mnist.py
for an example of how to implement sequence-wise batch normalization.
Implementing residual connections should work with the current implementation as that doesn't depend on the actual cell (it could also be an LSTM cell, for example). tf.nn.rnn_cell.ResidualWrapper
provides a good solution.
What might make sense to be incorporated into the model is the frame-wise batch normalization before the Recurrent + ReLU part (doing BN after it is also possible right now). I will look into that - the downside would be that it bloats the code a bit for adding a specific use case.
I hope this helps!
By the way, did you fix the NaN problem you mentioned earlier?
from indrnn.
Thanks batzner! Your comment really teaches me something new!
The NaN problem is fixed if I set recurrent_max_abs to 1
When set to pow(2, 1.0 / seq_length) it becomes NaN.(In my code the seq_length is 100)
(I have also tried to use a smaller learning rate, but it didn't solve the problem)
Actually this is reason why I posted this issue, because I think batch normalization might fix this NaN problem as well.
from indrnn.
@zeka0 Hi, in the paper it says, "batch normalization, denoted as “BN”, can also be employed in the IndRNN network before or after the activation function". I did not try using two BNs, but I think there is no reason to do both. There is a comment on BN in the IndRNN at this repo. Generally, if you are using sequence-wise BN, BN after activation probably gives you a high performance. But if you are using frame-wise BN, BN before activation probably is better.
from indrnn.
Related Issues (16)
- Constrain (0, max) instead of (-max, max)? HOT 6
- ValueError Issu HOT 4
- Result of Sequential MNIST HOT 14
- AttributeError: module 'tensorflow.python.ops.rnn_cell_impl' has no attribute '_LayerRNNCell' HOT 1
- Please release the code of action recognition HOT 2
- Errors when used in bidirectional_dynamic_rnn HOT 2
- ReLU activation with IndyLSTMCell HOT 5
- Would you like to contribute this cell to TF's tf.contrib.rnn? HOT 2
- how to reimplement the action recognition experiment of the original paper HOT 2
- How to add dropout to indrnn cells HOT 2
- Dimension Mismatch ?
- Performance issue in the definition of build_rnn, examples/sequential_mnist.py HOT 1
- Performance issues in the program HOT 2
- Probably wrong indexing HOT 2
- Initialization of recurrent weight for the adding problem or similar problems HOT 14
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from indrnn.