amarshah / complex_rnn Goto Github PK

View Code? Open in Web Editor NEW

75.0 75.0 17.0 23.57 MB

unitary matrix for hidden to hidden layer

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%

complex_rnn's People

Contributors

Stargazers

Watchers

Forkers

krasin doctorteeth vseledkin ml-lab caomw sudanenator huan086 thrandis kirk86 wukailun iguanaus codeaudit fatmas1982 jolinxql rzel xdcesc

complex_rnn's Issues

mathematics in reflection operator

Hi there, thanks for making your code public! I have been translating it to TensorFlow as an exercise and have therefore been reading it quite carefully.

I'm a little confused by some of the mathematics in the reflection function times_reflection (https://github.com/amarshah/complex_RNN/blob/master/models.py#L49). Basically, I'm either making a mistake in my own calculations, I didn't understand something, or the code is not quite right. So, apologies in advance for how long this is, I'm trying to be comprehensive.

As I understand it, this function is applying the typical reflection operator in a complex vector v, e.g. on an input vector h:

At the end of the above-mentioned function, there is:

output = T.inc_subtensor(output[:, :n_hidden], - 2. / vstarv * (a + b))
output = T.inc_subtensor(output[:, n_hidden:], - 2. / vstarv * (d - c))

(where vstarv is the squared norm of v)

So I conclude that

(let's call this equation 1)

Doing a bit of expanding: (sorry for ugly LaTeX)

Simplifying the appearance of this expression by rewriting the inner product terms (removing the dot and the vector symbol) to remind us these are just real scalars at this point, and gathering real and imaginary components:

So now we have:

(let's call this equation 2)

Looking back at the code, we have

a = T.outer(input_re_reflect_re - input_im_reflect_im, reflect_re)
b = T.outer(input_re_reflect_im + input_im_reflect_re, reflect_im)
c = T.outer(input_re_reflect_re - input_im_reflect_im, reflect_im)
d = T.outer(input_re_reflect_im + input_im_reflect_re, reflect_re)

LaTeXing these up for easier comparison (the outer, I think, should just be accounting for the fact that the first dimension of these input_... variables is the batch size, so this is all batched, which shouldn't matter for these calculations), and remembering that input is h and reflect is v in my notation:

Putting it together to get expressions for a+b and d-c...

According to equation 1 the right hand side here should equal the right hand side of equation 2, but... it doesn't seem to. What is going on? Some of the signs are wrong. Did I make a mistake?

I thought that perhaps the operator in question is not the typical reflection operator, but is in fact

(the dagger there denoting the conjugate transpose)
... but this also seems not to reproduce the expressions in the code [proof left to the reader]. I might also have made an error checking that, but my question would then be: if that is the intended operator, why conjugate transpose? Where does this operator come from?

Parameters for memory_problem

What's the parameters you use for memory_problem? Using the default, and changing type for x and y to float32, I'm getting

TypeError: Cannot convert Type TensorType(float32, matrix) (of Variable Subtensor{::, int32:int32:}.0) into Type TensorType(float32, 3D). You can try to manually convert Subtensor{::, int32:int32:}.0 into a TensorType(float32, 3D).

train = theano.function([index], costs[0], givens=givens, updates=updates)

Could you please add a license?

I like the simple, flat Theano code you've got here, but I am scared to write my own code with any reference to it because you haven't included a license (and that could mean you retain full copyright). Could you please put a license in the repository so I know to what extent I or anyone else can use this code?

Crashing at T.tile

At the line h_0_batch = T.tile(h_0, [x.shape[1], 1])

x.shape[1] is Subtensor{int64}.0

x is <TensorType(int32, matrix)>.

Executing T.tile gives the following error

ValueError: reps argument to tile must be a constant (e.g. tuple, list of integers)

I'm using all the default parameters when running memory_problem.py. What am I doing wrong?

uRNN code

Hi Amar and Martin

I've read your uRNN paper and the following reddit discussion with great interest. At reddit you mention that you have a code that is 4x faster than the one on github.
https://www.reddit.com/r/MachineLearning/comments/3uk2q5/151106464_unitary_evolution_recurrent_neural/cxfwsqw

is the current github version the faster code or is that not released yet?

BR Casper

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.