Coder Social home page Coder Social logo

complex_rnn's People

Contributors

amarshah avatar martinarjovsky avatar yoshua avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

complex_rnn's Issues

mathematics in reflection operator

Hi there, thanks for making your code public! I have been translating it to TensorFlow as an exercise and have therefore been reading it quite carefully.

I'm a little confused by some of the mathematics in the reflection function times_reflection (https://github.com/amarshah/complex_RNN/blob/master/models.py#L49). Basically, I'm either making a mistake in my own calculations, I didn't understand something, or the code is not quite right. So, apologies in advance for how long this is, I'm trying to be comprehensive.

As I understand it, this function is applying the typical reflection operator in a complex vector v, e.g. on an input vector h:

At the end of the above-mentioned function, there is:

output = T.inc_subtensor(output[:, :n_hidden], - 2. / vstarv * (a + b))
output = T.inc_subtensor(output[:, n_hidden:], - 2. / vstarv * (d - c))

(where vstarv is the squared norm of v)

So I conclude that

(let's call this equation 1)

Doing a bit of expanding: (sorry for ugly LaTeX)

Simplifying the appearance of this expression by rewriting the inner product terms (removing the dot and the vector symbol) to remind us these are just real scalars at this point, and gathering real and imaginary components:

So now we have:


(let's call this equation 2)

Looking back at the code, we have

a = T.outer(input_re_reflect_re - input_im_reflect_im, reflect_re)
b = T.outer(input_re_reflect_im + input_im_reflect_re, reflect_im)
c = T.outer(input_re_reflect_re - input_im_reflect_im, reflect_im)
d = T.outer(input_re_reflect_im + input_im_reflect_re, reflect_re)

LaTeXing these up for easier comparison (the outer, I think, should just be accounting for the fact that the first dimension of these input_... variables is the batch size, so this is all batched, which shouldn't matter for these calculations), and remembering that input is h and reflect is v in my notation:

Putting it together to get expressions for a+b and d-c...

According to equation 1 the right hand side here should equal the right hand side of equation 2, but... it doesn't seem to. What is going on? Some of the signs are wrong. Did I make a mistake?

I thought that perhaps the operator in question is not the typical reflection operator, but is in fact

(the dagger there denoting the conjugate transpose)
... but this also seems not to reproduce the expressions in the code [proof left to the reader]. I might also have made an error checking that, but my question would then be: if that is the intended operator, why conjugate transpose? Where does this operator come from?

Parameters for memory_problem

What's the parameters you use for memory_problem? Using the default, and changing type for x and y to float32, I'm getting

TypeError: Cannot convert Type TensorType(float32, matrix) (of Variable Subtensor{::, int32:int32:}.0) into Type TensorType(float32, 3D). You can try to manually convert Subtensor{::, int32:int32:}.0 into a TensorType(float32, 3D).

at

train = theano.function([index], costs[0], givens=givens, updates=updates)

Could you please add a license?

I like the simple, flat Theano code you've got here, but I am scared to write my own code with any reference to it because you haven't included a license (and that could mean you retain full copyright). Could you please put a license in the repository so I know to what extent I or anyone else can use this code?

Crashing at T.tile

At the line h_0_batch = T.tile(h_0, [x.shape[1], 1])

x.shape[1] is Subtensor{int64}.0

x is <TensorType(int32, matrix)>.

Executing T.tile gives the following error

ValueError: reps argument to tile must be a constant (e.g. tuple, list of integers)

I'm using all the default parameters when running memory_problem.py. What am I doing wrong?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.