Coder Social home page Coder Social logo

memn2n's Issues

To see memory slot probabilities

I am trying to see the memory slot probabilities(probabilities associated with different sentences) for a particular query. Is there a way to visualize them ? Please help.

Thanks,
Joe

Test

Should probably have a test somewhere!

Position Encoding

Hi domluna,
How did you get the equation in position_encoding? It seems different from the one in the paper, unless I made a silly algebra mistake...
Even then, is there an advantage in splitting out the equation into the way you wrote it? Some sort of optimization?

Compare Results

Hello @domluna!
Thanks for your nice scripts. I have one question about this model. Do you know why some task results here are very different from the Facebook matlab one? (like task11,13,16) Is it because the initialization of model?
https://github.com/vinhkhuc/MemN2N-babi-python/tree/master/bechmarks
Thank you for your respond :)

something wrong at nonlin

nonlinearity

            if self._nonlin:
                u_k = nonlin(u_k)

            u.append(u_k)

Unresolved reference nonlin,how to fix it

Separate memory from model?

Assuming I could define a custom gradient for the nil embedding, the memory a.k.a variables A, B, TA, and TB can be in a separate memory component.

The main benefit of this would be to more easily play around with different models around the memory.

Support for Ragged/Jagged arrays

On this line, it is mentioned there is not support for jagged arrays, but the new Tensorflow v2.1.0 has introduced RaggedTensor.
It would be nice if support for this feature can be provided in the current codebase.

running joint.py throws an error

Traceback (most recent call last):
File "joint.py", line 121, in
for start in range(0, n_train, n_train/20):
TypeError: 'float' object cannot be interpreted as an integer

This shows up after a few runs. single.py runs fine. Any idea why this could happen?

The full log is:

(mem-tf) skc@Ultron:~/Projects/qa-mem/tf-memn2n$ python joint.py
Started Joint Model
/Users/skc/anaconda/envs/mem-tf/lib/python3.5/re.py:203: FutureWarning: split() requires a non-empty pattern match.
return _compile(pattern, flags).split(string, maxsplit)
Longest sentence length 11
Longest story length 228
Average story length 9
Training Size 18000
Validation Size 2000
Testing Size 20000
(18000, 50, 11) (2000, 50, 11) (20000, 50, 11)
(18000, 11) (2000, 11) (20000, 11)
(18000, 175) (2000, 175) (20000, 175)
WARNING:tensorflow:tf.op_scope(values, name, default_name) is deprecated, use tf.name_scope(name, default_name, values)
WARNING:tensorflow:tf.op_scope(values, name, default_name) is deprecated, use tf.name_scope(name, default_name, values)
WARNING:tensorflow:tf.op_scope(values, name, default_name) is deprecated, use tf.name_scope(name, default_name, values)
WARNING:tensorflow:tf.op_scope(values, name, default_name) is deprecated, use tf.name_scope(name, default_name, values)
WARNING:tensorflow:tf.op_scope(values, name, default_name) is deprecated, use tf.name_scope(name, default_name, values)
WARNING:tensorflow:tf.op_scope(values, name, default_name) is deprecated, use tf.name_scope(name, default_name, values)
WARNING:tensorflow:tf.op_scope(values, name, default_name) is deprecated, use tf.name_scope(name, default_name, values)
Traceback (most recent call last):
File "joint.py", line 121, in
for start in range(0, n_train, n_train/20):
TypeError: 'float' object cannot be interpreted as an integer

Dialog tasks

Hi!

I have a question, can this model be used for the Dialog tasks?
My main concern is that Dialog tasks assume working in seq2seq mode, and I'm not sure if it's the same for the QA task.
Could you please provide some info on this?

fix 0 logits in input module

Currently, because the nil embedding is 0 (which is fine) and which we pad to a specified memory size, we tend to have a bunch of memories which are empty [0 0 ... 0]. The problem with this is we feed this into a softmax as is and exp(0) = 1. On the output the empty memories have a uniform probability. This is problematic because it alters the probabilities of non-empty memories.

So the solution is to add a largish negative number to empty memories before sotfmax is applied. Then the exp() of the value will be 0 or close enough.

This issue is particularly evident in task 4 where each story consists of 2 sentences. If we make the memory size large, say 50 (only 2 is needed) 2 things tend to occur:

  1. We converge at a much slower rate
  2. We get a worse error rate

An alternative solution would be make all batch-size 1 (at least at a low level, higher level API can make this nicer). This way the memory can be of any size since nothing in the underlying algorithm relies on the memory being a fixed size (at least I think this is the case, have to double check!).

Difference between code and paper

Hi, Thank you for your codes! It is very helpful.

I noticed a difference between your code and original paper. The paper uses embedding to get c for each story, and directly add o and u to get the input of the predict layer or the u for next layer in the case of multi-hop. In your code, c is the given the same value as m rather than otherwise recalculated. And o is dot producted with a matrix you called H before adding up with u. I am wondering why you do it this way? I haven't tested the difference. Will it influence the performance?

Change name to more general

Probably a good idea to change name so other models can be incorporated, maybe memory_models should make more sense?

Found joint.py errors

n_train/20, n_val/20, and n_test/20 cause errors in python3.

I modified
n_train/20 -> n_train//20
n_val/20 -> n_val//20
n_test/20 -> n_test//20
and it works

tokenize function code in data_utils.py is incorrect

with the test intention that

>>> tokenize('Bob dropped the apple. Where is the apple?')
    ['Bob', 'dropped', 'the', 'apple', '.', 'Where', 'is', 'the', 'apple', '?']

we should write like this:

def tokenize(sent):
    return [x for x in re.findall(r"\w+(?:'\w+)?|[^\w\s]", sent)]

Puzzled about the attention part

m_C = tf.reduce_sum(m_emb_C * self._encoding, 2)
c_temp = tf.transpose(m_C, [0, 2, 1])

Here in this part, the first line with reduce_sum should turn the matrix into 2-dimension, so I think it won't work for the transposition in the second line. I am not sure if I am getting something wrong

Add tutorial to download data before running

I follow the instruction to run single.py but it fails. It would be better to add tutorial about how and where to download the data.

$ python ./single.py
Started Task: 1
Traceback (most recent call last):
  File "./single.py", line 32, in <module>
    train, test = load_task(FLAGS.data_dir, FLAGS.task_id)
  File "/home/tobe/code/memn2n/data_utils.py", line 14, in load_task
    files = os.listdir(data_dir)
OSError: [Errno 2] No such file or directory: 'data/tasks_1-20_v1-2/en/'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.