To see memory slot probabilities

I am trying to see the memory slot probabilities(probabilities associated with different sentences) for a particular query. Is there a way to visualize them ? Please help.

Thanks,
Joe

Hi domluna,
How did you get the equation in position_encoding? It seems different from the one in the paper, unless I made a silly algebra mistake...
Even then, is there an advantage in splitting out the equation into the way you wrote it? Some sort of optimization?

Compare Results

Hello @domluna!
Thanks for your nice scripts. I have one question about this model. Do you know why some task results here are very different from the Facebook matlab one? (like task11,13,16) Is it because the initialization of model?
https://github.com/vinhkhuc/MemN2N-babi-python/tree/master/bechmarks
Thank you for your respond :)

something wrong at nonlin

nonlinearity

            if self._nonlin:
                u_k = nonlin(u_k)

            u.append(u_k)

Unresolved reference nonlin,how to fix it

Separate memory from model?

Assuming I could define a custom gradient for the nil embedding, the memory a.k.a variables A, B, TA, and TB can be in a separate memory component.

The main benefit of this would be to more easily play around with different models around the memory.

Add gradient noise

Support for Ragged/Jagged arrays

On this line, it is mentioned there is not support for jagged arrays, but the new Tensorflow v2.1.0 has introduced RaggedTensor.
It would be nice if support for this feature can be provided in the current codebase.

Figure out why accuracy is relatively stagnant

Support multi-word answers

is this possible to modeling multi turn dialog.......here is new dataset

https://nlp.stanford.edu/blog/a-new-multi-turn-multi-domain-task-oriented-dialogue-dataset/

running joint.py throws an error

Traceback (most recent call last):
File "joint.py", line 121, in
for start in range(0, n_train, n_train/20):
TypeError: 'float' object cannot be interpreted as an integer

This shows up after a few runs. single.py runs fine. Any idea why this could happen?

The full log is:

(mem-tf) skc@Ultron:~/Projects/qa-mem/tf-memn2n$ python joint.py
Started Joint Model
/Users/skc/anaconda/envs/mem-tf/lib/python3.5/re.py:203: FutureWarning: split() requires a non-empty pattern match.
return _compile(pattern, flags).split(string, maxsplit)
Longest sentence length 11
Longest story length 228
Average story length 9
Training Size 18000
Validation Size 2000
Testing Size 20000
(18000, 50, 11) (2000, 50, 11) (20000, 50, 11)
(18000, 11) (2000, 11) (20000, 11)
(18000, 175) (2000, 175) (20000, 175)
WARNING:tensorflow:tf.op_scope(values, name, default_name) is deprecated, use tf.name_scope(name, default_name, values)
WARNING:tensorflow:tf.op_scope(values, name, default_name) is deprecated, use tf.name_scope(name, default_name, values)
WARNING:tensorflow:tf.op_scope(values, name, default_name) is deprecated, use tf.name_scope(name, default_name, values)
WARNING:tensorflow:tf.op_scope(values, name, default_name) is deprecated, use tf.name_scope(name, default_name, values)
WARNING:tensorflow:tf.op_scope(values, name, default_name) is deprecated, use tf.name_scope(name, default_name, values)
WARNING:tensorflow:tf.op_scope(values, name, default_name) is deprecated, use tf.name_scope(name, default_name, values)
WARNING:tensorflow:tf.op_scope(values, name, default_name) is deprecated, use tf.name_scope(name, default_name, values)
Traceback (most recent call last):
File "joint.py", line 121, in
for start in range(0, n_train, n_train/20):
TypeError: 'float' object cannot be interpreted as an integer

Add Temporal Encoding

Dialog tasks

Hi!

I have a question, can this model be used for the Dialog tasks?
My main concern is that Dialog tasks assume working in seq2seq mode, and I'm not sure if it's the same for the QA task.
Could you please provide some info on this?

fix 0 logits in input module

Currently, because the nil embedding is 0 (which is fine) and which we pad to a specified memory size, we tend to have a bunch of memories which are empty [0 0 ... 0]. The problem with this is we feed this into a softmax as is and exp(0) = 1. On the output the empty memories have a uniform probability. This is problematic because it alters the probabilities of non-empty memories.

So the solution is to add a largish negative number to empty memories before sotfmax is applied. Then the exp() of the value will be 0 or close enough.

This issue is particularly evident in task 4 where each story consists of 2 sentences. If we make the memory size large, say 50 (only 2 is needed) 2 things tend to occur:

We converge at a much slower rate
We get a worse error rate

An alternative solution would be make all batch-size 1 (at least at a low level, higher level API can make this nicer). This way the memory can be of any size since nothing in the underlying algorithm relies on the memory being a fixed size (at least I think this is the case, have to double check!).

Add Linear start

how to run inference/test independently

Difference between code and paper

Hi, Thank you for your codes! It is very helpful.

I noticed a difference between your code and original paper. The paper uses embedding to get c for each story, and directly add o and u to get the input of the predict layer or the u for next layer in the case of multi-hop. In your code, c is the given the same value as m rather than otherwise recalculated. And o is dot producted with a matrix you called H before adding up with u. I am wondering why you do it this way? I haven't tested the difference. Will it influence the performance?

Change name to more general

Probably a good idea to change name so other models can be incorporated, maybe memory_models should make more sense?

Found joint.py errors

n_train/20, n_val/20, and n_test/20 cause errors in python3.

I modified
n_train/20 -> n_train//20
n_val/20 -> n_val//20
n_test/20 -> n_test//20
and it works

tokenize function code in data_utils.py is incorrect

with the test intention that

>>> tokenize('Bob dropped the apple. Where is the apple?')
    ['Bob', 'dropped', 'the', 'apple', '.', 'Where', 'is', 'the', 'apple', '?']

we should write like this:

def tokenize(sent):
    return [x for x in re.findall(r"\w+(?:'\w+)?|[^\w\s]", sent)]

$ python ./single.py
Started Task: 1
Traceback (most recent call last):
  File "./single.py", line 32, in <module>
    train, test = load_task(FLAGS.data_dir, FLAGS.task_id)
  File "/home/tobe/code/memn2n/data_utils.py", line 14, in load_task
    files = os.listdir(data_dir)
OSError: [Errno 2] No such file or directory: 'data/tasks_1-20_v1-2/en/'

domluna / memn2n Goto Github PK

memn2n's Issues

nonlinearity

Recommend Projects

Recommend Topics

Recommend Org