domluna / memn2n Goto Github PK
View Code? Open in Web Editor NEWEnd-To-End Memory Network using Tensorflow
License: MIT License
End-To-End Memory Network using Tensorflow
License: MIT License
I am trying to see the memory slot probabilities(probabilities associated with different sentences) for a particular query. Is there a way to visualize them ? Please help.
Thanks,
Joe
Should probably have a test somewhere!
Hi domluna,
How did you get the equation in position_encoding? It seems different from the one in the paper, unless I made a silly algebra mistake...
Even then, is there an advantage in splitting out the equation into the way you wrote it? Some sort of optimization?
Hello @domluna!
Thanks for your nice scripts. I have one question about this model. Do you know why some task results here are very different from the Facebook matlab one? (like task11,13,16) Is it because the initialization of model?
https://github.com/vinhkhuc/MemN2N-babi-python/tree/master/bechmarks
Thank you for your respond :)
if self._nonlin:
u_k = nonlin(u_k)
u.append(u_k)
Unresolved reference nonlin,how to fix it
Assuming I could define a custom gradient for the nil embedding, the memory a.k.a variables A, B, TA, and TB can be in a separate memory component.
The main benefit of this would be to more easily play around with different models around the memory.
On this line, it is mentioned there is not support for jagged arrays, but the new Tensorflow v2.1.0 has introduced RaggedTensor.
It would be nice if support for this feature can be provided in the current codebase.
Traceback (most recent call last):
File "joint.py", line 121, in
for start in range(0, n_train, n_train/20):
TypeError: 'float' object cannot be interpreted as an integer
This shows up after a few runs. single.py runs fine. Any idea why this could happen?
The full log is:
(mem-tf) skc@Ultron:~/Projects/qa-mem/tf-memn2n$ python joint.py
Started Joint Model
/Users/skc/anaconda/envs/mem-tf/lib/python3.5/re.py:203: FutureWarning: split() requires a non-empty pattern match.
return _compile(pattern, flags).split(string, maxsplit)
Longest sentence length 11
Longest story length 228
Average story length 9
Training Size 18000
Validation Size 2000
Testing Size 20000
(18000, 50, 11) (2000, 50, 11) (20000, 50, 11)
(18000, 11) (2000, 11) (20000, 11)
(18000, 175) (2000, 175) (20000, 175)
WARNING:tensorflow:tf.op_scope(values, name, default_name) is deprecated, use tf.name_scope(name, default_name, values)
WARNING:tensorflow:tf.op_scope(values, name, default_name) is deprecated, use tf.name_scope(name, default_name, values)
WARNING:tensorflow:tf.op_scope(values, name, default_name) is deprecated, use tf.name_scope(name, default_name, values)
WARNING:tensorflow:tf.op_scope(values, name, default_name) is deprecated, use tf.name_scope(name, default_name, values)
WARNING:tensorflow:tf.op_scope(values, name, default_name) is deprecated, use tf.name_scope(name, default_name, values)
WARNING:tensorflow:tf.op_scope(values, name, default_name) is deprecated, use tf.name_scope(name, default_name, values)
WARNING:tensorflow:tf.op_scope(values, name, default_name) is deprecated, use tf.name_scope(name, default_name, values)
Traceback (most recent call last):
File "joint.py", line 121, in
for start in range(0, n_train, n_train/20):
TypeError: 'float' object cannot be interpreted as an integer
Hi!
I have a question, can this model be used for the Dialog tasks?
My main concern is that Dialog tasks assume working in seq2seq mode, and I'm not sure if it's the same for the QA task.
Could you please provide some info on this?
Currently, because the nil embedding is 0 (which is fine) and which we pad to a specified memory size, we tend to have a bunch of memories which are empty [0 0 ... 0]. The problem with this is we feed this into a softmax as is and exp(0) = 1. On the output the empty memories have a uniform probability. This is problematic because it alters the probabilities of non-empty memories.
So the solution is to add a largish negative number to empty memories before sotfmax is applied. Then the exp() of the value will be 0 or close enough.
This issue is particularly evident in task 4 where each story consists of 2 sentences. If we make the memory size large, say 50 (only 2 is needed) 2 things tend to occur:
An alternative solution would be make all batch-size 1 (at least at a low level, higher level API can make this nicer). This way the memory can be of any size since nothing in the underlying algorithm relies on the memory being a fixed size (at least I think this is the case, have to double check!).
Hi, Thank you for your codes! It is very helpful.
I noticed a difference between your code and original paper. The paper uses embedding to get c
for each story, and directly add o
and u
to get the input of the predict layer or the u
for next layer in the case of multi-hop. In your code, c
is the given the same value as m
rather than otherwise recalculated. And o
is dot producted with a matrix you called H
before adding up with u
. I am wondering why you do it this way? I haven't tested the difference. Will it influence the performance?
Probably a good idea to change name so other models can be incorporated, maybe memory_models should make more sense?
n_train/20, n_val/20, and n_test/20 cause errors in python3.
I modified
n_train/20 -> n_train//20
n_val/20 -> n_val//20
n_test/20 -> n_test//20
and it works
with the test intention that
>>> tokenize('Bob dropped the apple. Where is the apple?')
['Bob', 'dropped', 'the', 'apple', '.', 'Where', 'is', 'the', 'apple', '?']
we should write like this:
def tokenize(sent):
return [x for x in re.findall(r"\w+(?:'\w+)?|[^\w\s]", sent)]
https://github.com/domluna/memn2n/blob/master/joint.py#L163
(needs import pandas as pd)
m_C = tf.reduce_sum(m_emb_C * self._encoding, 2)
c_temp = tf.transpose(m_C, [0, 2, 1])
Here in this part, the first line with reduce_sum should turn the matrix into 2-dimension, so I think it won't work for the transposition in the second line. I am not sure if I am getting something wrong
I follow the instruction to run single.py
but it fails. It would be better to add tutorial about how and where to download the data.
$ python ./single.py
Started Task: 1
Traceback (most recent call last):
File "./single.py", line 32, in <module>
train, test = load_task(FLAGS.data_dir, FLAGS.task_id)
File "/home/tobe/code/memn2n/data_utils.py", line 14, in load_task
files = os.listdir(data_dir)
OSError: [Errno 2] No such file or directory: 'data/tasks_1-20_v1-2/en/'
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.