seominjoon / qrn Goto Github PK
View Code? Open in Web Editor NEWQuery-Reduction Networks (QRN)
Home Page: http://uwnlp.github.io/qrn/
License: MIT License
Query-Reduction Networks (QRN)
Home Page: http://uwnlp.github.io/qrn/
License: MIT License
Hello,
I faced this problem below when I ran Task 5 in babi-dialog (other task 1-4 are fine). I checked the code since like the loss is nan in this case. Could you please help me with the issue?
InvalidArgumentError (see above for traceback): Nan in summary histogram for: HistogramSummary_8
[[Node: HistogramSummary_8 = HistogramSummary[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](HistogramSummary_8/tag, gpu_sync/average_gradients/Mean_8)]]
my python version is 3.5 and tensorflow is 0.11.0
@shmsw25 @seominjoon can you?
Config ID <absl.flags._flag.Flag object at 0x7f3fcc0c47b8>, task <absl.flags._flag.Flag object at 0x7f3fcc0c42b0>, 1 trials
Traceback (most recent call last):
File "/home/aniket/anaconda3/envs/py305/lib/python3.5/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/aniket/anaconda3/envs/py305/lib/python3.5/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/aniket/qrn/babi/main.py", line 272, in
tf.app.run()
File "/home/aniket/anaconda3/envs/py305/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 126, in run
_sys.exit(main(argv))
File "/home/aniket/qrn/babi/main.py", line 181, in main
summary = _main(config, num_trials)
File "/home/aniket/qrn/babi/main.py", line 191, in _main
load_metadata(config)
File "/home/aniket/qrn/babi/main.py", line 135, in load_metadata
data_dir = os.path.join(config.data_dir, config.lang + ("-10k" if config.large else ""))
TypeError: unsupported operand type(s) for +: 'Flag' and 'str'
How to solve this error? Please help me.
python3 -m babi_rnn.main --noload --task 3
qrn/my/tensorflow/rnn.py", line 459, in _assert_has_shape
return logging_ops.Assert(
AttributeError: module 'tensorflow.python.ops.logging_ops' has no attribute 'Assert'
it look some error on babi-rnn and dialog case
I get the following error when training the model for bAbI dialog task 5.
The command line args used are:
python dialog/main.py --load=False --task 5 --num_epochs 2 --data_dir "data/dialog-babi-tasks" --val_period 1 --save_period 1 --train=True --draft=True
The exact error is:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Received a label value of 292 which is outside the valid range of [0, 10). Label values: 0 0 0 0 0 0 0 4 227 292 0 0 0 4 0 0 0 7 0 1 0 32 9 0 0 0 0 0 0 0 0 0
[[Node: towers/gpu_0/loss/ans_loss/SparseSoftmaxCrossEntropyWithLogits_1/SparseSoftmaxCrossEntropyWithLogits = SparseSoftmaxCrossEntropyWithLogits[T=DT_FLOAT, Tlabels=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](towers/gpu_0/class/Linear_1/out1, towers/gpu_0/loss/ans_loss/Gather_2)]]
After going through the code, the answers placeholder is broken into 8 pieces, where each piece refers to a different part of answer here - https://github.com/uwnlp/qrn/blob/master/prepro-dialog.py#L232
So, we get logits for each part here separately as follows:
0 = {Tensor} Tensor("towers/gpu_0/class/Linear/out0:0", shape=(32, 15), dtype=float32, device=/device:GPU:0)
1 = {Tensor} Tensor("towers/gpu_0/class/Linear_1/out1:0", shape=(32, 10), dtype=float32, device=/device:GPU:0)
2 = {Tensor} Tensor("towers/gpu_0/class/Linear_2/out2:0", shape=(32, 10), dtype=float32, device=/device:GPU:0)
3 = {Tensor} Tensor("towers/gpu_0/class/Linear_3/out3:0", shape=(32, 4), dtype=float32, device=/device:GPU:0)
4 = {Tensor} Tensor("towers/gpu_0/class/Linear_4/out4:0", shape=(32, 3), dtype=float32, device=/device:GPU:0)
5 = {Tensor} Tensor("towers/gpu_0/class/Linear_5/out5:0", shape=(32, 674), dtype=float32, device=/device:GPU:0)
6 = {Tensor} Tensor("towers/gpu_0/class/Linear_6/out6:0", shape=(32, 645), dtype=float32, device=/device:GPU:0)
7 = {Tensor} Tensor("towers/gpu_0/class/Linear_7/out7:0", shape=(32, 2), dtype=float32, device=/device:GPU:0)
where the 2nd dimension refers to num_classes
for that piece of the answer if/when applicable. The 2nd dimension matches the size of dict for various positions in the answers
<class 'list'>: [15, 10, 10, 4, 3, 674, 645, 2]
, when pre-processing the dataset.
But, when I run the code, it throws the error mentioned above.
I'm using tensorflow 0.12.1
as 0.11
is deprecated now and there are no significant changes between the 2 releases as per - https://github.com/tensorflow/tensorflow/blob/master/RELEASE.md#release-0120
In the implementation of QRN for babi dialog it seems like the examples only include the bot(system) responses as the previous utterances (x_1,x_2,...,x_T ) in the dialog. Shouldn't it take the sequence of user utterances and the system utterances as the previous set of utterances?
Thanks in advance.
When I try to visualize the result of Dialog dataset after training I get the following error.
Traceback (most recent call last):
File "/home/prayalankar/anaconda3/envs/tyu/lib/python3.5/runpy.py", line 184, in _run_module_as_main
"main", mod_spec)
File "/home/prayalankar/anaconda3/envs/tyu/lib/python3.5/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/prayalankar/qrn/dialog/visualize_result.py", line 174, in
list_results(ARGS)
File "/home/prayalankar/qrn/dialog/visualize_result.py", line 88, in list_results
X, Q, Y, Y1, Y2, Y3, Y4, Y5, Y6, Y7 = data[:10]
ValueError: not enough values to unpack (expected 10, got 4)
python3 -m babi_rnn.main --noload --task 3
.....
WARNING:tensorflow:tf.op_scope(values, name, default_name) is deprecated, use tf.name_scope(name, default_name, values)
WARNING:tensorflow:tf.op_scope(values, name, default_name) is deprecated, use tf.name_scope(name, default_name, values)
Traceback (most recent call last):
File "/usr/local/Cellar/python3/3.6.0_1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/usr/local/Cellar/python3/3.6.0_1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/Users//qrn/babi_rnn/main.py", line 249, in
tf.app.run()
File "/Users//ve_tf0.11_py3/venv/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv[:1] + flags_passthrough))
File "/Users//qrn/babi_rnn/main.py", line 165, in main
summary = _main(config, num_trials)
File "/Users//qrn/babi_rnn/main.py", line 217, in _main
runner.initialize()
File "/Users//qrn/babi_rnn/base_model.py", line 71, in initialize
tower.initialize()
File "/Users//qrn/babi_rnn/model.py", line 165, in initialize
sequence_length=m_length, dtype='float', num_layers=L)
File "/Users//qrn/my/tensorflow/rnn.py", line 634, in dynamic_bidirectional_rnn
time_major=time_major, feed_prev_out=feed_prev_out, scope='FW')
File "/Users//qrn/my/tensorflow/rnn.py", line 488, in dynamic_rnn
swap_memory=swap_memory, sequence_length=sequence_length, feed_prev_out=feed_prev_out)
File "/Users/qrn/my/tensorflow/rnn.py", line 606, in _dynamic_rnn_loop
swap_memory=swap_memory)
File "/Users//ve_tf0.11_py3/venv/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2518, in while_loop
result = context.BuildLoop(cond, body, loop_vars, shape_invariants)
File "/Users//ve_tf0.11_py3/venv/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2356, in BuildLoop
pred, body, original_loop_vars, loop_vars, shape_invariants)
File "/Users//ve_tf0.11_py3/venv/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2337, in _BuildLoop
_EnforceShapeInvariant(m_var, n_var)
File "/Users//ve_tf0.11_py3/venv/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 578, in _EnforceShapeInvariant
% (merge_var.name, m_shape, n_shape))
ValueError: The shape for towers/gpu_0/networks/Bi-RNN/layer_0/FW/while/Merge_3:0 is not an invariant for the loop. It enters the loop with shape (32, 91), but has shape (32, 122) after one iteration. Provide shape invariants using either the shape_invariants
argument of tf.while_loop or set_shape() on the loop variables.
(venv) ali-186590cc37a5:qrn$
Everything works for me except the babi-dialog task6.
python -m prepro-dialog --task 6
python -m dialog.main --noload --task 6
Error message here:
Traceback (most recent call last):
File "/home/jason/anaconda2/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/home/jason/anaconda2/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/jason/qrn/dialog/main.py", line 281, in <module>
tf.app.run()
File "/home/jason/anaconda2/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "/home/jason/qrn/dialog/main.py", line 172, in main
summary = _main(config, num_trials)
File "/home/jason/qrn/dialog/main.py", line 238, in _main
runner.initialize()
File "dialog/base_model.py", line 65, in initialize
tower.initialize()
File "dialog/model.py", line 182, in initialize
A = Alist[0] if self.rnn else Alist[i]
IndexError: list index out of range
Can you help? Thanks :)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.