pbhatia243 / neural_conversation_models Goto Github PK
View Code? Open in Web Editor NEWTensorflow based Neural Conversation Models
License: Apache License 2.0
Tensorflow based Neural Conversation Models
License: Apache License 2.0
Hi,i am new to this deep learning
i am trying to execute train.py obtained from https://github.com/jocicmarko/ultrasound-nerve-segmentation/train.py
Any body there to help fix this problem, the error is in line
`` in get_unet()
67 print(conv5)
68
---> 69 up6 = concatenate([(UpSampling2D(size=(2, 2))(conv5), conv4)], axis=3)
70 conv6 = Conv2D(256, (3, 3), activation='relu', padding='same')(up6)
71 conv6 = Conv2D(256, (3, 3), activation='relu', padding='same')(conv6)
Thanks in advance
python neural_conversation_model.py --train_dir ubuntu/ --en_vocab_size 60000 --size 512 --data_path ubuntu/train.tsv --dev_data ubuntu/valid.tsv --vocab_path ubuntu/60k_vocan.en --attention
right now, it takes 3 days with a GTX1080 8G
Running epochs
global step 372800 learning rate 0.1905 step-time 0.91 perplexity 1.00
eval: bucket 0 perplexity 15943204.21
eval: bucket 1 perplexity 12256693.99
eval: bucket 2 perplexity 4716758.96
eval: bucket 3 perplexity 46789217.08
Running epochs
global step 373200 learning rate 0.1905 step-time 1.04 perplexity 1.00
rzai@rzai00:/prj/Neural_Conversation_Models$ python neural_conversation_model.py --train_dir ubuntu/ --en_vocab_size 60000 --size 512 --data_path ubuntu/train.tsv --dev_data ubuntu/valid.tsv --vocab_path ubuntu/60k_vocan.en --attention --decode --beam_search --beam_size 25/prj/Neural_Conversation_Models$
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcurand.so locally
I tensorflow/core/common_runtime/gpu/gpu_device.cc:951] Found device 0 with properties:
name: GeForce GTX 1080
major: 6 minor: 1 memoryClockRate (GHz) 1.7335
pciBusID 0000:02:00.0
Total memory: 7.92GiB
Free memory: 7.81GiB
W tensorflow/stream_executor/cuda/cuda_driver.cc:572] creating context when one is currently active; existing: 0x30e25d0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:951] Found device 1 with properties:
name: GeForce GTX 1080
major: 6 minor: 1 memoryClockRate (GHz) 1.7335
pciBusID 0000:01:00.0
Total memory: 7.92GiB
Free memory: 7.33GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:972] DMA: 0 1
I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] 0: Y Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] 1: Y Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:02:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Creating TensorFlow device (/gpu:1) -> (device: 1, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
WARNING:tensorflow:tf.op_scope(values, name, default_name) is deprecated, use tf.name_scope(name, default_name, values)
Attention Model
Symbols
60000
60000
Tensor("model_with_buckets/embedding_attention_seq2seq/concat:0", shape=(?, 5, 512), dtype=float32)
Check number of symbols
60000
Initial_state
Traceback (most recent call last):
File "neural_conversation_model.py", line 323, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv[:1] + flags_passthrough))
File "neural_conversation_model.py", line 318, in main
decode()
File "neural_conversation_model.py", line 230, in decode
model = create_model(sess, True, beam_search=beam_search, beam_size=beam_size, attention=attention)
File "neural_conversation_model.py", line 104, in create_model
forward_only=forward_only, beam_search=beam_search, beam_size=beam_size, attention=attention)
File "/home/rzai/prj/Neural_Conversation_Models/seq2seq_model.py", line 138, in init
softmax_loss_function=softmax_loss_function)
File "/home/rzai/prj/Neural_Conversation_Models/my_seq2seq.py", line 1044, in decode_model_with_buckets
decoder_inputs[:bucket[1]])
File "/home/rzai/prj/Neural_Conversation_Models/seq2seq_model.py", line 137, in
self.target_weights, buckets, lambda x, y: seq2seq_f(x, y, True),
File "/home/rzai/prj/Neural_Conversation_Models/seq2seq_model.py", line 101, in seq2seq_f
beam_size=beam_size )
File "/home/rzai/prj/Neural_Conversation_Models/my_seq2seq.py", line 839, in embedding_attention_seq2seq
initial_state_attention=initial_state_attention, beam_search=beam_search, beam_size=beam_size)
File "/home/rzai/prj/Neural_Conversation_Models/my_seq2seq.py", line 756, in embedding_attention_decoder
initial_state_attention=initial_state_attention, output_projection=output_projection, beam_size=beam_size)
File "/home/rzai/prj/Neural_Conversation_Models/my_seq2seq.py", line 600, in beam_attention_decoder
state_size = int(initial_state.get_shape().with_rank(2)[1])
AttributeError: 'tuple' object has no attribute 'get_shape'
rzai@rzai00:
Hi, I'm trying to get access to the logit scores of the reply variants produced by the beam search. I was expecting the variable 'output_logits' produced by the model.step function in neural_conversation_model.py ( https://github.com/pbhatia243/Neural_Conversation_Models/blob/master/neural_conversation_model.py#L253 ) to contain this information, but this doesn't seem to be the case in the block where beamsearch is turned on.
I'm using win10 system and python3.x tf0.12
It turns out in function beam_attention_decoder
the sentence
outputs.append(tf.argmax(nn_ops.xw_plus_b( output, output_projection[0], output_projection[1]), dimension=1))
should be outputs.append(output)
or the shape doesn't match in tf.nn.sampled_softmax_loss
.
Even when I solved this problem, the code still meets bugs in model.step
. I found that prev
in beam_attention_decoder
was (batch_size, input_size), which was (32,32). But after the loop_function
, it turned into shape (beam_size,input_size)(10,32). Thus the running failed.
Is this code really available?
if beam_search:
self.outputs, self.beam_path, self.beam_symbol = decode_model_with_buckets(
self.encoder_inputs, self.decoder_inputs, targets,
self.target_weights, buckets, lambda x, y: seq2seq_f(x, y, True),
softmax_loss_function=softmax_loss_function)
else:
# print self.decoder_inputs
self.outputs, self.losses = model_with_buckets(
self.encoder_inputs, self.decoder_inputs, targets,
self.target_weights, buckets, lambda x, y: seq2seq_f(x, y, True),
softmax_loss_function=softmax_loss_function)
# If we use output projection, we need to project outputs for decoding.
if output_projection is not None:
for b in xrange(len(buckets)):
self.outputs[b] = [
tf.matmul(output, output_projection[0]) + output_projection[1]
for output in self.outputs[b]
]
I updated tensorflow (now tensorflow-gpu==0.12.0rc0) and now I'm not being able to use beam search while decoding. It throws this error:
Traceback (most recent call last):
File "neural_conversation_model.py", line 323, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv[:1] + flags_passthrough))
File "neural_conversation_model.py", line 318, in main
decode()
File "neural_conversation_model.py", line 230, in decode
model = create_model(sess, True, beam_search=beam_search, beam_size=beam_size, attention=attention)
File "neural_conversation_model.py", line 104, in create_model
forward_only=forward_only, beam_search=beam_search, beam_size=beam_size, attention=attention)
File "/home/ubuntu/reminders_dump/seq_to_seq/seq2seq_model.py", line 138, in init
softmax_loss_function=softmax_loss_function)
File "/home/ubuntu/reminders_dump/seq_to_seq/my_seq2seq.py", line 1044, in decode_model_with_buckets
decoder_inputs[:bucket[1]])
File "/home/ubuntu/reminders_dump/seq_to_seq/seq2seq_model.py", line 137, in
self.target_weights, buckets, lambda x, y: seq2seq_f(x, y, True),
File "/home/ubuntu/reminders_dump/seq_to_seq/seq2seq_model.py", line 112, in seq2seq_f
beam_size=beam_size )
File "/home/ubuntu/reminders_dump/seq_to_seq/my_seq2seq.py", line 365, in embedding_rnn_seq2seq
feed_previous=feed_previous, beam_search=beam_search, beam_size=beam_size)
File "/home/ubuntu/reminders_dump/seq_to_seq/my_seq2seq.py", line 299, in embedding_rnn_decoder
loop_function=loop_function,output_projection=output_projection, beam_size=beam_size)
File "/home/ubuntu/reminders_dump/seq_to_seq/my_seq2seq.py", line 203, in beam_rnn_decoder
state_size = int(initial_state.get_shape().with_rank(2)[1])
AttributeError: 'tuple' object has no attribute 'get_shape'
Maybe this is related to : tensorflow/tensorflow#3056
mldl@mldlUB1604:/media/mldl/data1t/os_prj/Neural_Conversation_Models$ sudo python neural_conversation_model.py --train_dir ubuntu/ --en_vocab_size 60000 --size 512 --data_path ubuntu/train.tsv --dev_data ubuntu/valid.tsv --vocab_path ubuntu/60k_vocan.en --attention --decode --beam_search --beam_size 25
[sudo] password for mldl:
2017-02-03 02:41:44: I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcublas.so.8.0 locally
2017-02-03 02:41:44: I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcudnn.so.5 locally
2017-02-03 02:41:44: I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcufft.so.8.0 locally
2017-02-03 02:41:44: I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcuda.so.1 locally
2017-02-03 02:41:44: I tensorflow/stream_executor/dso_loader.cc:125] successfully opened CUDA library libcurand.so.8.0 locally
Traceback (most recent call last):
File "neural_conversation_model.py", line 29, in
from seq2seq_model import *
File "/media/mldl/data1t/os_prj/Neural_Conversation_Models/seq2seq_model.py", line 12, in
from my_seq2seq import *
File "/media/mldl/data1t/os_prj/Neural_Conversation_Models/my_seq2seq.py", line 41, in
from tensorflow.python.ops import rnn_cell
ImportError: cannot import name rnn_cell
mldl@mldlUB1604:/media/mldl/data1t/os_prj/Neural_Conversation_Models$
Could you please update the project with license clarification as I don't see any license terms on the project page. Thanks, Shobana
**else:
if beam_search:
return outputs[0], outputs[1], outputs[2:] # No gradient norm, loss, outputs.**
I have a doubt here: why are you returning 3 values, when the comment says just 2 values.
Hi ,
thanks for sharing code. i have some queries.
is this model handling some common issues related to seq2seq model on chatbot like replying generic responses (ex: 'Okay','No','Yes'..) and inconsistent replies for paraphrased contexts but with the same sense.
like
Q-where do you live now?
A-I live in Chicago.
Q-In which country do you live now?
A-Iceland, you?
...like this. inconsistent reply.
are you handling above issues ? if not what's your thought how can we handle these issues?
definitely this is not issues related to code. just don't have other option to get answer for above questions from you.
Thanks,
muna
Hi everyone,
I trained the model for a few epochs and tried to run it in interactive mode.
I used this command
python neural_conversation_model.py --train_dir ubuntu/ --en_vocab_size 60000 --size 512 --data_path ubuntu/train.tsv --dev_data ubuntu/valid.tsv --vocab_path ubuntu/60k_vocan.en --attention --decode --beam_search --beam_size 25
But I got this error
state_size = int(initial_state.get_shape().with_rank(2)[1])
AttributeError: 'tuple' object has no attribute 'get_shape'
Do you have an idea why ?
Thanks!
Cool code. In beam_attention_decoder, I get an error at line 615
s = math_ops.reduce_sum(
v[a] * math_ops.tanh(hidden_features[a] + y), [2, 3])
It looks like maybe y
takes the beams into account but hidden_features[a]
does not.
The stack trace looks like
File "/Users/jmugan/Box Sync/workspace/DeepLearning/Git/21CT_Translate/ObjectTranslatorTF/ncm_seq2seq.py", line 615, in attention
v[a] * math_ops.tanh(hidden_features[a] + y), [2, 3])
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/math_ops.py", line 518, in binary_op_wrapper
return func(x, y, name=name)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/gen_math_ops.py", line 44, in add
return _op_def_lib.apply_op("Add", x=x, y=y, name=name)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/op_def_library.py", line 655, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2156, in create_op
set_shapes_for_outputs(ret)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1612, in set_shapes_for_outputs
shapes = shape_func(op)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/math_ops.py", line 1390, in _BroadcastShape
% (shape_x, shape_y))
ValueError: Incompatible shapes for broadcasting: (32, 2, 1, 300) and (320, 1, 1, 300)
hello
Replies --------------------------------------->
pcm-2 socah pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 pcm-2
pcm-2 pcm-2 socah pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 pcm-2
pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 pcm-2
pcm-2 socah socah pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 pcm-2
pcm-2 socah pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 lng
pcm-2 socah socah pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 lng
pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 lng
pcm-2 pcm-2 socah pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 lng
reborn socah pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 pcm-2
reborn pcm-2 socah pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 pcm-2
reborn pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 pcm-2
reborn socah socah pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 pcm-2
pcm-2 socah socah socah pcm-2 pcm-2 pcm-2 pcm-2 pcm-2
pcm-2 socah pcm-2 socah pcm-2 pcm-2 pcm-2 pcm-2 pcm-2
pcm-2 pcm-2 pcm-2 socah pcm-2 pcm-2 pcm-2 pcm-2 pcm-2
pcm-2 pcm-2 socah socah pcm-2 pcm-2 pcm-2 pcm-2 pcm-2
reborn pcm-2 socah pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 lng
pcm-2 pcm-2 socah pcm-2 pcm-2 pcm-2 pcm-2 lng pcm-2
reborn socah pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 lng
pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 lng pcm-2
reborn socah socah pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 lng
reborn pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 lng
pcm-2 socah pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 lng pcm-2
pcm-2 socah socah pcm-2 pcm-2 pcm-2 pcm-2 lng pcm-2
appended pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 pcm-2 pcm-2
Hey.
Could you please give us some examples of the outputs your code generates?
I'm running on tensorflow 1.2.0-rc2 and got thie following error:
File "......\my_seq2seq.py", line 48, in
from tensorflow.python.ops.rnn_cell import _linear as linear
ImportError: cannot import name '_linear'
It also added:
AttributeError: module 'tensorflow.python.ops.rnn_cell' has no attribute 'linear'
while trying to decode the model ( trained using https://github.com/cmusphinx/g2p-seq2seq) with beam search I get the following error
Replies --------------------------------------->
Traceback (most recent call last):
File "neural_conversation_model.py", line 323, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv))
File "neural_conversation_model.py", line 318, in main
decode()
File "neural_conversation_model.py", line 275, in decode
rec = " ".join([tf.compat.as_str(rev_vocab[output]) for output in foutputs])
IndexError: list index out of range
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.