martin-gorner / tensorflow-rnn-shakespeare Goto Github PK
View Code? Open in Web Editor NEWCode from the "Tensorflow and deep learning - without a PhD, Part 2" session on Recurrent Neural Networks.
License: Apache License 2.0
Code from the "Tensorflow and deep learning - without a PhD, Part 2" session on Recurrent Neural Networks.
License: Apache License 2.0
I receive this error:
ValueError: A logdir must be specified when db is not specified. Run
tensorboard --help for details and examples.
when I enter the command:
tensorboard --log-dir=log
How to fix??
When run rnn_train.py, I got the following error:
Traceback (most recent call last):
File "/tensorflow-rnn-shakespeare/rnn_train.py", line 148, in
txt.print_learning_learned_comparison(x, y, l, bookranges, bl, acc, epoch_size, step, epoch)
File "/tensorflow-rnn-shakespeare/my_txtutils.py", line 180, in print_learning_learned_comparison
footer = format_string.format('INDEX', 'BOOK NAME', 'TRAINING SEQUENCE', 'PREDICTED SEQUENCE', 'LOSS')
ValueError: Invalid conversion specification
I use Python 2.7.6 and tensorflow 1.1.0 on Ubuntu 14.04. How can I fix this? Any reply will be very much appreciated.
Hi Martin,
Thanks for sharing this great fun project. I downloaded the checkpoints. The code rnn_play.py has been running for about 2 hours now (Window 10, Python 3.5, tensorflow 1.1). It took my computer less than 30 minutes to run tensorflow Mnist deep learning code. I was wondering whether something is wrong and how long it will take to run rnn_play.py?
I just downloaded for fun, not a programmer, but I get:
File "rnn_train.py", line 176
print(chr(txt.convert_to_alphabet(rc)), end="")
^
SyntaxError: invalid syntax
When trying to run the training file.
I will ask about text level the lib training on it,
Is it word level or char level?
if word level, why some word are generated is not correct? (based on my dataset).
Thanks
Hello Martin thanks a lot for the awesome videos and resources.
Currently I get my train file to run, and pick up the books. However, when printing what it has read so far it snaps out of it with the following error:
Traceback (most recent call last):
File "C:\Users\Pc\Desktop\ensorflow-rnn-shakespeare-master\rnn_train.py", line 150, in
txt.print_learning_learned_comparison(x, y, l, bookranges, bl, acc, epoch_size, step, epoch)
File "C:\Users\Pc\Desktop\ensorflow-rnn-shakespeare-master\my_txtutils.py", line 175, in print_learning_learned_comparison
print(footer)
File "C:\Users\Pc\AppData\Local\Programs\Python\Python36\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-7: character maps to
Can you please give me a had to clear this up and continue learning this awesome content!
German G
Following up on our Twitter chat: I want to save a trained model from my text as a TensorFlow SavedModel. It'd be especially cool if that SavedModel can be accepted by tfjs-converter and run in the browser.
How I tried to export the model:
builder = tf.saved_model.builder.SavedModelBuilder('./sample')
with tf.Session() as sess:
# current code
...
...
builder.add_meta_graph_and_variables(sess, [tf.saved_model.tag_constants.TRAINING])
builder.save()
I was able to see and export one tag-set (train) but it came with no SignatureDefs / MetaGraphDef tags, which is what I'm supposed to select in this process: https://github.com/tensorflow/tfjs-converter
in "TensorFlow and Deep Learning without a PhD, Part 2" you have talked about the an RNN "Michellle C was born in Paris France.... " and then you wanted to perdict "his mother tounge " can it be done using this simillar model as this example? is it suitable for question - answer model?
I'm sorry that I'm adding it here I saw your demo in Next Tel Aviv and I didn't understand if it is the same kind of problem
This isn't an issue but a question on model understanding - please let me know if I should raise this somewhere else.
When training, we input a string of characters (length SEQLEN), and predict the next character one at a time. Our prediction at each step is softmax, i.e. probabilities that the prediction is any of ALPHASIZE characters. During training, we take the arg max of this distribution, and then calculate accuracy by comparing that prediction with ground truth. However, accuracy plateaus at ~65%, and if we look at predictions they're not fluent english (with 35% characters being wrong), even after many epochs.
For inference, we start with a random input (some character) and generate characters one at a time, using each generated character as input to the next time step. Here, our prediction is not the arg max of the softmax distribution, instead we randomly choose from the 'top n' probabilities ( 'sample_from_probabilities' function in my_txtutils.py). Because of this when we inference, the same weights that couldn't produce fluent english in training (and only 65% accuracy), can produce completely fluent english words and phrases, even after a few batches. What's the reason for this difference?
I thought the intention of 'sample_from_probabilities' is just to introduce randomness, so we can generate lots of different samples. However, arg max doesn't generate fluent English while 'sample_from_probabilities' does, so I'm confused how it does this.
Please let me know if I can clarify or if I've misunderstood anything.
I can successfully train on a different corpus with rnn_train.py and get these files in /checkpoints:
Unfortunately I am unable to use the saved checkpoint with rnn_play.py.
I changed the filepaths to the .meta and .data files above in rnn_play.py but get this error:
DataLossError (see above for traceback): Unable to open table file .\rnn_train_1487755124-1500000.data-00000-of-00001: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
I already checked GitHub and SO for possible answers but couldn´t solve it that way.
How can I fix this? Any help is very much appreciated.
I am not quite sure if this is possible, I might just not have understood the doc well.
So is it possible to continue training an previous checkpoint?
I couldn't figure out how to modify rnn_play.py to work with checkpoints saved via rnn_train_stateistuple.py
The feed_dict needs a different initial state instead of 'Hin:0': h
but I don't know how to get the equivalent of zerostate = multicell.zero_state()
When I run this example I get the following error:
ValueError: Attempt to reuse RNNCell <tensorflow.contrib.rnn.python.ops.core_rnn_cell_impl.GRUCell object at 0x115eb2630> with a different variable scope than its first use. First use of cell was with scope 'rnn/multi_rnn_cell/cell_0/gru_cell', this attempt is with scope 'rnn/multi_rnn_cell/cell_1/gru_cell'. Please create a new instance of the cell if you would like it to use a different set of weights. If before you were using: MultiRNNCell([GRUCell(...)] * num_layers), change to: MultiRNNCell([GRUCell(...) for _ in range(num_layers)]). If before you were using the same cell instance as both the forward and reverse cell of a bidirectional RNN, simply create two instances (one for forward, one for reverse). In May 2017, we will start transitioning this cell's behavior to use existing stored weights, if any, when it is called with scope=None (which can lead to silent model degradation, so this error will remain until then.)
I tried updating line 79 in rnn_train.py
to:
multicell = rnn.MultiRNNCell([dropcell for _ in range(NLAYERS)], state_is_tuple=False)
but that did not change anything.
I am running TensorFlow 1.1.0 on mac with Python 3.5 in a conda environment.
Hello,
After running
python rnn_train.py
I have the error
UnicodeEncodeError : 'charmap' codec can't encode character '\u2502' in position 44 : character maps to
I tried a lot of things but I am beginner and can't resolve this issue...
in rnn_play.py generate great text ,but i have following question ?
1- can add (seed text) can be word or multiple words to generate some text based on it ?
2- if the question in 1 is no, what the evaluation method can be used to evaluate the generated text ?
to more understand i went to generate for example Advertising campaign
based on Advertising campaign dataset , how can evaluate the generated text without seed text ?
Thank you
Can you provide an example on how to freeze the shakespeare-rnn? It would be nice to save the model into a .pb file.
I am hoping to use this in an application where the algorithm completes a sentence for you. Where in rnn_play.py is the seed text given?
I can easily understand part 1, which is to recognize MNIST handwirtten digits all the way up to 99.51% accuracy. I enjoy experimenting all the tips, learning rate, dropout, up to BN. But can't see what is this part 2 doing at all. I appreciate anyone who point it out a little.
Hi Martin,
first of all - thanks for the great job on the TF materials! Code, YT, slides, etc.. Very high quality and very well presented. IMHO best high-level TF materials available.
All types of mnist convolutions Pt.1 works flawlessly, but unfortunately RNN's won't start due to some, I suspect, libraries inconsistency in-between some recent upgrades. This affects your_txtutils.py
Btw, the rnn_play with downloaded checkpoints works perfectly.
Any clue of a quick fix to start training?
thx
commit: 5c3b931
(outrun) pawel@paweldebian:tensorflow-rnn-shakespeare$ python3 rnn_train.py
Traceback (most recent call last):
File "rnn_train.py", line 51, in <module>
codetext, valitext, bookranges = txt.read_data_files(shakedir, validation=True)
File "/home/pawel/M/outrun/tensorflow-rnn-shakespeare/my_txtutils.py", line 247, in read_data_files
shakelist = glob.glob(directory, recursive=True)
TypeError: glob() got an unexpected keyword argument 'recursive'
(outrun) pawel@paweldebian:tensorflow-rnn-shakespeare$ python3 --version
Python 3.4.2
(outrun) pawel@paweldebian:tensorflow-rnn-shakespeare$ git log | head -6
commit 5c3b9313b023a15d3f3ab786617c79b60cd043a1
Merge: 43b83b6 9a7fd70
Author: Martin Görner <[email protected]>
Date: Tue May 23 10:59:56 2017 +0200
Updates for TF 1.1
(outrun) pawel@paweldebian:tensorflow-rnn-shakespeare$
After cloning the repo and running python3 rnn_train.py
I get the following error:
Traceback (most recent call last):
File "rnn_train.py", line 148, in <module>
txt.print_learning_learned_comparison(x, y, l, bookranges, bl, acc, epoch_size, step, epoch)
File "/global/project/projectdirs/m1532/rafael/note_generator/my_txtutils.py", line 159, in print_learning_learned_comparison
print(print_string.format(decx, decy, loss_string))
UnicodeEncodeError: 'ascii' codec can't encode character '\u2502' in position 28: ordinal not in range(128)
I am using tensorflow version 1.8.
I've confirmed with another person that running this on Windows hits this error, so it isn't just me. It is probably a TensorFlow issue, since it works on Ubuntu, but I thought I'd create this issue in case others hit it.
tensorflow-rnn-shakespeare>python rnn_train.py
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\dso_loader.cc:128] successfully opened CUDA library cublas64_80.dll locally
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\dso_loader.cc:128] successfully opened CUDA library cudnn64_5.dll locally
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\dso_loader.cc:128] successfully opened CUDA library cufft64_80.dll locally
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\dso_loader.cc:128] successfully opened CUDA library nvcuda.dll locally
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\dso_loader.cc:128] successfully opened CUDA library curand64_80.dll locally
Loading file shakespeare\1kinghenryiv.txt
Loading file shakespeare\1kinghenryvi.txt
...snip...
Loading file shakespeare\winterstale.txt
Training text size is 4.90MB with 142.38KB set aside for validation. There will be 1712 batches per epoch
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\gpu\gpu_device.cc:885] Found device 0 with properties:
name: GeForce GTX 1070
major: 6 minor: 1 memoryClockRate (GHz) 1.7845
pciBusID 0000:01:00.0
Total memory: 8.00GiB
Free memory: 6.68GiB
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\gpu\gpu_device.cc:906] DMA: 0
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\gpu\gpu_device.cc:916] 0: Y
I c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\gpu\gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0)
E c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\common_runtime\gpu\gpu_device.cc:586] Could not identify NUMA node of /job:localhost/replica:0/task:0/gpu:0, defaulting to 0. Your kernel may not have been built with NUMA support.
E c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\cuda\cuda_blas.cc:372] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED
W c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\stream.cc:1390] attempting to perform BLAS operation using StreamExecutor without BLAS support
Traceback (most recent call last):
File "C:\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1021, in _do_call
return fn(*args)
File "C:\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1003, in _run_fn
status, run_metadata)
File "C:\Anaconda3\lib\contextlib.py", line 66, in __exit__
next(self.gen)
File "C:\Anaconda3\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 469, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InternalError: Blas SGEMM launch failed : a.shape=(100, 610), b.shape=(610, 1024), m=100, n=1024, k=610
[[Node: RNN/while/MultiRNNCell/Cell0/GRUCell/Gates/Linear/MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/gpu:0"](RNN/while/MultiRNNCell/Cell0/GRUCell/Gates/Linear/concat, RNN/while/MultiRNNCell/Cell0/GRUCell/Gates/Linear/MatMul/Enter)]]
[[Node: Y/_19 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_2756_Y", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
(errors continue)
Looks like cuda_blas.cc:372] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED
is the first error.
I'm running on Windows 10, Anaconda3 4.2.0, Python 3.5.2, tensorflow 0.12.0.rc0
Traceback (most recent call last):
File "rnn_train.py", line 76, in
onecell = rnn.GRUCell(INTERNALSIZE)
AttributeError: module 'tensorflow.contrib.rnn' has no attribute 'GRUCell'
hi! i've added a bunch of texts in russian for learning, just to see what will happen, but seems like there is some kind of limitation in the my_txtutils.py on utf8
how can i remove this limitations?
or is there any way around?
Any hints on what to change to accommodate more than US ASCII? I am working with cookbook text that freely mingles French words (like entrée and à la môde) as well as the "vulgar fraction" characters (like ⅔ and ¼). They're unicode characters UTF-8 encoded. It seems like there are two parallel functions (convert_from_alphabet()
and convert_to_alphabet()
) that need to be adjusted manually to match. I don't really feel like enumerating every single possible Unicode character I might encounter, and putting it in the alphabet manually, though. Is there a simpler way?
Must use MultiRNNCell on multiple instances of base cells, not multiple copies of the same cell
investigate if this does not have something to do with constant data in the tensorflow graph
I ran into some version errors with Tensorflow 1.0
AttributeError: module 'tensorflow.python.ops.nn' has no attribute 'rnn_cell'
(already mentioned by rogerallen)
ValueError: Only call
softmax_cross_entropy_with_logits
with named arguments (labels=..., logits=..., ...)
I had to change lines 75-78 of rnn_train.py to
onecell = tf.contrib.rnn.GRUCell(INTERNALSIZE) dropcell = tf.contrib.rnn.DropoutWrapper(onecell, input_keep_prob=pkeep) multicell = tf.contrib.rnn.MultiRNNCell([dropcell]*NLAYERS, state_is_tuple=False) multicell = tf.contrib.rnn.DropoutWrapper(multicell, output_keep_prob=pkeep)
and line 100 to
loss = tf.nn.softmax_cross_entropy_with_logits(logits = Ylogits, labels = Yflat_)
And btw: Thank you very much for your excellent presentation! Extremely helpful. And fun to watch...
I try run the code, but I meet this error. My Tensorflow version is Version: 1.0.1. Can you help me fix it?
Traceback (most recent call last):
File "rnn_train.py", line 76, in
onecell = rnn.GRUCell(INTERNALSIZE)
AttributeError: module 'tensorflow.contrib.rnn' has no attribute 'GRUCell'
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.