Coder Social home page Coder Social logo

deepgram / kur Goto Github PK

View Code? Open in Web Editor NEW
813.0 59.0 107.0 1.83 MB

Descriptive Deep Learning

License: Apache License 2.0

Python 100.00%
deep-learning deep-neural-networks speech-recognition deep-learning-tutorial machine-learning neural-networks neural-network image-recognition speech-to-text

kur's Introduction

image

Kur: Descriptive Deep Learning

BUILD LICENSE PYTHON GITTER

Introduction

Welcome to Kur! You've found the future of deep learning!

  • Install Kur easily with pip install kur.
  • Design, train, and evaluate models without ever needing to code.
  • Describe your model with easily understandable concepts.
  • Quickly explore better versions of your model with the power of the Jinja2 templating engine.
  • Supports Theano, TensorFlow, and PyTorch, and supports multi-GPU out-of-the-box.
  • COMING SOON: Share your models with the community, making it incredibly easy to collaborate on sophisticated models.

Go ahead and give it a whirl: Get the Code and then jump into the Examples! Then build your own model in our Tutorial. Remember to check out our homepage for complete documentation and the newest news.

Like us? Share!

What is Kur?

Kur is a system for quickly building and applying state-of-the-art deep learning models to new and exciting problems. Kur was designed to appeal to the entire machine learning community, from novices to veterans. It uses specification files that are simple to read and author, meaning that you can get started building sophisticated models without ever needing to code. Even so, Kur exposes a friendly and extensible API to support advanced deep learning architectures or workflows. Excited? Jump straight into the Examples.

Get the Code

Kur is really easy to install! You can pick either one of these two options for installing Kur.

NOTE: Kur requires Python 3.4 or greater. Take a look at our installation guide for step-by-step instructions for installing Kur and setting up a virtual environment.

Latest Pip Release

If you know what you are doing, then this is easy:

Latest Development Release

Just check it out and run the setup script:

Quick Start: Or, if you already have Python 3 installed, then here's a few quick-start lines to get you training your first model:

Quick Start For Using pip:

Quick Start For Using git:

Usage

If everything has gone well, you shoud be able to use Kur:

You'll typically be using Kur in commands like kur train model.yml or kur test model.yml. You'll see these in the Examples, which is where you should head to next!

Troubleshooting

If you run into any problems installing or using Kur, please check out our troubleshooting page for lots of useful help. And if you want more detailed installation instructions, with help on setting up your environment, before sure to see our installation page.

Examples

Let's look at some examples of how fun and easy Kur makes state-of-the-art deep learning.

MNIST: Handwriting recognition

Let's jump right in and see how awesome Kur is! The first example we'll look at is Yann LeCun's MNIST dataset. This is a dataset of 28x28 pixel images of individual handwritten digits between 0 and 9. The goal of our model will be to perform image recognition, tagging the image with the most likely digit it represents.

NOTE: As with most command line examples, lines preceded by $ are lines that you are supposed to type (followed by the ENTER key). Lines without an initial $ are lines which are printed to the screen (you don't type them).

First, you need to Get the Code! If you installed via pip, you'll need to checkout the examples directory from the repository, like this:

If you installed via git, then you alreay have the examples directory locally, so just move into the example directory:

Now let's train the MNIST model. This will download the data directly from the web, and then start training for 10 epochs.

$ kur train mnist.yml
Downloading: 100%|█████████████████████████████████| 9.91M/9.91M [03:44<00:00, 44.2Kbytes/s]
Downloading: 100%|█████████████████████████████████| 28.9K/28.9K [00:00<00:00, 66.1Kbytes/s]
Downloading: 100%|█████████████████████████████████| 1.65M/1.65M [00:31<00:00, 52.6Kbytes/s]
Downloading: 100%|█████████████████████████████████| 4.54K/4.54K [00:00<00:00, 19.8Kbytes/s]

Epoch 1/10, loss=1.524: 100%|███████████████████████| 480/480 [00:02<00:00, 254.97samples/s]
Validating, loss=0.829: 100%|█████████████████████| 3200/3200 [00:03<00:00, 889.91samples/s]

Epoch 2/10, loss=0.628: 100%|███████████████████████| 480/480 [00:02<00:00, 228.25samples/s]
Validating, loss=0.533: 100%|████████████████████| 3200/3200 [00:03<00:00, 1046.12samples/s]

Epoch 3/10, loss=0.547: 100%|███████████████████████| 480/480 [00:02<00:00, 185.77samples/s]
Validating, loss=0.491: 100%|████████████████████| 3200/3200 [00:03<00:00, 1030.57samples/s]

Epoch 4/10, loss=0.488: 100%|███████████████████████| 480/480 [00:02<00:00, 225.42samples/s]
Validating, loss=0.443: 100%|████████████████████| 3200/3200 [00:03<00:00, 1046.23samples/s]

Epoch 5/10, loss=0.464: 100%|███████████████████████| 480/480 [00:03<00:00, 115.17samples/s]
Validating, loss=0.403: 100%|█████████████████████| 3200/3200 [00:04<00:00, 799.46samples/s]

Epoch 6/10, loss=0.486: 100%|███████████████████████| 480/480 [00:03<00:00, 183.11samples/s]
Validating, loss=0.400: 100%|████████████████████| 3200/3200 [00:02<00:00, 1134.17samples/s]

Epoch 7/10, loss=0.369: 100%|███████████████████████| 480/480 [00:02<00:00, 214.10samples/s]
Validating, loss=0.366: 100%|█████████████████████| 3200/3200 [00:04<00:00, 735.61samples/s]

Epoch 8/10, loss=0.353: 100%|███████████████████████| 480/480 [00:03<00:00, 204.33samples/s]
Validating, loss=0.351: 100%|████████████████████| 3200/3200 [00:02<00:00, 1147.05samples/s]

Epoch 9/10, loss=0.399: 100%|███████████████████████| 480/480 [00:02<00:00, 219.17samples/s]
Validating, loss=0.343: 100%|████████████████████| 3200/3200 [00:02<00:00, 1149.07samples/s]

Epoch 10/10, loss=0.307: 100%|██████████████████████| 480/480 [00:02<00:00, 220.97samples/s]
Validating, loss=0.324: 100%|████████████████████| 3200/3200 [00:02<00:00, 1142.78samples/s]

What just happened? Kur downloaded the MNIST dataset from LeCun's website, and then trained a model for ten epochs. Awesome!

Now let's see how well our model actually performs:

Wow! Across the board, we already have 90% accuracy for recognizing handwritten digits, and we only used 0.8% of the training set! That's how awesome Kur is.

Excited yet? Read on!

NOTE: Clever readers will notice that each training epoch only used 480 training samples. But MNIST provides 60,000 training samples total, so what gives? Simple: lots of us are running this code on consumer hardware; in fact, I'm running this example on my tiny ultrabook on an Intel Core m7 CPU. As you'll see in Under the Hood, I truncate the training process to only train on 10 batches of 32 samples each, just to make the training loop finish in a reasonable amount of time. It's not cheating: you still get 90% accuracy! But if you have awesome hardware, or just want to see how good your accuracy can get, then by all means read on and we'll show you how to modify that.

Under the Hood

So what exactly is going on here? Let's take a look at the MNIST example specification file:

This is just plain, old YAML, a markup language meant to be easy for humans to interpret (for a good overview of YAML language features, look at the Ansible overview).

There's a section to put the data. That's this:

And then there's a spot to define your model:

And there is an "include" part that just contains some default settings (advanced users might want to tweak these---don't worry, it's still simple):

Very simple! Kur downloaded our data directly from LeCun's website for us, that's easy. But what goes into in a Kur model? Just a nice, gentle list of things you want your deep learning model to do. Let's break it down:

  • We have an input called images (yep, it's the same images from our train section).
  • We pass the input to a convolution layer.
  • We add a regularized linear unit ("ReLU") activation.
  • We collapse (flatten) the high-dimensional output of a convolution into a nice, flat, 1-dimensional shape appropriate for sending into the fully-connected layers.
  • We add a fully-connected (dense) layer with 10 outputs.
  • We add a softmax activation (appropriate for classification tasks like MNIST), and mark it as producing labels (name: labels).

And that's it! It's pretty naïve: one convolution + activation + fully-connected + activation. But it works: we got 90% accuracy after only showing it a small subset of the training set.

But let's think about make it more complicated. What if we want two convolutional layers instead? Easy! Just add another convolution section to the model. We'll also add in another non-linearity (ReLU activation) between the two convolutions.

We can also add more dense (fully-connected) layers. You probably want them separated by activation layers, too. So if we add a 32-node fully-connected layer to our model, it now looks like this:

Let's give it a try! Save your changes, a just run the same kur train mnist.yml and kur evaluate mnist.yml commands from before.

NOTE: A more complex model will likely need more data. So be sure to look at the tip in More Advanced Things to train on more of the data set.

If you want to know more, the YAML specification that Kur uses is described in greater detail in our Using Kur page.

More Advanced Things

The one line in the mnist.yml specification that we didn't cover is the include: mnist-defaults.yml line. This is just a convenient way for us to separate out the default behavior of the MNIST example.

If you tweak this file, probably the big thing you want to remove is the num_batches: 10 line, which is what limits training to just the first 10 batches every epoch. Just delete the line or comment it out, and Kur will train on the whole dataset.

A Better MNIST

90% is pretty good! But can we do better? Absolutely! Let's see how.

We need to build a more expressive, deeper model. We will use more convolutional layers, with occassional pooling layers.

So we have three convolutions with a 3-by-3 pooling layer in the middle, and two fully-connected layers. Try training this model: kur train mnist.yml. Then evaluate it to see how it does: kur eval mnist.yml. We got better than 95% by training on only 0.8% of the training set.

What happens if we give it more data? Like we mentioned above, we can adjust the amount of data we give Kur by twiddling the num_batches entry in the train section of mnist-defaults.yml. Let's try using 5% of the dataset. To do this, we'll set num_batches: 94 (because 5% of 60,000 is 3000, and for the default batch size of 32, this comes out to about 94 batches). Now try training and evaluating again. We got almost 98%!

Don't stop now, let's train on the whole thing (just remove the num_batches line altogether, or set num_batches: null). Still training only 10 epochs, we got 98.6%. Wow. Let's compare this to state of the art, which Yann LeCun tracks on the MNIST website. It looks like the best error rate also uses convolutions and achieved a 0.23% error rate (so 99.77% accuracy). With just a couple tweaks, we are already only a percent away from the world's best. Kur rocks.

CIFAR-10: Image Classification

Okay, MNIST was pretty cool, but Kur can do much, much more. Imagine if you wanted to have an arbitrary number of convolution layers. Imagine if each convolution should have a different number of kernels. Imagine if you truly want flexibility. You've come to the right place.

Flexibility: Variables

Kur uses an engine to determine how do variable substitution. Jinja2 is the default templating engine, and it is very powerful and extensible. Let's see how to use it!

Let's look at the CIFAR-10 dataset. This is a image classification dataset of small 32 by 32 pixel color (RGB) images, each with one of ten classes (airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck). You might decide to start with a very similar model to the MNIST example:

We will start with a simple modification: let's make the convolution size a variable, so we can easily change it later. We can do it like this:

Okay, what just happened? First, we added a settings: section. This section is the appropriate place to declare variables, settings, and hyperparameters that will be used by the model (or for training, evaluation, etc.). We declared a variable named cnn with a nested size variable. In Python, this would be equivalent to a dictionary: {"cnn": {"size": [3, 3]}}.

Then we used the variable in the model's convolution layer: size: "{{ cnn.size }}". This is standard Jinja2 grammar. The double-brackets indicate that variable substitution should take place (without the brackets, we would accidently assign size to the literal string "cnn.size", which doesn't make sense). The variable we grab is cnn.size, corresponding to the variables we added in the settings section.

Cool! So we can use variables now. But how does that help us? It seems like we just made it more complicated. Well, let's imagine if we added another convolution layer. We already know how to add extra convolutions by just adding another convolution block (and usually you want another activation: relu layer, too). So this would look like:

Ah! So now we can see why variablizing the convolution size was nice: if we want to play with a model that uses different size kernels, we only need to edit one line instead of two.

But there are still two problems we might encounter:

  • What if we wanted to try out lots of models with different numbers of convolutions?
  • What if we wanted to use different size or kernel values in each convolution?

Kur can do it!

Flexibility: Loops

Let's address the first problem: what if we want to make the number of convolutions? Kur supports many "meta-layers" that it calls "operators." A very simple operator is the classic "for" loop. This allows us to add many convolution + activation layers at once. It looks like this:

This is equivalent to the version without the "for" loop. The for: loop tells us to do everything in the iterate: section twice. (Why twice? Because range: 2.) And of course, we can variabilize the number of iterations like this:

Think about this for a minute. Does it make sense? It should. The model looks like this:

  • An input layer of images.
  • A number of convolution and activation layers. How many? cnn.layers, so 2.
  • The rest of the model is as expected: a dense operation followed by an activation.

Flexibility: Variable-length Loops

So we solved the problem of allowing for a variable number of convolutions. But what if each convolution should use a different number of kernels (or sizes, etc.)? Well, Kur can happily handle this, too. In fact, the for: loop already does most of the work. Every for: loop creates its own "local" variable to let you know which iteration it is on. The default name for this variable is index. So if we want to use a different number of kernels for each convolution, we can do this:

Again, this is just Jinja2 substitution: we are asking for the index-th element of the cnn.kernels list. Each iteration of the for: loop therefore grabs a different value for kernels:. Cool, huh?

But we can do one better.

Flexibility: Filters

The annoying thing about our current model is that nothing forces the layers value to be the same as the length of the kernels variable. If you make really long (like, length seventeen) but leave layers at two, you probably made a mistake. (Why did you put in seventeen layers but then only use the first two in the loop?) What you really want is to make sure that layers is set to the length of the kernels list. Or put another way, you want add as many convolutions as you have kernels in the list.

Jinja2 supports a concept called "filters," which are basically functions that you can apply to objects. You can even define your own filters. But what we want right now is a way to get the length of a list. It's easy and it looks like this:

You'll notice that the layers variable is gone, and we have this funky |length thing in the "for" loop's range. This is standard Jinja2: the length filter returns the length of a list. So now we are asking the "for" loop to iterate as many times as we have another kernel size.

This is really cool if you think about it. You want to add another convolution to the network? All you do is add it's size to the kernels list. And look! You're model is now more general, more reuseable. You could have used the same model for MNIST! Or CIFAR! Or many different applications.

This is the heart of the Kur philosophy: you should describe your model once and simply. The specification describes* your model: a bunch of convolutions and then a fully-connected layer. You can specify the details (how many convolutions, their parameters, etc.) elsewhere. The model should stay elegant.

NOTE: Of course, it isn't always easy to write reusable models. And the learning curve can get in the way. When we say that models should be "simple," we don't mean that you don't need to think about it. We mean that it should be simple to use, simple to modify, and simple to share. A more general model is elegant: making changes to it is easy (you only modify the settings). And this makes it easier to reuse in new contexts or to share with the community. Simplicity is power.

Actually Training a CIFAR-10 Model

Great, we now have a simple, but powerful and general model. Let's train it. As before, you'll need to cd examples first.

Again, evaluation is just as simple:

Advanced Features

The cifar.yml specification file is more complicated than the MNIST one, mostly to expose you to some more knobs you can tweak. For example, you'll see these lines in the train section:

As in the MNIST case, num_batches tells Kur to only train on that many batches of data each epoch (mostly so that if you don't have a nice GPU, the example still finishes in a reasonable amount of time). The batch_size value indicates the number of training samples that should be used in each batch.

The train section also has a log: cifar-log line. This tells Kur to save a log file to cifar-log (in the current working directory). This log contains lots of interesting information about current training loss, batch loss, and the number of epochs. By default, they are binary-encoded files, but you can load them using the Kur API (in Python 3):

where LOG_PATH is the path to the log file (e.g., cifar-log) and STATISTIC is one of the logged statistics. data will be a Numpy array. To find available statistics, just list the available files in the LOG_PATH, like this:

For an example of using this log data, see our Tutorial.

Another difference from the MNIST examples is that there are more files referring to weights in the CIFAR specification. For example, in the validate section there is:

This tells Kur to save the best models weights (corresponding to the lowest loss on the validation set) to cifar.best.valid.w. Similarly, in the train section there is this:

The initial key tells Kur to try and load cifar.best.valid.w (the best weights with respect to the validation loss) at the beginning of training. If this file doesn't exist, nothing happens. This means that if you run the training cycle many times (with many calls to kur train cifar.yml), you always "restart" from the best model weights.

We are also saving the best weights (with respect to the training loss) to cifar.best.train.w. The most recent weights are saved to cifar.last.w.

NOTE: The weights depend on the model architecture. Say you you train CIFAR and produce cifar.best.valid.w. Then you tweak the model in the specification file. If you try to resume training (kur train cifar.yml), Kur will try to load cifar.best.valid.w. But the weights many not fit the new architecture! So, to be safe, you should always delete (or backup) your weight files before trying to train a fresh, tweaked model. In a production environment, you probably want to have different sub-directories for each variation/tweak to the model so that you never run into this problem.

The CIFAR-10 example also explicitly specifies an optimizer in the train section:

The optimizer function is set in the name field and all other parameters (such as learning_rate) are defined in the other fields. You can safely change the optimizer without breaking backwards-compatibility with older weight files.

kur's People

Contributors

ajsyp avatar antho-rousseau avatar embracelife avatar gnarlymedia avatar greedyuser avatar ipeevski avatar janowiesniak avatar joshgev avatar mattphotonman avatar navinpai avatar noajshu avatar scottstephenson avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kur's Issues

Validation loss starts to rise

I have trained the deepspeech model for about 48 hours and I found the loss is gradually going up. I am using 100 hour train data and have run 20 epochs. Have I got enough training and can stop training? Does this program include language model such as n-gram? If not, I probably need not expect the perfect results. Thanks!
Here are the last two results:
[INFO 2017-03-02 18:11:38,332 kur.model.executor:175] Validation loss: 155.686
Prediction: "i know wat mammock ind a forto guervev en iwile set she give tat"
Truth: "i know what mamma can afford to give and i will see she gives it"
[INFO 2017-03-02 20:43:30,224 kur.model.executor:175] Validation loss: 159.761
Prediction: "i se at call tad his vholly sendy oll o oadanetural ho those obof"
Truth: "i see him called tad his voice sounding hollow and unnatural to those above"

speech evaluate terminate by 97% without result

DL 50p(4.5GB) trainning data, and change batch_size to 8, while training no validating result, after 6 epochs, "kur -v evaluate speech.yml" give me next result:

...
[DEBUG 2017-02-24 10:56:07,746 kur.providers.batch_provider:185] Next batch of data has been prepared.
Evaluating: 97%|█████████████████████████████▏| 264/271 [00:30<00:00, 10.79samples/s]
sama@sama:~/ex/kur/speech$

No result!

any hints will be helpful

Not getting a prediction on speech example.

Is this normal after all this time? I know it's the small dataset but it hasnt given one prediction at all since the beginning of just a blank space.

Epoch 88/inf, loss=6.077: 97%|█████████▋| 2352/2432 [25:49<00:45, 1.74samples/s]
[INFO 2017-04-01 23:52:55,254 kur.model.executor:369] Training loss: 6.077
Validating, loss=372.212: 94%|█████████▍| 256/271 [00:45<00:02, 6.13samples/s]
[INFO 2017-04-01 23:53:40,300 kur.model.executor:206] Validation loss: 372.212
Prediction: ""
Truth: "that's it on your account"
Total wall-clock time: 38h 20m 31s
Training wall-clock time: 37h 08m 23s
Validation wall-clock time: 01h 12m 08s
Batch wall-clock time: 37h 07m 42s

Thanks for all your help in advance.

plot_weights hook: plot unlimited weights; access weights directly from Executor.model not external files

Now, I have managed to plot the convolutional weights by the following kurfile:

hooks:
    - plot_weights:
        weight_file: cifar.best.valid.w
        weight_keywords1: ["convolution.0", "kernel"]
        weight_keywords2: ["convolution.1", "kernel"]

In my plot_weights_hook.py, I get weight_keywords1, weight_keywords2 into hooks through __init__():

def __init__(self, weight_file, weight_keywords1, weight_keywords2, *args, **kwargs):
		""" Creates a new plotting hook, get plot filenames and matplotlib ready.
		"""

my question:

if I want to plot more convolutional weights, say weight_keywords3, weight_keywords4, weight_keywords5, do I have to change the source code, by adding them into __init__ like above?

Can **kwargs somehow help me avoid changing source every time I want to plot more weights? If so, how?

Thanks!

warning: divided by zero when `plot_hook.py` is executed using pytorch backend

Whenever I run kur train with hooks: plot in keras backend, everything is fine; but in pytorch backend, I got the following warning:

[WARNING 2017-05-08 18:04:00,276 py.warnings _showwarnmsg:99] /Users/Natsume/Documents/kur_experiment/kur/model/hooks/plot_hook.py:287: RuntimeWarning: divide by zero encountered in true_divide
  throughput = numpy.diff(batch) / numpy.diff(time)

should we be worry about this?

dataset file checksum fail

Md5 checksum is failing on all four dataset files.

ganga$md5 lsc100-10p-train.tar.gz
MD5 (lsc100-10p-train.tar.gz) = 16139ef9fc3c58035ee225fc22eb95be

Training on TIMIT

Hey guys,

Super new to kur (like literally looked at it for the first time today), so perhaps I'm missing something simple but here is my issue:

I've created a conversion script for the TIMIT dataset to get the dataset to match what kur is expecting: https://github.com/kentsommer/TIMIT-to-Kur/blob/master/to_kur_dataset.py

The jsonl file with the labels and everything which (along with the audio folder) is returned by running the above script on the TIMIT dataset can be found here: https://github.com/kentsommer/TIMIT-to-Kur/releases/download/v0.1/timit_train.jsonl

However, after trying to run $ kur train speech.yml, I get the following (note training on the standard lsc100 works fine):

Downloading: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████| 316M/316M [01:11<00:00, 4.41Mbytes/s]
Downloading: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████| 115M/115M [00:29<00:00, 3.87Mbytes/s]
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/framework/op_kernel.cc:993] Invalid argument: Saw a non-null label (index >= num_classes - 1) following a null label, batch: 1 num_classes: 29 labels: 26,25,0,13,14,24,0,23,10,24,20,17,26,25,14,20,19,0,13,6,23,9,17
[ERROR 2017-02-21 19:53:11,741 kur.model.executor:224] Exception raised during training.
Traceback (most recent call last):
  File "/home/kent/.virtualenvs/kur/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 1022, in _do_call
    return fn(*args)
  File "/home/kent/.virtualenvs/kur/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 1004, in _run_fn
    status, run_metadata)
  File "/usr/lib/python3.4/contextlib.py", line 66, in __exit__
    next(self.gen)
  File "/home/kent/.virtualenvs/kur/lib/python3.4/site-packages/tensorflow/python/framework/errors_impl.py", line 469, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Saw a non-null label (index >= num_classes - 1) following a null label, batch: 1 num_classes: 29 labels: 26,25,0,13,14,24,0,23,10,24,20,17,26,25,14,20,19,0,13,6,23,9,17
	 [[Node: CTCLoss = CTCLoss[ctc_merge_repeated=true, preprocess_collapse_repeated=false, _device="/job:localhost/replica:0/task:0/cpu:0"](Log/_401, ToInt64/_403, GatherNd, Squeeze_2/_405)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/kent/.virtualenvs/kur/lib/python3.4/site-packages/kur/model/executor.py", line 221, in train
    **kwargs
  File "/home/kent/.virtualenvs/kur/lib/python3.4/site-packages/kur/model/executor.py", line 537, in wrapped_train
    self.compile('train', with_provider=provider)
  File "/home/kent/.virtualenvs/kur/lib/python3.4/site-packages/kur/model/executor.py", line 107, in compile
    **kwargs
  File "/home/kent/.virtualenvs/kur/lib/python3.4/site-packages/kur/backend/keras_backend.py", line 639, in compile
    self.wait_for_compile(model, key)
  File "/home/kent/.virtualenvs/kur/lib/python3.4/site-packages/kur/backend/keras_backend.py", line 668, in wait_for_compile
    self.run_batch(model, batch, key)
  File "/home/kent/.virtualenvs/kur/lib/python3.4/site-packages/kur/backend/keras_backend.py", line 708, in run_batch
    outputs = compiled['func'](inputs)
  File "/home/kent/.virtualenvs/kur/lib/python3.4/site-packages/keras/backend/tensorflow_backend.py", line 1943, in __call__
    feed_dict=feed_dict)
  File "/home/kent/.virtualenvs/kur/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 767, in run
    run_metadata_ptr)
  File "/home/kent/.virtualenvs/kur/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 965, in _run
    feed_dict_string, options, run_metadata)
  File "/home/kent/.virtualenvs/kur/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 1015, in _do_run
    target_list, options, run_metadata)
  File "/home/kent/.virtualenvs/kur/lib/python3.4/site-packages/tensorflow/python/client/session.py", line 1035, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Saw a non-null label (index >= num_classes - 1) following a null label, batch: 1 num_classes: 29 labels: 26,25,0,13,14,24,0,23,10,24,20,17,26,25,14,20,19,0,13,6,23,9,17
	 [[Node: CTCLoss = CTCLoss[ctc_merge_repeated=true, preprocess_collapse_repeated=false, _device="/job:localhost/replica:0/task:0/cpu:0"](Log/_401, ToInt64/_403, GatherNd, Squeeze_2/_405)]]

Caused by op 'CTCLoss', defined at:
  File "/home/kent/.virtualenvs/kur/bin/kur", line 11, in <module>
    load_entry_point('kur==0.3.0', 'console_scripts', 'kur')()
  File "/home/kent/.virtualenvs/kur/lib/python3.4/site-packages/kur/__main__.py", line 382, in main
    sys.exit(args.func(args) or 0)
  File "/home/kent/.virtualenvs/kur/lib/python3.4/site-packages/kur/__main__.py", line 62, in train
    func(step=args.step)
  File "/home/kent/.virtualenvs/kur/lib/python3.4/site-packages/kur/kurfile.py", line 371, in func
    return trainer.train(**defaults)
  File "/home/kent/.virtualenvs/kur/lib/python3.4/site-packages/kur/model/executor.py", line 221, in train
    **kwargs
  File "/home/kent/.virtualenvs/kur/lib/python3.4/site-packages/kur/model/executor.py", line 537, in wrapped_train
    self.compile('train', with_provider=provider)
  File "/home/kent/.virtualenvs/kur/lib/python3.4/site-packages/kur/model/executor.py", line 107, in compile
    **kwargs
  File "/home/kent/.virtualenvs/kur/lib/python3.4/site-packages/kur/backend/keras_backend.py", line 581, in compile
    self.process_loss(model, loss)
  File "/home/kent/.virtualenvs/kur/lib/python3.4/site-packages/kur/backend/keras_backend.py", line 500, in process_loss
    self.find_compiled_layer_by_name(model, target)
  File "/home/kent/.virtualenvs/kur/lib/python3.4/site-packages/kur/loss/ctc.py", line 232, in get_loss
    transcript_length
  File "/home/kent/.virtualenvs/kur/lib/python3.4/site-packages/keras/backend/tensorflow_backend.py", line 3042, in ctc_batch_cost
    sequence_length=input_length), 1)
  File "/home/kent/.virtualenvs/kur/lib/python3.4/site-packages/tensorflow/python/ops/ctc_ops.py", line 145, in ctc_loss
    ctc_merge_repeated=ctc_merge_repeated)
  File "/home/kent/.virtualenvs/kur/lib/python3.4/site-packages/tensorflow/python/ops/gen_ctc_ops.py", line 164, in _ctc_loss
    name=name)
  File "/home/kent/.virtualenvs/kur/lib/python3.4/site-packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op
    op_def=op_def)
  File "/home/kent/.virtualenvs/kur/lib/python3.4/site-packages/tensorflow/python/framework/ops.py", line 2395, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/home/kent/.virtualenvs/kur/lib/python3.4/site-packages/tensorflow/python/framework/ops.py", line 1264, in __init__
    self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): Saw a non-null label (index >= num_classes - 1) following a null label, batch: 1 num_classes: 29 labels: 26,25,0,13,14,24,0,23,10,24,20,17,26,25,14,20,19,0,13,6,23,9,17
	 [[Node: CTCLoss = CTCLoss[ctc_merge_repeated=true, preprocess_collapse_repeated=false, _device="/job:localhost/replica:0/task:0/cpu:0"](Log/_401, ToInt64/_403, GatherNd, Squeeze_2/_405)]]

Any ideas?

UnboundLocalError: local variable 'status' referenced before assignment

error info:
Exception ignored in: <bound method BaseSession.del of <tensorflow.python.client.session.Session object at 0x2aeb78884a90>>
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 582, in del
UnboundLocalError: local variable 'status' referenced before assignment

kur version 0.5.1

a tiny error in character_rnn example kurfile

Hi @noajshu ,
Thanks for your character example, it is great!

@ajsyp has pointed to me something is not exactly right with the weights, and now I realized the tiny issue there. Inside kurfile.yaml, is the following weights what you wanted, instead of here:

train:
  data:
    - jsonl: ../data/train.jsonl
  epochs: 5
  weights:
    initial: best.w.kur
    best: best.w.kur
    last: last.w.kur
  log: log


validate:
  data:
    - jsonl: ../data/validate.jsonl
  weights: best.w.kur


test:
  data:
    - jsonl: ../data/test.jsonl
  weights: best.w.kur

pytorch: must set `border: valid` for demo to work? what if users want `same`?

with cifar.yml demo inside kur/, after change the backend to pytorch as below

backend: 
  name: pytorch

I got the following error message:

(dlnd-tf-lab)  ->kur train cifar.yml
Traceback (most recent call last):
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/bin/kur", line 11, in <module>
    load_entry_point('kur', 'console_scripts', 'kur')()
  File "/Users/Natsume/Downloads/kur/kur/__main__.py", line 491, in main
    sys.exit(args.func(args) or 0)
  File "/Users/Natsume/Downloads/kur/kur/__main__.py", line 63, in train
    func = spec.get_training_function()
  File "/Users/Natsume/Downloads/kur/kur/kurfile.py", line 385, in get_training_function
    model = self.get_model(provider)
  File "/Users/Natsume/Downloads/kur/kur/kurfile.py", line 176, in get_model
    self.model.build()
  File "/Users/Natsume/Downloads/kur/kur/model/model.py", line 286, in build
    self.build_graph(input_nodes, output_nodes, network)
  File "/Users/Natsume/Downloads/kur/kur/model/model.py", line 337, in build_graph
    for layer in node.container.build(self):
  File "/Users/Natsume/Downloads/kur/kur/containers/container.py", line 306, in build
    self._built = list(self._build(model))
  File "/Users/Natsume/Downloads/kur/kur/containers/layers/convolution.py", line 223, in _build
    raise ValueError('PyTorch convolutions cannot use "same" '
ValueError: PyTorch convolutions cannot use "same" border mode when the receptive field "size" is even.

After set border from default to valid, it works fine.

My question
If user intends to use border = "same", then what shall be done to make pytorch work?

Maybe, we shall change receptive field should not be even, so I tried to set it odd:

  cnn:
    kernels: [64, 32]
    size: [3, 3]
    strides: [1, 1]

But I got a new error this time:

(dlnd-tf-lab)  ->kur train cifar.yml
Traceback (most recent call last):
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/bin/kur", line 11, in <module>
    load_entry_point('kur', 'console_scripts', 'kur')()
  File "/Users/Natsume/Downloads/kur/kur/__main__.py", line 491, in main
    sys.exit(args.func(args) or 0)
  File "/Users/Natsume/Downloads/kur/kur/__main__.py", line 64, in train
    func(step=args.step)
  File "/Users/Natsume/Downloads/kur/kur/kurfile.py", line 393, in func
    model.restore(initial_weights)
  File "/Users/Natsume/Downloads/kur/kur/model/model.py", line 234, in restore
    self.backend.restore(self, filename)
  File "/Users/Natsume/Downloads/kur/kur/backend/pytorch_backend.py", line 176, in restore
    model.data.model.load_state_dict(state)
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/site-packages/torch/nn/modules/module.py", line 316, in load_state_dict
    own_state[name].copy_(param)
RuntimeError: inconsistent tensor size at /Users/soumith/code/pytorch-builder/wheel/pytorch-src/torch/lib/TH/generic/THTensorCopy.c:51

Does Deepgram support multi-GPUs? Thanks!

It seems it takes long to train a model such as deepspeech example with one GPU. I wonder whether I can add more GPUs to accelerate it. I could not find any description about this anywhere. Thank you!

How to get every activation layer output?

How to get every activation layer output?

With model(weights, biases) and a single data sample, how can I get each activation layer output?

The workflow:
a data sample --> reshape (if needed) --> dot(weights) + biases -->
--> apply activation function --> activation layer output

Is there an easy way of getting activation layer output in kur?

Below is my exploration, but not successful at all.

What I know:

  • model(weights, biases) at each epoch or even each batch can be assessed
  • I don't know how to access each activation tensors directly
    • it seems kur use keras (not sure about pytorch) directly offer prediction and loss through keras_backend.run_batch
    • I don't see an obvious way to access each activation values directly
  • But indirectly, activation values can be calculated: using a sample, weights&biases, activation functions
    • Can I use kur.layer.activation object to perform activation on numpy.array?

Logic flow: get output of each layer

  • when do I need output from each layer?
    • at the end of a batch or an epoch
  • how do I get the output of each layer?
    • backward: from trained weights + inputs + (activation) ==> layer output
  • how to get input?
    • through batch_provider, throw out batch one at a time
  • how to get weights?
    • after a batch or epoch of training, or certain number of them
    • create a temp folder and save model weights into it
    • select a particular layer's weights + bias files from the folder and open with idx.open, to get numpy array of weigths and biases
  • how to get activation?
    • through numpy functions to create activation operator (more likely chosen), e.g., tanh: np.tanh; relu: np.relu (not exist)
    • use keras.activation operator if possible?
      • How to use relu activation directly from keras
        • from keras import backend as K
        • K.relu(x)
    • use pytorch activation operator if possible? (more likely to be the easier solution found here)
    • best solution?
      • apply numpy.arrays to kur.layer.Activation objects, such as softmax, leakyrelu
      • so that, no need to convert numpy array to Theano.tensor or Tensorflow tensor, or pytorch tensor, and back?

keras_backend.py disables GPU for theano, while tensorflow backend selected

When calling kur with tensorflow backend, the keras_backend.py script will override the CUDA_VISIBLE_DEVICES environment variable if theano cannot use the GPU, even though tensorflow is correctly configured to use the GPU.

        if not self.devices:
                replace_theano_flag('device', 'cpu')
                env['CUDA_VISIBLE_DEVICES'] = '100'
                logger.info('Requesting CPU')

CUDA_VISIBLE_DEVICES shouldn't be overwritten in this fashion. There should be a check for which backend is used beforehand and only override if the selected backend can't access the GPU.

Kur mishandles exception when a kurfile requests a nonexistent supplier

supplier.py:51

		raise ValueError('Ambiguous supplier type in an element of the '
			'"input" list. Exactly one of the following keys must be '
			'present: {}'.format(', '.join(Supplier.get_all_suppliers())))

but Supplier.get_all_suppliers returns a list of types so you get this error:

  File "/Users/go-bro/.virtualenvs/kur/bin/kur", line 11, in <module>
    load_entry_point('kur==0.3.0', 'console_scripts', 'kur')()
  File "/Users/go-bro/.virtualenvs/kur/lib/python3.6/site-packages/kur/__main__.py", line 382, in main
    sys.exit(args.func(args) or 0)
  File "/Users/go-bro/.virtualenvs/kur/lib/python3.6/site-packages/kur/__main__.py", line 61, in train
    func = spec.get_training_function()
  File "/Users/go-bro/.virtualenvs/kur/lib/python3.6/site-packages/kur/kurfile.py", line 259, in get_training_function
    provider = self.get_provider('train')
  File "/Users/go-bro/.virtualenvs/kur/lib/python3.6/site-packages/kur/kurfile.py", line 230, in get_provider
    Supplier.from_specification(x, kurfile=self) for x in supplier_list
  File "/Users/go-bro/.virtualenvs/kur/lib/python3.6/site-packages/kur/kurfile.py", line 230, in <listcomp>
    Supplier.from_specification(x, kurfile=self) for x in supplier_list
  File "/Users/go-bro/.virtualenvs/kur/lib/python3.6/site-packages/kur/supplier/supplier.py", line 50, in from_specification
    ', '.join(Supplier.get_all_suppliers())
TypeError: sequence item 0: expected str instance, type found```

so the error is a bit confusing -- should I submit a PR?

How to make variable-length input for seq2seq model

Hello, I am new to Kur and trying to make a sequence-to-sequence model similar to this for machine translation:
tensorflow seq2seq model

I need to have variable-length input and variable-length output. I think for output sequence I can use CTC like in the ASR example. To test, I tried setting the input to transcript just like the output of the speech model. This is the error I get:

kur -vvv train translator.yaml
[INFO 2017-02-21 15:29:03,435 kur.kurfile:699] Parsing source: translator.yaml, included by top-level.
[INFO 2017-02-21 15:29:03,455 kur.kurfile:82] Parsing Kurfile...
[DEBUG 2017-02-21 15:29:03,455 kur.kurfile:784] Parsing Kurfile section: settings
[DEBUG 2017-02-21 15:29:03,461 kur.kurfile:784] Parsing Kurfile section: train
[DEBUG 2017-02-21 15:29:03,466 kur.kurfile:784] Parsing Kurfile section: validate
[DEBUG 2017-02-21 15:29:03,469 kur.kurfile:784] Parsing Kurfile section: test
[DEBUG 2017-02-21 15:29:03,472 kur.kurfile:784] Parsing Kurfile section: evaluate
[DEBUG 2017-02-21 15:29:03,476 kur.containers.layers.placeholder:63] Using short-hand name for placeholder: transcript
[DEBUG 2017-02-21 15:29:03,476 kur.containers.layers.placeholder:97] Placeholder "transcript" has a deferred shape.
[DEBUG 2017-02-21 15:29:03,483 kur.containers.layers.output:50] Using short-hand name for output: decoding
[DEBUG 2017-02-21 15:29:03,484 kur.kurfile:784] Parsing Kurfile section: loss
[INFO 2017-02-21 15:29:03,486 kur.loggers.binary_logger:71] Loading log data: log
[DEBUG 2017-02-21 15:29:03,486 kur.loggers.binary_logger:78] Loading old-style binary logger.
[DEBUG 2017-02-21 15:29:03,487 kur.loggers.binary_logger:184] Loading binary column: training_loss_total
[DEBUG 2017-02-21 15:29:03,487 kur.loggers.binary_logger:192] No such log column exists: log/training_loss_total
[DEBUG 2017-02-21 15:29:03,487 kur.loggers.binary_logger:184] Loading binary column: training_loss_batch
[DEBUG 2017-02-21 15:29:03,487 kur.loggers.binary_logger:192] No such log column exists: log/training_loss_batch
[DEBUG 2017-02-21 15:29:03,487 kur.loggers.binary_logger:184] Loading binary column: validation_loss_total
[DEBUG 2017-02-21 15:29:03,487 kur.loggers.binary_logger:192] No such log column exists: log/validation_loss_total
[DEBUG 2017-02-21 15:29:03,487 kur.loggers.binary_logger:184] Loading binary column: validation_loss_batch
[DEBUG 2017-02-21 15:29:03,487 kur.loggers.binary_logger:192] No such log column exists: log/validation_loss_batch
[DEBUG 2017-02-21 15:29:05,391 kur.utils.package:233] File exists and passed checksum: /Users/noajshu/kur/lsdc-train.tar.gz
[DEBUG 2017-02-21 15:29:05,391 kur.supplier.speechrec:587] Unpacking input data: /Users/noajshu/kur/lsdc-train.tar.gz
[DEBUG 2017-02-21 15:29:18,819 kur.supplier.speechrec:612] Looking for metadata file.
[DEBUG 2017-02-21 15:29:18,819 kur.supplier.speechrec:647] Found metadata file: /Users/noajshu/kur/lsdc-train/lsdc-train.jsonl
[DEBUG 2017-02-21 15:29:18,819 kur.supplier.speechrec:648] Inferred source path: /Users/noajshu/kur/lsdc-train/audio
[DEBUG 2017-02-21 15:29:18,819 kur.supplier.speechrec:650] Scanning metadata file.
[DEBUG 2017-02-21 15:29:18,833 kur.supplier.speechrec:652] Entries counted: 2432
[DEBUG 2017-02-21 15:29:18,833 kur.supplier.speechrec:654] Loading metadata.
[DEBUG 2017-02-21 15:29:18,865 kur.supplier.speechrec:685] Entries kept: 2432
[DEBUG 2017-02-21 15:29:18,866 kur.supplier.speechrec:501] Using all available data.
[DEBUG 2017-02-21 15:29:18,866 kur.supplier.speechrec:442] Creating sources.
[INFO 2017-02-21 15:29:18,866 kur.supplier.speechrec:144] Restoring normalization statistics: norm.yml
[INFO 2017-02-21 15:29:18,866 kur.utils.normalize:185] Restoring normalization state from: norm.yml
[INFO 2017-02-21 15:29:24,688 kur.supplier.speechrec:307] Inferring vocabulary from data set.
[INFO 2017-02-21 15:29:24,710 kur.supplier.speechrec:342] Loaded a 28-word vocabulary.
[DEBUG 2017-02-21 15:29:24,711 kur.providers.batch_provider:57] Batch size set to: 16
[DEBUG 2017-02-21 15:29:24,712 kur.providers.batch_provider:64] Batch provider will force batches of exactly 16 samples.
[DEBUG 2017-02-21 15:29:24,946 kur.utils.package:233] File exists and passed checksum: /Users/noajshu/kur/lsdc-test.tar.gz
[DEBUG 2017-02-21 15:29:24,946 kur.supplier.speechrec:587] Unpacking input data: /Users/noajshu/kur/lsdc-test.tar.gz
[DEBUG 2017-02-21 15:29:26,521 kur.supplier.speechrec:612] Looking for metadata file.
[DEBUG 2017-02-21 15:29:26,521 kur.supplier.speechrec:647] Found metadata file: /Users/noajshu/kur/lsdc-test/lsdc-test.jsonl
[DEBUG 2017-02-21 15:29:26,521 kur.supplier.speechrec:648] Inferred source path: /Users/noajshu/kur/lsdc-test/audio
[DEBUG 2017-02-21 15:29:26,521 kur.supplier.speechrec:650] Scanning metadata file.
[DEBUG 2017-02-21 15:29:26,521 kur.supplier.speechrec:652] Entries counted: 271
[DEBUG 2017-02-21 15:29:26,521 kur.supplier.speechrec:654] Loading metadata.
[DEBUG 2017-02-21 15:29:26,524 kur.supplier.speechrec:685] Entries kept: 271
[DEBUG 2017-02-21 15:29:26,524 kur.supplier.speechrec:501] Using all available data.
[DEBUG 2017-02-21 15:29:26,524 kur.supplier.speechrec:442] Creating sources.
[INFO 2017-02-21 15:29:26,524 kur.supplier.speechrec:144] Restoring normalization statistics: norm.yml
[INFO 2017-02-21 15:29:26,524 kur.utils.normalize:185] Restoring normalization state from: norm.yml
 [INFO 2017-02-21 15:29:32,311 kur.supplier.speechrec:307] Inferring vocabulary from data set.
[INFO 2017-02-21 15:29:32,313 kur.supplier.speechrec:342] Loaded a 28-word vocabulary.
[DEBUG 2017-02-21 15:29:32,314 kur.providers.batch_provider:57] Batch size set to: 16
[DEBUG 2017-02-21 15:29:32,314 kur.providers.batch_provider:64] Batch provider will force batches of exactly 16 samples.
[DEBUG 2017-02-21 15:29:32,314 kur.backend.backend:187] Using backend: keras
[INFO 2017-02-21 15:29:32,315 kur.backend.backend:80] Creating backend: keras
[INFO 2017-02-21 15:29:32,315 kur.backend.backend:83] Backend variants: none
[INFO 2017-02-21 15:29:32,315 kur.backend.keras_backend:81] The tensorflow backend for Keras has been requested.
[DEBUG 2017-02-21 15:29:32,315 kur.backend.keras_backend:189] Overriding environmental variables: {'KERAS_BACKEND': 'tensorflow', 'THEANO_FLAGS': None, 'TF_CPP_MIN_LOG_LEVEL': '1'}
[INFO 2017-02-21 15:29:34,063 kur.backend.keras_backend:195] Keras is loaded. The backend is: tensorflow
[INFO 2017-02-21 15:29:34,064 kur.model.model:260] Enumerating the model containers.
[INFO 2017-02-21 15:29:34,064 kur.model.model:265] Assembling the model dependency graph.
[DEBUG 2017-02-21 15:29:34,064 kur.model.model:272] Assembled Node: transcript
[DEBUG 2017-02-21 15:29:34,064 kur.model.model:274]   Uses:
[DEBUG 2017-02-21 15:29:34,065 kur.model.model:276]   Used by: ..recurrent.0
[DEBUG 2017-02-21 15:29:34,065 kur.model.model:277]   Aliases: transcript
[DEBUG 2017-02-21 15:29:34,065 kur.model.model:272] Assembled Node: ..recurrent.0
[DEBUG 2017-02-21 15:29:34,065 kur.model.model:274]   Uses: transcript
[DEBUG 2017-02-21 15:29:34,065 kur.model.model:276]   Used by: ..batch_normalization.0
[DEBUG 2017-02-21 15:29:34,065 kur.model.model:277]   Aliases: ..recurrent.0
[DEBUG 2017-02-21 15:29:34,065 kur.model.model:272] Assembled Node: ..batch_normalization.0
[DEBUG 2017-02-21 15:29:34,066 kur.model.model:274]   Uses: ..recurrent.0
[DEBUG 2017-02-21 15:29:34,066 kur.model.model:276]   Used by: ..recurrent.1
[DEBUG 2017-02-21 15:29:34,066 kur.model.model:277]   Aliases: ..batch_normalization.0
[DEBUG 2017-02-21 15:29:34,066 kur.model.model:272] Assembled Node: ..recurrent.1
[DEBUG 2017-02-21 15:29:34,066 kur.model.model:274]   Uses: ..batch_normalization.0
[DEBUG 2017-02-21 15:29:34,066 kur.model.model:276]   Used by: ..batch_normalization.1
[DEBUG 2017-02-21 15:29:34,066 kur.model.model:277]   Aliases: ..recurrent.1
[DEBUG 2017-02-21 15:29:34,066 kur.model.model:272] Assembled Node: ..batch_normalization.1
[DEBUG 2017-02-21 15:29:34,067 kur.model.model:274]   Uses: ..recurrent.1
[DEBUG 2017-02-21 15:29:34,067 kur.model.model:276]   Used by: ..recurrent.2
[DEBUG 2017-02-21 15:29:34,067 kur.model.model:277]   Aliases: ..batch_normalization.1
[DEBUG 2017-02-21 15:29:34,067 kur.model.model:272] Assembled Node: ..recurrent.2
[DEBUG 2017-02-21 15:29:34,067 kur.model.model:274]   Uses: ..batch_normalization.1
[DEBUG 2017-02-21 15:29:34,067 kur.model.model:276]   Used by: ..batch_normalization.2
[DEBUG 2017-02-21 15:29:34,068 kur.model.model:277]   Aliases: ..recurrent.2
[DEBUG 2017-02-21 15:29:34,068 kur.model.model:272] Assembled Node: ..batch_normalization.2
[DEBUG 2017-02-21 15:29:34,068 kur.model.model:274]   Uses: ..recurrent.2
[DEBUG 2017-02-21 15:29:34,068 kur.model.model:276]   Used by: ..activation.0
[DEBUG 2017-02-21 15:29:34,068 kur.model.model:277]   Aliases: ..batch_normalization.2, ..for.0
[DEBUG 2017-02-21 15:29:34,068 kur.model.model:272] Assembled Node: ..activation.0
[DEBUG 2017-02-21 15:29:34,068 kur.model.model:274]   Uses: ..batch_normalization.2
[DEBUG 2017-02-21 15:29:34,069 kur.model.model:276]   Used by: decoding
[DEBUG 2017-02-21 15:29:34,069 kur.model.model:277]   Aliases: ..activation.0
[DEBUG 2017-02-21 15:29:34,069 kur.model.model:272] Assembled Node: decoding
[DEBUG 2017-02-21 15:29:34,069 kur.model.model:274]   Uses: ..activation.0
[DEBUG 2017-02-21 15:29:34,069 kur.model.model:276]   Used by:
[DEBUG 2017-02-21 15:29:34,069 kur.model.model:277]   Aliases: decoding
[INFO 2017-02-21 15:29:34,069 kur.model.model:280] Connecting the model graph.
[DEBUG 2017-02-21 15:29:34,070 kur.model.model:311] Building node: transcript
[DEBUG 2017-02-21 15:29:34,070 kur.model.model:312]   Aliases: transcript
[DEBUG 2017-02-21 15:29:34,070 kur.model.model:313]   Inputs:
[DEBUG 2017-02-21 15:29:34,070 kur.containers.layers.placeholder:117] Creating placeholder for "transcript" with data type "float32".
[DEBUG 2017-02-21 15:29:34,070 kur.model.model:125] Trying to infer shape for input "transcript"
[DEBUG 2017-02-21 15:29:34,070 kur.model.model:143] Inferred shape for input "transcript": (None,)
[DEBUG 2017-02-21 15:29:34,070 kur.containers.layers.placeholder:127] Inferred shape: (None,)
[DEBUG 2017-02-21 15:29:34,098 kur.model.model:382]   Value: Tensor("transcript:0", shape=(?, ?), dtype=float32)
[DEBUG 2017-02-21 15:29:34,098 kur.model.model:311] Building node: ..recurrent.0
[DEBUG 2017-02-21 15:29:34,098 kur.model.model:312]   Aliases: ..recurrent.0
[DEBUG 2017-02-21 15:29:34,098 kur.model.model:313]   Inputs:
[DEBUG 2017-02-21 15:29:34,099 kur.model.model:315]   - transcript: Tensor("transcript:0", shape=(?, ?), dtype=float32)
Traceback (most recent call last):
  File "/Users/noajshu/.virtualenvs/rnn-translator/bin/kur", line 11, in <module>
    load_entry_point('kur==0.3.0', 'console_scripts', 'kur')()
  File "/Users/noajshu/.virtualenvs/rnn-translator/lib/python3.6/site-packages/kur/__main__.py", line 382, in main
    sys.exit(args.func(args) or 0)
  File "/Users/noajshu/.virtualenvs/rnn-translator/lib/python3.6/site-packages/kur/__main__.py", line 61, in train
    func = spec.get_training_function()
  File "/Users/noajshu/.virtualenvs/rnn-translator/lib/python3.6/site-packages/kur/kurfile.py", line 329, in get_training_function
    model = self.get_model(provider)
  File "/Users/noajshu/.virtualenvs/rnn-translator/lib/python3.6/site-packages/kur/kurfile.py", line 152, in get_model
    self.model.build()
  File "/Users/noajshu/.virtualenvs/rnn-translator/lib/python3.6/site-packages/kur/model/model.py", line 282, in build
    self.build_graph(input_nodes, output_nodes, network)
  File "/Users/noajshu/.virtualenvs/rnn-translator/lib/python3.6/site-packages/kur/model/model.py", line 339, in build_graph
    target=layer
  File "/Users/noajshu/.virtualenvs/rnn-translator/lib/python3.6/site-packages/kur/backend/keras_backend.py", line 242, in connect
    return target(inputs)
  File "/Users/noajshu/.virtualenvs/rnn-translator/lib/python3.6/site-packages/keras/engine/topology.py", line 529, in __call__
    self.assert_input_compatibility(x)
  File "/Users/noajshu/.virtualenvs/rnn-translator/lib/python3.6/site-packages/keras/engine/topology.py", line 469, in assert_input_compatibility
    str(K.ndim(x)))
ValueError: Input 0 is incompatible with layer ..recurrent.0: expected ndim=3, found ndim=2

Here is my Kurfile:

settings:

  # Deep learning model
  # cnn:
  #   kernels: 1000
  #   size: 11
  #   stride: 2
  rnn:
    size: 1000
    depth: 3
  vocab:
    # Need for CTC
    size: 28

  # Setting up the backend.
  backend:
    name: keras
    backend: tensorflow

  # Batch sizes
  provider: &provider
    batch_size: 16
    force_batch_size: yes

  # Where to put the data.
  data: &data
    path: "~/kur"
    type: spec
    max_duration: 50
    max_frequency: 8000
    normalization: norm.yml

  # Where to put the weights
  weights: &weights weights

###############################################################################
model:

  # This is Baidu's DeepSpeech model:
  #   https://arxiv.org/abs/1412.5567
  # Kur makes prototyping different versions of it incredibly easy.

  # The model input is audio data (called utterances).
  - input: transcript

  # One-dimensional, variable-size convolutional layers to extract more
  # efficient representation of the data.
  # - convolution:
  #     kernels: "{{ cnn.kernels }}"
  #     size: "{{ cnn.size }}"
  #     strides: "{{ cnn.stride }}"
  #     border: valid
  # - activation: relu
  # - batch_normalization

  # A series of recurrent layers to learn temporal sequences.
  - for:
      range: "{{ rnn.depth }}"
      iterate:
        - recurrent:
            size: "{{ rnn.size }}"
            sequence: yes
        - batch_normalization

  # # A dense layer to get everything into the right output shape.
  # - parallel:
  #     apply:
  #       - dense: "{{ vocab.size + 1 }}"
  - activation: softmax

  # The output is the transcription.
  - output: decoding

###############################################################################
train:

  data:
    # A "speech_recognition" data supplier will create these data sources:
    #   utterance, utterance_length, transcript, transcript_length, duration
    - speech_recognition:
        <<: *data
        url: "http://kur.deepgram.com/data/lsdc-train.tar.gz"
        checksum: >-
          fc414bccf4de3964f895eaa9d0e245ea28810a94be3079b55505cf0eb1644f94

  weights: *weights
  provider:
    <<: *provider
    sortagrad: duration

  log: log

  optimizer:
    name: sgd
    nesterov: yes
    learning_rate: 2e-4
    momentum: 0.9
    clip:
      norm: 100

###############################################################################
validate: &validate
  data:
    - speech_recognition:
        <<: *data
        url: "http://kur.deepgram.com/data/lsdc-test.tar.gz"
        checksum: >-
          e1c8cf9cd57e8c1ae952b6e4e40dcb5c8e3932c81ecd52c090e4a05c8ebbea2b

  weights: *weights
  provider: *provider

  hooks:
    - transcript

###############################################################################
test: *validate

###############################################################################
evaluate: *validate

###############################################################################
loss:
  - name: ctc
    # The model's output (its best-guest transcript).
    target: decoding
    # How long the corresponding audio utterance is.
    input_length: transcript_length
    relative_to: transcript
    # How long the ground-truth transcript is.
    output_length: transcript_length
    # The ground-truth transcipt itself.
    output: transcript

...

Could you give me some advice on getting this network configured to use character text sequence input? Do I need to write a new supplier?

Is character-level RNN example not ready to use yet?

It seems data is not connected nor processed for the model yet.

simply run kur -v train kurfile.yml in the language model, I got the following error message:

Traceback (most recent call last):
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/bin/kur", line 11, in <module>
    load_entry_point('kur', 'console_scripts', 'kur')()
  File "/Users/Natsume/Downloads/kur_road/kur/kur/__main__.py", line 382, in main
    sys.exit(args.func(args) or 0)
  File "/Users/Natsume/Downloads/kur_road/kur/kur/__main__.py", line 61, in train
    func = spec.get_training_function()
  File "/Users/Natsume/Downloads/kur_road/kur/kur/kurfile.py", line 259, in get_training_function
    provider = self.get_provider('train')
  File "/Users/Natsume/Downloads/kur_road/kur/kur/kurfile.py", line 240, in get_provider
    sources=Supplier.merge_suppliers(suppliers),
  File "/Users/Natsume/Downloads/kur_road/kur/kur/supplier/supplier.py", line 130, in merge_suppliers
    sources = supplier.get_sources()
  File "/Users/Natsume/Downloads/kur_road/kur/kur/supplier/jsonl_supplier.py", line 80, in get_sources
    self._load()
  File "/Users/Natsume/Downloads/kur_road/kur/kur/supplier/jsonl_supplier.py", line 59, in _load
    with open(self.source, 'r') as infile:
FileNotFoundError: [Errno 2] No such file or directory: '../data/train.jsonl'

When and how can I try this example? Thanks a lot!

What is required in order to contribute to kur development?

Hi Adam,
I think being able to contribute to kur code is not only very cook but also useful to my journey in deep learning. I wonder what is the efficient learning path from a kur user to a kur developer?

a little of my background: I have taken several Coursera machine learning certificates, and doing Udacity deep learning foundation nanodegree, so I know some ML and DL with python. However, I am only at the level of using them with simple projects.

I have time and willing to give a lot of effort in deep learning with kur. Could you give me some suggestions on how to move from where I am to becoming a kur developer?

Thanks a lot
Daniel

a bug in standard version Kur when `kur data mnist.yml`

by standard version, I mean installed by pip install kur

There is a bug here when run kur data mnist.yml


In [3]: !kur data mnist.yml
Traceback (most recent call last):
  File "/Users/Natsume/miniconda2/envs/tempy3.4/bin/kur", line 11, in <module>
    sys.exit(main())
  File "/Users/Natsume/miniconda2/envs/tempy3.4/lib/python3.4/site-packages/kur/__main__.py", line 385, in main
    sys.exit(args.func(args) or 0)
  File "/Users/Natsume/miniconda2/envs/tempy3.4/lib/python3.4/site-packages/kur/__main__.py", line 174, in prepare_data
    keys = sorted(batch.keys())
AttributeError: 'str' object has no attribute 'keys'

It can be easily fixed, but I guess it can also be fixed by update standard version kur to development version kur

dependency graph resolution

Hi

So I have run into the following value error;

ValueError: No change during dependency graph resolution. There is something wrong with the graph.

It occurs when trying to use kurfile.get_model() on the following .yml

settings:
  layer_reps: 2
  module_reps: 3
  cs: 16


model:

 - input:
    shape: [512,512,1]
   name: in

 - for:
    range: "{{module_reps}}"
    with_index: j
    iterate:

     - for:
        range: "{{layer_reps}}"
        with_index: i
        iterate:

         - convolution:
            kernels: "{{cs*2**j}}"
            size: [3,3]
           name: "m{{j}}_c{{i}}"
           sink: yes

         - activation: relu

     - pool:
        size: [2,2]
        strides: [2,2]
        type: max
       name: "m{{j}}_p"
       sink: yes

     - dense:
        size: "{{cs*2**module_reps}}"
       sink: yes
       name: middle

Now the issue seems to be in the naming of the dense layer. Without the name everything works fine...

Thanks Josh

How to use models on other data

Let's consider speech example. I've trained a model and want to try it on some wav files I've recorded. Is there any way to run model on specified file? Either command-line call or some python snippet would be highly appreciated. Or maybe there is a way to export model as a py-file and tensorflow model file so it can be used for serving as REST or via rabbitmq.

Also a side question, what "kur build" action does?

Unable to train mnist model

Hi I am trying to run the example mnist model and I get the following error;

[INFO 2017-03-15 14:43:14,295 kur.model.model:280] Connecting the model graph.
[DEBUG 2017-03-15 14:43:14,295 kur.model.model:311] Building node: images
[DEBUG 2017-03-15 14:43:14,295 kur.model.model:312] Aliases: images
[DEBUG 2017-03-15 14:43:14,295 kur.model.model:313] Inputs:
[DEBUG 2017-03-15 14:43:14,295 kur.containers.layers.placeholder:116] Creating placeholder for "images" with data type "float32".
[DEBUG 2017-03-15 14:43:14,295 kur.model.model:125] Trying to infer shape for input "images"
[DEBUG 2017-03-15 14:43:14,295 kur.model.model:143] Inferred shape for input "images": (28, 28, 1)
[DEBUG 2017-03-15 14:43:14,295 kur.containers.layers.placeholder:126] Inferred shape: (28, 28, 1)
[DEBUG 2017-03-15 14:43:14,305 kur.model.model:382] Value: Tensor("images:0", shape=(?, 28, 28, 1), dtype=float32)
[DEBUG 2017-03-15 14:43:14,306 kur.model.model:311] Building node: ..convolution.0
[DEBUG 2017-03-15 14:43:14,306 kur.model.model:312] Aliases: ..convolution.0
[DEBUG 2017-03-15 14:43:14,306 kur.model.model:313] Inputs:
[DEBUG 2017-03-15 14:43:14,306 kur.model.model:315] - images: Tensor("images:0", shape=(?, 28, 28, 1), dtype=float32)
[WARNING 2017-03-15 14:43:14,306 py.warnings:86] /home/josh/anaconda/envs/tensorflow/lib/python3.5/site-packages/keras/legacy/interfaces.py:86: UserWarning: Update your Conv2D call to the Keras 2 API: Conv2D(strides=[1, 1], activation="linear", name="..convolution.0", padding="same", kernel_size=(2, 2), filters=64)
'` call to the Keras 2 API: ' + signature)

Traceback (most recent call last):
File "/home/josh/anaconda/envs/tensorflow/bin/kur", line 11, in
sys.exit(main())
File "/home/josh/anaconda/envs/tensorflow/lib/python3.5/site-packages/kur/main.py", line 269, in main
sys.exit(args.func(args) or 0)
File "/home/josh/anaconda/envs/tensorflow/lib/python3.5/site-packages/kur/main.py", line 48, in train
func = spec.get_training_function()
File "/home/josh/anaconda/envs/tensorflow/lib/python3.5/site-packages/kur/kurfile.py", line 282, in get_training_function
model = self.get_model(provider)
File "/home/josh/anaconda/envs/tensorflow/lib/python3.5/site-packages/kur/kurfile.py", line 148, in get_model
self.model.build()
File "/home/josh/anaconda/envs/tensorflow/lib/python3.5/site-packages/kur/model/model.py", line 282, in build
self.build_graph(input_nodes, output_nodes, network)
File "/home/josh/anaconda/envs/tensorflow/lib/python3.5/site-packages/kur/model/model.py", line 339, in build_graph
target=layer
File "/home/josh/anaconda/envs/tensorflow/lib/python3.5/site-packages/kur/backend/keras_backend.py", line 238, in connect
return target(inputs)
File "/home/josh/anaconda/envs/tensorflow/lib/python3.5/site-packages/keras/engine/topology.py", line 545, in call
output = self.call(inputs, **kwargs)
File "/home/josh/anaconda/envs/tensorflow/lib/python3.5/site-packages/keras/layers/convolutional.py", line 164, in call
dilation_rate=self.dilation_rate)
File "/home/josh/anaconda/envs/tensorflow/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 2856, in conv2d
x = _preprocess_conv2d_input(x, data_format)
File "/home/josh/anaconda/envs/tensorflow/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 2729, in _preprocess_conv2d_input
if dtype(x) == 'float64':
File "/home/josh/anaconda/envs/tensorflow/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 470, in dtype
return x.dtype.name
AttributeError: 'list' object has no attribute 'dtype'

It is probably an issue with my setup so here is my enviroment;

appdirs 1.4.0
astra-toolbox 1.8 np111py35_1 astra-toolbox
blas 1.1 openblas conda-forge
bleach 1.5.0 py35_0 conda-forge
ca-certificates 2016.9.26 0 conda-forge
cairo 1.14.6 0 conda-forge
certifi 2016.9.26 py35_0 conda-forge
click 6.7 py35_0 conda-forge
cycler 0.10.0 py35_0 conda-forge
dbus 1.10.10 1 conda-forge
decorator 4.0.11 py35_0 conda-forge
entrypoints 0.2.2 py35_1 conda-forge
expat 2.1.0 2 conda-forge
flask 0.11.1 py35_0 conda-forge
flask-mongo-sessions 0.2.1
Flask-PyMongo 0.4.1
fontconfig 2.11.1 6 conda-forge
freetype 2.6.3 1 conda-forge
future 0.15.2 py35_0 conda-forge
gettext 0.19.7 1 conda-forge
glib 2.51.0 2 conda-forge
gmp 6.1.2 0 conda-forge
gst-plugins-base 1.8.0 0 conda-forge
gstreamer 1.8.0 1 conda-forge
h5py 2.6.0 np111py35_7 conda-forge
harfbuzz 1.3.4 0 conda-forge
hdf5 1.8.17 9 conda-forge
html5lib 0.999 py35_0
icu 56.1 4 conda-forge
ipykernel 4.5.2 py35_0 conda-forge
ipython 5.1.0 py35_2 conda-forge
ipython_genutils 0.1.0 py35_0 conda-forge
itsdangerous 0.24 py35_0 conda-forge
jinja2 2.8 py35_1 conda-forge
jpeg 9b 0 conda-forge
jsonschema 2.5.1 py35_0 conda-forge
jupyter_client 4.4.0 py35_0 conda-forge
jupyter_console 5.0.0 py35_0 conda-forge
jupyter_core 4.2.1 py35_0 conda-forge
Keras 2.0.0
kur 0.3.0
libastra 1.8 0 astra-toolbox
libffi 3.2.1 3 conda-forge
libgcc 5.2.0 0
libgfortran 3.0.0 1
libiconv 1.14 4 conda-forge
libpng 1.6.28 0 conda-forge
libsodium 1.0.10 0 conda-forge
libtiff 4.0.6 7 conda-forge
libxcb 1.12 0 conda-forge
libxml2 2.9.4 3 conda-forge
markupsafe 0.23 py35_1 conda-forge
matplotlib 2.0.0 np111py35_1 conda-forge
mistune 0.7.3 py35_0 conda-forge
mkl 2017.0.1 0
nbconvert 5.1.1 py35_0 conda-forge
nbformat 4.2.0 py35_0 conda-forge
ncurses 5.9 10 conda-forge
nomkl 1.0 0
notebook 4.2.3 py35_0 conda-forge
numpy 1.12.0
numpy 1.11.3 py35_blas_openblas_200 [blas_openblas] conda-forge
odl 0.5.3 py35_0 odlgroup
openblas 0.2.19 0 conda-forge
openssl 1.0.2h 3 conda-forge
packaging 16.8
pandas 0.19.2 np111py35_1 conda-forge
pandoc 1.19.1 0 conda-forge
pandocfilters 1.4.1 py35_0 conda-forge
pango 1.40.3 0 conda-forge
pcre 8.39 0 conda-forge
pexpect 4.2.1 py35_0 conda-forge
pickleshare 0.7.3 py35_0 conda-forge
pillow 4.0.0 py35_0 conda-forge
pip 9.0.1 py35_0 conda-forge
pixman 0.34.0 0 conda-forge
prompt_toolkit 1.0.9 py35_0 conda-forge
protobuf 3.1.0.post1
ptyprocess 0.5.1 py35_0 conda-forge
py 1.4.31 py35_0 conda-forge
pydicom 0.9.9
pydub 0.18.0
pygments 2.2.0 py35_0 conda-forge
pymongo 3.2.2 py35_0 conda-forge
pyparsing 2.1.10
pyparsing 2.1.10 py35_0 conda-forge
pyqt 4.11.4 py35_2 conda-forge
pytest 3.0.6 py35_0 conda-forge
python 3.5.3 0 conda-forge
python-dateutil 2.6.0 py35_0 conda-forge
python-magic 0.4.12
python-speech-features 0.5
pytz 2016.10 py35_0 conda-forge
PyYAML 3.12
pyzmq 16.0.2 py35_0 conda-forge
qt 4.8.7 3 conda-forge
qtconsole 4.2.1 py35_0 conda-forge
readline 6.2 0 conda-forge
scipy 0.18.1 np111py35_blas_openblas_201 [blas_openblas] conda-forge
setuptools 33.1.0 py35_0 conda-forge
setuptools 34.0.2
simplegeneric 0.8.1 py35_0 conda-forge
sip 4.18 py35_1 conda-forge
six 1.10.0 py35_1 conda-forge
six 1.10.0
sqlite 3.13.0 1 conda-forge
tensorflow 0.12.1
terminado 0.6 py35_0 conda-forge
testpath 0.3 py35_0 conda-forge
Theano 0.8.2
tk 8.5.19 1 conda-forge
tornado 4.4.2 py35_0 conda-forge
tqdm 4.11.2
traitlets 4.3.0 py35_0 conda-forge
wcwidth 0.1.7 py35_0 conda-forge
werkzeug 0.11.10 py35_0 conda-forge
Werkzeug 0.11.15
wheel 0.29.0 py35_0 conda-forge
wheel 0.29.0
widgetsnbextension 1.2.6
xz 5.2.2 0 conda-forge
zeromq 4.1.5 0 conda-forge
zlib 1.2.11 0 conda-forge

Please add some debug info to help finding the error cause

During the training, I met an ERROR. Have you met this before and what problem it could be? Training data problem? I've checked there is no empty string in the audio transcripts. Is there any other reason? Or do you have any debug log information that I can see which transcript (such as uuid) triggered this error?

[ERROR 2017-03-16 22:28:55,510 kur.model.executor:236] Exception raised during training.
Traceback (most recent call last):
File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/client/session.py", line 1021, in _do_call
return fn(*args)
File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/client/session.py", line 1003, in _run_fn
status, run_metadata)
File "/usr/lib/python3.4/contextlib.py", line 66, in exit
next(self.gen)
File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/errors_impl.py", line 469, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Labels length is zero in batch 17
[[Node: CTCLoss = CTCLoss[ctc_merge_repeated=true, preprocess_collapse_repeated=false, _device="/job:localhost/replica:0/task:0/cpu:0"](Log/_659, ToInt64/_661, GatherNd, Squeeze_2/_663)]]

Problem with Sentence generation

Hi,

I'm new to Kur and I'm trying to use it for Sentence Generation. Here's my flow:

1. Input data

# Imports
from keras.utils import np_utils
import numpy
import pickle

# Load data
filename = "shakespeare.txt"
raw_text = open(filename).read()
raw_text = raw_text.lower()

# create mapping of unique chars to integers
chars = sorted(list(set(raw_text)))
char_to_int = dict((c, i) for i, c in enumerate(chars))

# summarize the loaded data
n_chars = len(raw_text)
n_vocab = len(chars)
print "Total Characters: ", n_chars
print "Total Vocab: ", n_vocab

# prepare the dataset of input to output pairs encoded as integers
seq_length = 100
dataX = []
dataY = []
for i in range(0, n_chars - seq_length, 1):
    seq_in = raw_text[i:i + seq_length]
    seq_out = raw_text[i + seq_length]
    dataX.append([char_to_int[char] for char in seq_in])
    dataY.append(char_to_int[seq_out])
n_patterns = len(dataX)
print "Total Patterns: ", n_patterns

# reshape X to be [samples, time steps, features]
X = numpy.reshape(dataX, (n_patterns, seq_length, 1))
# normalize
X = X / float(n_vocab)
# one hot encode the output variable
y = np_utils.to_categorical(dataY)

# Data to file
output_file = 'shakespeare_data'
with open(output_file, 'wb') as fh:
    fh.write(pickle.dumps({'in' : X, 'out' : y}))

# Test out pickle data
with open('shakespeare_data', 'rb') as fh:
    data = pickle.loads(fh.read())
print data.keys()

# in
print 'IN'
print data['in'][:1], type(data['in'])

# out
print 'OUT'
print data['out'][:1], type(data['out'])

print 'Input shape: {} {}'.format(X.shape[1], X.shape[2])
print 'Output shape: {}'.format(y.shape[1])

2. Kurfile

---
model:
  - input:
      shape: [100, 1]
    name: in
  - recurrent:
      size: 256
      type: lstm
  - dropout: 0.2
  - dense: 65
  - activation: softmax
    name: out

train:
  data: 
    - pickle: shakespeare_data
  epochs: 10
  log: shakespeare-log
  optimizer: adam

loss:
  - target: out
    name: categorical_crossentropy
...

3. Train with optimizer

kur train kurfile.yml

kur -vv train shakespeare.yml
[INFO 2017-04-09 19:35:44,036 kur.kurfile:754] Parsing source: shakespeare.yml, included by top-level.
[INFO 2017-04-09 19:35:44,040 kur.kurfile:87] Parsing Kurfile...
[DEBUG 2017-04-09 19:35:44,041 kur.kurfile:905] Parsing Kurfile section: settings
[DEBUG 2017-04-09 19:35:44,041 kur.kurfile:905] Parsing Kurfile section: train
[DEBUG 2017-04-09 19:35:44,046 kur.kurfile:905] Parsing Kurfile section: validate
[DEBUG 2017-04-09 19:35:44,046 kur.kurfile:905] Parsing Kurfile section: test
[DEBUG 2017-04-09 19:35:44,046 kur.kurfile:905] Parsing Kurfile section: evaluate
[DEBUG 2017-04-09 19:35:44,048 kur.kurfile:905] Parsing Kurfile section: loss
[INFO 2017-04-09 19:35:44,050 kur.loggers.binary_logger:71] Loading log data: shakespeare-log
[DEBUG 2017-04-09 19:35:44,050 kur.loggers.binary_logger:78] Loading old-style binary logger.
[DEBUG 2017-04-09 19:35:44,050 kur.loggers.binary_logger:187] Loading binary column: training_loss_total
[DEBUG 2017-04-09 19:35:44,050 kur.loggers.binary_logger:195] No such log column exists: shakespeare-log/training_loss_total
[DEBUG 2017-04-09 19:35:44,050 kur.loggers.binary_logger:187] Loading binary column: training_loss_batch
[DEBUG 2017-04-09 19:35:44,050 kur.loggers.binary_logger:195] No such log column exists: shakespeare-log/training_loss_batch
[DEBUG 2017-04-09 19:35:44,050 kur.loggers.binary_logger:187] Loading binary column: training_loss_time
[DEBUG 2017-04-09 19:35:44,051 kur.loggers.binary_logger:195] No such log column exists: shakespeare-log/training_loss_time
[DEBUG 2017-04-09 19:35:44,051 kur.loggers.binary_logger:187] Loading binary column: validation_loss_total
[DEBUG 2017-04-09 19:35:44,051 kur.loggers.binary_logger:195] No such log column exists: shakespeare-log/validation_loss_total
[DEBUG 2017-04-09 19:35:44,051 kur.loggers.binary_logger:187] Loading binary column: validation_loss_batch
[DEBUG 2017-04-09 19:35:44,051 kur.loggers.binary_logger:195] No such log column exists: shakespeare-log/validation_loss_batch
[DEBUG 2017-04-09 19:35:44,051 kur.loggers.binary_logger:187] Loading binary column: validation_loss_time
[DEBUG 2017-04-09 19:35:44,051 kur.loggers.binary_logger:195] No such log column exists: shakespeare-log/validation_loss_time
[WARNING 2017-04-09 19:35:44,686 kur.supplier.pickle_supplier:67] We needed to explicitly set a "latin1" encoding to properly load the pickled data. This is probably because the pickled data was created in Python 2. You really should switch over to Python 3 in order to ensure future compatibility.
[DEBUG 2017-04-09 19:35:44,706 kur.providers.batch_provider:57] Batch size set to: 32
[DEBUG 2017-04-09 19:35:44,707 kur.backend.backend:270] Using backend: keras
[DEBUG 2017-04-09 19:35:44,707 kur.backend.backend:69] No execution device indicated to backend. Checking available devices...
[DEBUG 2017-04-09 19:35:44,707 kur.utils.cuda:161] Loading NVIDIA ML library.
[DEBUG 2017-04-09 19:35:44,708 kur.backend.backend:75] Failed to initialize CUDA. Falling back to CPU.
[INFO 2017-04-09 19:35:44,708 kur.backend.backend:154] Creating backend: keras
[INFO 2017-04-09 19:35:44,708 kur.backend.backend:157] Backend variants: none
[INFO 2017-04-09 19:35:44,708 kur.backend.keras_backend:124] No particular backend for Keras has been requested.
[DEBUG 2017-04-09 19:35:44,709 kur.backend.keras_backend:126] Using the system-default Keras backend.
[INFO 2017-04-09 19:35:44,709 kur.backend.keras_backend:175] Requesting CPU
[DEBUG 2017-04-09 19:35:44,710 kur.backend.keras_backend:186] Overriding environmental variables: {'KERAS_BACKEND': None, 'THEANO_FLAGS': 'force_device=true,device=cpu', 'CUDA_VISIBLE_DEVICES': '100', 'TF_CPP_MIN_LOG_LEVEL': '1'}
[INFO 2017-04-09 19:35:46,222 kur.backend.keras_backend:192] Keras is loaded. The backend is: tensorflow
[INFO 2017-04-09 19:35:46,223 kur.model.model:261] Enumerating the model containers.
[INFO 2017-04-09 19:35:46,223 kur.model.model:266] Assembling the model dependency graph.
[DEBUG 2017-04-09 19:35:46,223 kur.model.model:273] Assembled Node: in
[DEBUG 2017-04-09 19:35:46,223 kur.model.model:275]   Uses:
[DEBUG 2017-04-09 19:35:46,223 kur.model.model:277]   Used by: ..recurrent.0
[DEBUG 2017-04-09 19:35:46,223 kur.model.model:278]   Aliases: in
[DEBUG 2017-04-09 19:35:46,223 kur.model.model:273] Assembled Node: ..recurrent.0
[DEBUG 2017-04-09 19:35:46,223 kur.model.model:275]   Uses: in
[DEBUG 2017-04-09 19:35:46,224 kur.model.model:277]   Used by: ..dropout.0
[DEBUG 2017-04-09 19:35:46,224 kur.model.model:278]   Aliases: ..recurrent.0
[DEBUG 2017-04-09 19:35:46,224 kur.model.model:273] Assembled Node: ..dropout.0
[DEBUG 2017-04-09 19:35:46,224 kur.model.model:275]   Uses: ..recurrent.0
[DEBUG 2017-04-09 19:35:46,224 kur.model.model:277]   Used by: ..dense.0
[DEBUG 2017-04-09 19:35:46,224 kur.model.model:278]   Aliases: ..dropout.0
[DEBUG 2017-04-09 19:35:46,224 kur.model.model:273] Assembled Node: ..dense.0
[DEBUG 2017-04-09 19:35:46,224 kur.model.model:275]   Uses: ..dropout.0
[DEBUG 2017-04-09 19:35:46,224 kur.model.model:277]   Used by: out
[DEBUG 2017-04-09 19:35:46,225 kur.model.model:278]   Aliases: ..dense.0
[DEBUG 2017-04-09 19:35:46,225 kur.model.model:273] Assembled Node: out
[DEBUG 2017-04-09 19:35:46,225 kur.model.model:275]   Uses: ..dense.0
[DEBUG 2017-04-09 19:35:46,225 kur.model.model:277]   Used by:
[DEBUG 2017-04-09 19:35:46,225 kur.model.model:278]   Aliases: out
[INFO 2017-04-09 19:35:46,225 kur.model.model:281] Connecting the model graph.
[DEBUG 2017-04-09 19:35:46,226 kur.model.model:312] Building node: in
[DEBUG 2017-04-09 19:35:46,226 kur.model.model:313]   Aliases: in
[DEBUG 2017-04-09 19:35:46,226 kur.model.model:314]   Inputs:
[DEBUG 2017-04-09 19:35:46,226 kur.containers.layers.placeholder:161] Creating placeholder for "in" with data type "float32".
[DEBUG 2017-04-09 19:35:46,226 kur.model.model:126] Trying to infer shape for input "in"
[DEBUG 2017-04-09 19:35:46,226 kur.model.model:144] Inferred shape for input "in": (100, 1)
[DEBUG 2017-04-09 19:35:46,249 kur.model.model:394]   Value: Tensor("in:0", shape=(?, 100, 1), dtype=float32)
[DEBUG 2017-04-09 19:35:46,250 kur.model.model:312] Building node: ..recurrent.0
[DEBUG 2017-04-09 19:35:46,250 kur.model.model:313]   Aliases: ..recurrent.0
[DEBUG 2017-04-09 19:35:46,250 kur.model.model:314]   Inputs:
[DEBUG 2017-04-09 19:35:46,250 kur.model.model:316]   - in: Tensor("in:0", shape=(?, 100, 1), dtype=float32)
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
[DEBUG 2017-04-09 19:35:46,713 kur.model.model:394]   Value: Tensor("..recurrent.0/transpose_1:0", shape=(?, ?, 256), dtype=float32)
[DEBUG 2017-04-09 19:35:46,713 kur.model.model:312] Building node: ..dropout.0
[DEBUG 2017-04-09 19:35:46,713 kur.model.model:313]   Aliases: ..dropout.0
[DEBUG 2017-04-09 19:35:46,713 kur.model.model:314]   Inputs:
[DEBUG 2017-04-09 19:35:46,713 kur.model.model:316]   - ..recurrent.0: Tensor("..recurrent.0/transpose_1:0", shape=(?, ?, 256), dtype=float32)
[DEBUG 2017-04-09 19:35:46,736 kur.model.model:394]   Value: Tensor("..dropout.0/cond/Merge:0", shape=(?, ?, 256), dtype=float32)
[DEBUG 2017-04-09 19:35:46,736 kur.model.model:312] Building node: ..dense.0
[DEBUG 2017-04-09 19:35:46,736 kur.model.model:313]   Aliases: ..dense.0
[DEBUG 2017-04-09 19:35:46,736 kur.model.model:314]   Inputs:
[DEBUG 2017-04-09 19:35:46,736 kur.model.model:316]   - ..dropout.0: Tensor("..dropout.0/cond/Merge:0", shape=(?, ?, 256), dtype=float32)
[DEBUG 2017-04-09 19:35:46,786 kur.model.model:394]   Value: Tensor("..dense.0/add:0", shape=(?, 100, 65), dtype=float32)
[DEBUG 2017-04-09 19:35:46,786 kur.model.model:312] Building node: out
[DEBUG 2017-04-09 19:35:46,786 kur.model.model:313]   Aliases: out
[DEBUG 2017-04-09 19:35:46,786 kur.model.model:314]   Inputs:
[DEBUG 2017-04-09 19:35:46,786 kur.model.model:316]   - ..dense.0: Tensor("..dense.0/add:0", shape=(?, 100, 65), dtype=float32)
[DEBUG 2017-04-09 19:35:46,800 kur.model.model:394]   Value: Tensor("out/truediv:0", shape=(?, 100, 65), dtype=float32)
[INFO 2017-04-09 19:35:46,800 kur.model.model:285] Model inputs:  in
[INFO 2017-04-09 19:35:46,800 kur.model.model:286] Model outputs: out
Traceback (most recent call last):
  File "/Users/nakul/.virtualenvs/kur/bin/kur", line 11, in <module>
    sys.exit(main())
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/__main__.py", line 385, in main
    sys.exit(args.func(args) or 0)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/__main__.py", line 62, in train
    func = spec.get_training_function()
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/kurfile.py", line 373, in get_training_function
    trainer = self.get_trainer()
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/kurfile.py", line 502, in get_trainer
    optimizer=self.get_optimizer() if with_optimizer else None
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/kurfile.py", line 517, in get_optimizer
    spec = dict(self.data['train'].get('optimizer', {}))
ValueError: dictionary update sequence element #0 has length 1; 2 is required

4. Train without optimizer

kur train kurfile.yml

[INFO 2017-04-09 21:13:32,555 kur.kurfile:754] Parsing source: shakespeare.yml, included by top-level.
[INFO 2017-04-09 21:13:32,560 kur.kurfile:87] Parsing Kurfile...
[DEBUG 2017-04-09 21:13:32,560 kur.kurfile:905] Parsing Kurfile section: settings
[DEBUG 2017-04-09 21:13:32,560 kur.kurfile:905] Parsing Kurfile section: train
[DEBUG 2017-04-09 21:13:32,564 kur.kurfile:905] Parsing Kurfile section: validate
[DEBUG 2017-04-09 21:13:32,564 kur.kurfile:905] Parsing Kurfile section: test
[DEBUG 2017-04-09 21:13:32,564 kur.kurfile:905] Parsing Kurfile section: evaluate
[DEBUG 2017-04-09 21:13:32,567 kur.kurfile:905] Parsing Kurfile section: loss
[INFO 2017-04-09 21:13:32,568 kur.loggers.binary_logger:71] Loading log data: shakespeare-log
[DEBUG 2017-04-09 21:13:32,568 kur.loggers.binary_logger:78] Loading old-style binary logger.
[DEBUG 2017-04-09 21:13:32,568 kur.loggers.binary_logger:187] Loading binary column: training_loss_total
[DEBUG 2017-04-09 21:13:32,568 kur.loggers.binary_logger:195] No such log column exists: shakespeare-log/training_loss_total
[DEBUG 2017-04-09 21:13:32,568 kur.loggers.binary_logger:187] Loading binary column: training_loss_batch
[DEBUG 2017-04-09 21:13:32,568 kur.loggers.binary_logger:195] No such log column exists: shakespeare-log/training_loss_batch
[DEBUG 2017-04-09 21:13:32,568 kur.loggers.binary_logger:187] Loading binary column: training_loss_time
[DEBUG 2017-04-09 21:13:32,568 kur.loggers.binary_logger:195] No such log column exists: shakespeare-log/training_loss_time
[DEBUG 2017-04-09 21:13:32,568 kur.loggers.binary_logger:187] Loading binary column: validation_loss_total
[DEBUG 2017-04-09 21:13:32,568 kur.loggers.binary_logger:195] No such log column exists: shakespeare-log/validation_loss_total
[DEBUG 2017-04-09 21:13:32,569 kur.loggers.binary_logger:187] Loading binary column: validation_loss_batch
[DEBUG 2017-04-09 21:13:32,569 kur.loggers.binary_logger:195] No such log column exists: shakespeare-log/validation_loss_batch
[DEBUG 2017-04-09 21:13:32,569 kur.loggers.binary_logger:187] Loading binary column: validation_loss_time
[DEBUG 2017-04-09 21:13:32,569 kur.loggers.binary_logger:195] No such log column exists: shakespeare-log/validation_loss_time
[WARNING 2017-04-09 21:13:33,539 kur.supplier.pickle_supplier:67] We needed to explicitly set a "latin1" encoding to properly load the pickled data. This is probably because the pickled data was created in Python 2. You really should switch over to Python 3 in order to ensure future compatibility.
[DEBUG 2017-04-09 21:13:33,552 kur.providers.batch_provider:57] Batch size set to: 32
[DEBUG 2017-04-09 21:13:33,554 kur.backend.backend:270] Using backend: keras
[DEBUG 2017-04-09 21:13:33,555 kur.backend.backend:69] No execution device indicated to backend. Checking available devices...
[DEBUG 2017-04-09 21:13:33,555 kur.utils.cuda:161] Loading NVIDIA ML library.
[DEBUG 2017-04-09 21:13:33,555 kur.backend.backend:75] Failed to initialize CUDA. Falling back to CPU.
[INFO 2017-04-09 21:13:33,555 kur.backend.backend:154] Creating backend: keras
[INFO 2017-04-09 21:13:33,555 kur.backend.backend:157] Backend variants: none
[INFO 2017-04-09 21:13:33,555 kur.backend.keras_backend:124] No particular backend for Keras has been requested.
[DEBUG 2017-04-09 21:13:33,556 kur.backend.keras_backend:126] Using the system-default Keras backend.
[INFO 2017-04-09 21:13:33,556 kur.backend.keras_backend:175] Requesting CPU
[DEBUG 2017-04-09 21:13:33,556 kur.backend.keras_backend:186] Overriding environmental variables: {'KERAS_BACKEND': None, 'THEANO_FLAGS': 'force_device=true,device=cpu', 'CUDA_VISIBLE_DEVICES': '100', 'TF_CPP_MIN_LOG_LEVEL': '1'}
[INFO 2017-04-09 21:13:35,016 kur.backend.keras_backend:192] Keras is loaded. The backend is: tensorflow
[INFO 2017-04-09 21:13:35,016 kur.model.model:261] Enumerating the model containers.
[INFO 2017-04-09 21:13:35,016 kur.model.model:266] Assembling the model dependency graph.
[DEBUG 2017-04-09 21:13:35,017 kur.model.model:273] Assembled Node: in
[DEBUG 2017-04-09 21:13:35,017 kur.model.model:275]   Uses:
[DEBUG 2017-04-09 21:13:35,017 kur.model.model:277]   Used by: ..recurrent.0
[DEBUG 2017-04-09 21:13:35,017 kur.model.model:278]   Aliases: in
[DEBUG 2017-04-09 21:13:35,017 kur.model.model:273] Assembled Node: ..recurrent.0
[DEBUG 2017-04-09 21:13:35,017 kur.model.model:275]   Uses: in
[DEBUG 2017-04-09 21:13:35,017 kur.model.model:277]   Used by: ..dropout.0
[DEBUG 2017-04-09 21:13:35,018 kur.model.model:278]   Aliases: ..recurrent.0
[DEBUG 2017-04-09 21:13:35,018 kur.model.model:273] Assembled Node: ..dropout.0
[DEBUG 2017-04-09 21:13:35,018 kur.model.model:275]   Uses: ..recurrent.0
[DEBUG 2017-04-09 21:13:35,018 kur.model.model:277]   Used by: ..dense.0
[DEBUG 2017-04-09 21:13:35,018 kur.model.model:278]   Aliases: ..dropout.0
[DEBUG 2017-04-09 21:13:35,018 kur.model.model:273] Assembled Node: ..dense.0
[DEBUG 2017-04-09 21:13:35,018 kur.model.model:275]   Uses: ..dropout.0
[DEBUG 2017-04-09 21:13:35,018 kur.model.model:277]   Used by: out
[DEBUG 2017-04-09 21:13:35,018 kur.model.model:278]   Aliases: ..dense.0
[DEBUG 2017-04-09 21:13:35,019 kur.model.model:273] Assembled Node: out
[DEBUG 2017-04-09 21:13:35,019 kur.model.model:275]   Uses: ..dense.0
[DEBUG 2017-04-09 21:13:35,019 kur.model.model:277]   Used by:
[DEBUG 2017-04-09 21:13:35,019 kur.model.model:278]   Aliases: out
[INFO 2017-04-09 21:13:35,019 kur.model.model:281] Connecting the model graph.
[DEBUG 2017-04-09 21:13:35,025 kur.model.model:312] Building node: in
[DEBUG 2017-04-09 21:13:35,026 kur.model.model:313]   Aliases: in
[DEBUG 2017-04-09 21:13:35,026 kur.model.model:314]   Inputs:
[DEBUG 2017-04-09 21:13:35,026 kur.containers.layers.placeholder:161] Creating placeholder for "in" with data type "float32".
[DEBUG 2017-04-09 21:13:35,026 kur.model.model:126] Trying to infer shape for input "in"
[DEBUG 2017-04-09 21:13:35,026 kur.model.model:144] Inferred shape for input "in": (100, 1)
[DEBUG 2017-04-09 21:13:35,047 kur.model.model:394]   Value: Tensor("in:0", shape=(?, 100, 1), dtype=float32)
[DEBUG 2017-04-09 21:13:35,047 kur.model.model:312] Building node: ..recurrent.0
[DEBUG 2017-04-09 21:13:35,047 kur.model.model:313]   Aliases: ..recurrent.0
[DEBUG 2017-04-09 21:13:35,048 kur.model.model:314]   Inputs:
[DEBUG 2017-04-09 21:13:35,048 kur.model.model:316]   - in: Tensor("in:0", shape=(?, 100, 1), dtype=float32)
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
[DEBUG 2017-04-09 21:13:35,497 kur.model.model:394]   Value: Tensor("..recurrent.0/transpose_1:0", shape=(?, ?, 256), dtype=float32)
[DEBUG 2017-04-09 21:13:35,497 kur.model.model:312] Building node: ..dropout.0
[DEBUG 2017-04-09 21:13:35,497 kur.model.model:313]   Aliases: ..dropout.0
[DEBUG 2017-04-09 21:13:35,497 kur.model.model:314]   Inputs:
[DEBUG 2017-04-09 21:13:35,498 kur.model.model:316]   - ..recurrent.0: Tensor("..recurrent.0/transpose_1:0", shape=(?, ?, 256), dtype=float32)
[DEBUG 2017-04-09 21:13:35,518 kur.model.model:394]   Value: Tensor("..dropout.0/cond/Merge:0", shape=(?, ?, 256), dtype=float32)
[DEBUG 2017-04-09 21:13:35,518 kur.model.model:312] Building node: ..dense.0
[DEBUG 2017-04-09 21:13:35,518 kur.model.model:313]   Aliases: ..dense.0
[DEBUG 2017-04-09 21:13:35,518 kur.model.model:314]   Inputs:
[DEBUG 2017-04-09 21:13:35,518 kur.model.model:316]   - ..dropout.0: Tensor("..dropout.0/cond/Merge:0", shape=(?, ?, 256), dtype=float32)
[DEBUG 2017-04-09 21:13:35,551 kur.model.model:394]   Value: Tensor("..dense.0/add:0", shape=(?, 100, 65), dtype=float32)
[DEBUG 2017-04-09 21:13:35,551 kur.model.model:312] Building node: out
[DEBUG 2017-04-09 21:13:35,551 kur.model.model:313]   Aliases: out
[DEBUG 2017-04-09 21:13:35,551 kur.model.model:314]   Inputs:
[DEBUG 2017-04-09 21:13:35,552 kur.model.model:316]   - ..dense.0: Tensor("..dense.0/add:0", shape=(?, 100, 65), dtype=float32)
[DEBUG 2017-04-09 21:13:35,559 kur.model.model:394]   Value: Tensor("out/truediv:0", shape=(?, 100, 65), dtype=float32)
[INFO 2017-04-09 21:13:35,559 kur.model.model:285] Model inputs:  in
[INFO 2017-04-09 21:13:35,559 kur.model.model:286] Model outputs: out
[INFO 2017-04-09 21:13:35,559 kur.model.executor:591] No historical training loss available from logs.
[INFO 2017-04-09 21:13:35,560 kur.model.executor:599] No historical validation loss available from logs.
[INFO 2017-04-09 21:13:35,560 kur.model.executor:612] No previous epochs.
[DEBUG 2017-04-09 21:13:35,560 kur.model.executor:108] Recompiling the model.
[DEBUG 2017-04-09 21:13:35,560 kur.backend.keras_backend:592] Instantiating a Keras model.
[DEBUG 2017-04-09 21:13:35,561 kur.backend.keras_backend:603] _________________________________________________________________
[DEBUG 2017-04-09 21:13:35,561 kur.backend.keras_backend:603] Layer (type)                 Output Shape              Param #
[DEBUG 2017-04-09 21:13:35,561 kur.backend.keras_backend:603] =================================================================
[DEBUG 2017-04-09 21:13:35,561 kur.backend.keras_backend:603] in (InputLayer)              (None, 100, 1)            0
[DEBUG 2017-04-09 21:13:35,561 kur.backend.keras_backend:603] _________________________________________________________________
[DEBUG 2017-04-09 21:13:35,561 kur.backend.keras_backend:603] ..recurrent.0 (LSTM)         (None, 100, 256)          264192
[DEBUG 2017-04-09 21:13:35,561 kur.backend.keras_backend:603] _________________________________________________________________
[DEBUG 2017-04-09 21:13:35,561 kur.backend.keras_backend:603] ..dropout.0 (Dropout)        (None, 100, 256)          0
[DEBUG 2017-04-09 21:13:35,561 kur.backend.keras_backend:603] _________________________________________________________________
[DEBUG 2017-04-09 21:13:35,562 kur.backend.keras_backend:603] ..dense.0 (Dense)            (None, 100, 65)           16705
[DEBUG 2017-04-09 21:13:35,562 kur.backend.keras_backend:603] _________________________________________________________________
[DEBUG 2017-04-09 21:13:35,562 kur.backend.keras_backend:603] out (Activation)             (None, 100, 65)           0
[DEBUG 2017-04-09 21:13:35,562 kur.backend.keras_backend:603] =================================================================
[DEBUG 2017-04-09 21:13:35,562 kur.backend.keras_backend:603] Total params: 280,897.0
[DEBUG 2017-04-09 21:13:35,562 kur.backend.keras_backend:603] Trainable params: 280,897.0
[DEBUG 2017-04-09 21:13:35,562 kur.backend.keras_backend:603] Non-trainable params: 0.0
[DEBUG 2017-04-09 21:13:35,562 kur.backend.keras_backend:603] _________________________________________________________________
[DEBUG 2017-04-09 21:13:35,562 kur.backend.keras_backend:603]
[DEBUG 2017-04-09 21:13:35,562 kur.backend.keras_backend:645] Assembling a training function from the model.
[DEBUG 2017-04-09 21:13:35,580 kur.backend.keras_backend:575] Adding additional inputs: out
[DEBUG 2017-04-09 21:13:37,026 kur.backend.keras_backend:668] Additional inputs for log functions: out
[DEBUG 2017-04-09 21:13:37,026 kur.backend.keras_backend:685] Expected input shapes: in=(None, 100, 1), out=(None, None, None)
[DEBUG 2017-04-09 21:13:37,027 kur.backend.keras_backend:703] Compiled model: {'func': <keras.backend.tensorflow_backend.Function object at 0x11146de48>, 'names': {'input': ['in', 'out'], 'output': ['out', 'out']}, 'shapes': {'input': [(None, 100, 1), (None, None, None)]}}
[INFO 2017-04-09 21:13:37,754 kur.backend.keras_backend:736] Waiting for model to finish compiling...
[DEBUG 2017-04-09 21:13:37,754 kur.providers.batch_provider:139] Preparing next batch of data...
[DEBUG 2017-04-09 21:13:37,754 kur.providers.batch_provider:204] Next batch of data has been prepared.
[ERROR 2017-04-09 21:13:38,016 kur.model.executor:293] Exception raised during training.
Traceback (most recent call last):
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1022, in _do_call
    return fn(*args)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1004, in _run_fn
    status, run_metadata)
  File "/Users/nakul/.pyenv/versions/3.6.1/lib/python3.6/contextlib.py", line 89, in __exit__
    next(self.gen)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 469, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [2,65,1] vs. [2,100,65]
	 [[Node: gradients/mul_grad/BroadcastGradientArgs = BroadcastGradientArgs[T=DT_INT32, _class=["loc:@mul"], _device="/job:localhost/replica:0/task:0/cpu:0"](gradients/mul_grad/Shape, gradients/mul_grad/Shape_1)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/model/executor.py", line 290, in train
    **kwargs
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/model/executor.py", line 725, in wrapped_train
    self.compile('train', with_provider=provider)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/model/executor.py", line 114, in compile
    **kwargs
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/backend/keras_backend.py", line 708, in compile
    self.wait_for_compile(model, key)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/backend/keras_backend.py", line 738, in wait_for_compile
    self.run_batch(model, batch, key, False)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/backend/keras_backend.py", line 780, in run_batch
    outputs = compiled['func'](inputs)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2073, in __call__
    feed_dict=feed_dict)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 767, in run
    run_metadata_ptr)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 965, in _run
    feed_dict_string, options, run_metadata)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1015, in _do_run
    target_list, options, run_metadata)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1035, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [2,65,1] vs. [2,100,65]
	 [[Node: gradients/mul_grad/BroadcastGradientArgs = BroadcastGradientArgs[T=DT_INT32, _class=["loc:@mul"], _device="/job:localhost/replica:0/task:0/cpu:0"](gradients/mul_grad/Shape, gradients/mul_grad/Shape_1)]]

Caused by op 'gradients/mul_grad/BroadcastGradientArgs', defined at:
  File "/Users/nakul/.virtualenvs/kur/bin/kur", line 11, in <module>
    sys.exit(main())
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/__main__.py", line 385, in main
    sys.exit(args.func(args) or 0)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/__main__.py", line 63, in train
    func(step=args.step)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/kurfile.py", line 414, in func
    return trainer.train(**defaults)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/model/executor.py", line 290, in train
    **kwargs
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/model/executor.py", line 725, in wrapped_train
    self.compile('train', with_provider=provider)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/model/executor.py", line 114, in compile
    **kwargs
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/backend/keras_backend.py", line 653, in compile
    compiled.trainable_weights, total_loss
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/optimizer/optimizer.py", line 47, in optimize
    return keras_optimizer.get_updates(weights, [], loss)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/keras/optimizers.py", line 381, in get_updates
    grads = self.get_gradients(loss, params)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/keras/optimizers.py", line 47, in get_gradients
    grads = K.gradients(loss, params)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2108, in gradients
    return tf.gradients(loss, variables, colocate_gradients_with_ops=True)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/ops/gradients_impl.py", line 482, in gradients
    in_grads = grad_fn(op, *out_grads)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/ops/math_grad.py", line 610, in _MulGrad
    rx, ry = gen_array_ops._broadcast_gradient_args(sx, sy)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 411, in _broadcast_gradient_args
    name=name)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op
    op_def=op_def)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2395, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1264, in __init__
    self._traceback = _extract_stack()

...which was originally created as op 'mul', defined at:
  File "/Users/nakul/.virtualenvs/kur/bin/kur", line 11, in <module>
    sys.exit(main())
[elided 5 identical lines from previous traceback]
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/model/executor.py", line 114, in compile
    **kwargs
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/backend/keras_backend.py", line 650, in compile
    self.process_loss(model, loss)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/backend/keras_backend.py", line 566, in process_loss
    self.find_compiled_layer_by_name(model, target)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/loss/categorical_crossentropy.py", line 36, in get_loss
    return keras_wrap(model, target, output, 'categorical_crossentropy')
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/loss/loss.py", line 37, in keras_wrap
    out = loss(ins[0][1], output)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/keras/losses.py", line 37, in categorical_crossentropy
    return K.categorical_crossentropy(y_pred, y_true)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2552, in categorical_crossentropy
    return - tf.reduce_sum(target * tf.log(output),
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 884, in binary_op_wrapper
    return func(x, y, name=name)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 1105, in _mul_dispatch
    return gen_math_ops._mul(x, y, name=name)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/ops/gen_math_ops.py", line 1625, in _mul
    result = _op_def_lib.apply_op("Mul", x=x, y=y, name=name)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op
    op_def=op_def)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2395, in create_op
    original_op=self._default_original_op, op_def=op_def)

InvalidArgumentError (see above for traceback): Incompatible shapes: [2,65,1] vs. [2,100,65]
	 [[Node: gradients/mul_grad/BroadcastGradientArgs = BroadcastGradientArgs[T=DT_INT32, _class=["loc:@mul"], _device="/job:localhost/replica:0/task:0/cpu:0"](gradients/mul_grad/Shape, gradients/mul_grad/Shape_1)]]

Traceback (most recent call last):
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1022, in _do_call
    return fn(*args)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1004, in _run_fn
    status, run_metadata)
  File "/Users/nakul/.pyenv/versions/3.6.1/lib/python3.6/contextlib.py", line 89, in __exit__
    next(self.gen)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 469, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [2,65,1] vs. [2,100,65]
	 [[Node: gradients/mul_grad/BroadcastGradientArgs = BroadcastGradientArgs[T=DT_INT32, _class=["loc:@mul"], _device="/job:localhost/replica:0/task:0/cpu:0"](gradients/mul_grad/Shape, gradients/mul_grad/Shape_1)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/nakul/.virtualenvs/kur/bin/kur", line 11, in <module>
    sys.exit(main())
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/__main__.py", line 385, in main
    sys.exit(args.func(args) or 0)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/__main__.py", line 63, in train
    func(step=args.step)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/kurfile.py", line 414, in func
    return trainer.train(**defaults)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/model/executor.py", line 290, in train
    **kwargs
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/model/executor.py", line 725, in wrapped_train
    self.compile('train', with_provider=provider)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/model/executor.py", line 114, in compile
    **kwargs
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/backend/keras_backend.py", line 708, in compile
    self.wait_for_compile(model, key)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/backend/keras_backend.py", line 738, in wait_for_compile
    self.run_batch(model, batch, key, False)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/backend/keras_backend.py", line 780, in run_batch
    outputs = compiled['func'](inputs)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2073, in __call__
    feed_dict=feed_dict)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 767, in run
    run_metadata_ptr)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 965, in _run
    feed_dict_string, options, run_metadata)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1015, in _do_run
    target_list, options, run_metadata)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1035, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [2,65,1] vs. [2,100,65]
	 [[Node: gradients/mul_grad/BroadcastGradientArgs = BroadcastGradientArgs[T=DT_INT32, _class=["loc:@mul"], _device="/job:localhost/replica:0/task:0/cpu:0"](gradients/mul_grad/Shape, gradients/mul_grad/Shape_1)]]

Caused by op 'gradients/mul_grad/BroadcastGradientArgs', defined at:
  File "/Users/nakul/.virtualenvs/kur/bin/kur", line 11, in <module>
    sys.exit(main())
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/__main__.py", line 385, in main
    sys.exit(args.func(args) or 0)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/__main__.py", line 63, in train
    func(step=args.step)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/kurfile.py", line 414, in func
    return trainer.train(**defaults)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/model/executor.py", line 290, in train
    **kwargs
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/model/executor.py", line 725, in wrapped_train
    self.compile('train', with_provider=provider)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/model/executor.py", line 114, in compile
    **kwargs
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/backend/keras_backend.py", line 653, in compile
    compiled.trainable_weights, total_loss
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/optimizer/optimizer.py", line 47, in optimize
    return keras_optimizer.get_updates(weights, [], loss)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/keras/optimizers.py", line 381, in get_updates
    grads = self.get_gradients(loss, params)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/keras/optimizers.py", line 47, in get_gradients
    grads = K.gradients(loss, params)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2108, in gradients
    return tf.gradients(loss, variables, colocate_gradients_with_ops=True)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/ops/gradients_impl.py", line 482, in gradients
    in_grads = grad_fn(op, *out_grads)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/ops/math_grad.py", line 610, in _MulGrad
    rx, ry = gen_array_ops._broadcast_gradient_args(sx, sy)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 411, in _broadcast_gradient_args
    name=name)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op
    op_def=op_def)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2395, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1264, in __init__
    self._traceback = _extract_stack()

...which was originally created as op 'mul', defined at:
  File "/Users/nakul/.virtualenvs/kur/bin/kur", line 11, in <module>
    sys.exit(main())
[elided 5 identical lines from previous traceback]
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/model/executor.py", line 114, in compile
    **kwargs
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/backend/keras_backend.py", line 650, in compile
    self.process_loss(model, loss)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/backend/keras_backend.py", line 566, in process_loss
    self.find_compiled_layer_by_name(model, target)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/loss/categorical_crossentropy.py", line 36, in get_loss
    return keras_wrap(model, target, output, 'categorical_crossentropy')
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/kur/loss/loss.py", line 37, in keras_wrap
    out = loss(ins[0][1], output)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/keras/losses.py", line 37, in categorical_crossentropy
    return K.categorical_crossentropy(y_pred, y_true)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2552, in categorical_crossentropy
    return - tf.reduce_sum(target * tf.log(output),
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 884, in binary_op_wrapper
    return func(x, y, name=name)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 1105, in _mul_dispatch
    return gen_math_ops._mul(x, y, name=name)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/ops/gen_math_ops.py", line 1625, in _mul
    result = _op_def_lib.apply_op("Mul", x=x, y=y, name=name)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op
    op_def=op_def)
  File "/Users/nakul/.virtualenvs/kur/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2395, in create_op
    original_op=self._default_original_op, op_def=op_def)

InvalidArgumentError (see above for traceback): Incompatible shapes: [2,65,1] vs. [2,100,65]
	 [[Node: gradients/mul_grad/BroadcastGradientArgs = BroadcastGradientArgs[T=DT_INT32, _class=["loc:@mul"], _device="/job:localhost/replica:0/task:0/cpu:0"](gradients/mul_grad/Shape, gradients/mul_grad/Shape_1)]]

I'm unsure how to fix these errors. The above kurfile is not the final model. Ideally, I would like to implement the below LSTM model

image

Thanks

post processing code for tutorial from scratch has some error

The data type inside output.pkl from tutorial from scratch: post processing is not both numpy.array, therefore, the code below is not working

I have ran the code myself, here is where goes wrong:
screen shot 2017-02-26 at 9 51 59 pm

I managed to some code work, but not the others:

# diff = numpy.abs(data['truth']['above'] - data['result']['above']) < 1 # original but not working
# I made some changes here, the accuracy is 98%, not 99.7%.
diff = np.array([np.abs(a - b[0]) for a, b in zip(data['truth']['above'], data['result']['above'])]) < 1
correct = diff.sum()
total = len(diff)
correct / total * 100

kur deepgram10 error

I was following along the instructions and I encountered this error , I am using ubuntu 16.04 with only CPU

(kur) saurabh@saurabh-Inspiron-5559:~/kur/examples$  kur train speech.yml
[WARNING 2017-03-20 12:51:58,410 py.warnings:86] /home/saurabh/.virtualenvs/kur/lib/python3.5/site-packages/keras/legacy/interfaces.py:86: UserWarning: Update your `Conv1D` call to the Keras 2 API: `Conv1D(padding="valid", strides=2, activation="linear", filters=1000, kernel_size=11, name="..convolution.0")`
  '` call to the Keras 2 API: ' + signature)

Traceback (most recent call last):
  File "/home/saurabh/.virtualenvs/kur/lib/python3.5/site-packages/tensorflow/python/framework/tensor_shape.py", line 557, in merge_with
    self.assert_same_rank(other)
  File "/home/saurabh/.virtualenvs/kur/lib/python3.5/site-packages/tensorflow/python/framework/tensor_shape.py", line 603, in assert_same_rank
    "Shapes %s and %s must have the same rank" % (self, other))
ValueError: Shapes (11, 161, 1000) and (?, ?, ?, ?) must have the same rank

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/saurabh/.virtualenvs/kur/lib/python3.5/site-packages/tensorflow/python/framework/tensor_shape.py", line 633, in with_rank
    return self.merge_with(unknown_shape(ndims=rank))
  File "/home/saurabh/.virtualenvs/kur/lib/python3.5/site-packages/tensorflow/python/framework/tensor_shape.py", line 564, in merge_with
    (self, other))
ValueError: Shapes (11, 161, 1000) and (?, ?, ?, ?) are not compatible

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/saurabh/.virtualenvs/kur/bin/kur", line 11, in <module>
    sys.exit(main())
  File "/home/saurabh/.virtualenvs/kur/lib/python3.5/site-packages/kur/__main__.py", line 269, in main
    sys.exit(args.func(args) or 0)
  File "/home/saurabh/.virtualenvs/kur/lib/python3.5/site-packages/kur/__main__.py", line 48, in train
    func = spec.get_training_function()
  File "/home/saurabh/.virtualenvs/kur/lib/python3.5/site-packages/kur/kurfile.py", line 282, in get_training_function
    model = self.get_model(provider)
  File "/home/saurabh/.virtualenvs/kur/lib/python3.5/site-packages/kur/kurfile.py", line 148, in get_model
    self.model.build()
  File "/home/saurabh/.virtualenvs/kur/lib/python3.5/site-packages/kur/model/model.py", line 282, in build
    self.build_graph(input_nodes, output_nodes, network)
  File "/home/saurabh/.virtualenvs/kur/lib/python3.5/site-packages/kur/model/model.py", line 339, in build_graph
    target=layer
  File "/home/saurabh/.virtualenvs/kur/lib/python3.5/site-packages/kur/backend/keras_backend.py", line 238, in connect
    return target(inputs)
  File "/home/saurabh/.virtualenvs/kur/lib/python3.5/site-packages/keras/engine/topology.py", line 554, in __call__
    output = self.call(inputs, **kwargs)
  File "/home/saurabh/.virtualenvs/kur/lib/python3.5/site-packages/keras/layers/convolutional.py", line 156, in call
    dilation_rate=self.dilation_rate[0])
  File "/home/saurabh/.virtualenvs/kur/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 2824, in conv1d
    data_format=tf_data_format)
  File "/home/saurabh/.virtualenvs/kur/lib/python3.5/site-packages/tensorflow/python/ops/nn_ops.py", line 639, in convolution
    op=op)
  File "/home/saurabh/.virtualenvs/kur/lib/python3.5/site-packages/tensorflow/python/ops/nn_ops.py", line 308, in with_space_to_batch
    return op(input, num_spatial_dims, padding)
  File "/home/saurabh/.virtualenvs/kur/lib/python3.5/site-packages/tensorflow/python/ops/nn_ops.py", line 631, in op
    name=name)
  File "/home/saurabh/.virtualenvs/kur/lib/python3.5/site-packages/tensorflow/python/ops/nn_ops.py", line 87, in _non_atrous_convolution
    filter_shape = filter.get_shape().with_rank(input.get_shape().ndims)
  File "/home/saurabh/.virtualenvs/kur/lib/python3.5/site-packages/tensorflow/python/framework/tensor_shape.py", line 635, in with_rank
    raise ValueError("Shape %s must have rank %d" % (self, rank))
ValueError: Shape (11, 161, 1000) must have rank 4

How to add `negative_slope` to `leaky_relu` activation of pytorch backend?

I have managed to add LeakyReLU for both keras and pytorch end (see code below). And I want to add an argument alpha in keras or negative_slope in pytorch (equivalent, I guess) to this activation.

I could add alpha to keras LeakyReLU, but failed to add it to pytorch. I wonder when user needs to set value for alpha or negative_slope, how can they do it if we don't make the argument accessible. That's why I want to make the argument available in kur. Or is it that most cases we don't need to change alpha or negative_slope so there is no point of adding them?

Below is the source code I have changed to make LeakyReLU and alpha possible, but not negative_slope for pytorch. Could you check it for me and shed some light on how to add negative_slope for pytorch? Thanks

	def _parse(self, engine):
		""" Parse the layer.
		"""

		if not isinstance(self.args, dict):
			self.type = self.args
		else:
		# if alpha not exist or empty as None, default value is 0.3
			self.type = self.args['name']
		if self.type == 'leakyrelu':

			if 'alpha' in self.args and self.args['alpha'] is not None:
				self.alpha = self.args['alpha']
			else:
				self.alpha = 0.3


	###########################################################################
	def _build(self, model):
		""" Create the backend-specific placeholder.
		"""
		backend = model.get_backend()
		if backend.get_name() == 'keras':

			import keras.layers as L			# pylint: disable=import-error

			if self.type != "leakyrelu":
				yield L.Activation(
					'linear' if self.type == 'none' or self.type is None \
						else self.type,
					name=self.name,
					trainable=not self.frozen
				)
			# if advanced activation in keras, like LeakyReLU
			else:
				yield L.LeakyReLU(alpha=self.alpha)

		elif backend.get_name() == 'pytorch':

			import torch.nn.functional as F		# pylint: disable=import-error
			func = {
				'relu' : F.relu,
				'tanh' : F.tanh,
				'sigmoid' : F.sigmoid,
				'softmax' : F.log_softmax,
				'leakyrelu' : F.leaky_relu
			}.get(self.type.lower())
			if func is None:
				raise ValueError('Unsupported activation function "{}" for '
					'backend "{}".'.format(self.type, backend.get_name()))

			def connect(inputs):
				""" Connects the layer.
				"""
				assert len(inputs) == 1
				return {
					'shape' : self.shape([inputs[0]['shape']]),
					'layer' : model.data.add_operation(func)(
						inputs[0]['layer']
					)
				}

			yield connect

		else:
			raise ValueError(
				'Unknown or unsupported backend: {}'.format(backend))

Issues install kur on Ubuntu instance

Created the virtual env
I cloned the repo
Then did "sudo pip3 install -e ."
cd into examples
When I run "kur -v train speech.yml" I get this error.

Traceback (most recent call last):
File "/home/tracy/.virtualenvs/kur/lib/python3.4/site-packages/pkg_resources.py", line 2716, in
working_set.require(requires)
File "/home/tracy/.virtualenvs/kur/lib/python3.4/site-packages/pkg_resources.py", line 685, in require
needed = self.resolve(parse_requirements(requirements))
File "/home/tracy/.virtualenvs/kur/lib/python3.4/site-packages/pkg_resources.py", line 592, in resolve
raise VersionConflict(dist,req) # XXX put more info here
pkg_resources.VersionConflict: (kur 0.5.2 (/home/ubuntu/data/tracy/kur), Requirement.parse('kur==0.3.0'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/tracy/.virtualenvs/kur/bin/kur", line 5, in
from pkg_resources import load_entry_point
File "/home/tracy/.virtualenvs/kur/lib/python3.4/site-packages/pkg_resources.py", line 2720, in
parse_requirements(requires), Environment()
File "/home/tracy/.virtualenvs/kur/lib/python3.4/site-packages/pkg_resources.py", line 588, in resolve
raise DistributionNotFound(req)
pkg_resources.DistributionNotFound: kur==0.3.0

When I try to run "kur --version" I get the same thing.

I had it running before, so I dont know what happened.

Changing speech example to use LSTM instead of GRU predicts empty strings.

I'm experimenting with different configurations of the speech.yml example and can't seem to change the rnn type from gru to lstm. When I do, I get the following output at each validation step.

[INFO 2017-01-26 18:16:20,580 kur.model.executor:172] Validation loss: 834.754
Prediction: " "
Truth: "the great state of virginia mother of presidents went out of the union at last and north carolina tennessee and arkansas followed her but maryland kentucky and missouri still hung in the balance"

The prediction is always " " regardless of the truth value.

Any ideas?

Validation loss rockets up for speech example

Hi guys!
I'm trying to train speech example. According to your blog post it should learn in about 2 days (by the way on what hardware?). I'm training on Amazon p2.xlarge. After 20 hours train loss droped down to 6-7, yet validation loss just keeps growing (it is already above 770!). I find it rather weird.
Here is history of validation losses:
array([ 434.28509521, 317.38632202, 309.28720093, 312.42532349,
313.19784546, 324.18057251, 343.06167603, 349.9078064 ,
381.07891846, 386.22546387, 405.9574585 , 417.4850769 ,
424.54220581, 456.63919067, 461.6697998 , 476.76068115,
485.60809326, 485.19570923, 503.05889893, 506.81793213,
518.48876953, 514.61688232, 554.96228027, 564.0111084 ,
574.05865479, 566.97747803, 589.06658936, 585.39562988,
600.09417725, 590.19805908, 617.77484131, 607.53161621,
637.23944092, 628.73828125, 623.25372314, 640.79986572,
637.19677734, 669.72247314, 630.89300537, 682.78503418,
677.91912842, 682.71673584, 674.5670166 , 687.17169189,
654.59844971, 705.17877197, 731.55621338, 704.7802124 ,
687.76977539, 732.31542969, 747.31225586, 691.43365479,
724.47790527, 739.60638428, 731.09661865, 753.55401611,
751.44567871, 751.18878174, 777.97332764, 715.65222168,
768.37219238, 711.67407227, 757.25933838, 774.5269165 ,
786.63122559], dtype=float32)

This is trained on current master. I'll try to train on pip version of kur, maybe there would be a difference.

speech validating loss abnormal?

Epoch 70/inf, loss=6.643: 100%|████████| 2432/2432 [17:24<00:00, 2.29samples/s]
Validating, loss=806.257: 94%|█████████▍| 256/271 [00:40<00:02, 6.63samples/s]
Prediction: "lok h n v eon owlt im on nonaih"
Truth: "he was in a mood for music was he not"

training loss decreasing, while validating loss increasing, is that NOK?

Error installing kur via pip (keras>=1.1.2 not found)

Hi, I'm getting the following error when attempting to install kur via pip.

Could not find a version that satisfies the requirement keras>=1.1.2 (from kur) (from versions: 0.1.3)
No matching distribution found for keras>=1.1.2 (from kur)

Here's my setup:

python --version
Python 3.5.2

pip --version
ppip 9.0.1 from /Users/ben/virtualenvs/kur/lib/python3.5/site-packages (python 3.5)

Seems like a cool framework and would really like to try it out.

Illegal hardware instruction error

I am getting [1] 4589 illegal hardware instruction kur train mnist.yml when I run kur train mnist.yml. This happens after data download. I am running this on a virtual environment (Python 3.6.0 (brew installed), Mac OS 10.12.2), and here is my pip freeze:

Jinja2==2.9.4
Keras==1.2.0
kur==0.2.0
MarkupSafe==0.23
numpy==1.12.0
pydub==0.16.7
python-magic==0.4.12
python-speech-features==0.4
PyYAML==3.12
scipy==0.18.1
six==1.10.0
Theano==0.8.2
tqdm==4.11.0

I tried installing the most recent development version with no avail. Any idea why this may be happening?

Speech Recognition Seems To Overfit

Hi, I don't know if this is an issue with the framework, but did not know where else to ask. I have been training the speech recognition example (speech.yml) for about 80 epochs on a Titan X within a tensorflow-gpu-py3 based docker image. For some reason, the training loss has gone way down, but my validation loss is still very high and the sample predictions it spits out are gibberish.

Example output:

Epoch 80/inf, loss=5.845: 100%|##########| 2432/2432 [07:04<00:00,  7.80samples/s]
Validating, loss=766.122:  94%|#########4| 256/271 [00:21<00:01, 13.70samples/s]
Prediction: "thg asi bw a hnta e tb incotnetk rndegibnrtrlan ty bmna ekftett trelaob"
Truth: "and what inquired missus macpherson has mary ann given you her love"
  1. Is this sort of behavior expected this early in training?
  2. How long would you expect to have to train this model to start getting reasonable results.

bug in version 0.6.0 when run `kur data mnist.yml`, and a fix proposed

Error message:

(dlnd-tf-lab)  ->kur data mnist.yml
Traceback (most recent call last):
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/bin/kur", line 11, in <module>
    load_entry_point('kur', 'console_scripts', 'kur')()
  File "/Users/Natsume/Downloads/kur_road/kur/kur/__main__.py", line 396, in main
    sys.exit(args.func(args) or 0)
  File "/Users/Natsume/Downloads/kur_road/kur/kur/__main__.py", line 185, in prepare_data
    keys = sorted(batch.keys())
AttributeError: 'str' object has no attribute 'keys'

I fixed it in the following way, see whether this is acceptable:

replace original line 167-168

with the following lines:

	for k in provider:
		provider = provider[k]
		for batch in provider:
			break

error when using uppercase transcripts

This is perhaps less of an issue as a "heads up" to others.

I have transcripts with all uppercase letters, but this seems to have caused the following error:

InvalidArgumentError (see above for traceback): Labels length is zero in batch 0
         [[Node: CTCLoss = CTCLoss[ctc_merge_repeated=true, preprocess_collapse_
repeated=false, _device="/job:localhost/replica:0/task:0/cpu:0"](Log/_213, ToInt
64/_215, GatherNd, Squeeze_2/_217)]]

So it appears that unless I'm missing a configuration setting somewhere, all transcripts must be lowercase.

error info after update keras to 2.0.2

[WARNING 2017-03-23 20:43:09,774 py.warnings:87] /usr/local/lib/python3.5/dist-packages/kur/containers/layers/convolution.py:165: UserWarning: Update your Conv1D call to the Keras 2 API: Conv1D(padding="valid", name="..convolution.0", kernel_size=11, strides=2, filters=1000, activation="linear")
yield func(**kwargs)

Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/tensor_shape.py", line 557, in merge_with
self.assert_same_rank(other)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/tensor_shape.py", line 603, in assert_same_rank
"Shapes %s and %s must have the same rank" % (self, other))
ValueError: Shapes (11, 161, 1000) and (?, ?, ?, ?) must have the same rank

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/tensor_shape.py", line 633, in with_rank
return self.merge_with(unknown_shape(ndims=rank))
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/tensor_shape.py", line 564, in merge_with
(self, other))
ValueError: Shapes (11, 161, 1000) and (?, ?, ?, ?) are not compatible

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/bin/kur", line 11, in
sys.exit(main())
File "/usr/local/lib/python3.5/dist-packages/kur/main.py", line 269, in main
sys.exit(args.func(args) or 0)
File "/usr/local/lib/python3.5/dist-packages/kur/main.py", line 48, in train
func = spec.get_training_function()
File "/usr/local/lib/python3.5/dist-packages/kur/kurfile.py", line 282, in get_training_function
model = self.get_model(provider)
File "/usr/local/lib/python3.5/dist-packages/kur/kurfile.py", line 148, in get_model
self.model.build()
File "/usr/local/lib/python3.5/dist-packages/kur/model/model.py", line 282, in build
self.build_graph(input_nodes, output_nodes, network)
File "/usr/local/lib/python3.5/dist-packages/kur/model/model.py", line 339, in build_graph
target=layer
File "/usr/local/lib/python3.5/dist-packages/kur/backend/keras_backend.py", line 238, in connect
return target(inputs)
File "/usr/local/lib/python3.5/dist-packages/keras/engine/topology.py", line 554, in call
output = self.call(inputs, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/keras/layers/convolutional.py", line 156, in call
dilation_rate=self.dilation_rate[0])
File "/usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py", line 2822, in conv1d
data_format=tf_data_format)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/nn_ops.py", line 639, in convolution
op=op)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/nn_ops.py", line 308, in with_space_to_batch
return op(input, num_spatial_dims, padding)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/nn_ops.py", line 631, in op
name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/nn_ops.py", line 87, in _non_atrous_convolution
filter_shape = filter.get_shape().with_rank(input.get_shape().ndims)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/tensor_shape.py", line 635, in with_rank
raise ValueError("Shape %s must have rank %d" % (self, rank))
ValueError: Shape (11, 161, 1000) must have rank 4

Readme links go to absolute paths

Currently, https://kur.deepgram.com is not responding. According to http://www.isitdownrightnow.com/kur.deepgram.com.html, the site has been down for a few hours. This breaks the documentation because when I'm reading https://github.com/deepgram/kur/Readme.rst the link to https://kur.deepgram.com/troubleshooting is dead.

My expectation is that the links should be relative, so that if you're reading the page on the deepgram website, the links would go to other pages on the deepgram website. But if you're reading the page on GitHub, they should go to other pages on GitHub.

Tutorial crashes with ValueError: Dimension

Trying to follow the Tutorial From Scratch: Data and Model, but it crashes with the following error:

ValueError: Dimension 0 in both shapes must be equal, but are 2 and 1 for 'Assign' (op: 'Assign') with input shapes: [2,128], [1,10].

ValueError: operands could not be broadcast together with shapes (155,81) (161,)

When trying to use my own data for a speech example, I get this issue very early on:

ValueError: operands could not be broadcast together with shapes (155,81) (161,) 

I looked through the log, and I see that the model inferred an input dimension of 161. And so it's clear that when it goes to load a batch of data with a different dimension (in this case 155), it fails.

So I have two questions:

  1. What does that dimension represent (I'm using data.type=spec)?
  2. How does the model "infer" that dimension should be 161?

Including keras Conv2DTranspose

Hi,

So is it a difficult endeavour to add new layers to the yml specification? I have had a brief look at the code for the dense layer and will have a go at trying to implement myself but any hints/pointers would be very helpful! Specifically I really would like to have access to transposed convolutions...https://keras.io/layers/convolutional/#conv2dtranspose perhaps they are already available?

Thanks

Josh

How to convert gan_mnist example from tensorflow to kur?

I want to convert a gan_mnist in tensorflow to gan_mnist in kur.

At this moment, all I know about model in kur is the following:
model's life cycle:

  1. model as dict in kurfile -->
  2. model as list of kur containers or layers -->
  3. model as layers of keras -->
  4. with a Func defined in keras (which I cannot manage to see inside from kur, but I would love to be able to see) to train -->
  5. the process of training (forward pass to get loss, and backward pass to update weights) is sealed inside keras like a blackbox (because I can't see them in kur source code), therefore, I can only access loss and weights at the end of each batch training but nothing from individual layer.

It seems to me that I can't access each layer's output directly, such as logits of each model below. I hope I am wrong. Is there a way to access each layer's output directly with kur? or can I write some additional functions in kur to access outputs of each layer of the models?

Another difficulty I have is to write the models in kurfile. Is the kurfile below make sense or valid in logic and style? I prefer kurfile over using kur api directly, but I don't know what to put in kurfile a lot of times. At the moment, I am confused about when to use -, and when not to use -, I have marked the place where I am particularly confused with ????.

There are two sections below: 1. parts of kurfile; 2. corresponding parts in tensorflow

Section1: some key sections of gan_mnist pseudo-kurfile

How would you write this gan-kurfile? I would like to see what this gan kurfile would look like (it needs not to be working code, I just want to see the proper pseudo kurfile you may write)

model:  # see model code in tensorflow below
  generator:
    - input: input_z # shape (?, 100)
    - dense: 128 # g_hidden_size = 128
    - activation:
        name: leakyrelu
        alpha: 0.01
    - dense: 784 # out_dim (generator) = input_size (real image) = 784
    - activation:
        name: tanh
    - output: # output of the latest layer
        name: g_out # shape (?, 784)

  discriminator_real:
    - input: input_real # or images # shape (?, 784)
    - dense: 128 # d_hidden_size
    - activation:
        name: leakyrelu
        alpha: 0.01
    - dense: 1 # shrink nodes from 128 to 1, for 2_labels classification with sigmoid (non softmax)
        logits: d_logits_real # can I output logits here
# do I need to output logits
    - activation:
        name: sigmoid
    - output: # output of the latest layer
# can logits in the layer before the latest layer be accessed from here?
        name: d_out_real # not used at all ?

  discriminator_fake:
    - input: g_out # shape (?, 784)
  - dense: 128 # d_hidden_size
  - activation:
      name: leakyrelu
      alpha: 0.01
  - dense: 1 # shrink nodes from 128 to 1, for 2_labels classification with sigmoid (non softmax)
      logits: d_logits_fake # can I output logits here
# do I need to output logits
  - activation:
      name: sigmoid
  - output: # output of the latest layer
# can logits in the layer before the latest layer be accessed from here?
      name: d_out_fake # not used at all?

# https://kur.deepgram.com/specification.html?highlight=loss#loss
loss:  # see loss code in tensorflow below
  generator:
    - target: labels_g   # labels=tf.ones_like(d_logits_fake), it can be defined as one input data 
    - logits: d_logits_fake # when to use `-`, when not????
      name: categorical_crossentropy
      g_loss: g_loss
  discriminator_real:
    - target: labels_d_real # labels=tf.ones_like(d_logits_real) * (1 - smooth)
    - logits: d_logits_real
      name: categorical_crossentropy
      d_loss_real: d_loss_real
  discriminator_fake:
    - target: labels_d_fake # labels=tf.zeros_like(d_logits_fake)
    - logits: d_logits_fake
      name: categorical_crossentropy
      d_loss_fake: d_loss_fake

train:
  optimizer: # see the optimizers tensorflow code below
    - opt_discriminator:
        name: adam
        learning_rate: 0.001
        d_loss: d_loss #  d_loss = d_loss_real + d_loss_fake
        d_trainable: d_vars
    - opt_generator:
        name: adam
        learning_rate: 0.001
        g_loss: g_loss
        g_trainable: g_vars

Section2 is the key parts (d_model, g_model, losses, optimizers ... ) in tensorflow below

Inputs for generator and discriminator

def model_inputs(real_dim, z_dim):
	# real_dim is 784 for sure
    inputs_real = tf.placeholder(tf.float32, (None, real_dim), name='input_real')

	# z_dim is set 100, but can be almost any number
    inputs_z = tf.placeholder(tf.float32, (None, z_dim), name='input_z')

    return inputs_real, inputs_z

Generator model

def generator(z, out_dim, n_units=128, reuse=False, alpha=0.01):
    with tf.variable_scope('generator', reuse=reuse):
        # Hidden layer
        h1 = tf.layers.dense(z, n_units, activation=None)
        # Leaky ReLU
        h1 = tf.maximum(alpha * h1, h1)

        # Logits and tanh output
        logits = tf.layers.dense(h1, out_dim, activation=None)
        out = tf.tanh(logits)

        return out

Discriminator model

def discriminator(x, n_units=128, reuse=False, alpha=0.01):
    with tf.variable_scope('discriminator', reuse=reuse):
        # Hidden layer
        h1 = tf.layers.dense(x, n_units, activation=None)
        # Leaky ReLU
        h1 = tf.maximum(alpha * h1, h1)

        logits = tf.layers.dense(h1, 1, activation=None)
        out = tf.sigmoid(logits)

        return out, logits

Hyperparameters

# Size of input image to discriminator
input_size = 784
# Size of latent vector to generator
# The latent sample is a random vector the generator uses to construct it's fake images. As the generator learns through training, it figures out how to map these random vectors to recognizable images that can fool the discriminator
z_size = 100 # not 784! so it can be any number?
# Sizes of hidden layers in generator and discriminator
g_hidden_size = 128
d_hidden_size = 128
# Leak factor for leaky ReLU
alpha = 0.01
# Smoothing
smooth = 0.1

Build network

tf.reset_default_graph()

# Create our input placeholders
input_real, input_z = model_inputs(input_size, z_size)

# Build the model
g_out = generator(input_z, input_size)
# g_out is the generator output, not model object

# discriminate on real images, get output and logits
d_out_real, d_logits_real = discriminator(input_real)
# discriminate on generated images, get output and logits
d_out_fake, d_logits_fake = discriminator(g_out, reuse=True)

Calculate losses

# get loss on how good discriminator work on real images
d_loss_real = tf.reduce_mean(
                  tf.nn.sigmoid_cross_entropy_with_logits(
				  			logits=d_logits_real,
							# labels are all true as 1s
							# label smoothing *(1-smooth)
                            labels=tf.ones_like(d_logits_real) * (1 - smooth)))

# get loss on how good discriminator work on generated images 							
d_loss_fake = tf.reduce_mean(# get the mean for all the images in the batch
                  tf.nn.sigmoid_cross_entropy_with_logits(
				  			logits=d_logits_fake,
							# labels are all false, as 0s
                            labels=tf.zeros_like(d_logits_real)))

# get total loss by adding up 							
d_loss = d_loss_real + d_loss_fake

# get loss on how well generator work for generating images as real as possible
g_loss = tf.reduce_mean(
             tf.nn.sigmoid_cross_entropy_with_logits(
			 			logits=d_logits_fake,
						# generator wants images all be real as possible, so set True, 1s
                        labels=tf.ones_like(d_logits_fake)))

Optimizers

# Optimizers
learning_rate = 0.002

# Get the trainable_variables, split into G and D parts
t_vars = tf.trainable_variables()
g_vars = [var for var in t_vars if var.name.startswith('generator')]
d_vars = [var for var in t_vars if var.name.startswith('discriminator')]

# update the selected weights, or the discriminator weights
d_train_opt = tf.train.AdamOptimizer(learning_rate).minimize(d_loss, var_list=d_vars)

# update the selected weights, or the generator weights
g_train_opt = tf.train.AdamOptimizer(learning_rate).minimize(g_loss, var_list=g_vars)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.