The neurowriter's discuss from albarji

SubwordTokenizer produces tokens with less repetitions than those allowed

Example: when trying with the Quijote corpus, some of these symbols are being generated:

  9   "zcan",
  9   "Ver",
  9   "onada",
  9   "nis",
  9   "menti",
  9   "memor",
  9   "llen",
  9   "guas",
  9   "enga",
  9   "encias",
  9   "cue",
  9   "áronle",
  9   "amen",
  9   "all",
  8   "yos",
  8   "tencia",
  8   "suje",
  8   "senti",
  8   "ólo",
  8   "neces",
  8   "nadas",
  8   "mun",
  8   "mez",
  8   "laron",
  8   "idor",
  8   "idades",
  8   "empera",
  8   "discre",
  8   "bastan",
  8   "alu",
  7   "vence",
  7   "Ú",
  7   "tentar",
  7   "rosos",

This is probably a result of merging symbols together. In the prune phase the low frequency symbols should be deleted, breaking them down to individual characters.

Conv+LSTM model

Instead of using an stacked LSTM model, which is powerful but slow, a viable alternative might be using a Conv+LSTM model. An working example of this architecture can be seen in the Tacotron 2 network, inside the encoder module (https://research.googleblog.com/2017/12/tacotron-2-generating-human-like-speech.html). Main points from this architecture:

Character-level embeddings of size 512
3 layers of Conv, each with 512 filters and shape 5 x 1 (5-chars span) + BatchNormalization + ReLU + Dropout 0.5
Bidirectional LSTM with 256 units in each direction + ZoneOut 0.1

Except for the ZoneOut, all of these can be implemented here.

Cudnn layers don't work on CPU architectures

Thu CUDNN version of LSTM layers does not work when running on CPU. The software should detect whether we are running just on CPU, and swap those layers by standard LSTM layers.

Attention Model

The attention model proposed in the project https://github.com/minimaxir/textgenrnn seems to work well.

Code Climate failing

Invalid configuration.
Errors:

syntax error: (): did not find expected key while parsing a block mapping at line 1 column 1

Review on practical tips about text generation

This paper might contain useful insights for improving this application: https://arxiv.org/abs/1711.09534

SGD+Nesterov optimizer

Recent results show that while adaptive optimization methods do obtain better minima in the trainining loss function, they are more prone to overfitting, specially in network with more parameters than training data, which might well be the case here.

To avoid this it would be useful to SGD and SGD+Nesterov as optimizers in the hypertuning procedure.

Iterated Dilated Convolutions

These work very well on sequence tagging tasks: https://arxiv.org/pdf/1702.02098.pdf

Better WaveNet parameters

Try using the WaveNet parameters used in https://github.com/buriburisuri/speech-to-text-wavenet

Try native keras multiGPU model

https://github.com/fchollet/keras/blob/3dd3e8331677e68e7dec6ed4a1cbf16b7ef19f7f/keras/utils/training_utils.py#L56-L75

SubWordTokenizer with fixed dictionary of subwords

Some studies have been carried out on subword decompositions apt for poetry.

Spanish: http://www.elcastellano.org/ns/edicion/2014/abril/silabas.html

It would be useful to adapt to SubWordTokenizer to allow providing a list of desired subword tokens.

make build-image GPU=1

the GPU flag is not being passed on to the appropriate make commands in the makefile.

SpatialDropout

Seem like vanilla Dropout is not so effective for Convolutional Layers. A better method might be to use SpatialDropout (https://faroit.github.io/keras-docs/1.2.0/layers/core/#spatialdropout2d) or no Dropout at all, just leave it for LSTM and Dense layers.

Unfortunately this layer is not yet implemented in Keras.

albarji / neurowriter Goto Github PK

neurowriter's Issues

SubwordTokenizer produces tokens with less repetitions than those allowed

Conv+LSTM model

Cudnn layers don't work on CPU architectures

Attention Model

Code Climate failing

Review on practical tips about text generation

SGD+Nesterov optimizer

Iterated Dilated Convolutions

Better WaveNet parameters

Try native keras multiGPU model

SubWordTokenizer with fixed dictionary of subwords

Generating Haikus

Try CUDNN LSTM layers

CRF loss

Docker image not make using of conda-gpu.txt

SpatialDropout

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent