Coder Social home page Coder Social logo

Comments (15)

fchollet avatar fchollet commented on May 4, 2024 51

In addition, here are a few quick examples of solutions to your problem:

Zero-padding

X = keras.preprocessing.sequence.pad_sequences(sequences, maxlen=100)
model.fit(X, y, batch_size=32, nb_epoch=10)

Batches of size 1

for seq, label in zip(sequences, y):
   model.train(np.array([seq]), [label])

from keras.

louisabraham avatar louisabraham commented on May 4, 2024 13

I had the same problem. You have basically 2 solutions (let's say the dimension of you input is n) :

Either you use the parameters :

batch_input_shape=(1, 1, n), stateful=True

Then you train with :

for _ in range(nb_epoch):
    model.fit(X, Y, batch_size=1, nb_epoch=1, shuffle=False)
    model.reset_states()

or with :

for _ in range(nb_epoch):
    for x, y in zip(X, Y):
        model.fit(x, y, batch_size=1, nb_epoch=1, shuffle=False)
    model.reset_states()

and with X of shape (length, 1, n).

I don't know if the two methods are equivalent though… Maybe there are more gradient updates with the second…

Or you define the model with

input_shape=(None, n)

(and stateful=False by default)
and you train with :

model.fit(X, Y, nb_epoch=nb_epoch)

and X has shape (1, length, n).

from keras.

shamitlal avatar shamitlal commented on May 4, 2024 8

@fchollet In the case of batch size 1 method , what should be assigned to input_length parameter in the model? Or should it be set to NONE in this case?

from keras.

visionscaper avatar visionscaper commented on May 4, 2024 7

@fchollet , just for my understanding, when you 'pad_sequences', the padded zeros are fed through the sequence network (e.g. recurrent NN), correct?

What I was looking for is a method where this doesn't happen; I only want to input the real sequence, with each sequence having another length, and subsequently use the output for further processing.

It seems to me that padding the sequences will make it harder to learn the task at hand, since the zeros don't provide info but they get encoding by the network any way.

from keras.

patyork avatar patyork commented on May 4, 2024 4

Sequences should be grouped together by length, and segmented manually into batches by that length before being sent to Keras.

Alternatively (or in addition to the above, to get more sequences of the same length), if it does not break the logic in the cost function, the sequences can be padded with 0s (or the equivalent non-entity).

The reason that lists are not supported is that Theano builds everything as tensors, or matrices of matrices, so everything must have the same dimensionality (Theano does not assume it should pad with 0s where lengths differ).

from keras.

vsoto avatar vsoto commented on May 4, 2024 3

What's the difference between using a Masking layer for sequences of different lengths and setting the mask_zero field to True?

from keras.

patyork avatar patyork commented on May 4, 2024 2

@visionscaper to follow up: "padding" your input is necessary, but it should be done in a way that makes sense. For examples:

  • with video: pad the input with the equivalent of black frames of video
  • with audio: pad the input with the equivalent of silence

Padding with just straight zeros will, as you guessed, more than likely encode some unnecessary - if not incorrect - information into the network. Padding with a "neutral" frame of data is the correct approach.

from keras.

zumpchke avatar zumpchke commented on May 4, 2024 1

Or use the masking layer..

#3086

On Fri, Nov 18, 2016 at 8:36 AM, Tao Bror Bojlén [email protected]
wrote:

@visionscaper https://github.com/visionscaper yes, the padding still
goes through the network. If you don't want this, you might want to look in
to sequence-to-sequence learning, e.g. with farizrahman4u/seq2seq
https://github.com/farizrahman4u/seq2seq. This paper
https://arxiv.org/abs/1409.3215 explains the idea.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#40 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AHK7nQTN8J20oWH46g-FkppQNKn2YbSXks5q_MjtgaJpZM4D8Wcd
.

from keras.

carlthome avatar carlthome commented on May 4, 2024 1

@patyork, sorry, but wouldn't Masking() take care of this? Over all minibatches in an epoch some sequences might end earlier than the longest, and should thus not be given any weight in the forward pass so we make sure the samples are set to zero and then also make sure to reduce the loss accordingly before updating the weights in the backward pass. I guess it would be extremely important to normalize data if masking is being used.

from keras.

hqsiswiliam avatar hqsiswiliam commented on May 4, 2024

@fchollet sorry if I bother you, but I can't find model.train(np.array([seq]), [label]) in keras document for batch size 1.

from keras.

 avatar commented on May 4, 2024

@visionscaper yes, the padding still goes through the network. If you don't want this, you might want to look in to sequence-to-sequence learning, e.g. with farizrahman4u/seq2seq. This paper explains the idea.

from keras.

LopezGG avatar LopezGG commented on May 4, 2024

@patyork : You mentioned

Sequences should be grouped together by length, and segmented manually into batches by that length before being sent to Keras.
I initialize my model as :
model_LSTM = Sequential()
model_LSTM.add(LSTM(lstm_hidden_units, input_shape=(X.shape[1], X.shape[2])))

and I plan on calling "model.train_on_batch(X, y)" on every batch. The problem is how can I intialize input_shape in LSTM when it varies across batches ?

from keras.

patyork avatar patyork commented on May 4, 2024

@LopezGG The shape of a single temporal (or any other kind of) "frame" of the input sequence must be the same shape with the varying dimension being the length or number of "frames" each sample has.

For example, excluding anything to do with batches and batch size: a set of video clips all have a resolution of 1920x1080 pixels but can vary in duration. In this case, the input shape is 1920 by 1080 which is the "frame" size, and the varying dimension is the duration/length of the video - such as 120 frames/4 seconds of video. The sequence for this example video would be 120 frames of 1920*1080 pixels. Any length of video can be fed through this network, so long as it is a 1920x1080 feed.

Going one step further: if you want to use batches of videos to train concurrently, the sequences in each batch must be the same length. One way to accomplish this is to predefine a few "buckets" of temporal length, for example "up to 2 seconds, up to 4 seconds, etc". You can then bucket your video clips, padding when necessary (with all black frames) to get all of the clips to the cutoff/bucket duration.

from keras.

patyork avatar patyork commented on May 4, 2024

@carlthome Yes, the Masking layer appears to be exactly what is needed. This thread predates that layer addition, and I was unaware of its intended use.

The masking layer looks great for most applications, but I would think the "pad with neutral data" approach should be kept in mind for some applications. Specifically, I would think that for speech recognition, it would be good to embed the idea of "silence" as a valid input sequence, and a very expected input sequence.

from keras.

wangqianwen0418 avatar wangqianwen0418 commented on May 4, 2024

@hqsiswiliam

but I can't find model.train(np.array([seq]), [label]) in keras document for batch size 1.

it should be model.train_on_batch

from keras.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.