Coder Social home page Coder Social logo

convlstm_pytorch's Introduction

ConvLSTM_pytorch

This file contains the implementation of Convolutional LSTM in PyTorch made by me and DavideA.

We started from this implementation and heavily refactored it add added features to match our needs.

Please note that in this repository we implement the following dynamics: CLSTM_dynamics

which is a bit different from the one in the original paper.

How to Use

The ConvLSTM module derives from nn.Module so it can be used as any other PyTorch module.

The ConvLSTM class supports an arbitrary number of layers. In this case, it can be specified the hidden dimension (that is, the number of channels) and the kernel size of each layer. In the case more layers are present but a single value is provided, this is replicated for all the layers. For example, in the following snippet each of the three layers has a different hidden dimension but the same kernel size.

Example usage:

model = ConvLSTM(input_dim=channels,
                 hidden_dim=[64, 64, 128],
                 kernel_size=(3, 3),
                 num_layers=3,
                 batch_first=True
                 bias=True,
                 return_all_layers=False)

TODO (in progress...)

  • Comment code
  • Add docs
  • Add example usage on a toy problem
  • Implement stateful mechanism
  • ...

Disclaimer

This is still a work in progress and is far from being perfect: if you find any bug please don't hesitate to open an issue.

convlstm_pytorch's People

Contributors

dansyu avatar ndrplz avatar stefanopini avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

convlstm_pytorch's Issues

Understanding the algo.

Hello there, really interested in your work but i am still understanding the implementation. Can you explain you have used stacked version of x_(t) and h_(t-1) for convolution according to this:
Screenshot from 2023-09-22 12-02-18

but in implementation you used h_cur and c_cur rather then h_(t-1)?
Screenshot from 2023-09-22 12-06-11

Will really appreciate your quick answer.

Question: Shouldn't each layer have multiple cells?

I noticed in the code that the input sequence updates the same cell iteratively before the hidden state is passed to the next layer. My understanding of LSTMs is that there can be multiple cells per layer? These cells then act as a sliding window over time, where cell 0 takes input from t0, cell 1 takes input from t1 and hidden state from cell 0, and so on. Can I clarify about this?

computation graph broken

I am trying to train a simple model:

model = seq2seq(input_dim=1,
             hidden_dim=[64],
             num_layers=1,
             kernel_size=(3, 3),
             batch_first=True,
             bias=True,
             return_all_layers=False).to(device)

but the computational graph is broken somewhere and I can't seem to find the place.

ReadMe example missing a comma

model = ConvLSTM(input_dim=channels,
hidden_dim=[64, 64, 128],
kernel_size=(3, 3),
num_layers=3,
batch_first=True[missing comma here]
bias=True,
return_all_layers=False)

multi-gpu use

Hi,

I just wanted to know if this implementation would work well in a DataParrallel model and if there was some adaptation to do ?

Thanks

hidden state raises NotImplementedError()

Hey,
I am trying to integrate your convLSTM cell into an existing model I have.
I did this in the following way:

    def forward(self,x): 
        out = my_model(x)
        out = out.unsqueeze(1) #make sure dimensions fit
        out, self.hidden = self.convlstm(out, self.hidden)
        return out[-1].squeeze(1)

self.hidden is none in the first run, but not none in the second, leading to:

    # Implement stateful ConvLSTM
    if hidden_state is not None:
        raise NotImplementedError()

which is in your ConvLSTM module (line 141, 142)

would this be implemented just by:

    # Implement stateful ConvLSTM
    if hidden_state is not None:
        hidden_state=hidden_state`

or am I misunderstanding something?
What is supposed to happen here?

Thanks for any help :-D

kernel_size = (2, 2) causes dimension issues

This code snippet taken and changed slightly from the docstring fails for me:

import torch
from CONVLSTM_Implementation import ConvLSTM
x = torch.rand((32, 10, 64, 128, 128))
convlstm = ConvLSTM(64, 16, (2, 2), 1, True, True, False)
_, last_states = convlstm(x)

with the following Error:

c_next = f * c_cur + i * g
RuntimeError: The size of tensor a (129) must match the size of tensor b (128) at non-singleton dimension 3

But I don't really know why. I guess I will use a different kernel_size for now.

RuntimeError

When added the code below, I got an error:" RuntimeError: expected stride to be a single integer value or a list of 3 values to match the convolution dimensions, but got stride=[1, 1] ", does anyone tell me why this happen?

def main():

model = ConvLSTM(input_dim=3,
                 hidden_dim=[64, 64, 128, 128],
                 kernel_size=(3, 3, 3),
                 num_layers=4,
                 batch_first=True,
                 bias=True,
                 return_all_layers=False
                 )

print(model)

x = torch.randn((32, 10, 64, 128, 128))

now_states, last_states = model(x)

if name == "main":
main()

torch split into the four gates

Hey I was not sure how the logic in line 49 works

cc_i, cc_f, cc_o, cc_g = torch.split(combined_conv, self.hidden_dim, dim=1)

Since each of the 4 gates has operations of weights with inputs, how is the order of split determined? Why not something like cc_g, cc_f, cc_i, cc_go = torch.split(combined_conv, self.hidden_dim, dim=1)? I am a bit confused how the LSTM equations in https://pytorch.org/docs/stable/nn.html#torch.nn.LSTMCell are implemented here.

Thanks in advance.

output shape probelm

sorry, i didn't really understand conLSTM
when i use keras layers ConvLSTM2D(filters = 128, kernel_size=(3, 3), padding='same', return_sequences = False, go_backwards = True,kernel_initializer = 'he_normal' ),such as input shape is (batch,2,h,w,channel),2 i guess is time,and the output is (batch,h,w,128)
but in your code ,i didn't get same shape, can you help me to get the shape like keras convLSTM,thx

split_size_or_sections

Sorry, my English is not very good. Why split_size_or_sections is self.hidden_dim? There are only 4 variables to receive the result, I think self.hidden_dim should be changed to 4

44 cc_i, cc_f, cc_o, cc_g = torch.split(combined_conv, self.hidden_dim, dim=1)

What is the difference between hidden_state and hidden_dim?

I saw that in the code, hidden_state is not implemented:

    def forward(self, input_tensor, hidden_state=None):
        """

        Parameters
        ----------
        input_tensor: todo
            5-D Tensor either of shape (t, b, c, h, w) or (b, t, c, h, w)
        hidden_state: todo
            None. todo implement stateful

meanwhile, hidden_dim is given.
What is the difference between those two variables?

Various time step of input samples

I am not sure how to handle the situation of various length of input sample. I think padding is not okay for my problem. Is there any better way to solve it?

A question

I have a question regarding your implementation:
As I understood the original convolutional lstm formulation is as follows:
Screen Shot 2019-06-12 at 2 24 27 PM

But in your implementation, u used only one convolution layer. I don't understand how these 2 correspond with each other. because in the formulation, c is only used in the Hadamard product and not in convolutions, but here c and h are both used in convolutions.
in fact, all weights are shared for all 4 formulas, although there are 11 weights in the original formula.

Reproducing Moving Mnist with ConvLSTM

Hello I am trying to reproduce the results of the paper in moving Mnist.
I developed my own implementation but it didnt converge. Now I will try with your proposed implementation.

I was wondering have you managed to use it in any dataset link Moving Mnist?

Does it converge?

By the way, I am not 100% sure how you implement the input to state transitions and final state to output transition. Do you do it via 2D(in the timestep) or 3D convs (in all the sequence)

Any ideas?
N.A.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.