Coder Social home page Coder Social logo

pytorch-transfomer's Issues

Decoder takes same inputs as Encoder

Very nice implementation. One question though: In the Transformer's class forward function (line 79), could you please explain why do you feed to the decoder only the last "n" timesteps of the encoder's input? (see code snippet below). I thought the decoder had to be fed some kind of additional target variable, for instance the next "n" timesteps.

[79] d = self.decs[0](self.dec_input_fc(x[:,-self.dec_seq_len:]), e)

something wrong about the prediction

hi,I think you have some problems in forecasting.
if(output_sequence_length == 1): x[0].append([t(q).detach().squeeze().numpy()]) else: for a in t(q).detach().squeeze().numpy(): x[0].append([a])
where the test data x, supposing the output_sequence_length is 5, then when you forecast the last 5 numbers, you shouldn't input the 5 Observed values, which is wrong.

Training on GPU

Hi, first of all, thank you for the amazing work you have done. I have been playing with this architecture for a while now and have taken huge inspiration from your code. I have been trying to train this model on GPU but always end up with an error of RuntimeError: Tensor for argument #3 'mat2' is on CPU, but expected it to be on GPU (while checking arguments for addmm). Do you know how this can be solved?

Multi dimensional input

Hello,

Are there adjustments that I would need to make for the model to accept multi dimensional input?
i.e. 2 time series, so that for every time step my model would have 2 input features instead of 1?

Thanks

Question regarding architecture

Hello,

I would like to ask what was the rationale for the decision to change ReLu to ELU and not to use dropout when implementing Transformer? I would appreciate an answer and thanks in advance for clarification.

damaged .ipynb file?

Nice work. But it seems that some codes disappear in .ipynb file. Would you please check it or upload it again? Thank you very much.

I should using GPU, bur ERROR

I using GPU, so model.cuda()
and train.cuda()

but, this error

Traceback (most recent call last):
File "/home/woon/anaconda3/envs/pytorch/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3417, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 10, in
train_predict_outputs = model(train_data)
File "/home/woon/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/woon/06_pytorch/Reference_code_/Pytorch-Transfomer-master/Network.py", line 74, in forward
e = self.encs0
File "/home/woon/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/woon/06_pytorch/Reference_code_/Pytorch-Transfomer-master/Network.py", line 19, in forward
a = self.attn(x)
File "/home/woon/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/woon/06_pytorch/Reference_code_/Pytorch-Transfomer-master/utils.py", line 50, in forward
a.append(h(x, kv = kv))
File "/home/woon/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/woon/06_pytorch/Reference_code_/Pytorch-Transfomer-master/utils.py", line 30, in forward
return attention(self.query(x), self.key(x), self.value(x))
File "/home/woon/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/woon/06_pytorch/Reference_code_/Pytorch-Transfomer-master/utils.py", line 97, in forward
x = self.fc1(x)
File "/home/woon/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/woon/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/linear.py", line 87, in forward
return F.linear(input, self.weight, self.bias)
File "/home/woon/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/functional.py", line 1612, in linear
output = input.matmul(weight.t())
RuntimeError: Expected object of device type cuda but got device type cpu for argument #2 'mat2' in call to _th_mm

Change the network ??

model.eval() does not reach internal layers

Dear all,

I've been trying your code. I've adapted it to my data and I've trained a model. But then, at inference if the random seed is not fix the model produces different results each time.
I think that the model.eval() method that should change the behavior of the layers that perform mini-batch normalization (inside the encoder, for instace) is not reaching that part of the model.

Best.

The output is nearly the same

Hello, thanks for sharing the implementation! I have a question for several days. I have trained model using my data, the size is [batch, 12, 6]. But after two epochs, all the output become nearly the same. When test, I use random tensor, the output is also nearly the same. Could you please give me some advice about it?
sincerely

multiple input output

Thank you very much for sharing ! I want to predict multi-input, input size is [ batch _ size, 6,7 ], predict [ batch _ size, 7 ]. How do I modify hyperparameters ? How should the prediction part be modified ?

look ahead masking

Hi,
Why did you not implement look ahead masking in your model.
I am new to transformers.
Regards
Arjun

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.