liammaclean216 / pytorch-transfomer Goto Github PK

My implementation of the transformer architecture from the Attention is All you need paper applied to time series.

Python 4.13% Jupyter Notebook 95.87%

pytorch-transfomer's Issues

Decoder takes same inputs as Encoder

Very nice implementation. One question though: In the Transformer's class forward function (line 79), could you please explain why do you feed to the decoder only the last "n" timesteps of the encoder's input? (see code snippet below). I thought the decoder had to be fed some kind of additional target variable, for instance the next "n" timesteps.

[79] d = self.decs[0](self.dec_input_fc(x[:,-self.dec_seq_len:]), e)

something wrong about the prediction

hi,I think you have some problems in forecasting.
if(output_sequence_length == 1): x[0].append([t(q).detach().squeeze().numpy()]) else: for a in t(q).detach().squeeze().numpy(): x[0].append([a])
where the test data x, supposing the output_sequence_length is 5, then when you forecast the last 5 numbers, you shouldn't input the 5 Observed values, which is wrong.

It's seem not implement part of the test

Nice Work, but I found that the repo doesn't implement the part of the test, i.e. when the model of training was done, we can't use it to test, right?

Training on GPU

Hi, first of all, thank you for the amazing work you have done. I have been playing with this architecture for a while now and have taken huge inspiration from your code. I have been trying to train this model on GPU but always end up with an error of RuntimeError: Tensor for argument #3 'mat2' is on CPU, but expected it to be on GPU (while checking arguments for addmm). Do you know how this can be solved?

Multi dimensional input

Hello,

Are there adjustments that I would need to make for the model to accept multi dimensional input?
i.e. 2 time series, so that for every time step my model would have 2 input features instead of 1?

Thanks

Question regarding architecture

Hello,

I would like to ask what was the rationale for the decision to change ReLu to ELU and not to use dropout when implementing Transformer? I would appreciate an answer and thanks in advance for clarification.

damaged .ipynb file?

Nice work. But it seems that some codes disappear in .ipynb file. Would you please check it or upload it again? Thank you very much.

I should using GPU, bur ERROR

I using GPU, so model.cuda()
and train.cuda()

but, this error

Traceback (most recent call last):
File "/home/woon/anaconda3/envs/pytorch/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3417, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 10, in
train_predict_outputs = model(train_data)
File "/home/woon/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/woon/06_pytorch/Reference_code_/Pytorch-Transfomer-master/Network.py", line 74, in forward
e = self.encs0
File "/home/woon/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/woon/06_pytorch/Reference_code_/Pytorch-Transfomer-master/Network.py", line 19, in forward
a = self.attn(x)
File "/home/woon/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/woon/06_pytorch/Reference_code_/Pytorch-Transfomer-master/utils.py", line 50, in forward
a.append(h(x, kv = kv))
File "/home/woon/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/woon/06_pytorch/Reference_code_/Pytorch-Transfomer-master/utils.py", line 30, in forward
return attention(self.query(x), self.key(x), self.value(x))
File "/home/woon/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/woon/06_pytorch/Reference_code_/Pytorch-Transfomer-master/utils.py", line 97, in forward
x = self.fc1(x)
File "/home/woon/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/woon/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/linear.py", line 87, in forward
return F.linear(input, self.weight, self.bias)
File "/home/woon/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/functional.py", line 1612, in linear
output = input.matmul(weight.t())
RuntimeError: Expected object of device type cuda but got device type cpu for argument #2 'mat2' in call to _th_mm

Change the network ??

The code did not use dropout?

model.eval() does not reach internal layers

Dear all,

I've been trying your code. I've adapted it to my data and I've trained a model. But then, at inference if the random seed is not fix the model produces different results each time.
I think that the model.eval() method that should change the behavior of the layers that perform mini-batch normalization (inside the encoder, for instace) is not reaching that part of the model.

Best.

The output is nearly the same

Hello, thanks for sharing the implementation! I have a question for several days. I have trained model using my data, the size is [batch, 12, 6]. But after two epochs, all the output become nearly the same. When test, I use random tensor, the output is also nearly the same. Could you please give me some advice about it?
sincerely

liammaclean216 / pytorch-transfomer Goto Github PK

pytorch-transfomer's Issues

Decoder takes same inputs as Encoder

something wrong about the prediction

It's seem not implement part of the test

Training on GPU

Multi dimensional input

Question regarding architecture

damaged .ipynb file?

I should using GPU, bur ERROR

The code did not use dropout?

model.eval() does not reach internal layers

The output is nearly the same

Where's the part about the mask

multiple input output

look ahead masking

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent