liammaclean216 / pytorch-transfomer Goto Github PK
View Code? Open in Web Editor NEWMy implementation of the transformer architecture from the Attention is All you need paper applied to time series.
My implementation of the transformer architecture from the Attention is All you need paper applied to time series.
Very nice implementation. One question though: In the Transformer's class forward function (line 79), could you please explain why do you feed to the decoder only the last "n" timesteps of the encoder's input? (see code snippet below). I thought the decoder had to be fed some kind of additional target variable, for instance the next "n" timesteps.
[79] d = self.decs[0](self.dec_input_fc(x[:,-self.dec_seq_len:]), e)
hi,I think you have some problems in forecasting.
if(output_sequence_length == 1): x[0].append([t(q).detach().squeeze().numpy()]) else: for a in t(q).detach().squeeze().numpy(): x[0].append([a])
where the test data x, supposing the output_sequence_length is 5, then when you forecast the last 5 numbers, you shouldn't input the 5 Observed values, which is wrong.
Nice Work, but I found that the repo doesn't implement the part of the test, i.e. when the model of training was done, we can't use it to test, right?
Hi, first of all, thank you for the amazing work you have done. I have been playing with this architecture for a while now and have taken huge inspiration from your code. I have been trying to train this model on GPU but always end up with an error of RuntimeError: Tensor for argument #3 'mat2' is on CPU, but expected it to be on GPU (while checking arguments for addmm). Do you know how this can be solved?
Hello,
Are there adjustments that I would need to make for the model to accept multi dimensional input?
i.e. 2 time series, so that for every time step my model would have 2 input features instead of 1?
Thanks
Hello,
I would like to ask what was the rationale for the decision to change ReLu to ELU and not to use dropout when implementing Transformer? I would appreciate an answer and thanks in advance for clarification.
Nice work. But it seems that some codes disappear in .ipynb file. Would you please check it or upload it again? Thank you very much.
I using GPU, so model.cuda()
and train.cuda()
but, this error
Traceback (most recent call last):
File "/home/woon/anaconda3/envs/pytorch/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3417, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 10, in
train_predict_outputs = model(train_data)
File "/home/woon/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/woon/06_pytorch/Reference_code_/Pytorch-Transfomer-master/Network.py", line 74, in forward
e = self.encs0
File "/home/woon/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/woon/06_pytorch/Reference_code_/Pytorch-Transfomer-master/Network.py", line 19, in forward
a = self.attn(x)
File "/home/woon/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/woon/06_pytorch/Reference_code_/Pytorch-Transfomer-master/utils.py", line 50, in forward
a.append(h(x, kv = kv))
File "/home/woon/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/woon/06_pytorch/Reference_code_/Pytorch-Transfomer-master/utils.py", line 30, in forward
return attention(self.query(x), self.key(x), self.value(x))
File "/home/woon/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/woon/06_pytorch/Reference_code_/Pytorch-Transfomer-master/utils.py", line 97, in forward
x = self.fc1(x)
File "/home/woon/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/woon/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/linear.py", line 87, in forward
return F.linear(input, self.weight, self.bias)
File "/home/woon/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/functional.py", line 1612, in linear
output = input.matmul(weight.t())
RuntimeError: Expected object of device type cuda but got device type cpu for argument #2 'mat2' in call to _th_mm
Change the network ??
Dear all,
I've been trying your code. I've adapted it to my data and I've trained a model. But then, at inference if the random seed is not fix the model produces different results each time.
I think that the model.eval() method that should change the behavior of the layers that perform mini-batch normalization (inside the encoder, for instace) is not reaching that part of the model.
Best.
Hello, thanks for sharing the implementation! I have a question for several days. I have trained model using my data, the size is [batch, 12, 6]. But after two epochs, all the output become nearly the same. When test, I use random tensor, the output is also nearly the same. Could you please give me some advice about it?
sincerely
Thank you very much for sharing ! I want to predict multi-input, input size is [ batch _ size, 6,7 ], predict [ batch _ size, 7 ]. How do I modify hyperparameters ? How should the prediction part be modified ?
Hi,
Why did you not implement look ahead masking in your model.
I am new to transformers.
Regards
Arjun
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.