oliverguhr / transformer-time-series-prediction Goto Github PK

View Code? Open in Web Editor NEW

1.3K 1.3K 238.0 3.23 MB

proof of concept for a transformer-based time series prediction model

License: MIT License

Python 100.00%

transformer-time-series-prediction's People

Contributors

Stargazers

Watchers

Forkers

surmount1 ssxzsqy crystal22 xanyv arijitthegame flaneur-ml shamoons msinghraniyal valeman ml4wireless uyagde xiaotailong michiru123 jaeohwoo arthurcarvalhoc jingmouren floricaaa huoyujia gdh756462786 mike-halpin 321hg mrsanzhi ydy8989 cheriylan trillionpowers hsouporto sunnyqiny sbhakat youjp choodly tcwltcwl vicky-51 duongtrung manateechen nerocube speedyidea tlennon140 patoalejor maleicacid 2021-paper-fun alirezaghods chenzhengkun7 pengchen233 andrewd1024 yodepapa shaojingzhi susululu verystrongjoe l-forecaster chendingliang tonellotto ruifmaxx bigcat11 earlbabson gaodawn wuqua sangkyunjo afters-cool robertogemartin renxb u-tony-wu mcq429366683 z-tong s18020 dr-dahou-adrar pmayd ppokhrel1 mdsung soachishti yasinkutuk windspire rag9704 panzuanxin jun-yang-1007 hangzhang10 greengrass2015 wangchengcosmo reinjunon wjs-cslg gaoyuanning yui010206 ttt496 komal11lamba afcarl nicemartin ykwei1127 reg1ss tripleess pjw960408 kiminh hbalint12 rookiedata1 dervishson alex-arch zhen-hw100 skvtun uestc-wsh qxdxx lyzl2010 ilmb

transformer-time-series-prediction's Issues

Mismatch between loss and multistep sinusoid predictor

I reproduced the plot in the readme for the multistep predictor of a sinusoid, but after changing some hyperparams I'm seeing a mismatch between the loss and the predictive power. Below are the losses for a run with default params and a run with another set of hyperparams labelled"best-1lyr" (lr=0.00843, decay factor=.9673, num features=110):

Both converge to a stable result after ~60 epochs, however their predictions are not stable, nor do they match up with predictive power. Below are gifs of the predictor output for default params and my other set of params, respectively:

The run with default parameters appears to jump out of a locally convex region and into another around the 50th epoch. It actually does this twice, and the 100th epoch prediction is the one with higher magnitude noise at the start and end of the prediction. The run with new parameters seems to be remain in a fixed region of the cost surface, however it has consistently much lower predictive power than the run with default parameters, while at the same time achieving a loss below the run with default parameters. Any ideas what issue(s) I might be running in to?

One thing to note is that it appears there is some randomness in training even though the code sets random seeds for torch and numpy. I get different loss curves for multiple runs of the default params, but, oddly, they only diverge after exactly 15 epochs. Also this note that training curves look pretty much the same.

Covariates?

I wonder if it could be possible to add covariates as input

Model collapse after adding encoder layers.

Hi, thanks for providing the code.

In transformer-singlestep.py, I change model = TransAm().to(device) to model = TransAm(num_layers=5).to(device). When I train the model, I find the model outputs same value (0) for all inputs. I thought that increasing the number of layers can make model more expressive, but it results in worse performance.

Do you meet the same problem before? I am not sure if I need to change the training setting.

(BTW, I also tried suggestion in another issue. I remove self. in self.encoder_layer, but still can't make training converge.)

Prediction for epoch 100:

How to predict in multi-dimension

Hello I want to know in this code the prediction is one dimension. But in my problem, the prediction is three-dimensions. I want to know how to modify this code.

There seems to be something wrong (or I may be wrong!) in the single step `get_batch` function

I believe that
input = torch.stack([item[0] for item in data]).view((input_window,batch_len,1))
should be changed to
input = torch.stack([item[0] for item in data]).T.unsqueeze(-1).

This is because the orders are not correct if the first part is used. (the inputs that go into the transformer is incorrect (i.e. it doesn't go like t=0, t=1, t=2,... )

A bug when I add encoder layers

Thanks very much for your code. However, there are some difference between your code and the tutorial of PyTorch: SEQUENCE-TO-SEQUENCE MODELING WITH NN.TRANSFORMER AND TORCHTEXT in class TransAm

According to https://www.zhihu.com/question/67209417/answer/1264503855, with the addition of self. to encoder_ Layers This leads to self. encoder_ Layers are counted as parameters of module, but only self. Transformeris used in network operation_ The nlayers copied from encoder are the parameters in nn.transformerencoderlayer That is to say, self. Encoder_ Layers do not participate in model operation, so there is no gradient in backward, which leads to training errors.

csv

can I export the output by csv file instead of png image?

Why make label length same as seq length?

The create_inout_sequences function return the label length same as seq length.
Like

in -> [0..99]

target -> [1..100]

But I saw many LSTM works make the data sequence like:

in -> [0..99]

target -> [100]

so I changed the create_inout_sequences function to return 100seq and 1 label for each sequence. It raise error

Traceback (most recent call last):
File "E:/Trans.py", line 254, in
train_data, val_data = get_data()
File "E:/Trans.py", line 120, in get_data
train_sequence = create_inout_sequences(train_data, input_window)
File "E:/Trans.py", line 98, in create_inout_sequences
return torch.FloatTensor(inout_seq)
ValueError: expected sequence of length 100 at dim 2 (got 1)

I don't know how to fix it. I wonder why make label length same as seq length? Is it possible to make the sequences comprised of 100seq and 1label?

About model output

Thank you very much for the code. I applied the model to load decomposition, but I found that the final output is a straight line. Is it because there is no connection between the input data and the label? In the prediction problem, the input and the label are only different in time step. The input and the label in the decomposition task It is difficult to establish a connection between the values, I want to know how to deal with this situation

Where is the code for temperature dataset?

In the two examples code shown, it is only for sin data not for temperature data.

Where is the code for temperature dataset?

Thanks

Why is this sin wave made more complex?

transformer-time-series-prediction/transformer-multistep.py

Line 94 in 0f93707

    
           amplitude   = np.sin(time) + np.sin(time*0.05) +np.sin(time*0.12) *np.random.normal(-0.2, 0.2, len(time))

IndexError: invalid index of a 0-dim tensor.

127     seq_len = min(batch_size, len(source) - 1 - i)
128     data = source[i:i+seq_len]

--> 129 input = torch.stack(torch.stack([item[0] for item in data]).chunk(input_window, 1)) # 1 is feature size
130 target = torch.stack(torch.stack([item[1] for item in data]).chunk(input_window, 1))
131 return input, target

IndexError: invalid index of a 0-dim tensor. Use tensor.item() in Python or tensor.item<T>() in C++ to convert a 0-dim tensor to a number

which version is your torch? I use the newest version in colab, but occurs an error . please help me.

Result reproducibility

I have not been able to get the same same results as those presented here. Not even after 1000 epochs.

In the code the number if epochs is 10 and 100 for single and multi respectively. I was wondering if those were the lengths you trained them when posting the results?

Multistep Tranformer Input Zeroed

I was wondering why the input to the multistep transformer has zeroes of length of the output_window. Is there a reason why we can't do it in the same way as for the single step transformer, that is, instead of [0 1 2 3 4 0 0], we have [0 1 2 3 4] for the input and [2 3 4 5 6] for the labels instead of [0 1 2 3 4 5 6]?

New question regarding the multistep prediction strategy

I was reading and debugging the multi-step implementation to understand it better. I've come across an interesting thing, seams like the features and labels in the training and evaluation are the same. This behavior is correct ? I thought that in a multi step prediction problem the input features is delayed in relation to the wanted labels, this way we have a window of past behavior of the data and we are aiming to predict the future behavior of the data.

how to get input tensor?

Hello, for multi-steps predict, how to get the x, label y tensor? Would you like to use the seq 1-100 to demostrcate it?

why don't you use the transformer decoder?

as the tittle said.

single-variable prediction

This seems to be a single-variable prediction, which only uses the sequence information of the own variables of the time signal, does not use other features

why encoder needs mask？

Thank you for upload the code. I think the input data is history data, which isn't masked. I don't understand

is it possible to predict the turning point of time series?

thanks for your great work!
I have tried your single-step script to train and test for a certain time series dataset.
However, I noticed that although the predicted curve is close to the ground truth by mean squared error, it seems always falls behind in predicting turning point of the curve.
For example, if the actual turning point appears at time point 10, my predicted turning point will probably appear at time point 11 or 12, seems unable to predict that turning point at the actual time point.
did you run into the same issue by any chance? do you have any suggestions?

Understand multistep transformer

Hi thanks for your work, may I ask what this line is used for? It seems you made the dataset smaller?
train_sequence = train_sequence[:-output_window]

General questions.

I was wondering if this uses teacher forcing during training? And what terms did you use as the SOS and EOS tokens? :)

I have been trying to the get the transformer to work on time series for over a month now, and it seems almost nearly impossible using the nn.Transformer model provided by Pytorch. Did you by any chance get the decoder in the original transformer to work aswell?

new question!!

Hello, the current situation is input [1 * 100] - > target [1 * 100], but if it is a CSV table, such as:

                     featuresA          featuresB          featuresC

2021/1/10 .........
2021/1/11 ........
2021/1/12 ...........
.......

So the input is BATCH * 3 * 100(3 features , 100timestep),how to change your code, thank you!

How to add support for multiple feature in PositionalEncoding

Hi,
For now PositionalEncoding is only able to map one dimensional to encoder dimension. Please provide a efficient way to map multi-variate feature to the encoder dimension.
Thanks

why multi-step predict future only use one prediction, rather than use the length of output_window like [-output_window:]

transformer-time-series-prediction/transformer-multistep.py

Line 210 in 570d39b

data = torch.cat((data, output[-1:]))

oliverguhr / transformer-time-series-prediction Goto Github PK

transformer-time-series-prediction's People

Contributors

Stargazers

Watchers

Forkers

transformer-time-series-prediction's Issues

in -> [0..99]

target -> [1..100]

in -> [0..99]

target -> [100]

Recommend Projects

Recommend Topics

Recommend Org