Coder Social home page Coder Social logo

transformer-time-series-prediction's People

Contributors

chenhui-x avatar oliverguhr avatar xanyv avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

transformer-time-series-prediction's Issues

Mismatch between loss and multistep sinusoid predictor

I reproduced the plot in the readme for the multistep predictor of a sinusoid, but after changing some hyperparams I'm seeing a mismatch between the loss and the predictive power. Below are the losses for a run with default params and a run with another set of hyperparams labelled"best-1lyr" (lr=0.00843, decay factor=.9673, num features=110):
best-1lyr-and-default-val-loss

Both converge to a stable result after ~60 epochs, however their predictions are not stable, nor do they match up with predictive power. Below are gifs of the predictor output for default params and my other set of params, respectively:
ezgif-3-f5ba02db66e2
ezgif-3-c9d64b5b7846

The run with default parameters appears to jump out of a locally convex region and into another around the 50th epoch. It actually does this twice, and the 100th epoch prediction is the one with higher magnitude noise at the start and end of the prediction. The run with new parameters seems to be remain in a fixed region of the cost surface, however it has consistently much lower predictive power than the run with default parameters, while at the same time achieving a loss below the run with default parameters. Any ideas what issue(s) I might be running in to?

One thing to note is that it appears there is some randomness in training even though the code sets random seeds for torch and numpy. I get different loss curves for multiple runs of the default params, but, oddly, they only diverge after exactly 15 epochs. Also this note that training curves look pretty much the same.

Covariates?

I wonder if it could be possible to add covariates as input

Model collapse after adding encoder layers.

Hi, thanks for providing the code.

In transformer-singlestep.py, I change model = TransAm().to(device) to model = TransAm(num_layers=5).to(device). When I train the model, I find the model outputs same value (0) for all inputs. I thought that increasing the number of layers can make model more expressive, but it results in worse performance.

Do you meet the same problem before? I am not sure if I need to change the training setting.

(BTW, I also tried suggestion in another issue. I remove self. in self.encoder_layer, but still can't make training converge.)

Prediction for epoch 100:

image

How to predict in multi-dimension

Hello I want to know in this code the prediction is one dimension. But in my problem, the prediction is three-dimensions. I want to know how to modify this code.

A bug when I add encoder layers

Thanks very much for your code. However, there are some difference between your code and the tutorial of PyTorch: SEQUENCE-TO-SEQUENCE MODELING WITH NN.TRANSFORMER AND TORCHTEXT in class TransAm

According to https://www.zhihu.com/question/67209417/answer/1264503855, with the addition of self. to encoder_ Layers This leads to self. encoder_ Layers are counted as parameters of module, but only self. Transformeris used in network operation_ The nlayers copied from encoder are the parameters in nn.transformerencoderlayer That is to say, self. Encoder_ Layers do not participate in model operation, so there is no gradient in backward, which leads to training errors.

csv

can I export the output by csv file instead of png image?

Why make label length same as seq length?

The create_inout_sequences function return the label length same as seq length.
Like

in -> [0..99]

target -> [1..100]

But I saw many LSTM works make the data sequence like:

in -> [0..99]

target -> [100]

so I changed the create_inout_sequences function to return 100seq and 1 label for each sequence. It raise error

Traceback (most recent call last):
File "E:/Trans.py", line 254, in
train_data, val_data = get_data()
File "E:/Trans.py", line 120, in get_data
train_sequence = create_inout_sequences(train_data, input_window)
File "E:/Trans.py", line 98, in create_inout_sequences
return torch.FloatTensor(inout_seq)
ValueError: expected sequence of length 100 at dim 2 (got 1)

I don't know how to fix it. I wonder why make label length same as seq length? Is it possible to make the sequences comprised of 100seq and 1label?

About model output

Thank you very much for the code. I applied the model to load decomposition, but I found that the final output is a straight line. Is it because there is no connection between the input data and the label? In the prediction problem, the input and the label are only different in time step. The input and the label in the decomposition task It is difficult to establish a connection between the values, I want to know how to deal with this situation

IndexError: invalid index of a 0-dim tensor.

127     seq_len = min(batch_size, len(source) - 1 - i)
128     data = source[i:i+seq_len]

--> 129 input = torch.stack(torch.stack([item[0] for item in data]).chunk(input_window, 1)) # 1 is feature size
130 target = torch.stack(torch.stack([item[1] for item in data]).chunk(input_window, 1))
131 return input, target

IndexError: invalid index of a 0-dim tensor. Use tensor.item() in Python or tensor.item<T>() in C++ to convert a 0-dim tensor to a number

which version is your torch? I use the newest version in colab, but occurs an error . please help me.

Result reproducibility

I have not been able to get the same same results as those presented here. Not even after 1000 epochs.

In the code the number if epochs is 10 and 100 for single and multi respectively. I was wondering if those were the lengths you trained them when posting the results?

Multistep Tranformer Input Zeroed

I was wondering why the input to the multistep transformer has zeroes of length of the output_window. Is there a reason why we can't do it in the same way as for the single step transformer, that is, instead of [0 1 2 3 4 0 0], we have [0 1 2 3 4] for the input and [2 3 4 5 6] for the labels instead of [0 1 2 3 4 5 6]?

New question regarding the multistep prediction strategy

I was reading and debugging the multi-step implementation to understand it better. I've come across an interesting thing, seams like the features and labels in the training and evaluation are the same. This behavior is correct ? I thought that in a multi step prediction problem the input features is delayed in relation to the wanted labels, this way we have a window of past behavior of the data and we are aiming to predict the future behavior of the data.

how to get input tensor?

Hello, for multi-steps predict, how to get the x, label y tensor? Would you like to use the seq 1-100 to demostrcate it?

single-variable prediction

This seems to be a single-variable prediction, which only uses the sequence information of the own variables of the time signal, does not use other features

why encoder needs mask?

Thank you for upload the code. I think the input data is history data, which isn't masked. I don't understand

is it possible to predict the turning point of time series?

thanks for your great work!
I have tried your single-step script to train and test for a certain time series dataset.
However, I noticed that although the predicted curve is close to the ground truth by mean squared error, it seems always falls behind in predicting turning point of the curve.
For example, if the actual turning point appears at time point 10, my predicted turning point will probably appear at time point 11 or 12, seems unable to predict that turning point at the actual time point.
did you run into the same issue by any chance? do you have any suggestions?

Understand multistep transformer

Hi thanks for your work, may I ask what this line is used for? It seems you made the dataset smaller?
train_sequence = train_sequence[:-output_window]

General questions.

I was wondering if this uses teacher forcing during training? And what terms did you use as the SOS and EOS tokens? :)

I have been trying to the get the transformer to work on time series for over a month now, and it seems almost nearly impossible using the nn.Transformer model provided by Pytorch. Did you by any chance get the decoder in the original transformer to work aswell?

new question!!

Hello, the current situation is input [1 * 100] - > target [1 * 100], but if it is a CSV table, such as:

                     featuresA          featuresB          featuresC

2021/1/10 .........
2021/1/11 ........
2021/1/12 ...........
.......

So the input is BATCH * 3 * 100(3 features , 100timestep),how to change your code, thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.