Coder Social home page Coder Social logo

Comments (22)

kashif avatar kashif commented on June 18, 2024 1

will try to get this working for deepvar first by today and then the others afterwards

from pytorch-ts.

StatMixedML avatar StatMixedML commented on June 18, 2024 1

@kashif I'd be glad to, but potentially as of next week

from pytorch-ts.

NielsRogge avatar NielsRogge commented on June 18, 2024 1

Thank you for the quick reply.

For people wondering (quick summary):

  • the model you use itself already creates a bunch of covariates, which are defined in the create_transformation function of the model. For example, DeepVAREstimator already creates covariates such as Fourier time features (if you're not providing time_features yourself when initializing the model), age features, and observed values as seen here. Also, lagged features are created (as seen by the lag_seq variable), if you're not providing them yourself when initializing the model.
  • if you want to add additional covariates (such as holiday information or other dynamic real features), add them to your dataset objects as shown in section 1.3 of GluonTS' extended tutorial.
  • If you're using MultivariateGrouper to group the various time series, you have to add the features again after grouping (as shown above).
  • when initializing DeepVAREstimator, set use_feat_dynamic_real/use_feat_static_cat/use_feat_static_real to True (depending on which you are using). Also, if you're using categorical features, set cardinality, which is a list containing the number of unique values for each categorical feature, and embedding_dimension which is a list with embedding dimensions you want to use for each of the additional features. For each additional categorical feature you add, an embedding layer is created as seen here.

@kashif is there a reason use_dynamic_feat_cat is not supported in DeepVAR? Isn't holiday information a dynamic (i.e. time-dependent) categorical feature?

from pytorch-ts.

kashif avatar kashif commented on June 18, 2024 1

No, the holidays get converted to dynamic real features and depending on the kernel you use a particular date gets smoothed out so the model knows when a particular date is approaching and has passed... the reason the dynamic cat is not used is because I never found a need for it yet... but as soon as I do I will add it...

from pytorch-ts.

kashif avatar kashif commented on June 18, 2024

Thanks for the question.

  • Categorical information at the moment is not used in the TempFlow model even though the estimator is initialised to get it. My reasoning was that for the multivariate versions of the open datasets end up being a single time series so I didnt see a need to distinguish the individual time series... does that make sense? Also things like age didn't make sense since it would be a vector of sorts...

  • Yes if you do not include any time features it makes the appropriate time features based on the frequency see the fourier_time_features_from_frequency_str bit in the estimator

  • holiday features are not automatically added

  • the lags are also not automatic but you need to provide the lags_seq to the estimator

I do not remember which fields are supported but I can check. Hope that helps!

from pytorch-ts.

StatMixedML avatar StatMixedML commented on June 18, 2024

Categorical information at the moment is not used in the TempFlow model even though the estimator is initialised to get it. My reasoning was that for the multivariate versions of the open datasets end up being a single time series so I didnt see a need to distinguish the individual time series...

I see your point. I`d assume that incorporating categorical information like State and Industry in the above example adds additional context for the model, as series within the same State and Industry might be more related and that the model is able to pick it up if it is stated explicitly. Also, imagine we want to forecast a new State / Industry combination. Putting them into the right "bucket" might add to accuracy. I am not sure though to what extent this is already captured by learning the conditional density using normalizing flows. Also I am referring to your Paper, Section 4.2

We employ embeddings for categorical features (Charrington, 2018), which allows for relationships within a category, or its context, to be captured while training models. Combining these embeddings as features for time series forecasting yields powerful models like the first place winner of the Kaggle Taxi Trajectory Prediction challenge (De Brébisson et al., 2015).

Is there any other multivariate model available in PyTorch-TS that allows to incorporate categorical information?

Many thanks!

from pytorch-ts.

kashif avatar kashif commented on June 18, 2024

Here in the paper i was referring to a situation where we have different multi-variate timeseries or the time covariates can be embeddings rather than fourier features...

Hmm.. i think the best might be to try to add categorical embeddings to the DeepVAR and use the full multivariate normal output or low-rank to compare... You can have a look at the DeepAR on how the embedding layer is added...

from pytorch-ts.

StatMixedML avatar StatMixedML commented on June 18, 2024

Here in the paper i was referring to a situation where we have different multi-variate timeseries.

I am not sure I fully understand what you are saying. May I ask you to clarify what different multi-variate timeseries means.

from pytorch-ts.

kashif avatar kashif commented on June 18, 2024

I mean't the situation where you have multivariate time series from say some system 1 and another from system 2 etc. (they have to be the same number...) in that case you could have categorical covariates... does that make sense?

from pytorch-ts.

StatMixedML avatar StatMixedML commented on June 18, 2024

Let' see if I got it right: take the example from above where we have 133 different time series that are within State/Industry combinations. Would that be a single system?

Can you give an example of two different systems, maybe referring to the data set in your paper?

from pytorch-ts.

kashif avatar kashif commented on June 18, 2024

ah ok so you have 133 different multi-variate time series... then in that case if all the 133 time series have the same dimension then yes the categorical covariates will help.

from pytorch-ts.

StatMixedML avatar StatMixedML commented on June 18, 2024

Yes, each series (133 different series in total) has 417 months of training observations (hence all have the same dimension) and is uniquely identified using two keys:

  • State: The Australian state (or territory)
  • Industry: The industry of retail trade

How much of an effort would it be for you to incorporate categorical covariate information into TransformerTempFlowEstimator?

from pytorch-ts.

StatMixedML avatar StatMixedML commented on June 18, 2024

How much of an effort would it be for you to incorporate categorical covariate information into TransformerTempFlowEstimator?

Kindly asking for an update...

from pytorch-ts.

kashif avatar kashif commented on June 18, 2024

@StatMixedML would you be able to test the issue-3 branch?

from pytorch-ts.

kashif avatar kashif commented on June 18, 2024

@StatMixedML would you be able to test the issue-3 branch?

from pytorch-ts.

StatMixedML avatar StatMixedML commented on June 18, 2024

@kashif Re-reading our discussion, may I ask you to give a specific example (ideally including a data snippet) of your understanding of what a multivariate time series is? I am not sure we are on the same page :-)

So here is one definition:

A Multivariate time series has more than one time-dependent variable. Each variable depends not only on its past values but also has some dependency on other variables. This dependency is used for forecasting future values.

To be more specific, I am currently using the Australian retail trade turnover data set. Each series (133 univariate time-series) has 417 months of training observations and the data have the following columns:

  • State: The Australian state (or territory)
  • Industry: The industry of retail trade
  • Turnover: Retail turnover in $Million AUD
  • Date: Monthly Data (1982-04-01 to 2018-12-01)

The data look as follows in tabular format:

image

A subset plotted looks as follows:

image

So we have different univariate time-series combinations of State/Industry that constitute the multivariate data set. I`d assume that incorporating categorical information like State and Industry adds additional context for the model, as series within the same State and Industry might be more related and the model is able to pick it up if it is stated explicitly. Also, imagine we want to forecast a new State / Industry combination. Putting them into the right "bucket" might add accuracy.

As there is potentially some interdependence between some of the series, I believe DeepVAR and TransformerTempFlowEstimator are a good choice for modelling. I have seen that you've added support for use_feat_dynamic_real/use_feat_dynamic_cat/use_feat_static_cat to DeepVAR. Given that State/Industry doesn't change with time, I would start off and use use_feat_static_cat for testing.

from pytorch-ts.

StatMixedML avatar StatMixedML commented on June 18, 2024

@kashif Just interested in your thoughts on #3 (comment)

from pytorch-ts.

lorrp1 avatar lorrp1 commented on June 18, 2024

@StatMixedML have you managed to get it working?

from pytorch-ts.

NielsRogge avatar NielsRogge commented on June 18, 2024

@kashif could you please clarify how we can add additional covariate information (which is not defined in create_transformation), such as holiday information, and let the model learn embeddings from them?

Does TransformerTempFlowEstimator support that?

Should we you just define them as feat_static_cat/feat_static_real/feat_dynamic_cat and add them when creating ListDataset?

from pytorch-ts.

kashif avatar kashif commented on June 18, 2024

@NielsRogge so the normalizing flow models which were for the paper and were run on the open datasets did not have additional holiday or lets say dynamic real features so I never added that to the model, however if you have a look at the DeepVAR model you can see how one can add both categorical as well as dynamic features to a multivariate model... If its something you want and cannot do without (e.g. by using deepVAR with some distribution emission) then let me know and i'll try to find some time to add that feature... hope that helps!

from pytorch-ts.

NielsRogge avatar NielsRogge commented on June 18, 2024

Ok so if I understand correctly, the features defined in create_transformation of TransformerTempFlow are used by the model, but if I want to add additional covariate information I should use DeepVAR?

from pytorch-ts.

kashif avatar kashif commented on June 18, 2024

yes @NielsRogge try to get it working with deepVAR as it has support for these covariates like categorical and dynamic real features...

from pytorch-ts.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.