Comments (22)
will try to get this working for deepvar
first by today and then the others afterwards
from pytorch-ts.
@kashif I'd be glad to, but potentially as of next week
from pytorch-ts.
Thank you for the quick reply.
For people wondering (quick summary):
- the model you use itself already creates a bunch of covariates, which are defined in the
create_transformation
function of the model. For example,DeepVAREstimator
already creates covariates such as Fourier time features (if you're not providingtime_features
yourself when initializing the model), age features, and observed values as seen here. Also, lagged features are created (as seen by thelag_seq
variable), if you're not providing them yourself when initializing the model. - if you want to add additional covariates (such as holiday information or other dynamic real features), add them to your dataset objects as shown in section 1.3 of GluonTS' extended tutorial.
- If you're using
MultivariateGrouper
to group the various time series, you have to add the features again after grouping (as shown above). - when initializing
DeepVAREstimator
, setuse_feat_dynamic_real
/use_feat_static_cat
/use_feat_static_real
toTrue
(depending on which you are using). Also, if you're using categorical features, setcardinality
, which is a list containing the number of unique values for each categorical feature, andembedding_dimension
which is a list with embedding dimensions you want to use for each of the additional features. For each additional categorical feature you add, an embedding layer is created as seen here.
@kashif is there a reason use_dynamic_feat_cat
is not supported in DeepVAR? Isn't holiday information a dynamic (i.e. time-dependent) categorical feature?
from pytorch-ts.
No, the holidays get converted to dynamic real features and depending on the kernel you use a particular date gets smoothed out so the model knows when a particular date is approaching and has passed... the reason the dynamic cat is not used is because I never found a need for it yet... but as soon as I do I will add it...
from pytorch-ts.
Thanks for the question.
-
Categorical information at the moment is not used in the
TempFlow
model even though the estimator is initialised to get it. My reasoning was that for the multivariate versions of the open datasets end up being a single time series so I didnt see a need to distinguish the individual time series... does that make sense? Also things like age didn't make sense since it would be a vector of sorts... -
Yes if you do not include any time features it makes the appropriate time features based on the frequency see the
fourier_time_features_from_frequency_str
bit in the estimator -
holiday features are not automatically added
-
the lags are also not automatic but you need to provide the
lags_seq
to the estimator
I do not remember which fields are supported but I can check. Hope that helps!
from pytorch-ts.
Categorical information at the moment is not used in the TempFlow model even though the estimator is initialised to get it. My reasoning was that for the multivariate versions of the open datasets end up being a single time series so I didnt see a need to distinguish the individual time series...
I see your point. I`d assume that incorporating categorical information like State and Industry in the above example adds additional context for the model, as series within the same State and Industry might be more related and that the model is able to pick it up if it is stated explicitly. Also, imagine we want to forecast a new State / Industry combination. Putting them into the right "bucket" might add to accuracy. I am not sure though to what extent this is already captured by learning the conditional density using normalizing flows. Also I am referring to your Paper, Section 4.2
We employ embeddings for categorical features (Charrington, 2018), which allows for relationships within a category, or its context, to be captured while training models. Combining these embeddings as features for time series forecasting yields powerful models like the first place winner of the Kaggle Taxi Trajectory Prediction challenge (De Brébisson et al., 2015).
Is there any other multivariate model available in PyTorch-TS that allows to incorporate categorical information?
Many thanks!
from pytorch-ts.
Here in the paper i was referring to a situation where we have different multi-variate timeseries or the time covariates can be embeddings rather than fourier features...
Hmm.. i think the best might be to try to add categorical embeddings to the DeepVAR
and use the full multivariate normal output or low-rank to compare... You can have a look at the DeepAR
on how the embedding layer is added...
from pytorch-ts.
Here in the paper i was referring to a situation where we have different multi-variate timeseries.
I am not sure I fully understand what you are saying. May I ask you to clarify what different multi-variate timeseries means.
from pytorch-ts.
I mean't the situation where you have multivariate time series from say some system 1 and another from system 2 etc. (they have to be the same number...) in that case you could have categorical covariates... does that make sense?
from pytorch-ts.
Let' see if I got it right: take the example from above where we have 133 different time series that are within State/Industry combinations. Would that be a single system?
Can you give an example of two different systems, maybe referring to the data set in your paper?
from pytorch-ts.
ah ok so you have 133 different multi-variate time series... then in that case if all the 133 time series have the same dimension then yes the categorical covariates will help.
from pytorch-ts.
Yes, each series (133 different series in total) has 417 months of training observations (hence all have the same dimension) and is uniquely identified using two keys:
- State: The Australian state (or territory)
- Industry: The industry of retail trade
How much of an effort would it be for you to incorporate categorical covariate information into TransformerTempFlowEstimator?
from pytorch-ts.
How much of an effort would it be for you to incorporate categorical covariate information into TransformerTempFlowEstimator?
Kindly asking for an update...
from pytorch-ts.
@StatMixedML would you be able to test the issue-3 branch?
from pytorch-ts.
@StatMixedML would you be able to test the issue-3 branch?
from pytorch-ts.
@kashif Re-reading our discussion, may I ask you to give a specific example (ideally including a data snippet) of your understanding of what a multivariate time series is? I am not sure we are on the same page :-)
So here is one definition:
A Multivariate time series has more than one time-dependent variable. Each variable depends not only on its past values but also has some dependency on other variables. This dependency is used for forecasting future values.
To be more specific, I am currently using the Australian retail trade turnover data set. Each series (133 univariate time-series) has 417 months of training observations and the data have the following columns:
- State: The Australian state (or territory)
- Industry: The industry of retail trade
- Turnover: Retail turnover in $Million AUD
- Date: Monthly Data (1982-04-01 to 2018-12-01)
The data look as follows in tabular format:
A subset plotted looks as follows:
So we have different univariate time-series combinations of State/Industry that constitute the multivariate data set. I`d assume that incorporating categorical information like State and Industry adds additional context for the model, as series within the same State and Industry might be more related and the model is able to pick it up if it is stated explicitly. Also, imagine we want to forecast a new State / Industry combination. Putting them into the right "bucket" might add accuracy.
As there is potentially some interdependence between some of the series, I believe DeepVAR and TransformerTempFlowEstimator are a good choice for modelling. I have seen that you've added support for use_feat_dynamic_real/use_feat_dynamic_cat/use_feat_static_cat to DeepVAR. Given that State/Industry doesn't change with time, I would start off and use use_feat_static_cat for testing.
from pytorch-ts.
@kashif Just interested in your thoughts on #3 (comment)
from pytorch-ts.
@StatMixedML have you managed to get it working?
from pytorch-ts.
@kashif could you please clarify how we can add additional covariate information (which is not defined in create_transformation
), such as holiday information, and let the model learn embeddings from them?
Does TransformerTempFlowEstimator
support that?
Should we you just define them as feat_static_cat
/feat_static_real
/feat_dynamic_cat
and add them when creating ListDataset
?
from pytorch-ts.
@NielsRogge so the normalizing flow models which were for the paper and were run on the open datasets did not have additional holiday or lets say dynamic real features so I never added that to the model, however if you have a look at the DeepVAR model you can see how one can add both categorical as well as dynamic features to a multivariate model... If its something you want and cannot do without (e.g. by using deepVAR with some distribution emission) then let me know and i'll try to find some time to add that feature... hope that helps!
from pytorch-ts.
Ok so if I understand correctly, the features defined in create_transformation
of TransformerTempFlow are used by the model, but if I want to add additional covariate information I should use DeepVAR?
from pytorch-ts.
yes @NielsRogge try to get it working with deepVAR as it has support for these covariates like categorical and dynamic real features...
from pytorch-ts.
Related Issues (20)
- Branch: 0.7.0 - RuntimeError: Cannot serialize type diffusers.schedulers
- Run out of memory when I tried to run "Time-Grad-Electricity.ipynb" HOT 2
- Missing Trainer in version-0.7.0 HOT 1
- Enhancing Covariate Conditioning in TimeGrad HOT 1
- Multivariate-Flow-Solar:an error is reported when flow_type='MAF' HOT 1
- Reproducibility issue in TimeGrad with ver-0.7.0 HOT 8
- Inquiry about implementation of mean_wQuantileLoss and m_sum_mean_wQuantileLoss
- A question about the hyperparameter Settings of the model Time-Grad on both of Solar and Wikipedia datasets.
- Issue while runing the Readme
- can't generate dataset "pts_m5" HOT 5
- TypeError: `model` must be a `LightningModule` or `torch._dynamo.OptimizedModule`, got `TimeGradLightningModule`
- ValidationError: 1 validation error for PyTorchPredictorModel
- TypeError: PyTorchPredictor.__init__() got an unexpected keyword argument 'freq' HOT 14
- too many indices for array: array is 1-dimensional, but 2 were indexed
- Data imputation.
- TimeGrad Notebook version 0.7.0 -> predicts all nans HOT 4
- TimeGrad-electricity error
- Pytest pydantic throws an error
- Reproducing the results in "Multivariate Probabilistic Time Series Forecasting via Conditioned Normalizing Flows" in need of Parameters
- ImportError: cannot import name 'PyTorchPredictor' from partially initialized module 'gluonts.torch.model.predictor'
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pytorch-ts.