While trying to forecast predictions and plot the confidence intervals, I received the

thanks <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Thanks so much, <a class="user-mention notranslate" data-hovercard-type="user" data-ho

Ah, yes! Your note on updating prediction_length</c

cool <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url

Cannot cast array data from dtype('<M8[ns]') to dtype('float64') according to the rule 'safe',about zalandoresearch/pytorch-ts

Comments (10)

kashif commented on June 15, 2024 1

thanks @AlexMRuch having a look!

from pytorch-ts.

kashif commented on June 15, 2024 1

@AlexMRuch thanks for the suggestions! I agree it needs more documentation. So I have pushed a m5 data loader here which shows how one can create a dataset with categorical and dynamic features.... it also has how I created a meta file...

Do you think its worth having a quick call?

from pytorch-ts.

kashif commented on June 15, 2024 1

BTW to fix the issue in your notebook you can set the prediction_length to be equal to the extra days in the test dataset (for calculating metrics or any length > 1). So for your example do:

# Define DL Time Series Model
estimator = DeepAREstimator(
    freq = FREQ,
    prediction_length = time_range_full - time_range_split, # or anything > 1
    input_size = 37,
    trainer = Trainer(
        epochs = 10,
        device = DEVICE
    )
)

and then everything else in your notebook should work.

from pytorch-ts.

kashif commented on June 15, 2024 1

So the multivariate grouper is just a helper function to make multivariate time series out of the open datasets which are all uni-variate. If you prepare the dataset yourself then you can just make it via ListDataset but now the target will be an 2d-array and you will need to set the flag one_dim_target=False.

from pytorch-ts.

AlexMRuch commented on June 15, 2024 1

Forgot to thank you for this advice and to let you know that getting the different accuracy metrics worked well for my model pipeline. Unfortunately, though, I’ve switched to using another library that is a bit easier to use and is more flexible for my modeling needs right now (https://github.com/jdb78/pytorch-forecasting). I hope the notes above help others who wish to use pytorch-ts for forecasting! Thanks again for your time and for helping me test pytorch-ts. I look forward to watching it as it develops!

from pytorch-ts.

AlexMRuch commented on June 15, 2024

Thanks so much, @kashif! Really appreciate it!

from pytorch-ts.

AlexMRuch commented on June 15, 2024

I just wanted to follow-up on my issue above and not that a substantial barrier to more widespread adoption of pytorch-ts may be that there just isn't enough commenting and instructions in your examples, or documentation for the library for that matter.

For example, I am working through the Multivariate-Flow-Solar.ipynb notebook now and the cells within Prepare data set are very uninformative, especially if you're working with a dataset that isn't built into the library (as most people will be, including myself).

Even having comments for things like what MultivariateGrouper

train_grouper = MultivariateGrouper(max_target_dim=int(dataset.metadata.feat_static_cat[0].cardinality))

does (and how it differs from classes like ListDataset) would be tremendously helpful. However, when setting this up on my own data, I'm not sure how to even get my multivariate data into a pts-dataset form that has the metadata method.

It's very frustrating and really makes it hard for me to justify using the library more (e.g., if it's this hard just to make a dataset, I can be pretty confident that lots of other things are going to be a challenge).

I am more than happy to use this library more and to push contributions; however, the learning curve for getting things even up and running with the presented examples has been a challenge 😕

Thanks for your patience and help with this. I really am excited to use this library and potentially to help it develop, but right now I'm kind of stuck because unless I go through each of the specific modules to try to tease apart what everything does and how and when, I can only go with what's on the repo so far. It's a little perplexing given how well documented flair is: https://github.com/flairNLP/flair; however, that fact may be why flair has over 9k stars and 400+ users – you can pick it up and run with it in under an hour. Would really love to see pytorch-ts get to that stage too (and maybe help in the process) 😄

from pytorch-ts.

kashif commented on June 15, 2024

So note that the predictor.predict(test_data) will generate forecasts from where the test_data ends for prediction_length time points:

for test_entry, forecast in zip(test_data, predictor.predict(test_data)):
    to_pandas(test_entry)[-40:].plot(linewidth=2)
    forecast.plot(color='g', prediction_intervals=[50.0, 90.0])
plt.grid(which='both')

However if you want to generate predictions in the test time range you need to use the make_evaluation_predictions helper:

forecast_it, ts_it = make_evaluation_predictions(
    dataset=test_data,  # test dataset
    predictor=predictor,  # predictor
    num_samples=100,  # number of sample paths we want for evaluation
)

forecasts = list(forecast_it)
tss = list(ts_it)

and then you can plot the predictions for each entry in your dataset together with the unseen test data via something like:

def plot_prob_forecasts(ts_entry, forecast_entry):
    plot_length = 50
    prediction_intervals = (50.0, 90.0)
    legend = ["observations", "median prediction"] + [f"{k}% prediction interval" for k in prediction_intervals][::-1]

    fig, ax = plt.subplots(1, 1, figsize=(10, 7))
    ts_entry[-plot_length:].plot(ax=ax)  # plot the time series
    forecast_entry.plot(prediction_intervals=prediction_intervals, color='g')
    plt.grid(which="both")
    plt.legend(legend, loc="upper left")
    plt.show()

plot_prob_forecasts(tss[0], forecasts[0])

from pytorch-ts.

AlexMRuch commented on June 15, 2024

Ah, yes! Your note on updating prediction_length did the fix! 🎉

I was under the impression that prediction_length was "how far into the future to predict given the freq parameter, so 1 in my case was meant to imply 1D. So, to clarify, given that 40 == time_range_full - time_range_split (as of today) and that's 40 days ahead (Oct. 27th, which aligns with your forecast). Was my issue simply that prediction_length needs to be more than 1 day (e.g., 7 or 14 days)? I was just curious about what your logic was for choosing time_range_full - time_range_split. I know you said this is done to set the prediction window "equal to the extra days in the test dataset," but I'm not sure if that's a general suggestion of something specific to pytorch-ts, given that 7 and 14 also work well. (Most of my work is with non-time-series DL, and my longitudinal data analysis background is with mixed effects modeling, so thanks for your thoughts here.)

I'm still unclear about what input_size is 37, given that my training_data object has 157 days and that easy day only has one variable so far (a single float). I tried checking the source code for the model (https://github.com/zalandoresearch/pytorch-ts/blob/master/pts/model/deepar/deepar_estimator.py#L36), but it didn't describe it, so I'm not sure if this is a static variable or if I'll have to update it when I rerun this in the future (when more days are in the dataset).

Very help details on MultivariageGrouper and setting one_dim_target=False – thanks!

I can't even begin to thank you for the advice and examples on make_evaluation_predictions and plot_prob_forecasts – seriously, thank you.

Is there a way to plot both the make_evaluation_predictions and the predictor.predict() predictions in the same plot to get at evaluation and forecasting in the same plot that you know of offhand?

Also, how can we get clear evaluation accuracy reports (e.g., RMSE, etc.) for the evaluation above? Is there a built-in
pytorch-ts method?

Thanks again, @kashif!

I am going to give the multivariate and m5 notebooks a try next week (I can either post new issues should I hit them, or I can post them here since you laid out some tips above) but would love to setup a time to chat with you in the near future about the library. I followed you on Twitter and can email you as well if you'd prefer.

from pytorch-ts.

kashif commented on June 15, 2024

cool @AlexMRuch lets try to catch up next week. So to answer some of your questions here:

yes so prediction_length needs to be more than 1 and i choose it to be the extra days in your test set so that I can compare the forecasts to the ground truth. I could have set the prediction length to another number but then the test dataset would not be of much use. So if you want to obtain metrics on the prediction compared to the ground truth test set then its a good idea to set the prediction_length to be the size by which the test set is larger. You typically set it to the number of time steps you would like for a specific problem or data set.

So input_size is the size of the feature vector which consist of the 1-dim target (as you correctly stated) together with other covariates which are either time varying features like the encoding of the current time point, lag features, embeddings of a particular time series etc etc.) So for your current experiment even though the target is 1-dim at every time point, the other features end up giving you a total feature size of 37. This is a bit clunky in pytorch... I should be able to calculate it from the parameters of the dataset and other parameters of model but I haven't gotten around to it.

So to get the metrics for uni-variate models you do:

from pts.evaluation import Evaluator
import json

evaluator = Evaluator()
agg_metrics, item_metrics = evaluator(iter(tss), iter(forecasts), num_series=len(test_data))

and that returns metrics aggregated over all the time series of your dataset as well as over individual ones in the tuple above, e.g.

print(json.dumps(agg_metrics, indent=4))

{
    "MSE": 13505.071875,
    "abs_error": 3810.694580078125,
    "abs_target_sum": 26724.0,
    "abs_target_mean": 668.1,
    "seasonal_error": 492.5,
    "MASE": 0.19343627310041245,
    "MAPE": 0.1368498585666218,
    "sMAPE": 0.14887559716391757,
    "OWA": NaN,
    "MSIS": 2.412104219543147,
    "QuantileLoss[0.1]": 1330.3681396484376,
    "Coverage[0.1]": 0.075,
    "QuantileLoss[0.2]": 2218.410595703125,
    "Coverage[0.2]": 0.075,
    "QuantileLoss[0.3]": 2886.637915039062,
    "Coverage[0.3]": 0.15,
    "QuantileLoss[0.4]": 3453.1165405273437,
    "Coverage[0.4]": 0.15,
    "QuantileLoss[0.5]": 3810.694549560547,
    "Coverage[0.5]": 0.2,
    "QuantileLoss[0.6]": 4070.61484375,
    "Coverage[0.6]": 0.2,
    "QuantileLoss[0.7]": 4184.155157470703,
    "Coverage[0.7]": 0.275,
    "QuantileLoss[0.8]": 3994.054711914063,
    "Coverage[0.8]": 0.35,
    "QuantileLoss[0.9]": 3354.4697082519533,
    "Coverage[0.9]": 0.425,
    "RMSE": 116.21132421154145,
    "NRMSE": 0.17394300884828837,
    "ND": 0.1425944686453422,
    "wQuantileLoss[0.1]": 0.04978177442180952,
    "wQuantileLoss[0.2]": 0.08301192170719672,
    "wQuantileLoss[0.3]": 0.10801668593919556,
    "wQuantileLoss[0.4]": 0.12921406004068792,
    "wQuantileLoss[0.5]": 0.14259446750338822,
    "wQuantileLoss[0.6]": 0.15232056742067057,
    "wQuantileLoss[0.7]": 0.15656919463668248,
    "wQuantileLoss[0.8]": 0.14945572189470374,
    "wQuantileLoss[0.9]": 0.12552274016808687,
    "mean_wQuantileLoss": 0.12183190374804684,
    "MAE_Coverage": 0.2888888888888889
}

hope that helps!

from pytorch-ts.

Cannot cast array data from dtype('<M8[ns]') to dtype('float64') according to the rule 'safe' about pytorch-ts HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent