mattsherar / temporal_fusion_transform Goto Github PK

View Code? Open in Web Editor NEW

238.0 3.0 58.0 349 KB

Pytorch Implementation of Google's TFT

Python 45.36% Jupyter Notebook 54.64%

temporal_fusion_transform's Introduction

Temporal_Fusion_Transform

Pytorch Implementation of Google's TFT

Original Github link: https://github.com/google-research/google-research/tree/master/tft

Paper link: https://arxiv.org/pdf/1912.09363.pdf

Abstract Multi-horizon forecasting problems often contain a complex mix of inputs -- including static (i.e. time-invariant) covariates, known future inputs, and other exogenous time series that are only observed historically -- without any prior information on how they interact with the target. While several deep learning models have been proposed for multi-step prediction, they typically comprise black-box models which do not account for the full range of inputs present in common scenarios. In this paper, we introduce the Temporal Fusion Transformer (TFT) -- a novel attention-based architecture which combines high-performance multi-horizon forecasting with interpretable insights into temporal dynamics. To learn temporal relationships at different scales, the TFT utilizes recurrent layers for local processing and interpretable self-attention layers for learning long-term dependencies. The TFT also uses specialized components for the judicious selection of relevant features and a series of gating layers to suppress unnecessary components, enabling high performance in a wide range of regimes. On a variety of real-world datasets, we demonstrate significant performance improvements over existing benchmarks, and showcase three practical interpretability use-cases of TFT.

temporal_fusion_transform's People

Contributors

Stargazers

Watchers

Forkers

akeskiner arita37 hungaroring thunderbirdwang math6068 lastlap fschoeller yunxileo leonchu0114 yaoxy2010 kamalendupy startrekor snumumrik fmorenopino stjordanis micseb hermanro77 zhangjielun1994 vergangenheit zongke-zjut sijanshrestha7 mjkim-storelink cris-her qwzhong1988 roc1n4nte panteleimon-a betteryy 10sun hsouporto jhonathan-pedroso kevinningthu sangkyunjo fabianhenning liangcao2018 wmoalxf mfriendly pwdemars yoontae6719 dotrado nakols sairamtvv ftwh andy-hhh-hub daisy-zzz nguyen7594 vincehass jimmylihui zero506 schemm1 kevindarby free-angel lkh-7 turlagh zhufx666 hero-zheng jy946 reemmohamad

temporal_fusion_transform's Issues

About the code for the add&norm

The code is very clear to read, but I find it confusing that in the class TFT,

self.post_lstm_norm = TimeDistributed(nn.BatchNorm1d(self.hidden_size))

So just batch norm is okay there? I don't see the parameters of the batch_first there. Thanks for you kind reply.

I met the error when I run model.forward(batch)

I ran the following code and met the error shown above
"output,encoder_output, decoder_output,
attn,attn_output_weights,
static_embedding, embeddings_encoder, embeddings_decoder = model.forward(batch)"
How can i solve this propblem

A question about the paper: does it use past targets?

The task is formalized as

the past targets y is part of the inputs.

But in the figure, it seems that past targets are not in this model.

where

So where are the past targets? (y)

Hello! I meet some error when I run trainer.ipynb

when I run trainer.ipynb, I meet errors like followings :

elect = ts_dataset.TSDataset(id_col, time_col, input_cols,
target_col, time_steps, max_samples,
input_size, num_encoder_steps, output_size, train)
will arise the error that:

Since the definition of TFDataset is

It actually missed two items.
I'll appreciate it very much if you would help me with this problem!

Is it possible to achieve the prediction accuracy in the paper using tft code?

I downloaded the experimental data of traffic and tried to modify the tft code for prediction, but I can't achieve the prediction accuracy in the paper, can the author please share the full prediction code?

Does this model use the original Multi-head attention?

the code in tft_model.py
self.multihead_attn = nn.MultiheadAttention(self.hidden_size, self.attn_heads)

so it used the original Multi-head attention in , not the Interpretable Multi-Head Attention in TFT paper?

Incorrect loss function

I believe the loss function is incorrect.
Currently calculated:

errors = target[:, i] - preds[:, i]
losses.append(torch.max((q-1) * errors, q * errors ).unsqueeze(1))

however the google github has the loss calculated in the following:
prediction_underflow = y - y_pred
q_loss = quantile * tf.maximum(prediction_underflow, 0.) + (1. - quantile) * tf.maximum(-prediction_underflow, 0.)

The correct calculation should be something like this:

losses.append(
q * torch.max(errors, torch.zeros_like(errors)) + (1. - q) * torch.max(-errors, torch.zeros_like(errors) )

Column "hours_from_start" is used 2x in the column definition

See:

Temporal_Fusion_Transform/data_formatters/traffic.py

Line 47 in 4fd9fdb

('hours_from_start', DataTypes.REAL_VALUED, InputTypes.TIME),

And:

Temporal_Fusion_Transform/data_formatters/traffic.py

Line 51 in 4fd9fdb

('hours_from_start', DataTypes.REAL_VALUED, InputTypes.KNOWN_INPUT),

Key Error: 'seq_length' & IndexError: index out of range in self

@mattsherar

When running the trainer notebook, in the code block where keys are being added to the 'config' dictionary, a 'seq_length' key was not added.

Do you know what value this key should hold?

When blindly testing with values related to the previous code & dataset (ex. 1000 b/c of max_samples, 192 b/c of time_steps, etc.), I am receiving the following: IndexError: index out of range in self

This happens when running the following piece of code:

I will continue to look into the issue but any help would be appreciated.

Could you please attach the requirement file for the necessary libraries?

Hi, I would like to know the necessary libraries for installation, are they as same as the requirement of original tensorflow-version repository?

TFT_Datasest: Needs the target variable explicitly be added to the dataset?

Hi all!

I was just wondering, if the target variable needs explicitly be added to the dataset as "time_varying_unknown_reals" or does the TFT anyway consider the historic values of the target variable since I define it as "target" when creating the TmeSeriesDataSet-object?

The question arose when I computed a prediction with and without explicitly adding the target to the dataset and the accuarcy was actualy better without sometimes and sometimes it didn't made a difference.

Thanks a lot in advance!

seq_length has not been defined in trainer.ipynb config dict

Hi,
The config file for the data in your trainer.ipynb notebook is as follows:

config = {}
config['static_variables'] = len(static_cols)
config['time_varying_categoical_variables'] = 1
config['time_varying_real_variables_encoder'] = 3#4
config['time_varying_real_variables_decoder'] = 2#3
config['num_masked_series'] = 1
config['static_embedding_vocab_sizes'] = [369]
config['time_varying_embedding_vocab_sizes'] = [369]
config['embedding_dim'] = 8
config['lstm_hidden_dimension'] = 160
config['lstm_layers'] = 1
config['dropout'] = 0.05
config['device'] = 'cpu'
config['batch_size'] = 64
config['encode_length'] = 168
config['attn_heads'] = 4
config['num_quantiles'] = 3
config['vailid_quantiles'] = [0.1,0.5,0.9]

seq_length is not passed in the config dict when initialising the TFT:

model = tft_model.TFT(config)

However, seq_length is required in the class:

class TFT(nn.Module):
    def __init__(self, config):
        super(TFT, self).__init__()

         ........

        self.seq_length = config['seq_length']

I've tried seq_length values 0-1000 but for all values I'm getting the error below.

IndexError                                Traceback (most recent call last)
<ipython-input-280-9bd3ca615a58> in <module>
      1 output,encoder_output, decoder_output, \
      2 attn,attn_output_weights, \
----> 3 static_embedding, embeddings_encoder, embeddings_decoder = model.forward(batch)

~\Downloads\tft_model.py in forward(self, x)
    346         ##Embedding and variable selection
    347         static_embedding = torch.cat(embedding_vectors, dim=1)
--> 348         embeddings_encoder = self.apply_embedding(x['inputs'][:,:self.encode_length,:].float().to(self.device), static_embedding, apply_masking=False)
    349         embeddings_decoder = self.apply_embedding(x['inputs'][:,self.encode_length:,:].float().to(self.device), static_embedding, apply_masking=True)
    350         embeddings_encoder, encoder_sparse_weights = self.encoder_variable_selection(embeddings_encoder[:,:,:-(self.embedding_dim*self.static_variables)],embeddings_encoder[:,:,-(self.embedding_dim*self.static_variables):])

~\Downloads\tft_model.py in apply_embedding(self, x, static_embedding, apply_masking)
    299         time_varying_categoical_vectors = []
    300         for i in range(self.time_varying_categoical_variables):
--> 301             emb = self.time_varying_embedding_layers[i](x[:, :,self.time_varying_real_variables_encoder+i].view(x.size(0), -1, 1).long())
    302             time_varying_categoical_vectors.append(emb)
    303         time_varying_categoical_embedding = torch.cat(time_varying_categoical_vectors, dim=2)

C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py in _call_impl(self, *input, **kwargs)
   1128         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1129                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1130             return forward_call(*input, **kwargs)
   1131         # Do not call functions when jit is used
   1132         full_backward_hooks, non_full_backward_hooks = [], []

~\Downloads\tft_model.py in forward(self, x)
     48         x_reshape = x.contiguous().view(-1, x.size(-1))  # (samples * timesteps, input_size)
     49 
---> 50         y = self.module(x_reshape)
     51 
     52         # We have to reshape Y

C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py in _call_impl(self, *input, **kwargs)
   1128         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1129                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1130             return forward_call(*input, **kwargs)
   1131         # Do not call functions when jit is used
   1132         full_backward_hooks, non_full_backward_hooks = [], []

C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\sparse.py in forward(self, input)
    156 
    157     def forward(self, input: Tensor) -> Tensor:
--> 158         return F.embedding(
    159             input, self.weight, self.padding_idx, self.max_norm,
    160             self.norm_type, self.scale_grad_by_freq, self.sparse)

C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\functional.py in embedding(input, weight, padding_idx, max_norm, norm_type, scale_grad_by_freq, sparse)
   2197         # remove once script supports set_grad_enabled
   2198         _no_grad_embedding_renorm_(weight, input, max_norm, norm_type)
-> 2199     return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
   2200 
   2201 

IndexError: index out of range in self

Can Temporal Fusion Transform deal with variable length inputs?

Does the model have masked embedding layer that deal with inputs that have different time_steps? Thanks!

Whether this project can reproduce the results in the corresponding paper

Hello, I am very interested in your code and I would like to ask whether your algorithm can reproduce the results in the paper